IRC channel logs

2021-12-31.log

back to list of logs

<muurkha>I could be wrong but I don't think any of us here has significant hardware expertise, so we may not be in the best position to assess the threat
<muurkha>I mean I don't, and you don't, oriansj. maybe fossy or stikonas[m] or bauen1 or somebody does
<oriansj>muurkha: absolutely correct.
*vagrantc is racking up an extensive track record for breaking hardware
<oriansj>It is kinda my last bootstrapping goal before I am *done* with a minor hope someone else does it first. Then one needs only that information printed and a simple hex0-monitor
<bauen1>oriansj: that's a shame, it would be interesting to read that paper
<bauen1>muurkha: i have experience with building toy cpus / vhdl / logic, but not with the reverse engineering of hardware (and the electrical / transistor knowledge that requires)
<oriansj>also it'll be nice to see knight in an FPGA and hopefully in real hardware
<stikonas>fossy: how do you compute checksums in qemu? hacked the build to output it to the screen?
<stikonas>at some point I'll have to recalculate kexec checksum...
<stikonas>I briefly tried changing sources directory, it's only a few lines of change but then a lot of checksums change
<muurkha>bauen1: yeah, that's more or less where I am too. I've spent maybe 100 hours in my life debugging logic circuits on breadboards, but my VHDL/Verilog knowledge is all spectator-sport armchair-quarterback bullshit, unlike yours
<fossy>stikonas[m]: hm? now that I'm checksumming packages rather than individual files, i'm just running sha256sum in the final bash prompt after everything finishes
<fossy>i don't have hardware experience, muurkha
<doras>If we're discussing sources, the next annoying part I'm struggling with is the nontrivial logic which determines where each tar should be staged under "/after".
<stikonas>doras: well, that can be changed but we need to first wait for fossy to land his changes
<stikonas>doras: the simplest change is to leave everything in sources and move logic of moving tarballs into bootstrap chain
<stikonas>or actually copying rather than moving
<stikonas>since we only build cp binary in stage0-posix
<doras>Makes sense.
<stikonas>or maybe make source packages know that sources are in /sources
<stikonas>and unpack from there
<stikonas>it makes it a bit less nice from development perspective but I guess that will be managable
<stikonas>right now one has to maintain only downloadable list in python and checksum file
<stikonas>moving into bootstrap means that we'll have a third place
<stikonas>although logic in python will be simplified (only download into /sources directory)
<doras>I was thinking of staging sources in tmp/after/sources and then have the bootstrap figure out things from there.
<doras>But it's not so simple because the current logic just extracts every tar under tmp/after/<package>/src.
<doras>So each "package" extracts its own sources and doesn't touch others'.
<doras>I guess it can be achieved by being more explicit in the package logic and list exactly which tar is required for each package at the bootstrap level.
<stikonas>doras: well, I think just in tmp/sources
<stikonas>I'll later rename after to sysa too
<stikonas>but that operation changes a lot of checksums
<stikonas>and fossy is reworking checksum code right now
<doras>From what I could tell the sources are only relevant for the current tmp/after packages.
<doras>But either works
<stikonas>well, and sysc also has source tarballs
<stikonas>they are in sysc/usr/src/
<stikonas>but at that time we have bash, so easier to script
<doras>Haven't gotten to sysc yet :)
<stikonas>anyway, we can simplify quite a bit of preparation code
<stikonas>but a bit will still be needed
<stikonas>possibly something like this
<stikonas>1) clone stage0-posix
<stikonas>2) clone live-bootstrap
<stikonas>if you want to run manually without python then 3) copy live-bootstrap over on top of stage0-posix
<stikonas>4) download sources into /sources directory
<stikonas>and run kaem-optional-seed
<stikonas>I think that's the maximum we can achieve
<stikonas>or maybe run bchroot kaem-optional-seed
<doras>That would be so much better if we could do that.
<stikonas>yeah, I think that's doable
<stikonas>at least run with some isolation
<stikonas>not sure about run without isolation (just on normal system) without any chroots or anything
<stikonas>stage0-posix can run like that but not live-bootstrap
<doras>If we could also convert the current source download logic to rely on a yaml manifest or similar, it would be so much easier to work with.
<stikonas>well, that should be also doable
<stikonas>yaml in python is easy enough to work
<stikonas>just need to decide on format of that file
<stikonas>could be something like package in one level, then tarball name as children nodes and then its children nodes is url
<stikonas>but anyway, that's for later
<bauen1>i do remember that i still have some code that uses gnu stow to make pseudo packages, is that still of interest for live-bootstrap ?
<doras>It's better to keep the format as flat as possible. I think this would work well: each node contains: url, destination directory and file name.
<doras>So for example for mes you'd have: url: https://github.com/oriansj/mes-m2/archive/75a50911d89a84b7aa5ebabab52eb09795c0d61b.tar.gz, destination: sysa/tmp/after/mes/src, filename: mes.tar.gz
<stikonas>bauen1: I think fossy is replacing it with something else
<stikonas>but I'm not sure yet how common fossy's and your packages are
<stikonas>i.e. are they one or the other or orthogonal
<stikonas>yeah, maybe flat is better...
<doras>stikonas: I guess it would also be useful to have the hash in the manifest for downloading purposes, but that would duplicate the package hashes to two files, won't it?
<doras>I mean the tar hashes.
<stikonas>hmm, it would...
<stikonas>but maybe you can just download and run sha256sum on everything?
<stikonas>we'll see
<stikonas>that's a bit too far to plan now
<doras>Hmmm...
<stikonas>first I wanted to simplify after->sysa rename
<doras>When are the hashes currently checked?
<stikonas>on download
<doras>Only on download?
<stikonas>rootfs.py checks them
<stikonas>yes, only on download
<doras>Oh
<stikonas>well, download happens on each run
<doras>So it wouldn't duplicate the hashes, it would just move them to the manifest.
<stikonas>hmm, that's right
<doras>And then when a download completes, the hash would be verified.
<stikonas>so maybe it's fine
<stikonas>although now producing hash file was really nice...
<stikonas>just sha256sum * >> ../SHA256SUMS.sources
<doras>I think we can create a helper script for this.
<stikonas>but it's fine, we don't change them often
<stikonas>yeah, we can do a script too
<stikonas>ok, I'll be back in an hour or so...
<doras>It may even be useful to have separate manifests for sysa, sysb and sysc, so it's easy to track exactly which sources each require, so if someone only wants to bootstrap sysa, they wouldn't need to download all the rest.
<doras>I couldn't find where the stage0 binary checksums are verified. The ones under sysa/stage0-posix/checksums.
<doras>Perhaps they aren't?
<stikonas>doras: they are verified by stage0-posix
<stikonas>doras: https://github.com/oriansj/stage0-posix/blob/master/x86/kaem.run#L44
<stikonas>using the sha256sum that we just built
<doras>stikonas: so is sysa/stage0-posix/checksums an unused leftover in live-bootstrap?
<stikonas>probably, let me check
<stikonas>yes, it's not used
<stikonas>I can delete it
<stikonas>it might be from before stage0-posix had added checksums
<stikonas>doras: pushed
<stikonas>anyway, keep in mind that these checksum files are just for us to be more confident that bootstrap is reproducible
<stikonas>but it's not "a proof"
<stikonas>i.e. one can imagine some malicious sha256sum binary that prints fake checksum
<stikonas>for somebody to prove that there is no backdoor in sha256sum, one has to externally check hex0
<stikonas>(i.e. not with sha256sum that we just built)
<stikonas>this whole thing is I guess related to Goedel's incompleteness theorem
<stikonas>that no system containing arithmetic can prove its own consistency
<doras>I just gave a Matrix-level thumbs up. I wonder what it did in IRC, if at all.
<bauen1>doras: nothing
<doras>:)
<doras>👍️
<doras>Does that look better?
<bauen1>well, i probably don't have the appropiate font installed because it renders as a box
<bauen1>how far is live-bootstrap at this point ? from the readme it can finally compile a linux kernel, so it can't be that far away, right ?
<bauen1>oh and it can build guile, so that should be pretty close to bootstrapping guix ?
<stikonas>I can see thumbs up icon on my font
<stikonas>bauen1: yes, it can build guile and gcc 4.7.4
<stikonas>bauen1: although, guix bootstrap would probably belong in a project on top of live-bootstrap
<stikonas>(and other distros too)
<bauen1>stikonas: i agree
<stikonas>bauen1: we initially built guile to be able to run autogen
<bauen1>so that means live-bootstrap still needs to build a newer version of gcc ?
<stikonas>bauen1: but autogen turned out to be fairly impossible to bootsrtap without using pre-generated files
<stikonas>bauen1: well, we can get a newer version
<stikonas>it's just a couple of packages on top
<doras>stikonas: is SHA256SUMS.sources currently copied to sysa/tmp/after for no reason?
<stikonas>although due to lack of autogen, binutils and gcc build scripts are a bit nasty
<stikonas>doras: yes, right now for now reason
<bauen1>stikonas: and the pregenerated files are too big to write from hand / modify ?
<stikonas>it was for potential future use if we decide to check tarballs inside bootstrap
<stikonas>bauen1: for autogen it's not clear how to approach that
<stikonas>for gcc/binutils, Makefile.in uses autogen mostly as templating engine
<stikonas>i.e. one could rewrite it to use something else (e.g. jinja2)
<stikonas>but that's still quite a bit of work
<stikonas>for autogen, all the versions that we found (even already first version in git) has files that were generated with autogen
<stikonas>if you want, you can take a look
<stikonas>but both me and fossy tried to look a bit
<stikonas>and it's just too scary
<bauen1>> AutoGen will accept either its own definition format, or XML files as definition input, in addition to CGI data (for producing dynamic HTML) and traditional AutoGen definitions.
<stikonas>bauen1: well, maybe one can try to do the same with autogen
<bauen1>yes that does indeed look scary
<stikonas> https://git.savannah.gnu.org/cgit/autogen.git/tree/autoopts?id=8c4ae21e5a19d32036965da9753c6b2be9b753e0
<stikonas>this is the first version
<stikonas>that is in git
<stikonas>although, I seem to remember that fossy said it was broken
<bauen1>stikonas: how do i recognise a file build by autogen ?
<stikonas>well, tpl files are all autogen input files
<stikonas>maybe makefile has some hint
<stikonas>well, in gcc it explicitely says autogenerated by
<bauen1>oh what the fuck
<stikonas>e.g. https://raw.githubusercontent.com/gcc-mirror/gcc/master/Makefile.in
<bauen1>fossy: what's broken about the first version of autogen ?
<stikonas># Makefile.in is generated from Makefile.tpl by 'autogen Makefile.def'.
<stikonas>bauen1: also autogen tightly integrates with guile and probably won't always work with guile 3, so that might give extra headache
<stikonas>although, maybe that's solvable
<bauen1>stikonas: so the first version contains 460 + 274 + 278 lines of autogen input, so that doesn't seem that bad, and most of it could be skipped if it's about options we don't care about
<stikonas>bauen1: I can't even run configure there configure: error: can not find sources in . or ..
<stikonas>although, maybe one can try to build without autotools
<stikonas>bauen1: ok, another folder clearly contains autogen generated c file https://git.savannah.gnu.org/cgit/autogen.git/tree/compat?id=8c4ae21e5a19d32036965da9753c6b2be9b753e0
<stikonas>strsignal.h is autogenerated
<bauen1>yeah, why would you even do that ....
<stikonas>fossy even tried to write autogen maintainer
<stikonas>but we didn't get any useful answer. Just something like you need either autogen or use files from repo/tarball
<bauen1>urgh
<stikonas>bauen1: so that is basically why nobody added autogen to live-bootstrap yet...
<stikonas>if somebody can unentangle this mess it would definitely be good
<stikonas>but I'm not very hopeful on this
<stikonas>maybe somebody who is familiar with autogen and scheme already can do a quicker job
<stikonas>but starting from scratch is hard
<bauen1>stikonas: if building the version that you've linked "initial revision" will help, that should be doable at least
<stikonas>well, it's just a guess now, but presumably one can try to build on top of that
<stikonas>well, that's how I managed to build perl
<stikonas>but it wasn't nowhere near as bad
<stikonas>I had to rewrite some perl scripts from perl 5.000 in awk
<stikonas>and then we did a few jumps via intermediate perl versions (maybe 6 of them)
<bauen1>stikonas: i suspect you'll need a few more for autogen :(
<stikonas>probably
<stikonas>e.g. replaced opcode.perl with https://github.com/fosslinux/live-bootstrap/blob/master/sysa/perl-5.000/files/opcode.awk
<stikonas>but these are fairly small scripts
<bauen1>of course somebody had the smart idea to add an eval / shell command to a text processor
<bauen1>yep that first commit is horribly broken
<bauen1>like, where tf did the `src` directory go
<bauen1>stikonas: by cleverly defining the HAVE_XXX macros you could exclude most (if not all) code in the compat directory, but the src folder ist just misssing
<stikonas>yeah, so maybe don't look at that first commit
<bauen1>yeah, i'll go back to playing with hardware i think
<stikonas>yeah, autogen is mean :D
<stikonas>but unfortunately gcc and binutils use it for Makefile.in :(