IRC channel logs

2023-11-27.log

back to list of logs

<Googulator>Had this idea today to further secure the bootstrap process by encrypting the newly created FS on the disk, e.g. using luks, to exclude the HDD firmware from potentially tampering with the build
<Googulator>If all that the HDD firmware ever sees is ciphertext, it can't possibly inject a backdoor, unless it cracks the encryption first
<Googulator>That leaves only the boot drive (which isn't necessarily the final HDD holding the bootstrapped system - see my "trusted Flash drive" plan), the CPU, the RAM, the chipset, and the NIC handling plaintext
<Googulator>NIC can be eliminated by including anything needed before HTTPS is available in srcfs
<Googulator>Oh, and of course, the BIOS / system firmware, at least on x86
<Googulator>RAM I think we can reasonably trust, as current RAM technology is pure storage
<Googulator>No place to secretly hide processing capability
<Googulator>CPU... well, if you can't trust that, all bets are off
<Googulator>Chipset generally does no processing on its own, though watch out for things like Intel ME...
<Googulator>For the bootstrap drive, I've already outlined my plans
<Googulator>And for the BIOS - ideally we can bootstrap something like u-boot, coreboot or edk2 onntop of just the CPU's embedded boot ROM, which is probably unavoidable as the last vendor binary in the boot path
<Googulator>(though bootstrapping RAM training could get tricky - again, RK3588 or another similar SoC with significant SRAM available right from the reset vector is helpful here)
<Googulator>Also, the NIC interfering isn't really an issue - we already do hash checking on everything that comes through the network before using it
<Googulator>So if the NIC injects a backdoor, it will cause the hash to not match, unless the attack also includes colliding SHA-256.
<notgull>Are there CPU boot ROMs that are open source?
<pabs3>does the NIC need to be involved in bootstrap?
<oriansj>notgull: not yet but we can change that too
<pabs3>surely some of the RISC-V ones are?
<oriansj>pabs3: perhaps but I haven't seen one for a RISC-V chip that I can buy and use yet.
<oriansj>Googulator: it only takes about 3K gates too backdoor a RAM chip; which on a 1Gb chip isnt much
<Googulator>Open-source doesn't help with CPU boot ROMs
<Googulator>Because they're true mask ROM
<Googulator>You can't swap it out for a source-built version
<Googulator>& a CPU with a backdoored mask ROM will probably also include a copy of a clean version of the boot ROM & redirect subsequent attempts to read it to the clean one, thwarting comparison with a source-compiled one
<Googulator>Meanwhile, testing the "simplify" branch - why do we now predownload guile-3.0.9, and especially why in 2 copies (once as gz, once as xz)?
<Googulator>In fact, looks like it now predownloads everything
<Googulator>fossy: is this a bug?
<Googulator>(I'm testing on bare metal with kernel bootstrap)
<Googulator>oriansj: can that be done in practice on a modern (LP)DDR4/5 chip while still having it function as a proper DRAM, including performance?
<Googulator>Assuming APTs don't have access to silicon manufacturing technology far more advanced than what legit DRAM makers currently use
<Googulator>fossy: right before I was about to boot into simplify on bare metal, I can see a reason why kernel bootstrap could be failing
<Googulator>The ext3 root partition is never created, only /external is
<Googulator>(why the move back to ext3, btw?)
<Googulator>sfdisk is never called during bootstrap, while generator.py only creates /external
<fossy>Googulator: yes, bug/not yet implemented, i haven't quite fixed up the external sources code
<Googulator>Why is it even hitting the externals path?
<fossy>?
<fossy>why is what hitting the externals path
<Googulator>Why is externals.img even being created? I didn't specify --external-sources
<fossy>see above, the external-sources handling/lack thereof is not quite fixed up yet
<fossy>regarding creation of ext3 root partition, what partition are you talking about exactly?
<Googulator>The one that becomes the root partition after kexecing into Linux
<fossy>ah, yes, that is as of yet unimplemented - we do not get anywhere nearly that far yet in kernel bootstrap mode
<Googulator>The mount command is there, but not the mkfs that creates it
<fossy>it fails with a segfault in make
<Googulator>Oh...
<Googulator>We fail earier than that?
<fossy>yes, and i have no idea why :P
<Googulator>Was there any change to tcc?
<fossy>minorly - but none that are specific to kernel bootstrap
<fossy>just to be completely clear: when using a Linux kernel, it works fine
<fossy>when using kernel bootstrap, it does not work
<fossy>with precisely the same code
<Googulator>Sounds like Fiwix build issue
<Googulator>Or memory layout
<Googulator>IIRC we never use libc to build Fiwix
<Googulator>Just tcc
<fossy>Fiwix build issue, how so? the Fiwix build does not fail, again, the problem is a segmentation fault when make-3.82 runs
<Googulator>I mean Fiwix does get built, but it's wrong
<fossy>ah - right
<fossy>yes, that is not an impossibility
<Googulator>A broken tcc would explain that
<Googulator>Not outright segfaulting or throwing a compilation error, but outputting wrong code
<Googulator>Test running on baremetal now, with some of my fixes from pre-simplify ported
<fossy>memory layout was my first thought. i injected a busybox into fiwix and used that to fill the disk a bit, which did not induce any errors from then just trying to do basic functions, so i don't think its the ramdisk's memory mapping. it could be memory mapping of userspace processes, but i'd be surprised we aren't running into it before...\
<Googulator>Is the final tcc binary from the new 3-step process identical to the old 5-step one?
<fossy>not sure, but 1. simplify branch as pushed is still on 5-step process, i haven't pushed rebase, and 2. now that i have rebased locally, it still fails identically
<Googulator>Also, just noticed: simplify is "22 commits ahead, 27 comits behind master"
<fossy>(repeat) not sure, but 1. simplify branch as pushed is still on 5-step process, i haven't pushed rebase, and 2. now that i have rebased locally, it still fails identically
<Googulator>oh, right... that's a different branch
<Googulator>The make-3.82 that breaks is the one that was built with kaem, right?
<fossy>yep :)
<fossy>i've also checked binary equivalence between fiwix and non-fiwix environments, so it's not irreproducibility
<fossy>Mikaku: how do you tend to debug binaries *within* Fiwix? do you have a gdb build for Fiwix, or is it just good old printf?
<Googulator>That early make-3.82 is checksummed, and I guess it must match if it didn't fail even earlier with a checksum mismatch
<fossy>yeah exactly
<Googulator>That confirms make is being built correctly
<fossy>well, actually, make 3.82 doesn't have same checksum on simplify branch as master
<fossy>but that is to be expected due to path differences
<fossy>(i haven't updated checksums in simplify branch yet iether)
<fossy>either*
<fossy>so make isn't being built irreproducibly, but it could be consistently being built incorrectly in a way that only exposes itself within Fiwix
<fossy>which would be very odd...
<Googulator>My bet would be on Fiwix itself
<Googulator>Make just happens to be what triggers it
<fossy>i agree...
<Googulator>First thing to use fork() maybe?
<fossy>nah, kaem uses fork
<Googulator>Hmm...
<Googulator>define: JOBS = 1 ( KERNEL_BOOTSTRAP == True )
<Googulator>That's different from how it was handled before...
<fossy>defined differently, but functionally the same
<Googulator>Would this mean we're passing -j1 to make now, whereas before, we would pass no -j at all?
<Googulator>"make" and "make -j1" aren't fully synonymous
<fossy>no, patch's make invocation is
<fossy>make -f Makefile PREFIX=${PREFIX}
<fossy>which is identical to before (no ${MAKEJOBS})
<fossy>(patch is first thing to use make)
<Googulator>How does JOBS get passed?
<Googulator>Is it an envvar?
<fossy>MAKEJOBS="-j${JOBS}", and then most invocations of make use ${MAKEJOBS}, but not most kaem scripts
<fossy>JOBS is unused until bash, i think
<fossy>(which maybe shouldn't be the case, but that's a separate concern)
<Googulator>OK, so make itself isn't listening for an environment variable named "JOBS"
<fossy>correct
<Googulator>I was worried "JOBS=1 make" would have the same effect as "make -j1"
<Googulator>Do you have a build of Fiwix captured on disk from one of your "Fiwix-with-disk" runs?
<Googulator>I'd like to diff it against a pre-simplify build
<fossy>no, but i can make one reasonably easily
<fossy>i think you will get a huge diff against a pre-simplify build due to the /sysa -> /steps, but happy to provide anyway
<Googulator>Do paths get included in the binary?
<Googulator>Anyway, got [error.o] Segmentation fault on bare metal too
<Googulator>This is the same error as in qemu, right?
<fossy>yes, precisely the same
<fossy>yes, paths do get included in many binaries, mostly due to mes libc + static linking
<Googulator>Hmm... weird binary size limitation or alignment requirement in Fiwix?
<Googulator>If paths are included, they could shift around code
<Googulator>*shift code around
<Googulator>Hmm... this gets even weirder
<Googulator>Just got fiwix built using chroot in both pre- and post-simplify
<Googulator>They're *binary identical*
<fossy>you beat me to it :P
<fossy>i was doing the same thing but you found the conclusion before me
<fossy>oh that would make sense cause fiwix doesnt include libc so path change shouldnt affect it
<fossy>(well unless there was something else going on but apparently there isnt)
<Googulator>Maybe fiwix only gets miscompiled if it's being built under builder-hex0?
<fossy>possibly, what's the sha for the fiwix built in chroot, i'll check the sha when built under hex0
<fossy>oh wait im dumb that's in the repo lol
<fossy>(un?)fortunately the checksums do match
<fossy>AH!
<fossy>it is something much less sinister than any of that
<Googulator>Did you find it?
<fossy>it appears to be a bad PATH variable, and environment does differ between fiwix and non-fiwix, because of the kexec and so script change
<Googulator>Oh...
<fossy>i figured this out by recompling with -g, taking the eip emitted by fiwix's segfault code, and cross referencing with an out-of-fiwix disassembly
<fossy>yeah, because default kaem PATH is /bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games, and PATH should be /usr/bin, and /bin doesn't exist, and make is buggy with a nonexisting PATH dir so it segfaults
<fossy>gah that is really annoying but at least its found
<Mikaku>fossy: yes I use printk() (with -D__DEBUG__) mostly for debugging purposes
<Mikaku>fossy: sorry for a late response, now I see you managed to solve it :-)
<fossy>Mikaku: no need to apologise, thanks regardless :)
<Harzilein>hi
<matrix_bridge><Andrius Štikonas> Harzilein: hi and welcome!
<oriansj>Googulator: well yes, it can operate on the backend refresh cycle
<oriansj>Harzilein: welcome
<oriansj>fossy: great work as always ^_^
<oriansj>Mikaku: loving your Fiwix work; we really need to find some people to help you port to more architectures.
<Mikaku>oriansj: yes, I'm now focused on implementing UNIX domain sockets but I also have in mind to include at least one architecture (Raspberry Pi?) to help to organize better the code
<Mikaku>having help on this process would be amazing, indeed
<GoogulatorMobile>oriansj: builder-hex0 is your project, right?
<matrix_bridge><Andrius Štikonas> GoogulatorMobile: no, that is rickmasters
<GoogulatorMobile>Oh, right
<matrix_bridge><Andrius Štikonas> And in retrospective I would suggest doing early stages slightly differently... Bit oh well...
<GoogulatorMobile>I was thinking, how hard would it be to make it use LBA instead of CHS for reading srcfs?
<matrix_bridge><Andrius Štikonas> We now have 2 staged approach where stage1 is very small and is basically hex0 + jump to builder-hex0
<GoogulatorMobile>I'm seeing some weird issues on my boards that appear to be CHS-related
<matrix_bridge><Andrius Štikonas> But might be nicer to do multistage where second stage understands hex1 or hex2
<matrix_bridge><Andrius Štikonas> But that's unrelated to your issue...
<oriansj>improvements are always welcome; although the earliest stages tend to be the most annoying (due to manual offset calculations)
<GoogulatorMobile>rickmasters: is there a "high-level prototype" version of builder-hex0 stage2 that can be compiled into hex0 from a higher level language?
<rickmasters>It's developed in hex2 and then you run hex2tohex0.sh to compute the offsets but hex2 is the highest language
<rickmasters>development notes here https://github.com/ironmeld/builder-hex0/blob/main/DEVELOP.md
<rickmasters>By the way, I'm testing a release of Fiwix 1.5.0 patched for live-bootstrap and it will have your PAE passthrough PR.
<rickmasters>Should be ready soon
<stikonas>nice!
<stikonas>nad that hex2tohex0 is precisely the reason why it would be nice to have another small intermediate stage that is written in hex0 but can do hex2
<stikonas>(or possibly 2 stages, with hex1 in the middle)
<rickmasters>stikonas: yes I think having a hex2 stage would be better.
<rickmasters>stikonas: Initially it wasn't so clear because builder-hex0 was small enough to develop in hex0.
<rickmasters>stikonas: And we already had a hex2 compiler in user land so I focused on being able to run that to compile a later kernel.
<stikonas>well, that's some improvement for the future
<stikonas>probably should first try to get everything upstreamed
<stikonas>(Fiwix stuff)
<rickmasters>stikonas: Over time builder-hex0 just grew organically and the porting to a higher language became a big job.
<stikonas>well, complexity of more complex programs is exactly the reason why stage0-posix has those intermediate stages
<stikonas>but yes, refactoring this low level stuff is a lot of effort
<rickmasters>stikonas: bug porting builder-hex0.hex0 to builder-hex0.hex2 was reasonable and writing a converter to hex0 was fairly easy with a full tool chain (although arguably cheating)
<rickmasters>s/bug/but
<stikonas>well, hex2 to M0 shouldn't be too hard either
<stikonas>but M0 to C would be far bigger change
<stikonas>hex2 and M0 mostly differ by having nicer macros rather than hex numbers
<rickmasters>I agree its something we can do in the future. Right now, Fiwix upstreaming, fossy's simplification, Googulator's bare metal work are higher priorities.
<stikonas>indeed, I'm especially looking at Googulator's baremetal work
<stikonas>would be nice to actually start using live-bootstrap for installing new system
<fossy>baremetal is looking *very* tricky, to be honest
<stikonas>yeah...
<stikonas>fossy: so what are the main challanges?
<stikonas>there is that BIOS map thing, but that's in review now
<rickmasters>fossy: curious what your status is on your simplification work? If you have a fix for the PATH issue perhaps I can help with kernel bootstrap if it needs further work
<stikonas>rickmasters: I guess just mkdir would workaround PATH issue
<fossy>rickmasters: I *think* that all issues relating to the kernel bootstrap directly are resolved. my current run is testing Fiwix->Linux transition. but i think that might be nearly OK; i'll let you know though :)
<fossy>stikonas: i just made sure that PATH exists in the env file
<stikonas>ok, that would work too
<rickmasters>from my notes, for bare metal we need builder-hex0 to expose the BIOS memory map,
<rickmasters>kexec-fiwix needs to pass the memory map to linux
<rickmasters>we need to remove a linux tar ball and change the linux headers accordingly,
<rickmasters>we need to change the linux config to include more drivers,
<fossy>stikonas: the main challenge that I'm not sure the best way to solve at this point is; there's a fair few drivers we could need to mount a disk (IDE/AHCI, NVME, USB). adding more drivers requires more ramdisk space, which is scarce. to get more space is difficult
<fossy>we can 2-stage linux for ethernet, etc drivers, that's pretty easily doable
<fossy>disk drivers are my primary concern
<rickmasters>we need to fix deblob of a network driver, googulator filed an issue.
<stikonas>yeah, we only need storage drivers for 1st stage
<stikonas>well, I guess we could skip deblob...
<stikonas>though the question then is. Let's say somebody has a NIC that needs blobs and still want to run bootstrap on it
<fossy>RE: linux headers, https://github.com/fosslinux/live-bootstrap/pull/334/commits/cc436d47cc801f5391c48cd9a85c2060d9ef17a9
<stikonas>how do they inject those blobs into sysc (or whatever it will become after simplification)
<fossy>don't the blobs get included in the driver's .o?
<stikonas>fossy: no, it's in /lib if I remember correctly
<stikonas>unfortunately, in the future we'll need more and more HW blobs...
<stikonas>for Wifi, the last generation that worked on linux-libre was 802.11n. For GPUs, there used to be some old nouveau cards that worked without them and up to recently Intel GPUs too. But I think not anymore...
<stikonas>well, I guess for blobless stuff people will have to turn to non-x86 arches anyway
<stikonas>fossy: so linux-firmware repo install stuff to /lib/firmware
<fossy>ah, right
<stikonas>perhaps we need an additional parameter to inject extra stuff...
<stikonas>(subject to space availability)
<stikonas>that might actually be useful for non-kernel bootstrap testing of other arches
<stikonas>e.g. right now early stages of live-bootstrap run on riscv64 but if you want to test it on x86_64 machine (much faster than native riscv64 bootstrap), you need to inject qemu-riscv64 binary into bootstrap environment
<stikonas>(or in bwrap case it could be another --ro-try-bind argument)
<rickmasters>For bare metal storage the hardware needs to be supported by builder-hex0, Fiwix, and Linux.
<rickmasters>builder-hex0 uses BIOS with CHS interface so that's a bit different.
<rickmasters>Googulator is asking about using LBA instead of CHS in builder-hex0, which would have some effect on supported hardware.
<rickmasters>I'm thinking it may make sense to have a small list of supported hardware to be used (rarely) for verification purposes, otherwise the primary platform is qemu
<stikonas>so then people would be mostly expected to bootstrap new system using pre-existing Linux kernel?
<rickmasters>well, that's where we are now, but we're working on improving but I think it's reasonable to set limits
<stikonas>true...
<stikonas>though if we try to keep system small enough to fit in memory
<stikonas>then one could use builder-hex0 BIOS calls to read everything
<stikonas>Fiwix will just use memory
<stikonas>and then Linux can use storage again
<stikonas>assuming that your storage is supported by curently bootstrap Linux version
<rickmasters>stikonas: you're right, thanks for the correction.
<stikonas>well, you were not wrong either
<stikonas>it's just different paths we could take
<stikonas>though it might be that in the future this path will be come less useful either
<stikonas>if there is completely new hardware that Linux 4.x wouldn't support
<stikonas>(same true for NICs, probably even worse)
<stikonas>perhaps that's why we need USB drivers...
<stikonas>then you could use some older USB NIC and storage...
<stikonas>anyway, it's a hard problem to make bootstrap work across wide range of hardware
<stikonas>it is indeed much simpler to target a small set
<rickmasters>fossy: Could you elaborate more on the ramdrive size limit?
<fossy>we have X MB of RAM available for the ramdisk, currently I think that's around 1280MB? (expanding that number may be a possibility in the future, but will require prospective bootstrapping systems to have more RAM...) building more drivers requires more space to build those drivers, space we don't really have in RAM
<stikonas>yaeh, but how big are those drivers...
<stikonas>even full distro kernel is not that big
<fossy>most drivers are kernel modules usually
<fossy>on my system that is 155M
<fossy>but compiling it uses a fair bit more space than that :\
<fossy>due to intermediate fiels
<fossy>files*
<stikonas>well, my Gentoo kernel (that I didn't specifically optimize for size) is 7 MiB for kernel and 37 MB for modules
<fossy>hm, not so bad
<stikonas>but yes, compiling would use more
<stikonas>I think my modules might be compressed too
<stikonas>yeah, I have them in .ko.xz format
<rickmasters>fossy: sorry where does the 1280MB limit come from exactly?
<fossy>i *think* that is the space guranteed by Fiwix's memory map, if i understand correctly. Googulator would know exactly
<stikonas>that might be the number on his system
<stikonas>might be BIOS dependent
<stikonas>due to those PAE maps
<fossy>it is bios dependent but there is a maximum size of the MMIO which is the upper limit
<stikonas>but yes, we should ask Googulator
<rickmasters>fossy: sounds like you're talking about the Fiwix ram drive used to build the linux kernel?
<rickmasters>If so, that is limited by available memory in builder-hex0 to hold the ext2 initrd file which is passed to Fiwix.
<rickmasters>The current initrd is 1152MB but perhaps removing a linux tar ball increased it.
<stikonas>rickmasters: do you mean decreased it?
<stikonas>we should be using less pace in initrd after removing linux tarball
<rickmasters>we'd be using less memory in builder-hex0 so we'd have more room to create a bigger initrd that has more space available for building
<stikonas>I see
<stikonas>yeah, makes sense
<stikonas>I've slightly misunderstood the logic but in the end it's the same thing
<stikonas>available memory is increased, not initrd is increased....