IRC channel logs
2023-11-11.log
back to list of logs
<Googulator>I've just realized that my the part of my proposal where the boot sector just prints dots until it can see the real sector 1 has a slight vulnerability: a backdoored host could write a boot sector that actually wants to load more binary code from sector 1 and beyond, but implements the same wait-with-dots behavior until sector 1 becomes visible. <Googulator>Also, a malicious boot sector could just read from sector 2, 4, 6, and so on, ignoring the inaccessible odd-numbered ones. All of this could enable the malicious boot sector to emulate the clean one. Luckily, it's fixable. <Googulator>Instead of a single switch on the A8 line, have switches on all address lines A8 and up. On boot, the user needs to close them in such an order that the A8 switch is closed last, since that's what the legit boot sector checks. <Googulator>This way, on boot, truly only the first sector is accessible. <Googulator>Then, rather than just printing a dot everytime the boot sector sees itself as sector 1, it prints a dot, followed by the entire contents of sector 1 (i.e. itself), hex-encoded. This way, a malicious boot sector has no chance of embedding a compressed copy of the clean one within itself and emulate this behavior. <Googulator>This can be made even stronger if instead of padding the boot sector with zeros, we pad with new, cryptographically secure pseudorandom data every time we generate a bare metal image. <Googulator>This maximally reduces the compressibility of the boot sector, leaving no space for a malicious boot sector's own code if it chooses to embed a compressed clean copy. <Googulator>Ideally, the source of the random data should be something like /dev/random, or even a true hardware RNG. Alternatively, seed a PRNG with way more than 512 bytes of data. <oriansj>stikonas: we can always optimize the generated code in M2-Planet, it was just my first guess at how to do it (there definitely are potential improvements I missed) <oriansj>Googulator: you could set all of the lines to be toggled off by a single switch <Googulator>That could lead to the lines getting connected in a nondeterministic order. It's important that the lowest-numbered line (in my example, A8) is connected last. However, 2 switches should be doable. <oriansj>assuming CRC is sufficient, you could have sector 0 validate itself prior to its use and use a manually set (and impossible guess 100Byte random seed) to ensure an attacker can't just hardcode a crc value <Googulator>It's actually the _legit_ sector 0 that can get confused if line A8 comes up before A9 <Googulator>In that case, it will read sector 1 just fine, but read garbage instead of subsequent sectors <oriansj>stikonas: now to begin fuzzing the crap out the new switch functionality <Googulator>2 switches are needed to defend not against attacks, but rather against random glitching from a nondeterministic switching order <Googulator>BTW, is anyone still working on bootstrapping Haskell? <stikonas>there were few attemts but not too successful <stikonas>Googulator: by the way, can't you write boot sector to e.g. optical media and inspect under microscope? <Googulator>All attempts I found so far seem to focus on v.029 <Googulator>stikonas: not sure if that's possible with a plain old light microscope, and using an electron microscope introduces trust issues (electron microscopes are usually digital with lots of firmware, among other issues) <stikonas>Googulator: there were attempts at far newer GHC <Googulator>"GHC~0.26 doesn't build with HBC. (It could, but we haven't put in the effort to maintain it.)" <stikonas>and lit files I think are not bootstrapped either <stikonas>there was some attempt at 2.something I think <stikonas>oh, maybe I'm confusing it with something else <stikonas>I remember early ghc depended on something starting with l... <stikonas>or maybe it was some other compiler, nh98 or something like that <stikonas>anyway, I'm not most up to date on this... <Googulator>It appears GHC v0.24 may have been the last version explicitly buildable using HBC - unfortunately it's lost, and so are all earlier versions <Googulator>But AFAIK that was only used in super early releases (all of which are lost) <stikonas>and thee old versions would be tricky to run on new systems in any case <Googulator>[ 4.046888] RAMDISK: gzip image found at block 0 <Googulator>[ 4.241804] RAMDISK: incomplete write (9780 != 16384) <Googulator>[ 4.248331] VFS: Cannot open root device "(null)" or unknown-block(0,0): error -6 <Googulator>Is there some hard limit on initramfs file size? <stikonas>I've seen cases where live-bootstrap failed to load big initramfs... <stikonas>but that might be before kernel bootstrapping work <stikonas>but probably we still have various limits <stikonas>Googulator: but which bootstrap option are you using? <Googulator>qemu with kernel bootstrapping, but I made some changes <stikonas>I think builder-hex0 does not create gziped initramfs <stikonas>but I had sometimes case of initramfs not booting when I forgot to remove some big files from /sysa/distfiles <Googulator>I'm trying to bring the compiled Linux kernel image into the final system, so that after bootstrap, I can have a bootable image <stikonas>Googulator: yeah, that can definitely exceed the size <stikonas>some of the limits that we have might be fairly tight <stikonas>partly because we keep everything in RAM before Linux <stikonas>that's an option (though needs work in live-bootstrap) <stikonas>but then we'll lose ability to bootstrap on hardware with e.g. nvme which Fiwix doesn't support <Googulator>Meanwhile, just realized another problem with my Trusting-Trust-resistance plan: some of the files in srcfs are gzipped; just printing those as-is won't give us a readable printout before their contents are executed <stikonas>Googulator: well, that's not hard to solve <stikonas>just need to print their contents as we unzip them <stikonas>and tars on the other hand are mostly human readable <Googulator>builder-stage2 could then use a "zip" command, almost synonymous with "src", but not printing what it reads (because it's useless binary garbage at that point) <oriansj>well nothing should be gzip'd or bzip'd until after we properly bootstrap their unpackers <oriansj>well an hour of fuzzy thus far and no segfaults found yet <oriansj>but much more tests need to be written before we know the instructions it generates produces correct behavior over all normally written code <stikonas>well, you could also start dogfooding a bit of that in stage0 <stikonas>i.e. mescc-tools or M2-Mesoplanet could use switch now <stikonas>some projects like to claim that they write compiler in its own language and (make bootstrapping hard) in order to test features. But as we have shown, having bootstrap chain and also various helper tools gives you just as many tests... <muurkha>I mean adding a feature to your language and being able to use it is fun <muurkha>maintaining multiple slightly different copies of your compiler codebase maintained under different restrictive language subsets is the opposite of fun for most people <stikonas>not necesserily different coppies, e.g. you can have your language X written in C++, compiled with GCC, but then all other tooling (which most languages have too) can be written in that new language <muurkha>sure, that's a thing you can do if you're a total masochist <muurkha>or if your language is worse than C++ for writing compilers in, and a few such languages do exist. Fortran 77, for example <stikonas>well, even inside gcc, you hav estuff like fortran <stikonas>just one of the already bootstrapped languages <muurkha>yeah, Scheme is fairly decent. jcowan pointed out the other day that there is a widely supported (but withdrawn!) SRFI that supplies pattern matching <muurkha>which I probably would have used for Ur-Scheme if I'd known about it <Googulator>In kexec-fiwix.c, we tell Fiwix how much RAM to reserve for the eventual Linux kernel to kexec into <oriansj>muurkha: I guess I am a masochist then, because I find writing a compiler in even assembly to be fun <oriansj>and assembly is a much worse language than c++ to work in <oriansj>(in terms of productivity and ease of improvement) <oriansj>well it is a bad language for figuring out what you want to do <oriansj>but it is pretty quick if you know exactly what you want to do and how to do it <muurkha>it might depend in part on your programming environment <muurkha>I think Minsky wrote a paper about how computers were good vehicles for poorly-thought-out ideas? <muurkha>he liked interactively programming in machine code in DDT <oriansj>well pdp-10s running ITS were much more fun than any system around these days <muurkha>but I think he wrote that paper before ITS <oriansj>and it is probably the inspiration for the LISP behavior on exceptions <oriansj>drop into a lisp shell and fix the code and inspect/mess with the variables if needed <muurkha>ITS probably isn't the inspiration for that; you could do that in DDT in the 01950s <muurkha>and all the BASIC environments I've used work that way too, which I suspect is what the original Dartmouth BASIC system did in the early 01960s <muurkha>but it's very plausible that DDT was the inspiration for that LISP behavior <muurkha>larsb described DDT the other day as "Emacs for machine code", and I think there's something to that <oriansj>plus the 36bit machine was quite beautiful assembly <muurkha>it looks nice but I haven't tried using it <muurkha>because the consistent instruction format makes it easy for DDT to follow a pointer from an instruction word to the operand word that it's referencing; it has a key for that <muurkha>obviously that makes less sense when you're referring to a local variable in a stack frame or an instance variable in an object, but those were a lot less common than they are now <oriansj>and they actually passed syscall arguments in registers; which meant their .text sections were actually pure assembly instructions and no data <oriansj>which from first hand experience makes for a much simpler disassembler <jcowan>What made ITS special was that DDT was its shell <oriansj>13 hours of fuzzing, no segfaults yet <vagrantc>hrm. struggling with updating the debian packages for mescc-tools 1.5+ due to the git submodule for M2libc ... which are a dependency for updating mes <vagrantc>seems i have a few suboptimal approaches ... treat M2libc as a patch, manually create a tarball of the relevent M2libc bits, package M2libc as it's own source package ... or convince upstream to not use git submodules :) <vagrantc>or is it plausible to build M2libc independently? <stikonas>vagrantc: why do you need to treat M2libc separately? <vagrantc>the tarball of mescc-tools i have does not include it <stikonas>the one in the release announcement should have <stikonas>or the one in savannah download area too <vagrantc>i think i'm using one auto-generated from git as a web interface <vagrantc>and then there's the question of where do i find the most up-to-date gpg key <vagrantc>i have yet to even find a release announcement for mescc-tools <matrix_bridge><Andrius Štikonas> vagrantc: announcement is on bootstrapoable list <matrix_bridge><Andrius Štikonas> I pushed 1.5.0 tag to github (which had no tags) without realising that oriansj pushed 1.5.0 tag to savannah <matrix_bridge><Andrius Štikonas> So later oriansj tagged savannah commit with 1.5.1 <vagrantc>so ... what is tagged as 1.5.1 is what is shipped as 1.5.0 ? <vagrantc>or what is tagged as 1.5.1 is not a tag that should be used? <vagrantc>can we get a 1.5.2 that makes this madness go away? :) <vagrantc>or at least, sweeps the madness under the rug <matrix_bridge><Andrius Štikonas> I had no push access to savannah before this ... <matrix_bridge><Andrius Štikonas> There is one other small fix since then anyway <vagrantc>maybe i am not subscribed to the bootstrappable list... <muurkha>oriansj: as opposed to, say, the PDP-8, where jcowan tells me you normally put the arguments after the call instruction? <stikonas>vagrantc: there is also changelog on github release page <vagrantc>stikonas: should i pull tarballs from the github release page? <vagrantc>let's see if i can find the right one :) <stikonas>what changed... Though it's mostly the same content as changelog.org <vagrantc>ok, seem to be on the right path finally... <vagrantc>stikonas: thanks for wrapping up the confusion, but would definitely appreciate a new release that reduces the chances of confusion :) <stikonas>yeah, I'll make a new one as soon as I sort out my misbehaving mouse (probably due to being in the middle of upgrade, none of the buttons work, so I'm keyboard only)