IRC channel logs

2023-11-11.log

back to list of logs

<Googulator>I've just realized that my the part of my proposal where the boot sector just prints dots until it can see the real sector 1 has a slight vulnerability: a backdoored host could write a boot sector that actually wants to load more binary code from sector 1 and beyond, but implements the same wait-with-dots behavior until sector 1 becomes visible.
<Googulator>Also, a malicious boot sector could just read from sector 2, 4, 6, and so on, ignoring the inaccessible odd-numbered ones. All of this could enable the malicious boot sector to emulate the clean one. Luckily, it's fixable.
<Googulator>Instead of a single switch on the A8 line, have switches on all address lines A8 and up. On boot, the user needs to close them in such an order that the A8 switch is closed last, since that's what the legit boot sector checks.
<Googulator>This way, on boot, truly only the first sector is accessible.
<Googulator>Then, rather than just printing a dot everytime the boot sector sees itself as sector 1, it prints a dot, followed by the entire contents of sector 1 (i.e. itself), hex-encoded. This way, a malicious boot sector has no chance of embedding a compressed copy of the clean one within itself and emulate this behavior.
<Googulator>This can be made even stronger if instead of padding the boot sector with zeros, we pad with new, cryptographically secure pseudorandom data every time we generate a bare metal image.
<Googulator>This maximally reduces the compressibility of the boot sector, leaving no space for a malicious boot sector's own code if it chooses to embed a compressed clean copy.
<Googulator>Ideally, the source of the random data should be something like /dev/random, or even a true hardware RNG. Alternatively, seed a PRNG with way more than 512 bytes of data.
<oriansj>stikonas: we can always optimize the generated code in M2-Planet, it was just my first guess at how to do it (there definitely are potential improvements I missed)
<oriansj>Googulator: you could set all of the lines to be toggled off by a single switch
<Googulator>That could lead to the lines getting connected in a nondeterministic order. It's important that the lowest-numbered line (in my example, A8) is connected last. However, 2 switches should be doable.
<Googulator>One for A8, one for everything A9 and up
<oriansj>assuming CRC is sufficient, you could have sector 0 validate itself prior to its use and use a manually set (and impossible guess 100Byte random seed) to ensure an attacker can't just hardcode a crc value
<Googulator>Sector 0 isn't the problem
<Googulator>It's actually the _legit_ sector 0 that can get confused if line A8 comes up before A9
<Googulator>In that case, it will read sector 1 just fine, but read garbage instead of subsequent sectors
<oriansj>stikonas: now to begin fuzzing the crap out the new switch functionality
<Googulator>2 switches are needed to defend not against attacks, but rather against random glitching from a nondeterministic switching order
<Googulator>BTW, is anyone still working on bootstrapping Haskell?
<stikonas>Googulator: I don't think so
<stikonas>there were few attemts but not too successful
<stikonas>Googulator: by the way, can't you write boot sector to e.g. optical media and inspect under microscope?
<Googulator>Because it seems we do have surviving sources for GHC v0.26: https://hub.darcs.net/simon/ghc_old/patch/93ed5c1c4b5ca702e69fc1d21c847bd824035eff
<Googulator>All attempts I found so far seem to focus on v.029
<Googulator>v0.29
<Googulator>stikonas: not sure if that's possible with a plain old light microscope, and using an electron microscope introduces trust issues (electron microscopes are usually digital with lots of firmware, among other issues)
<Googulator>GHC v0.26 sources in Git, a bit easier to read because it has a proper web interface: https://gitlab.haskell.org/ghc/ghc/-/tree/e7d21ee4f8ac907665a7e170c71d59e13a01da09
<stikonas>Googulator: there were attempts at far newer GHC
<Googulator>newer, yes
<Googulator>but not older
<Googulator>this predates 0.29
<stikonas>oh, this is the last blog post https://www.joachim-breitner.de/blog/802-More_thoughts_on_a_bootstrappable_GHC
<Googulator>Unfortunately, https://gitlab.haskell.org/ghc/ghc/-/blob/e7d21ee4f8ac907665a7e170c71d59e13a01da09/ghc/docs/install_guide/installing.lit#L1480
<stikonas>yeah, the old ones have this problem
<Googulator>"GHC~0.26 doesn't build with HBC.  (It could, but we haven't put in the effort to maintain it.)"
<stikonas>and lit files I think are not bootstrapped either
<stikonas>there was some attempt at 2.something I think
<Googulator>isn't lit just documentation?
<stikonas>oh, maybe I'm confusing it with something else
<stikonas>I remember early ghc depended on something starting with l...
<stikonas>or maybe it was some other compiler, nh98 or something like that
<stikonas>anyway, I'm not most up to date on this...
<Googulator>It appears GHC v0.24 may have been the last version explicitly buildable using HBC - unfortunately it's lost, and so are all earlier versions
<Googulator>IIRC it was LML (Chalmers Lazy ML)
<stikonas>oh indeed
<Googulator>But AFAIK that was only used in super early releases (all of which are lost)
<Googulator>Before self-hosting was achieved
<Googulator>I already posted on r/lostmedia, but nothing found so far before v0.26: https://www.reddit.com/r/lostmedia/comments/17qcemw/fully_lost_old_versions_of_the_glasgow_haskell/?sort=new
<stikonas>and thee old versions would be tricky to run on new systems in any case
<Googulator25>network issues...
<Googulator>Any ideas what could be causing this error?
<Googulator>[    4.046888] RAMDISK: gzip image found at block 0
<Googulator>[    4.241804] RAMDISK: incomplete write (9780 != 16384)
<Googulator>[    4.247450] write error
<Googulator>[    4.248331] VFS: Cannot open root device "(null)" or unknown-block(0,0): error -6
<stikonas>in live-bootstrap?
<stikonas>possibly truncated initramfs?
<Googulator>That's what it seems to be, but why?
<Googulator>Is there some hard limit on initramfs file size?
<stikonas>there might be
<stikonas>but I don't know why
<stikonas>I've seen cases where live-bootstrap failed to load big initramfs...
<stikonas>but that might be before kernel bootstrapping work
<stikonas>but probably we still have various limits
<stikonas>Googulator: but which bootstrap option are you using?
<Googulator>qemu with kernel bootstrapping, but I made some changes
<stikonas>I think builder-hex0 does not create gziped initramfs
<stikonas>hmm
<Googulator>This is the 2nd initramfs
<Googulator>build in Fiwix for Linux
<stikonas>but I had sometimes case of initramfs not booting when I forgot to remove some big files from /sysa/distfiles
<Googulator>I'm trying to bring the compiled Linux kernel image into the final system, so that after bootstrap, I can have a bootable image
<stikonas>x86 imposes quite a few limits
<stikonas>due to 32-bit memory addressing
<stikonas>and kernel uses 1GiB internally
<stikonas>so only 3 GiB are left
<stikonas>Googulator: yeah, that can definitely exceed the size
<stikonas>some of the limits that we have might be fairly tight
<stikonas>partly because we keep everything in RAM before Linux
<stikonas>Fiwix could use hard drives
<stikonas>that's an option (though needs work in live-bootstrap)
<stikonas>but then we'll lose ability to bootstrap on hardware with e.g. nvme which Fiwix doesn't support
<Googulator>Meanwhile, just realized another problem with my Trusting-Trust-resistance plan: some of the files in srcfs are gzipped; just printing those as-is won't give us a readable printout before their contents are executed
<stikonas>Googulator: well, that's not hard to solve
<stikonas>just need to print their contents as we unzip them
<stikonas>and tars on the other hand are mostly human readable
<Googulator>Yeah, printing on ungz could help there
<Googulator>builder-stage2 could then use a "zip" command, almost synonymous with "src", but not printing what it reads (because it's useless binary garbage at that point)
<oriansj>well nothing should be gzip'd or bzip'd until after we properly bootstrap their unpackers
<oriansj>well an hour of fuzzy thus far and no segfaults found yet
<oriansj>^fuzzy^fuzzing^
<stikonas>that's good
<oriansj>but much more tests need to be written before we know the instructions it generates produces correct behavior over all normally written code
<stikonas>well, you could also start dogfooding a bit of that in stage0
<stikonas>i.e. mescc-tools or M2-Mesoplanet could use switch now
<oriansj>and mescc-tools-extras
<stikonas>some projects like to claim that they write compiler in its own language and (make bootstrapping hard) in order to test features. But as we have shown, having bootstrap chain and also various helper tools gives you just as many tests...
<muurkha>they're more of a pain to maintain
<muurkha>I mean adding a feature to your language and being able to use it is fun
<muurkha>maintaining multiple slightly different copies of your compiler codebase maintained under different restrictive language subsets is the opposite of fun for most people
<stikonas>not necesserily different coppies, e.g. you can have your language X written in C++, compiled with GCC, but then all other tooling (which most languages have too) can be written in that new language
<muurkha>sure, that's a thing you can do if you're a total masochist
<muurkha>or if your language is worse than C++ for writing compilers in, and a few such languages do exist. Fortran 77, for example
<stikonas>well, even inside gcc, you hav estuff like fortran
<stikonas>yeah...
<stikonas>well, doesn't have to be C++
<stikonas>just one of the already bootstrapped languages
<muurkha>yeah, Scheme is fairly decent. jcowan pointed out the other day that there is a widely supported (but withdrawn!) SRFI that supplies pattern matching
<muurkha>which I probably would have used for Ur-Scheme if I'd known about it
<Googulator>Found the initrd size limit...
<Googulator>In kexec-fiwix.c, we tell Fiwix how much RAM to reserve for the eventual Linux kernel to kexec into
<Googulator> kexec_proto=linux kexec_size=67000
<oriansj>muurkha: I guess I am a masochist then, because I find writing a compiler in even assembly to be fun
<muurkha>:)
<oriansj>and assembly is a much worse language than c++ to work in
<oriansj>(in terms of productivity and ease of improvement)
<muurkha>maybe, I like assembly better
<muurkha>but it's certainly slower going
<oriansj>well it is a bad language for figuring out what you want to do
<oriansj>but it is pretty quick if you know exactly what you want to do and how to do it
<muurkha>it might depend in part on your programming environment
<muurkha>I think Minsky wrote a paper about how computers were good vehicles for poorly-thought-out ideas?
<muurkha>he liked interactively programming in machine code in DDT
<muurkha>yes, in 01967: "WHY PROGRAMMING IS A GOOD MEDIUM FOR EXPRESSING POORLY UNDERSTOOD AND SLOPPILY­FORMULATED IDEAS" https://web.media.mit.edu/~minsky/papers/Why%20programming%20is--.html
<oriansj>well pdp-10s running ITS were much more fun than any system around these days
<muurkha>you can run one now if you want
<muurkha>but I think he wrote that paper before ITS
<oriansj>and it is probably the inspiration for the LISP behavior on exceptions
<oriansj>drop into a lisp shell and fix the code and inspect/mess with the variables if needed
<muurkha>ITS probably isn't the inspiration for that; you could do that in DDT in the 01950s
<muurkha>and all the BASIC environments I've used work that way too, which I suspect is what the original Dartmouth BASIC system did in the early 01960s
<muurkha>but it's very plausible that DDT was the inspiration for that LISP behavior
<muurkha>larsb described DDT the other day as "Emacs for machine code", and I think there's something to that
<oriansj>plus the 36bit machine was quite beautiful assembly
<muurkha>it looks nice but I haven't tried using it
<muurkha>it's a big benefit for DDT though
<muurkha>because the consistent instruction format makes it easy for DDT to follow a pointer from an instruction word to the operand word that it's referencing; it has a key for that
<muurkha>obviously that makes less sense when you're referring to a local variable in a stack frame or an instance variable in an object, but those were a lot less common than they are now
<oriansj>and they actually passed syscall arguments in registers; which meant their .text sections were actually pure assembly instructions and no data
<oriansj>which from first hand experience makes for a much simpler disassembler
<jcowan>What made ITS special was that DDT was its shell
<oriansj>13 hours of fuzzing, no segfaults yet
<vagrantc>hrm. struggling with updating the debian packages for mescc-tools 1.5+ due to the git submodule for M2libc ... which are a dependency for updating mes
<vagrantc>seems i have a few suboptimal approaches ... treat M2libc as a patch, manually create a tarball of the relevent M2libc bits, package M2libc as it's own source package ... or convince upstream to not use git submodules :)
<vagrantc>or is it plausible to build M2libc independently?
<vagrantc>and then just use that?
<stikonas>vagrantc: why do you need to treat M2libc separately?
<stikonas>it's included in the tarball
<vagrantc>stikonas: which tarball?
<stikonas>vagrantc: release tarball
<vagrantc>the tarball of mescc-tools i have does not include it
<stikonas>where did you get your tarball from?
<stikonas>the one in the release announcement should have
<stikonas>or the one in savannah download area too
<vagrantc>ok, will look for that one.
<vagrantc>i think i'm using one auto-generated from git as a web interface
<vagrantc>and then there's the question of where do i find the most up-to-date gpg key
<vagrantc>hrm. https://download.savannah.nongnu.org/releases/mescc-tools/ only has 1.5.0 and 1.5.1 is out ...
<vagrantc>i have yet to even find a release announcement for mescc-tools
<matrix_bridge><Andrius Štikonas> vagrantc: announcement is on bootstrapoable list
<matrix_bridge><Andrius Štikonas> 1.5.0 is the latest
<matrix_bridge><Andrius Štikonas> Though we messed up with tarballs a bit
<matrix_bridge><Andrius Štikonas> Or got tags
<matrix_bridge><Andrius Štikonas> I pushed 1.5.0 tag to github (which had no tags) without realising that oriansj pushed 1.5.0 tag to savannah
<matrix_bridge><Andrius Štikonas> So later oriansj tagged savannah commit with 1.5.1
<matrix_bridge><Andrius Štikonas> vagrantc: and my key is at https://stikonas.eu/andrius.asc
<vagrantc>so ... what is tagged as 1.5.1 is what is shipped as 1.5.0 ?
<matrix_bridge><Andrius Štikonas> On savannah yes
<vagrantc>or what is tagged as 1.5.1 is not a tag that should be used?
<matrix_bridge><Andrius Štikonas> On github is 1.5.0 both tag and release
<vagrantc>can we get a 1.5.2 that makes this madness go away? :)
<matrix_bridge><Andrius Štikonas> I guess we could...
<vagrantc>or at least, sweeps the madness under the rug
<matrix_bridge><Andrius Štikonas> I had no push access to savannah before this ...
<matrix_bridge><Andrius Štikonas> So didn't psy so much attention to it :(
<matrix_bridge><Andrius Štikonas> Anyway, for now use that tarball
<matrix_bridge><Andrius Štikonas> I can later create a new one
<matrix_bridge><Andrius Štikonas> There is one other small fix since then anyway
<vagrantc>maybe i am not subscribed to the bootstrappable list...
<muurkha>oriansj: as opposed to, say, the PDP-8, where jcowan tells me you normally put the arguments after the call instruction?
<stikonas>vagrantc: there is also changelog on github release page
<vagrantc>stikonas: should i pull tarballs from the github release page?
<stikonas>should be the same
<stikonas>I'll upload to both
<stikonas>so savannah might be nicer in principle
<stikonas>but up to you
<vagrantc>let's see if i can find the right one :)
<stikonas>but you can at least read here https://github.com/oriansj/mescc-tools/releases/tag/Release_1.5.0
<stikonas>what changed... Though it's mostly the same content as changelog.org
<vagrantc>ok, seem to be on the right path finally...
<vagrantc>stikonas: thanks for wrapping up the confusion, but would definitely appreciate a new release that reduces the chances of confusion :)
<stikonas>yeah, I'll make a new one as soon as I sort out my misbehaving mouse (probably due to being in the middle of upgrade, none of the buttons work, so I'm keyboard only)