IRC channel logs

2022-04-03.log

back to list of logs

<oriansj>muurkha: I don't think it solves the exact same problem but perhaps a related one
<muurkha>well, to solving the Karger–Thompson attack, which is one objective of stage0 etc.
<oriansj>true
<nimaje>well, you want proofs for (most of) your bootstrap, that doesn't mean that you need to verify them via some tiny seed, you could run the verification on a bootstrapped system
<vagrantc>verifying the bootstrap using the tools the bootstrap generated?
<oriansj>well the problem of creating a trusted system for doing the verifying; is kinda why it is related but not exactly the same thing
<oriansj>which honestly is why the assumption: user has a trusted way to get the root binaries into RAM to run our builds is unavoidable
<muurkha>nimaje: if the proofs of your bootstrap are unsound, perhaps the unsoundness is that your bootstrap contains a backdoor that looks for your proof verifier and tweaks its answer to be "accept" in some case
<oriansj>punting it to hand toggling memory and/or specialized hardware is a much easier thing to do
<muurkha>so it would be worthwhile to run the proof verifier on a system that isn't vulnerable to the thing you're trying to verify
<muurkha>and there are Metamath Zero verifiers that are quite small in machine code
<littlebobeep>oriansj: Why does bootstrap seed for x86-32 need to be under 510 bytes?
<oriansj>littlebobeep: the Master Boot Record is only 512 bytes in size and the last 2 bytes are reserved for a signature that is mandatory
<littlebobeep>Oh it needs to fit in MBR?
<muurkha>heh, "x86-32"
<oriansj>littlebobeep: you can fit a whole operating system in that space: https://github.com/nanochess/bootOS
<oriansj>or a lisp: https://github.com/jart/sectorlisp
<oriansj>or a FORTH: https://github.com/cesarblum/sectorforth
<oriansj>so why if you wanted a minimal and trivial to audit root binary would you go larger?
<vagrantc>easter eggs
<oriansj>well a trusting trust attack would certainly be one hell of an easter egg
<oriansj> https://gitlab.com/bauen1/stage0-backdoor.git
<muurkha>not sure if bootOS is actually an OS, but it's about as much of an OS as MS-DOS
<muurkha>it can load and run programs, and you can write programs in it
<littlebobeep>muurkha: Yeah just specifying bit depth
<littlebobeep>speaking of bit depth why does TCC not work on x86_64?
<oriansj>littlebobeep: TCC does work on x86_64
<oriansj>it is Gnu Mes that only supports x86-linux and armhf-linux
<muurkha>littlebobeep: it's not wrong!
<littlebobeep>oriansj: Oh wut okay someone told me in #guix that it was TCC's limitations that cause all Guix bootstrap binaries to be x86-32
<littlebobeep>oriansj: Do you know what it is about GNU Mes that only supports those two 32-bit ISAs? Why not x86-64 or aarch64 or riscv64?
<oriansj>littlebobeep: short answer is because no one has taken the time to do the porting work yet nor funded janneke to work on it yet
<oriansj>So it can be done and stage0-posix+TCC both support AMD64 so it would enable live-bootstrap to have a start on that port too. (not sure where the next blocker will be yet though)
<littlebobeep>Hmmmm okay well isn't it written in C so should be portable...?
<littlebobeep>Ahh I thought stage0 was 32-bit only
<oriansj>littlebobeep: I wonder how seeing AArch64, AMD64 and riscv64 https://github.com/oriansj/stage0-posix would indicate it was 32bit only?
<oriansj>also C can be extremely portable if done with care and interpreted languages like scheme and even common lisp can become non-portable if one with careless
<oriansj>So the language isn't the deciding factor on how portable something is, so much as the consideration done by the developer(s)
<oriansj>(even macro assembly can be portable to different hardware architectures and operating systems)
<muurkha>C can be portable or nonportable
<oriansj>(things below macro assembly however tend not to be portable simply due to the high bar required to make them portable)
<oriansj>(portable hex0 is possible if one is willing to do custom hex0 ELF headers)
<oriansj>also MesCC is portable to Mes.c and guile; not sure how many other schemes it runs well on though
<oriansj>and Guile does run on 6x86-pae, AMD64, AArch64, armel, armhf, mips, mips64el, mipsel, ppc64el, s390x (but not on mips64, ppc64, ppc32el, ppc32, s390, riscv32, riscv64 or a bunch of other architectures that having working C compilers)
<muurkha>interesting, I didn't know that about Guile!
<oriansj>and because M2-Planet supports Knight and there are C compilers for both FORTH machines and Lisp machines. One could argue C is the most portable language; it just has a bunch of undefined behaviors that people tend to abuse.
<muurkha>I think C has a lot of implementation-specific behavior too
<muurkha>like, 65535u + 1u isn't undefined, but whether it ends up being 65536u or 0u depends on the machine's word size
<oriansj>muurkha: I mean one can't even assume -1+1 = 0 in C (it can be -0)
<muurkha>I think -0 == 0 though
<oriansj>not in terms of behavior: https://en.wikipedia.org/wiki/Ones'_complement#Negative_zero
<muurkha>I don't know, I've never programmed on a one's-complement machine
<muurkha>but I thought C guaranteed that they tested equal
<oriansj>well I am not certain of the current standard in that regard but I have seen C code which made a serious point about ((0 == a) || (-0 == a))
<muurkha>I see
<muurkha>I don't think -1+1 is undefined behavior either though
<muurkha>(independent of whether C guarantees that -1+1 == 0 I think it does guarantee it won't make demons fly out of your nose)
<oriansj>one also has to remember not all C compilers behave according to the spec
<oriansj>and valid C code for one compiler might not be valid for another C compiler
<muurkha>aye
<oriansj>but then again, I don't know of any real programming language that it couldn't also be said about.
<muurkha>sure, once you have multiple implementations they have differences
<muurkha>C tries to expose the machine more than high-level languages do, though ANSI has rolled that back pretty far in recent decades
<oriansj>the best one can hope is the Turing core and import functionality is universal or eventually one implementation will "win" and then bleh
<muurkha>burn it down, burn it all down
<muurkha>we don't need no water
<oriansj>muurkha: wouldn't work, we have a C compiler and scheme bootstrapped
<oriansj>and neither of those languages pass on the universal import functionality front
<oriansj>so there is always going to be a "winner" for those languages and no way to avoid it
<oriansj>not that it would be hard to make a "P" language which would be 99% compatible with C and without that sort of detail but I doubt we could convince anyone to bother to adopt it
***alMalsamo is now known as littlebobeep
<oriansj>heck just replace int, char, MES and long in M2-Planet with int32_t, int8_t, uint64_t and int64_t; pull out the macro functionality and stict with the CONSTANT keyword and you'll have it
<oriansj>^stict^stick^
<oriansj>p $LIBRARY_PATH/libp.p hello-world.p -o hello-world
<oriansj>might make for some ugly compile commands but it'll be impossible for one not to know where the source they are building from and it'll be impossible for any particular implementation of that language to "win" as any competitor could just support that turing core and compile behavior and build every program written in that language
<muurkha>yeah
<oriansj>possibly toss in a rule in the language specification banning the use of ASM() outside of a single standard library which provides access to low level functionality (syscalls sort of fun)
<oriansj>then autotools doing things like: checking size of int8_t and checking size int64_t wouldn't happen
<littlebobeep>oriansj: You mentioned coding a kernel in ASM so we wouldn't depend on Linux binary to bootstrap, but ASM for what ISA? This would not be portable to other ISAs...
<oriansj>littlebobeep: it is something I will have to do several times honestly; just like I wrote C compilers in assembly multiple times (knight, x86, amd64) and with luck other people might want to help (like I got for AArch64, riscv32 and riscv64) and they just leverage my work to speed up their own
<oriansj>So ideally I would have a minimal one written in C that is good enough for live-bootstrap to build TCC. Snapshot that code and just hand compile to assembly; then continue the development on the C version until it is powerful enough to run all the steps needed to bootstrap Linux (or Guix) and then we do ports to as many architectures as possible and then I get to move on to another fun road-block
<muurkha>you only need to bootstrap once to defeat the Karger–Thompson attack
<muurkha>if you do it on ppc64el you can then use it to cross-compile kernels and compilers for other architectures
<littlebobeep>muurkha: Is Karger-Thompson attack the same as Trusting Trust backdoored compiler attack?
<muurkha>yeah, Thompson's Turing Award lecture on it was called "Reflections on Trusting Trust"
<littlebobeep>muurkha: OKay yes I never had a system that is not vulernable to this :/
<littlebobeep>muurkha: I never heard of Karger tho
<muurkha>Paul Karger and Roger Schell published the attack in Multics Security Evaluation: Vulnerability Analysis in 01974
<muurkha>cf. https://en.wikipedia.org/wiki/Backdoor_%28computing%29#Compiler_backdoors
<muurkha>oh, I bet mode +c removed my underlining of the paper title
<muurkha>but you can probably figure out what it is
<oriansj>muurkha: true but I'm going after the Nexus Intruder Attack; Sorting out the Trusting Trust attack is just a step on that journey
<oriansj>you know, solve the software problems first before getting into the more expensive work
<muurkha>you explained what the Nexus Intruder attack was before, but I forget
<littlebobeep>Ohhh okay I read about the Multics attack
<nimaje>does autotools even work on platforms where CHAR_BIT != 8? well, you wouldn't have int8_t on those and int64_t only if CHAR_BIT is in (16, 32, 64) (int{8,16,32,64}_t 'signed integer type with width of exactly 8, 16, 32 and 64 bits respectively with no padding bits and using 2's complement for negative values (provided if and only if the implementation directly supports the type)')
<littlebobeep>oriansj: wut is Nexus Intruder attack, I just grepped the logs for this channel and nothing shows up
<oriansj>muurkha, littlebobeep: short version hardware compromises operating systems, which compromise compilers and VLSI tools; which ensure future hardware is malicious and will ensure the cycle continues. It was a proposal to subvert the Soviet computing infrastructure *forever*
<oriansj>that being said, there is no evidence that it was done nor that it is even possible.
<oriansj>But it makes for a fun little puzzle for me to solve, even if it has no real world applications.
<muurkha>oh, right! yeah, you can definitely do that but you probably don't need to involve the compilers
<oriansj>I guess I never defined what the size field means in GFK filesystem in dnodes for folders
<oriansj> https://github.com/oriansj/GFK/blob/main/GFK%20standard.org
<oriansj>hmmm how does this sound: for files be the size of the file in bytes but for folders be: the sum of all the contents of that directory or zero to simplify bootstrapping implementations done in assembly
<muurkha>maybe you should implement it first and see what you need it for
<oriansj>I don't need it but I did define a standard dnode
<muurkha>original Unix didn't have a readdir() or getdents() system call, you just read() the directory
<muurkha>then there was a library function to parse the directory contents into filenames and inode numbers
<muurkha>so naturally the file size was the number of bytes you would read in that fashion
<oriansj>muurkha: I am also trying to figure out what would be easy to manually create in hex0
<muurkha>yeah
<oriansj>well fat12 did: For other entries than files then file size field should be set to 0.
<oriansj>and I guess just doing that probably is good enough
<oriansj>ok so: if pointing to a file/Indirect File: the size of that file in bytes and if anything else: ZERO to simplify bootstrapping implementations.
***alMalsamo is now known as littlebobeep