IRC channel logs

<fossy>"can anyone help me to write the rootfs.py file in bash language"????

<oriansj>ludocode: love the idea, glad to see something in progress. Odd to not support arrays and structs when they only take only a few lines of assembly and really clean up the C code you can write.

<stikonas>fossy: I think it's the same Guest that asked something similar a month or so ago

<stikonas>but yes, I don't really understand the need

<stikonas>and if you don't like python, you could just follow manual instructions

<stikonas>and write bash script or similar based on that

<stikonas>fossy: have you managed to make any progress on that montly reproducibility issue?

<stikonas>was it diffutils 3.10?

<oriansj>although the hex and vm implementation appears to be over 6KB.

<oriansj>which is a bit of an audit problem.

<oriansj>and there is the question of the driver for the vm flows

<oriansj>and have you taken a look at the sweet-16 instruction set?

<ludocode>Yeah, in retrospect putting arrays in the first stage compiler would have made life a bit easier for some of the second stages. Mostly at the time I was just really trying to limit the amount of assembly I had to write

<ludocode>It can always be changed after the fact. Right now the second stage compiler is completely done aside from whatever bugs are gonna crop up, and it implements most of C89 plus some C99 so it's not really an issue anymore.

<ludocode>Hex and VM together are above 6 KB of hexadecimal machine code, yep. A lot of that is padding and data, I really did not want to be conservative with error messages and such, but yes it's a lot

<ludocode>Minimizing size of the machine code is not really a primary goal. I'm a lot more interested in the aliens/archaeology/etc. bootstrapping problem than the trusting trust problem to be honest, so the VM is really the critical feature of this, and the fact that it can be implemented in pure machine code is kind of a bonus / side goal

<ludocode>And in any case the machine code is really heavily documented, it's more comments than code. I do think it's reasonably auditable, maybe not trivially so but certainly within the capabilities of anyone who cares about the trusting trust problem

<ludocode>I hadn't seen sweet-16, that's pretty cool. I probably should have spent more time researching instruction sets before starting because there are some obvious mistakes in Onramp, like the lack of shift instructions. I intend to change the bytecode over time to fix some of these problems, right now it works so I figure it's more important to work on the full compiler

<oriansj>ludocode: completely fair.

<oriansj>rotate instructions don't really end up being useful and fixed instruction size isn't as helpful as you would expect for a single state machine

<oriansj>and call reg (or jump-link reg reg) is all you need to support true function pointers in C efficiently.

<oriansj>if your goal was a minimal VM for C, a stack machine with 8bit instructions probably would be a tighter fit

<ludocode>Yeah the reason it doesn't have function pointers is mainly about type storage, nothing to do with the instruction set. Types aren't stored as a tree of declarators, there are no references from types to other types. Really simple to implement but it means function pointer types aren't really possible

<ludocode>I used rotate just because I was really trying to limit to 16 instructions and it was possible to implement both shifts with one rotate. Turned out to be a terrible idea, implementing a shift using rotate takes like 15 instructions, and shifts are used all the time and rotate never. (whereas xor can be implemented with just three instructions, and+or+sub, so I should have just gotten rid of xor and put in both shifts. That's what I

<ludocode>intend to change later)

<ludocode>Anyway the reason the VM is so big isn't really the instruction set. Most of the VM is about host integration: passing command-line arguments and environment variables into the VM and bridging the filesystem through all the system calls

<oriansj>and it appears you are using a full byte per register?

<ludocode>Yeah the bytecode is really not meant to be space efficient, it's meant to be really easy to read/write. That's another advantage of the fixed instruction size, everything lines up in a hexdump

<oriansj>well you could go MMIX and just have 256 registers

<ludocode>Another big advantage of fixed instruction size is it's way easier to manually calculate relative jumps, the relative jump instruction uses words rather than bytes

<oriansj>it wouldn't add any complexity to the VM, just to its runtime requirements

<ludocode>That would make the bytecode pretty terrible to write because there would be no way to write small immediate values. Right now with 0x80-0x8F as registers, values 0x00-0x7F are positive constants and 0x90-0xFF are negative constants

<oriansj>well you would have instructions like addi and the like; which could then use the space used by the registers to provide the immediate and you'd have the full range

<ludocode>I originally made the bytecode variable instruction length, the immediate load instruction was 6 bytes and syscall was 2 bytes for example, and I thought about making divide 5 bytes so it could output quotient and modulus into separate registers. But fixed instruction size turned out to be way better

<ludocode>Yeah sure but then you have a lot more than 16 instructions :)

<oriansj>ludocode: sometimes added complexity in one part reduces complexity in other parts

<oriansj>but I agree on the byte counting being annoying.

<ludocode>Plus it's way easier to audit with distinct values for immediate vs. register bytes. If you see 0x80 it's pretty much guaranteed to be a register. As opposed to having separate add vs. addi, you have to look at the instruction to figure out whether its argument is a register or immediate

<ludocode>I find the bytecode extremely readable, probably has a lot to do with it being so space inefficient lol

<oriansj>well, self-written bytecode usually is quite easy to follow. Reading other people's is a good bit harder.

<ludocode>true, I suppose that will be the real test, if anyone can understand what I wrote haha

<oriansj>well reading bytecode requires either me being familiar with it or having to look at documentation constantly

<oriansj>but as your goal is universal bootstrap that even aliens could use; it could be ignored entirely

<ludocode>Yeah I wanted to make a one-page cheat sheet that could be printed or put up on a second monitor or whatever, sort of like Robert Elder's one page CPU. I had a separate quick reference document at one point, I think I didn't include it when I ported the code to GitHub

<oriansj>have you seen the blynn-compiler vm?

<ludocode>Neat, haven't seen that no

<oriansj>or how about Binary Lambda Calculus? https://tromp.github.io/cl/Binary_lambda_calculus.html

<oriansj>the entire VM was implemented in 29 bytes

<ludocode>Nope, not that either

<oriansj>one can implement SKI calculus in it in just 34 bits

<ludocode>I'm sure there are lots of VM designs that can be made super small, but those don't really seem practical for implementing a C compiler

<ludocode>Performance is an issue, the Onramp VM isn't super fast despite having an instruction set that maps reasonably well to real hardware. Those VMs that operate on individual bits are surely not going to be fast enough to build anything real on top

<oriansj>well any VM design can be used to implement a C compiler (assuming your assembly language isn't garbage)

<oriansj>The problem with any VM is getting a nice assembly language bootstrapped and after that it is all relatively simple.

<ludocode>Sure but if your VM is 100,000x slower than real hardware then the bootstrap could take months. That's not practical

<oriansj>well, 100Kx slower is very unlikely

<ludocode>I actually did this with Onramp, I implemented a VM in POSIX shell :) it's under platform/vm/sh/. It's totally useless because it's about 100,000x slower than the VMs written in machine code and C

<oriansj>1Kx slower isn't that bad; it is fast enough to bootstrap all the way up in 3 minutes

<oriansj>so 100Kx slower would just be 300 minutes

<ludocode>If you implement a VM for the Binary Lambda Calculus in machine code, and you built, say, a 16-bit assembly language on top of it, do you think it would be 1000x slower than a real CPU? I would guess it would be even much slower than that

<oriansj>well it is fast enough to run this: https://paste.debian.net/1312031 fast enough that one never sees a pause in its rotation.

<ludocode>Is that already implemented? A C compiler on top of the binary lambda calculus?

<oriansj>ludocode: bootstrapped -> no; compiled as a target yes

<ludocode>neat

<oriansj>which is why your goal is a good bit easier than trusting trust bootstrapping.

<ludocode>Sure, I mean I'm not a CPU designer by any means, I'm sure someone could design a much better instruction set with a much smaller VM. I designed what I felt was practical, and what made bootstrapping comfortable, without worrying too much about size.

<ludocode>In any case the main size overhead is in the filesystem. I wanted the VM to have syscalls such that the compiler can interact with the host environment's filesystem exactly like a normal compiler would. The interface to Onramp is a normal unix compiler like GCC, it works the same way.

<ludocode>If you make a really tiny VM with the intention of building the filesystem inside of it, there's no way to interface that naturally with a host environment.

<oriansj>well FORTH does a pretty solid job of that

<ludocode>Onramp supports both freestanding and hosted. I haven't written a freestanding VM yet but I suspect it will be about half the size of the hosted VM since you don't have to implement any filesystem syscalls or any argument/environment passing

<ludocode>3KB is still a lot bigger than the VMs you're suggesting of course but it seems reasonably small to me for what I consider to be a really comfortable bytecode

<oriansj>well supporting disk and user I/O is a good bit more complex than just 5 POSIX I/O syscalls

<oriansj>as hardware these days lies

<ludocode>A scripted bootstrap from stored media doesn't technically require any I/O at all. It could just implement an in-memory filesystem and bootstrap all the way up to compiling a real native kernel and chainboot into it.

<ludocode>Obviously you would want a console output but even that is just for debugging

<oriansj>aside from the initial load into memory (assuming it is large enough to hold everything)

<oriansj>and it all disappears on reboot

<ludocode>Yeah sure. I would eventually want to implement read/write sector syscalls to page stuff to disk during this process. builder-hex0 does something like this to output its final image to disk right? how much machine code does that take?

<oriansj>well that very much depends on the machine in question.

<ludocode>The disappearing on reboot seems like an advantage to me :) Once you're in the native kernel it seems to me the first thing you'd want to do is rebuild itself with itself, persist that to disk instead of whatever the bootstrap process built

<oriansj>ludocode: would a 2 stage vm be a valid option for you?

<ludocode>Sure. I talk about that a bit in one of the docs in Onramp actually, you can certainly implement the VM in multiple stages

<oriansj>like a minimal VM for the steps up till you have a C compiler and then do a massively expanded VM for better performance and full features?

<ludocode>Oh I see what you mean. I suppose that is also possible, but I'm not sure if there's much point to that

<ludocode>Instead of an expanded VM I imagine you'd be better off compiling to something like LLVM or QBE, something that can be translated to machine code

<oriansj>well if the second stage VM is more well known, you could leverage the work of others to speed up your development and reduce the amount of effort required.

<oriansj>for example a RISC-V 64bit instruction set VM

<oriansj>then you could leverage the RISC-V bootstrap work for GCC and the like

<ludocode>I'm not sure what you mean, or what the use case is for that. Do you mean to use Onramp to compile a RISC-V emulator to then run the GUIX bootstrap on top of it?

<ludocode>I suppose that's already possible with what I've built so far for Onramp. I don't really see why you'd want to though

<oriansj>well I guess if GCC and the like isn't a goal, then I can see how trying to support it seems like a wasted or pointless effort.

<ludocode>No, GCC is definitely the goal, but I'm hoping to be able to take a much more direct path

<oriansj>otherwise porting TCC, GCC and related would require you to add support for your custom architecture to all of them; which will take years of effort.

<oriansj>RISC-V has been a work in progress for a couple years now

<ludocode>Yeah but I don't intend for any of those to output Onramp bytecode. I just want to get TinyCC running on Onramp, and outputting machine code

<oriansj>outputting machine code for which architectures?

<ludocode>Well x86 or x86_64 to start.

<oriansj>so Onramp is designed to stop once TCC is bootstrapped and everything else is just some host native work?

<ludocode>If I can get TinyCC running, then I'm hoping I can get it to compile a small toy POSIX kernel like Fiwix or Tilck (or even an old version of Linux which TinyCC did actually compile at one point.) I realize TinyCC can't compile those toy kernels at the moment, it's missing linker scripts for example, but I think it's possible to fill in the blanks myself, and the authors of those kernels might be interested in helping as well if they

<ludocode>learn that it might be possible to bootstrap them this way

<ludocode>Yeah pretty much. The idea would be Onramp -> TinyCC -> Kernel+musl+BusyBox, then chainboot into it.

<oriansj>ludocode: We use TinyCC to compile Fiwix

<ludocode>Oh great. So yeah, that's what I want to do

<oriansj>the steps are already in live-bootstrap

<ludocode>Cool. I didn't know live-bootstrap compiled Fiwix. I haven't looked at it in a while I guess

<oriansj>it is the bare-metal bootstrapping work stikonas and fossy and rick-masters have put together with the help of the Fiwix kernel's author's help.

<ludocode>Awesome. Sounds like they already did exactly what I was imagining, that's great news

<oriansj>(there are also a few other people who chipped in and did amazing work but as I am aiming to do a brief intro, I apologize for skipping the mentions)

<ludocode>So yeah I basically want to try do the same thing on Onramp. I'm probably a year away from even starting that because I need to get the final stage compiler and preprocessor done enough that it can compile TinyCC, but that's one of my goals

<oriansj>have you seen https://github.com/cosinusoidally/tcc_bootstrap_alt ?

<oriansj>it uses the Javascript VM to bootstrap TCC

<ludocode>Oh neat, I had not seen that.

<oriansj>You might find a few more gems in the IRC history logs (that I can't remember off the top of my head) that may inspire or provide alternates you prefer.

<ludocode>Thanks

<oriansj>and thanks for exploring the universal bootstrap VM solution space

<ludocode>Thanks for spending so much time with me talking about this, it's really helpful. I'm glad I finally got somebody to look at my silly compiler for aliens :)

<oriansj>ludocode: I get it, for the first couple years there were only a handful of us here to discuss the stage0 bootstrap work.

<oriansj>You'll only get encouragement, suggestions and recommendations for improvement; people here are nice and helpful.

<oriansj>and don't be afraid to chat, just please understand we are kind of spread around the world and it might take a bit before a solid response can be generated.

<ludocode>Thanks

<lrvick2>Was trying to drop savannah in stagex and noticed http://git.savannah.gnu.org/cgit/coreutils.git/snapshot/coreutils-8.32.tar.gz vs https://mirrors.kernel.org/gnu/coreutils/coreutils-8.32.tar.gz have substantial differences. Many files exclusive to each side. For instance the latter includes things like primes.h and the former does not.

<lrvick2>Has anyone eyeballed this one already?

<Foxboron>lrvick2: You are comparing apples to oranges, most likely. The snapshot/ is a "git archive" of the repo while mirrors.kernel.org is probably a separate process

<Foxboron> https://github.com/coreutils/coreutils/blob/master/src/local.mk#L67

<Foxboron>Yes, so `make dist` vs `git archive`. The git snapshot is not something that should be used in this case

<lrvick2>Fair. Only case replacing a dozen savannah links with other mirrors where a hash changed. and in this case a 10x size increase as well.

<lrvick2>the kernel.org copy is also signed by a maintainer at least

<Foxboron>Well yes, if you care about authenticating sources then the git snapshots is the worst place you can go.

<oriansj>well not the worst but yeah, sha1sum is breakable with a bit of money.

<Foxboron>oriansj: Not really breakable, you can do collisions in currently some very specific scenarios and no proofs it would work on git repos yet

<Foxboron>it's *probably* just a matter of time though

<pabs3>plasma41: from #breezy:

<pabs3><jelmer> older versions of Breezy (3.4 and older) are still pure-Python

<pabs3><jelmer> but we have no plans to make the rust code optional

<matrix_bridge><Andrius Štikonas> you need rust for desktop stuff anyway, so there isn't much point in avoiding it after initial system bootstrap

<matrix_bridge><cosinusoidally> Have there been any attempts to create some intermediate steps between M0 and cc_x86? For me it's a bit of a leap from small assembly language programs to something like cc_x86. I've been playing around with otccelf (https://bellard.org/otcc/ a version of the precursor to tcc but with elf output support). When compiled, otccelf is slightly smaller than cc_x86, but it's very hard to understand and...

<matrix_bridge>... doubly so in assembly form.

<oriansj>well we did do a Lisp in M0 and a FORTH in M0

<oriansj>they are both smaller and simpler than cc_*

<oriansj>it is just that Lisp and FORTH don't end up making a much shorter C compiler than M0 assembly

<oriansj>it is basically 1-3 assembly instructions per line of C code

<oriansj>considerable time has been spent on Both on FORTH and Lisp (do a garbage collected lisp in assembly is a bitch)

<matrix_bridge><cosinusoidally> This one https://github.com/oriansj/stage0/blob/master/stage2/forth.s ? Is that knight assembly? I did start making an attempt to port otccelf to forth, specifically the buzzard2 dialect of forth, but I gave up pretty quickly when I ran into issues trying to break out of nested loop. Probably also doesn't help that I've never written forth before.

<matrix_bridge><cosinusoidally> buzzard2 starts off with a ~200 line implementation of a dialect it calls "first" and then uses first to implement what it calls "third" (which is its own dialect of not quite forth)

<oriansj>cosinusoidally: correct

<oriansj>yeah Caleb Ristvedt did an impressive job with stage3/inital_library.fs to making that FORTH much more ANSI standard

<oriansj>but yeah, basically we found that there is no magic in implementing a compiler, only a bunch of tedious detail tracking.

<oriansj>stack state, where variables are located, how to operate on them, in what order and how to load/store them

<matrix_bridge><Andrius Štikonas> And cc_x86 is very readable if you start working on it

<matrix_bridge><Andrius Štikonas> Especially with new gas style macros

<oriansj>also cc_x86.M1 took less than 24 hours to write, debug and self-host M2-Planet from. So it shouldn't take more than a day or two for something to know every detail about it (or look at the C version of it that exists in stage0's high level prototypes folder)

IRC channel logs

2024-03-26.log