IRC channel logs
2022-04-09.log
back to list of logs
<oriansj>bootstrapping: come for the technicals; stay for the great people you meet along the way <mid-kid>I haven't checked up on this project for a looong while. What are some projects that will help me bootstrap a new machine? <mid-kid>From the smallest set of binaries, possible. <littlebobeep>mid-kid: Yeah it's awesome but you need a Linux kernel binary too so ummmmm dunno how to safely compile that, best option might be trusting GNU Guix live environment or something <mid-kid>yeah I'll have to trust the compiler on that one <mid-kid>I've bootstrapped my current gentoo install manually from mes once, so that's the "safest" thing I have rn. <littlebobeep>mid-kid: That is cool but stage0 does not even start with a C compiler haha <littlebobeep>if just from Mes as I understand it you need a C compiler somehow <mid-kid>true, does live-bootstrap start from stage0 or something else? <mid-kid>I started from mes+small busybox last I did it. <mid-kid>not the best but I didn't feel like breaking my horns <stikonas>mid-kid: yes, it starts from stage0-posix <stikonas>basically there are 3 binaries: kernel, hex0 and kaem-minimal <stikonas>right now only Linux kernel works, though it can be fairly stripped down <stikonas>at some point maybe more minimalistic kernel would work, oriansj is working on one <stikonas>mid-kid: after live-bootstrap you'll endup with gcc 4.7.4 now <stikonas>so you'll need some intermediate toolchain jump before you can get to latest GCC 11 <stikonas>mid-kid: in principle it would be nice to automate steps from live-bootstrap to Gentoo... <stikonas>but at the moment we haven't built Python yet lin live-bootstrap <stikonas>mid-kid: also if you are going to run bootstrap on baremetal, you might need some bootloader <stikonas>to load linux kernel + initramfs with live-bootstrap <mid-kid>right almost forgot about that, yeah <mid-kid>and yeah I know there's some intermediate steps and extran tools needed in bootstrapping gentoo <mid-kid>gentoo itself can't be bootstrapped even with the bootstrap.sh script without nudging it and forcing it to build certain packages to break dependency loops <mid-kid>it's a mess of a bootstrap but once there, a full rebuild and you have a clean system. <mid-kid>(I actually wanted to try bootstrapping slackware this time around, by just running make_world.sh in a loop until hopefully everything compiles except for rust probably) <mid-kid>(though I realized I don't have a big enough hard drive for that) <bauen1>mid-kid: it would also be interesting if you managed to rebuild the exact same binary as the prebuild stages <mid-kid>the build bots keep a cache of binary packages around <mid-kid>This leads to inconsistencies where glibc is updated but nothing is rebuilt against the new glibc <mid-kid>And lol I'm not figuring out how to reproduce *that* <bauen1>mid-kid: maybe you can ask whoever manages that to kick of a "clean" build and then try to come very close to that ? but yes reproducing that would be ... a challenge <mid-kid>At that point I might as well run a stage3 build on my current machine. <mid-kid>But I still doubt it's reproducible due to timestamps everywhere at the last <bauen1>mid-kid: well that's basically the idea, but you want the stage3 you build to be the same as the stage3 gentoo currently offers so you can be reasonably sure that the gentoo stage3 is "good" <mid-kid>yeah... it might be worth proposing adding at least some reproducibility aware options to the upstream build system but... not today. <stikonas>yeah, there is no way to reproduce official stage3 <stikonas>but there is less need to do that with gentoo <mid-kid>yeah I'd be perfectly content just bootstrapping with a reproducible bare minimum set of binaries <mid-kid>One comment I'd like to give to the live-bootstrap repo is that I'd like a bit more insight into what it's actually doing when generating the initramfs. <mid-kid>Even if it's just console output that'd be nice. <mid-kid>Having a hard time figuring out where the /init binary is coming from. <oriansj>well the kernel problem is going to have to be solved in several steps: modified-BootOS => minimal filesystem => filesystem-library => text editor => hex0 => hex1 => hex2 => M0 => custom POSIX written in assembly + stage0-posix => stage0-posix steps till M2-Planet => custom POSIX written in C => rest of live-bootstrap <mid-kid>Or why it's using a 4.9 kernel when chrooting int sysb <mid-kid>(do I have to configure this kernel? can I just tell it to keep running the previous kernel= <oriansj>mid-kid: the init binary is just kaem-optional-seed <oriansj>it reads kaem.$arch and just runs the script <mid-kid>I saw it was running kaem.x86 but didn't know that was default behavior for kaem <oriansj>mid-kid: it is default behavior for kaem-optional-seed so that multiple architectures can be supported on the same filesystem <mid-kid>I see. kaem-minimal.hex0 is the canonical source code for it? It's not generated from somewhere and *then* commented? <oriansj>like doing art with colored grains of sand <oriansj>it is under 800bytes so only 1600 hex chars needed; only jumps are a pain but you learn the tricks of making good comments pretty quick <mid-kid>So I assume linux-4.9 is the most recent version that can be built with sysa <mid-kid>Can I cheat and provide my own kernel image. <mid-kid>I'm sure the machine I'm gonna run this on right now will do just fine with 4.9 but I have other machines that straight up won't lol. <oriansj>well aside from potential missing syscalls; you in theory could use any POSIX kernel you want <oriansj>which would be the potential missing syscalls bit <mid-kid>Oh I see I can just set CHROOT=true to skip the whole kernel shtick and keep running with the kernel I built for the machine. <oriansj>and the reason for the kexec if I remember correctly was to deal with the RAM disk running out of memory (stikonas/fossy correct me if I remembered that wrong) <mid-kid>oh right I see sysb will mount a disk <mid-kid>Yeah that's exactly what I'm worried about, not having the disk drivers or other stuff <mid-kid>I'd rather be dropped off into a shell after sysa honestly... <oriansj>they do enjoy contributions and I don't think they would reject a patch adding a flag for turning that sort of behavior on/off <stikonas[m]>And as for rootfs.py, we are now working on simplifying steps <oriansj>well you have kaem but it isn't an interactive shell (yet) <mid-kid>well yeah of course, I meant dropping me into a shell *after* sysa so I can mount the filesystem and mess with whatever <mid-kid>Yeah I think I'll end up just making a rootfs in chroot mode, copying that to the disk and booting it. Saves headaches in terms figuring out how to preserve the kernel. <stikonas>oriansj: kexec was actually to avoid assumptions on device drivers in bootstrap kernel <stikonas>that should have resonably good storage drivers <stikonas>and should be able to mount storage for sysc <stikonas>mid-kid: you bootstrap kernel is not provided by live-bootstrap at all <mid-kid>yeah I know, I'm just worried the kexec'd kernel won't support the machine properly <stikonas>unless you have some really new storage disk <stikonas>but I think 4.9.10 supports everything modern (hard drives, ssds, nvme, etc) <mid-kid>does 4.9 support NVMe, just out of curiosity? <stikonas>in any case, you can simply use usb stick for sysc <stikonas>right now rootfs.py does some not completely trivial copying of files when preparing sysa <stikonas>I'm trying to move most of that into kaem/bash scripts in live-bootstrap <stikonas>but now got some strange error that I can't reproduce interractively <stikonas> +> patch -Np0 -i ../../patches/mes-libc.patch <stikonas>patch: **** Can't create file ../x86/artifact/poiljpqaerror 02: <stikonas>but if I login into busybox shell and run the same command, patch applies just fine <unmatched-paren>because this is one of these constructs: `typedef struct XXX XXX; struct XXX {...};` <unmatched-paren>By the way... i've decided on a plan for implementing my Pascal compiler: write it in M2-Planet C, outputting QBE IL, then add an M1 backend once that's done to allow building it from M2. <stikonas>unmatched-paren: but do you need Pascal so early in the boot chain? <stikonas>unless you really need it, buildable by tcc or even gcc is usually good enough <stikonas>we only really need M2 compatibility for early tools before tcc is built <oriansj> unmatched-paren: M2 requires everything to be defined *BEFORE* use; (functions have to be prototyped($args); or implemented($args){$statements}) <oriansj>Types have to be fully defined prior to use and typedef is a use <stikonas>anyway, rather than trying to figure out why patch binary is failing, I have fixed make-3.80 build script not to need patching at all... <oriansj>stikonas: perhaps just a minor sanity test of is it the binary or the environment <oriansj>aka put a statically compiled binary from gcc in there and have it do the patch and see if it works or if it fails because something is missing from the environment <stikonas>I can try that if patch binary fails later with other packages... <stikonas>right now I can build make without patching it al all. Just had to add a single extra -D define <stikonas>I think I removed TMPDIR=/tmp variable which is necessary <stikonas>in any case, cleaning up make build script is good <stikonas>and patch can now be moved later in bootstrap and built with makefile rather than kaem script <mid-kid>Quick question, but is there any specific reason musl was picked over glibc as is used in the guix bootstrap? Also why is tcc built against musl instead of just keeping using the mes-libc? <mid-kid>oh and what is perl used for in sysa? <mid-kid>oh wow the perl-5.32.1 build does the thing where it mistakenly places all man pages in /. <mid-kid>I forgot what causes that bug, but I guess it's fine if they're just deleted afterwards. <mid-kid>So uh. What is sysc's "target"? Why is it building all the packages it is? Are they all dependencies of xbps or is it just "upgrade all recent toolchain components"? <mid-kid>Not entirely sure why xbps is used at all when the packages it ends up packaging into the format aren't enough to make a working rootfs out of <stikonas>mid-kid: yes, musl has far simpler build system <stikonas>it's not enough to build most of the software <mid-kid>well yeah but interactive bash is built after gcc is bootstrapped anyway <stikonas>mid-kid: and also, non GPL software is a bit tricky <stikonas>we already have this problem with heirloom-devtools <stikonas>but that will go away once we can use gash with mes <mid-kid>wonder why heirloom-devtools is being built at all, I've never seen that in the guix bootstrap scripts <mid-kid>I guess that is necessary unless you use the pre-generated .y.c files <mid-kid>not sure why that'd be an issue, musl is mit, afaik none of those licenses are incompatible. <stikonas[m]>So non GPL stuff linked against mes is non redistributable <mid-kid>yeah and LGPL is basically a GPL when statically linked <mid-kid>which I guess would be an issue when using glibc as the (statically linked) libc <stikonas[m]>If you provide object files for relinking then it's fine <mid-kid>it isn't, the guix bootstrap builds binutils/gcc first <mid-kid>I guess removing intermediate compiler versions would make it easier to introduce new architectures later down the line <mid-kid>libc would be a contender for that as well <stikonas[m]>Riscv32 hardware probably won't be capable of running full Linux system <mid-kid>guix goes through mes, two different versions of tcc, gcc-2.95.3, gcc-4.6.4, and finally gcc-4.9. live-bootstrap uses both tcc versions, gcc-4.0.4, and ends up with gcc-4.7.4 <mid-kid>I wonder what the oldest version of gcc that can build gcc-11 is <mid-kid>I know glibc requires a fairly recent version (I believe 5.4?) <stikonas>in any case there will be only 1 extra step between 4.7.4 and 11 <unmatched-paren>stikonas: re pascal at M2 stage, no reason why we'd need it i can think of, it just sounds like a fun challenge :) <unmatched-paren>btw, is there anything i should know that isn't in the dragon book (presumably because of its age some new techniques aren't in there?) <unmatched-paren>i just realized that there's something else that's probably more important than Nim that's written in Pascal... ΤεΧ! <unmatched-paren>but ΤεΧ is written in WEB, which is written in WEB... does the tangled output of a literate programming count as a blob? <mid-kid>according to the DFSG, anything that's not manually written/created by a human is a blob <mid-kid>which is why configure scripts generated from autoconf count <unmatched-paren>it's pretty obvious if code generated from literate programs deviates from the source <mid-kid>I mean, I've read through plenty of generated configure scripts and even partially "reversed" one. <mid-kid>It gets fun when old projects use autoconf-1.12.2, good luck finding the source for that lol <mid-kid>Still, better safe than sorry, if you can generate it, do that. <unmatched-paren>WEB's source code for extracting Pascal is probably pretty trivial, let me see if i can find it <mid-kid>Damn, I missed my chance to make a "garbage in garbage out" joke with regards to autoconf :P <unmatched-paren>I looked at the Guix configure.ac and immediately typed `:q!<RET>` :P <unmatched-paren>apparently there are reasons it's fine for tex, but i'm not sure about WEB <unmatched-paren>either way, we can just write a free WEB extractor and use it on tex without ever going through WEB (if it is really necessary) <oriansj>unmatched-paren: I don't think we deeply looked into the WEB bootstrap requirements yet (wasn't on the radar yet) but anything not written by a human is to be considered a blob and wouldn't be considered acceptable <oriansj>the readablity of a blob doesn't excuse its blob nature <oriansj>but feel free to add what notes you do have about it to the wiki and it'll be updated as someone thinks more about it and what is needed. <oriansj>and you are right licensing and build requirements might complicate things but worst case is someone spends time writing some code from scratch and solves it the hard way. <unmatched-paren>stikonas: how does that work? does it convert WEB to Pascal and then Pascal to C? <oriansj>but the first step to bootstrapping any language is first get the facts about what is available, what is absolutely needed and where the gaps are <oriansj>a few hours looking around can save months of work if one is lucky <unmatched-paren>well, so far i can't find any way of bootstrapping pascal other than writing a new compiler; FPC has LOADS of non-standard features <oriansj>but remember that sometimes reasonable assumptions like a language can be built by version-1 doesn't always hold and can catch one without warning <oriansj>unmatched-paren: and by ancient hardware would it mean SIMH or qemu or dosbox emulation? <unmatched-paren>i suspect that it was never written in anything but Pascal, and was originally bootstrapped from one of the proprietary pascals <unmatched-paren>oriansj: well, it was originally written for DOS (I think in response to Borland dropping support for it) <unmatched-paren>the architectures were m88k and some really old 16(?)-bit Intel arch <unmatched-paren>we could probably run it in dosbox, the problem is compiling it in the first place <stikonas>unmatched-paren: well, I don't know how it works, you can take a look. But I doubt that it goes via Pascal, why would it... <unmatched-paren>anyway, building stuff in chains is tedious, especially with something THAT old <stikonas>well, that's why we still haven't bootstrapped ghc... <unmatched-paren>"Originally, the compiler was a 16-bit DOS executable compiled by Turbo Pascal." <- in which case it'll almost certainly have used Turbo Pascal extensions <stikonas>rekado tried again recently but I think not successfully <stikonas>which doesn't build on normal toolchains <unmatched-paren>- 1997: guy gets angry at Borland for cutting support for DOS and decides to make his own compiler starting from Turbo Pascal <unmatched-paren>- circa 1999: the compiler can compile itself, but probably does so via loads of non-standard extensions <unmatched-paren>the problem is, it seems like to make pascal useful, you need a lot of non-standard extensions <oriansj>unmatched-paren: just not any good enough to bootstrap FPC? <unmatched-paren>p2c, rascal, that thing, GNU Pascal (you mentioned that you could never get it working?), ... <unmatched-paren>gpc was following some ANSI or ISO standard 'Extended Pascal' or something <unmatched-paren>stikonas: not possible, the Delphi OOP features are used in a lot of places <oriansj>which are incompatible pascal dialects *unmatched-paren afk, sorry <oriansj>well I am guessing not every feature it supports is needed for it to be built <oriansj>but as you like a good challenge; doing a compiler is scratch is good fun <oriansj>So that leaves a couple paths: easy mode (just pick your favorite high level language), normal mode (pick any language that can be built by in Guix without binary substitutes) or hard mode (pick any stage0-posix language for extra nerd cred) <oriansj>allthough the thought of M2-Planet directly building QBE and cproc is very interesting <unmatched-paren>oriansj: "I am guessing not every feature" a reasonable assumption, but unfortunately mostly incorrect <oriansj>as that combo can directly build GCC 4.7.4 <oriansj>unmatched-paren: double ouch, talk about a bad bootstrapping language combo <unmatched-paren>oriansj: my query above about typedef was the result of a failed attempt to compile QBE with m2 <unmatched-paren>i noticed that it used bitfields in the same file, does m2 support them? <oriansj>unmatched-paren: nope it wasn't a feature needed yet <unmatched-paren>almost every .pas file contains a `unit` declaration. this is pascal's module system. <oriansj>unmatched-paren: well support for GAS was in the M3 work that got put on hold <unmatched-paren>at that point, M2 would basically be a complete cc and you wouldn't need cproc at all <unmatched-paren>but it probably suffers from the same architecture support problem as mes <unmatched-paren>oh, yeah, fpc also has a non-standard preprocessor which is used *everywhere* <unmatched-paren>heh, the first file I pull up contains `class` in the first type declaration <unmatched-paren>i seem to remember that `unit` was used even in the really old fpc i found somewhere (can't remember where) <unmatched-paren>something slightly more trivial: // line comments are also non-standard <oriansj>chibicc also needs binutils to make binaries <oriansj>unmatched-paren: well yes, rather easily <unmatched-paren>oriansj: i tried replacing the enum with a bunch of #defines and an int <oriansj>didn't need them when we had #define (or CONSTANT support) <unmatched-paren>i guess there's something in there which M2 is mistakenly... identifying... as an identifier? <oriansj>so there is probably a string more than 4KB in length (or block comment) <unmatched-paren>btw, is there a clever acronym for M3-Meteoroid? ;) M2-Planet is 'PLAtform NEutral Transpiler', M2-Mesoplanet is 'Macro Expander Saving Our m2-PLANET', etc <oriansj>no unmatched-paren M2 doesn't support #include but M2-Mesoplanet does <oriansj>see M2-Planet is designed to run on bare metal where things like #include are meaningless and can't work <oriansj>yes but it doesn't get revealed until after a working version is out <unmatched-paren>another thing you should note (but not a blocker) if you decide to pursue the cproc idea further: cproc doesn't have a preprocessor yet, you need to use `mcpp`, an external program <oriansj>assuming we expand it enough and turn off the bits for passing information to M2-Planet about original file names and line numbers <stikonas>we also need to add an option to turn off includes in Mesoplanet <stikonas>unmatched-paren: line numbers is just a comment <stikonas>oriansj: needed a way to tell M2-Planet how to print meaningful error messages <stikonas>but right now I was working on live-bootstrap <oriansj>it'll force you to specify all the C sources with -f <oriansj>but I can get that up in a couple minutes <stikonas>recently we were getting a few complaints that it's not completely clear how to create live-bootstrap image (i.e. what rootfs.py does), so I'm trying to make it trivial <oriansj>and using stage0-posix-x86 instead of stage0-posix should simplify things a great deal <unmatched-paren>wow. the standard library contains, among many other things: wasmtime bindigs, Boehm GC bindings, mySQL, SQLite, and Postgres bindings, gitlab api bindings, gtk1, gtk2, sdl, cairo... and many more things: https://paste.debian.net/1237425/