IRC channel logs

2022-01-22.log

back to list of logs

<alMalsamo>Is Musl used anywhere in the live-bootstrap process?
<oriansj>alMalsamo: did you not look at the steps? https://github.com/fosslinux/live-bootstrap/blob/master/parts.rst
<stikonas>alMalsamo: indeed, musl is used in most of the live-bootstrap process
<stikonas>we build it quite early with tcc
<stikonas>on the other hand, guix bootstrap does not use musl but it's not as rigorous in terms of pre-generated files
<stikonas>alMalsamo: before musl we only have meslibc
<stikonas>it's a much smaller libc
<stikonas>and also has some licensing clashes with heirloom-devtools, so heirloom-yacc and heirloom-lex binaries are not redistributable
<stikonas>(since GPL is not compatible with CDDL, same licensing issue as ZFS filesystem on Linux kernel)
<stikonas>but that issue would go away if gash would be updated to run on mes, which might happen in the future
<stikonas>gash is a shell, so we would be able to postpone bash build to a bit later
<alMalsamo>Hmm gash doesn't run on Mes? Only Guile or what?
<alMalsamo>I thought the whole point of Gash and Gash-utils was to replace C binaries in GNU Guix bootstrap on Mes
<oriansj>alMalsamo: yes the point of Gash and Gash-utils was to remove the need for the bash and core util binaries used at the root of the Gnu Guix bootstrap; however you are forgetting the core of the Guix bootstrap is a rather larget guile binary
<oriansj>live-bootstrap lacks that issue and uses kaem-optional (757bytes) to drive the bootstrap and later builds guile from source (which could be used as the root binary for Guix)
<alMalsamo>oriansj: live-bootstrap/sysa/stage0-posix/src/x86/kaem-minimal.hex0
<alMalsamo>Is that kaem-optional or is it a different file?
<alMalsamo>oriansj: I thought GNU Guix bootstrap uses GNU Mes Scheme interpreter now instead of Guile, why on earth did janneke code GNU Mes then?
<oriansj>alMalsamo: kaem-minimal.hex0 is the hex0 source code and kaem-optional-seed is the binary generated when you build that source
<oriansj>alMalsamo: what do you thing gash, gash-utils, bootar and guix itself is running on?
<alMalsamo>Okay thanks for the clarification. unfortunately I am not entirely sure what "kaem" even is, and I don't think I am 1337 enough to decipher/audit this code taking a look at it. I am mostly familiar with reading Scheme code
<oriansj>alMalsamo: kaem is a shell
<alMalsamo>oriansj: Hmm I dunno I haven't installed GNU Guix since before GNU Mes was invented, but since GNU Mes exists now I kind of assumed that the Scheme interpreter portion of Mes was used in the bootstrap process, not Guile
<oriansj>which just reads text and creates an array and calls execve and waits for it to finish
<oriansj>alMalsamo: easy mistake to make and honestly a point I wish they were more clear about
<oriansj>if you can read C code: https://github.com/oriansj/stage0-posix/blob/master/High%20Level%20Prototypes/kaem-minimal.c
<alMalsamo>Hmm okay then what is the point of coding a seperate Scheme intepreter in GNU Mes project if it is not even used in GNU Guix bootstrap process? When would this portion of GNU Mes ever get used?
<oriansj>alMalsamo: oh it is used
<oriansj>it just doesn't run everything yet
<oriansj>for example it is used to run MesCC to compile TCC
<alMalsamo>oriansj: Okay I am more familiar with Scheme than C but I will take a look at kaem-minimal.c and see if I can understand it. I already have it locally: live-bootstrap/sysa/stage0-posix/src/High\ Level\ Prototypes/kaem-minimal.c
<oriansj>alMalsamo: well if any of it is unclear, please let me know. I'm more than happy to explain anything that isn't completely clear and will be more than happy to improve its comments to aid future understanding for the next people who read it
<alMalsamo>And this is going to be a very n00b question (sorry) but I don't even understand what language kaem-minimal.hex0 is written in, the only low-language I am familiar with is assembly, what exactly *is* hex0?
<oriansj>alMalsamo: we love n00b questions
<alMalsamo>oriansj: Thanks for being open to helping I appreciate it. I am not skilled enough to create low-level code like this so I am immensely appreicative for your contributions to bootstrappable builds in general!
<oriansj>a byte is 8bits; a nibble is 4bits and can be expressed by 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F
<alMalsamo>I would like to audit as much of this code as I can however it seems very vital way to install an OS, unfortunately most GNU+Linux users don't give a flying fuck about diverse double compilation I dunno why not >_<
<alMalsamo>Yes I am familiar with byte vs. nibble and also hexadecimal notation
<oriansj>So by reading 20 and combining them into a single byte, we get the byte value 0x20 which is the byte for the space character
<oriansj>to aid understanding we added line comment syntax
<oriansj># line comment
<oriansj>and
<oriansj>; line comment
<oriansj>so a hex0 program would read a ; or # and just throw away everything until it hits a line feed character
<oriansj>whitespace characters and everything else is ignored (except EOF which marks we are done)
<oriansj>So we just read a byte, checks if it is EOF or 0-F or ; or # or throws it away
<oriansj>if it is 0-F (a-f are also supported but mapped to A-F) we store it until we read another hex character
<oriansj>then we shift it 4bits and add the new hex value to get a byte that we write directly with no further changes
<oriansj>So in effect a hex0 program is like doing art via individual colored grains of sand
<oriansj>it is very tedious, boring and error prone (especially with jumps)
<oriansj>So we write as little as possible in hex0
<alMalsamo>Hmm okay but I still don't understand what *language* this is, I mean it seems instructions are encoded in hexadecimal somehow, but this is not in assembly or something? how is a hex0 program "assembled" or "converted" to a binary executable to actually be run on a CPU?
<oriansj>alMalsamo: the language is hex0, the output *IS* binary
<oriansj>we are writing in machine code directly
<oriansj>it is below assembly
<oriansj>below a linker
<oriansj>below loaders
<oriansj>it is the very bits that run directly on the bare metal
<alMalsamo>Okay damn so you have to make new hex0 code for every microarchitecture then similar to assembly language?
<alMalsamo>What turns hex0 text into binary?
<oriansj>alMalsamo: fortunately no; we only need to make a new hex0 for every unique instruction set
<oriansj>alMalsamo: well we have multiple paths to do so
<alMalsamo>Hmm but each microarchitecture usually HAS a unique instruction set, right? Unique ISA?
<oriansj>one could even hand convert and toggle it into memory if one so desired
<alMalsamo>x86 vs. ARM vs. RISC-V all have different ISAs so I assume writing in machine code would require doing everything from scratch for each ISA
<oriansj>alMalsamo: well no, see backwards binary compatibility is the most valuable part of an ISA. Aka binaries built previously will continue to run without changes
<oriansj>alMalsamo: and we do in fact redo all of the steps up until M2-Planet from scratch for each of those ISAs
<oriansj>so provided we restrict ourselves to a minimal subset, our programs should run on ALL implementations of that ISA
<alMalsamo>Hmm but backwards binary compatability would only apply to processors implementing a single ISA, how could binaries be compatible acorss different microarchitecures?
<oriansj>alMalsamo: you seem to be mistaking different ISAs for microarchitectures
<oriansj>see an infinite number of different microarchitectures can implement the same ISA and run the exact same binaries without changes
<alMalsamo>Hmm maybe I am, I guess microarchitecure refers to a generation of a design WITHIN an ISA?
<oriansj>but no two different ISAs can be expected to run the exact same binaries
<oriansj>alMalsamo: that would be correct
<alMalsamo>I see stage0-posix code for RISC-V and ARM so damn you had to code all of these for each ISA from scratch? That would require intimate knowledge of each ISA's machine code...
<alMalsamo>What are you some kind of wizard?
<oriansj>alMalsamo: nope
<oriansj>just someone willing to do the work
<oriansj>it just requires one to admit they don't know and being willing to look stupid to understand what they need to reason about
<oriansj>no one here knows everything
<alMalsamo>Hmm well you don't look very stupid to me haha
<alMalsamo>I tried studying assembly x86 for example and never got very far.
<oriansj>alMalsamo: well we love to help people learn here
<alMalsamo>Low-level stuff like this hurts my brain :P
<oriansj>alMalsamo: that might be because no one took the time to help make it clear for you
<alMalsamo>Well I appreciate you using GNU GPLv3+ license, but I am not as big a fan of Github
<oriansj>alMalsamo: that is absolutely fine
<oriansj>I'm more than happy to get this all into savannah or other hosting locations as well
<oriansj>I just haven't had the time to do so
<alMalsamo>So you said that one could hand convert hex0 code and toggle it into memory, but how exactly does hex0 code get executed in the bootstrap process once I try this on my Thinkpad? I mean if hex0 is a kind of machine code representation then it doesn't need an assembler which is great, but SOMETHING has to convert it from text to an executable somehow right?
<alMalsamo>And do you actually have RISC-V32 and RISC-V64 hardware you are testing this stuff on?
<oriansj>alMalsamo: that we is why provide the kaem-optional-seed binary for direct execution and provide you multiple methods of generating the exact same binary from the hex0 source provided
<oriansj>if you want to waste a weekend, you can manually verify every single byte in the binary corresponds to the hex0 source provided.
<oriansj>alMalsamo: I actually didn't do the RISC-V porting, that was stikonas; I just helped with the mescc-tools enhancements needed.
<oriansj>and I use qemu for that testing
<oriansj>but there was someone with actual RISC-V hardware who did do testing and found a difference between the qemu and actual metal (Gabriel Wicki (I think their handle is gbrlwck))
<oriansj>alMalsamo: why do you think there are 4 files in here: https://github.com/oriansj/bootstrap-seeds/tree/master/POSIX/x86 2 hex0 sources and 2 binaries that directly correspond to that hex0 source code
<sam_>oriansj++ (it's the right attitude to have)
<sam_>not being afraid to ask random questions is really key
<alMalsamo>sam_: Hey you are from #gentoo right? You been here awhile?
<sam_>hi, yes :)
<alMalsamo>Hmm hex0-seed is fucking tiny though! What does this do?
<alMalsamo>sam_: Okay I didn't think you would be here haha
*sam_ is in a lot of places
<alMalsamo>sam_: Do you use GNU Guix or just Gentoo?
<sam_>just Gentoo. Not enough hours in the day for anything else :)
<sam_>But been tempted to try guix in foreign mode
<alMalsamo>Hmm so what brings you to #bootstrappable then?
<oriansj>alMalsamo: hex0-seed does exactly what one should expect, take hex0 files and output binaries
<sam_>it's an interesting project (and important IMO)
<alMalsamo>oriansj: Oh cool that is what I was looking for then
<sam_>bootstrappable is not just for guix
<sam_>(also, stikonas[m] kind of got me interested in modern-day bootstrapping with his OpenJDK work.)
<alMalsamo>sam_: Hmm but what other distros does it even apply to? I haven't found another distro that performs diverse double compilation
<sam_>i'm also here because it's relevant to me for e.g. rust
<oriansj>alMalsamo: here is it in C if you prefer it that way: https://github.com/oriansj/stage0-posix/blob/master/High%20Level%20Prototypes/hex0.c
<sam_>bootstrapping applies to all distros. distros may wish to e.g. use guix to get a well-known seed. but bootstrapping also applies to other languages, like Java (which needs Java to build), or Rust (which needs Rust to build)
<oriansj>the last 5 years have been very eventful for bootstrapping.
<oriansj>with BootOS, SectorForth and SectorLisp providing new possible roots. live-bootstrap providing a full path from M2-Planet to GCC+guile; Gnu Mes/MesCC providing a C compiler written in scheme that is able to self-host; blynn-compiler providing a possible path to bootstrap haskell (not quite done but tempting) and many more
<alMalsamo>Hmm I haven't heard of BootOS, SectorForth and SectorLisp...
<oriansj> https://github.com/nanochess/bootOS , https://github.com/cesarblum/sectorforth.git and https://github.com/jart/sectorlisp
<alMalsamo>Hmm I read bootOS readme, how does this help? Looks like it only supports 8088 CPUs who even has that anyway?
<alMalsamo>Looks like the only apps supported are a couple games and a BASIC intepreter...
<oriansj>alMalsamo: all x86 and AMD64 processors can run 8088 binaries
<oriansj>and the key point it is able to write hex programs to disk and load them
<alMalsamo>Hmm so would something like BootOS remove the dependency on an i686 Linux kernel image in to live-boostrap process or am I misunderstanding?
<oriansj>stikonas: I cracked the bug and it was me being stupid
<oriansj>alMalsamo: live-bootstrap is the steps after stage0
<oriansj>stage0-posix to be specific in its current form
<oriansj>BootOS provides an alternate root starting point for stage0 on x86 without depending on any kernel or Operating System at all
<alMalsamo>Oohh that's pretty cool actually, have you actually bootstrapped starting from BootOS then?
<oriansj>alMalsamo: anyone can make the hex0 seeds in BootOS
<oriansj>just don't type in the comments
<oriansj>we still need to sort out a proper bootstrap kernel written in M2-Planet's C subset or admit defeat and just write a couple minimal POSIX in assembly for fun
<oriansj>stikonas: and I think we have a regression in M2-Mesoplanet in regards to #defines
<oriansj> https://paste.debian.net/1227979/
<oriansj>I'll dig into it tomorrow as it seems to be related to a bug in building ungz with M2-Mesoplanet
<alMalsamo>oriansj: Another n00b question heh, what exactly are M2-Planet and M2-Mesoplanet and what is the difference between the two?
<stikonas[m]>C compiler
<stikonas[m]>Difference is like cc1 and gcc
<stikonas[m]>Once compiles C
<stikonas[m]>The other is a wrapper that runs other executables
<stikonas[m]>E.g. GCC runs cc1, as, ld commands
<stikonas[m]>So M2-planet is just C compiler
<stikonas[m]>But it outputs assembly, not binaries
<stikonas[m]>alMalsamo: also I wrote stage0-posix risc-v port without previous experience in risc-v and almost no experience in assembly programming
<stikonas[m]>It's not that hard as you think, although it is a bit tedious
<stikonas>sam_: Guix as foreign distro is useful if you want something that is not packaged in Gentoo (or any of its overlays) but is in Guix
<stikonas>(and provided that one is too lazy to package that program as ebuild)
<oriansj>stikonas: absolutely correct
<oriansj>now my turn for a dumb question; anyone know how to make GCC or Clang throw a big fat warning when in one file there is a prototype void set_env(char** envp); and it is used like so set_env(envp); and in another file where set_env is actually defined, its definition is just void set_env() {...} as passing an argument to a M2-Planet C function that doesn't expect an argument will result in an off by 1 (register size) bug and this is the
<oriansj>second time it has shown up.
<oriansj>or if that isn't possible, should I just create that functionality for M2-Planet?
<oriansj>alMalsamo: one could say, doing stage0-posix work we will give you all the experience needed to become a good low level programmer.
<stikonas>strange, I thought gcc would flag argument mismatch...
<stikonas>oriansj: what if you use -Wextra
<stikonas>still no warnings?
<oriansj>stikonas: the only warnings given are: https://paste.debian.net/1228011/
<stikonas>hmm, and this includes even -Wall
<stikonas>so I guess it's not possible to get it to show warning
<stikonas>very strange...
<stikonas>anyway, segfault is gone now
<stikonas>although, I do get some other error
<stikonas>non-line number: - provided to #FILENAME
<stikonas>this is when I try to build blood-elf.c and stringify.c
<stikonas>in the meantime I've updated submodules in stage0-posix to pull in your fixes
<oriansj>well --dirty-mode should help figure out this bug
<stikonas>-E should work too...
<oriansj>ok that is an M2-Planet bug
<oriansj>ok M2-Planet fix is up
<oriansj>now to figure out the M2-Mesoplanet macro regression
<oriansj>ok #define foo 15 + /* blah */ 1 is to replace foo with just 15 + 1
<stikonas>hmm, nowI get empty final binary...
<stikonas>./AMD64/bin/M2-Mesoplanet -f mescc-tools/blood-elf.c -f mescc-tools/stringify.c -o blood-elf
<stikonas>but it's the same problem with gcc
<oriansj>stikonas: ok use --dirty-mode and lets work backwards
<oriansj>is the /tmp/M1-macro-* file containing valid hex2 contents
<stikonas>hmm, unknown argument
<stikonas>strange, it is in the source
<stikonas>argh, Im running the wrong binary
<stikonas>oriansj: I don't even get to M1-macro*
<stikonas>M2-Planet-000000 is already empty
<oriansj>here is what I am getting as output: https://paste.debian.net/1228031/
<oriansj>and now I found the reason for the ungz.c compile problem for M2-Mesoplanet
<oriansj> https://paste.debian.net/1228032/
<oriansj>in short M2-Mesoplanet isn't eliminating the #if defined(__M2__) block in the output but rather just replacing the __M2__ with 42 and dumping it out (which doesn't produce the correct result in the output)
<stikonas>yeah, your output looks better
<stikonas>oh, M2libc needs update
<stikonas>hmm, no, the problem is somewhere else
<stikonas>oriansj: is #ifdef __M2__ also broken?
<stikonas>or just #if defined
<oriansj>stikonas: no I have a different fix
<oriansj>give me a minute
<oriansj>well that was a fast way to max out RAM
<oriansj>ok now __M2__ will expand to __M2__ in the output while still being defined for the macro functionality
<oriansj>and ungz.c is now able to be compiled with just: ./bin/M2-Mesoplanet -f ../mescc-tools-extra/ungz.c -o ungz
<oriansj>and patches are up
<oriansj>stikonas: it working now for you?
<oriansj>anyone?
<stikonas>one moment, checking
<stikonas>no, same problem as before, empty files
<stikonas>hmm, I'll have to investigate it...
<oriansj>is the /tmp/M2-Mesoplanet-* file there?
<oriansj>also does which M2-Planet show the correct file?
<stikonas>yes tmp/Mesoplanet file is there
<stikonas>oh, this time I might have forgotten to add M2-Planet to PATH
<oriansj>sounds like we should add a sanity check to detect if M2-Planet, M1, blood-elf and hex2 actually exist before trying to use them
<stikonas>no, it's still there
<stikonas>even after I added M2-PLanet to path
<stikonas>output is empty
<oriansj>ok, and when you manually run the M2-Planet command?
<stikonas>but if I manually run that command that M2-Mesoplanet outputs
<stikonas>it works
<stikonas>hmm, so something wrong with spawning
<oriansj>ok
<stikonas>but this is also reproducible on gcc-compiled version
<stikonas>so can't be M2libc problem
<stikonas>strange
<oriansj>hmm
<stikonas>[pid 345935] execve("/home/andrius/repositories/bootstrap/stage0-posix/M2-Planet", ["M2-Planet", "--file", "/tmp/M2-Mesoplanet-OxPtp4", "--output", "/tmp/M2-Planet-ZMbz2v", "--architecture", "amd64", "--debug"], 0x7ffda7b58388 /* 100 vars */) = -1 EACCES (Permission denied)
<stikonas>but why...
<oriansj>ls -hal /home/andrius/repositories/bootstrap/stage0-posix/M2-Planet
<oriansj>it should be -rwxr-x---
<stikonas>oh wait, why is even reading that directory
<stikonas>it should run binary from AMD64/bin/M2-Planet
<stikonas>oh I see
<stikonas>I've added wrong thing to PATH
<stikonas>well, actually I have both stage-posix root and bin directories in PATH
<stikonas>and it tries to execute directory
<stikonas>ok, it works now
<stikonas>oriansj: thanks for helping me figure this out
<stikonas>it was my fault...
<stikonas>now the question is, should we start using Mesoplanet for building stuff in stage0-posix?
<stikonas>or keep current kaem scripts invoking M2-Planet/blood-elf/M1/hex2
<oriansj>stikonas: I'm thinking we change stage0-posix to use M2-Mesoplanet for everything after we have M1, blood-elf, hex2 and M2-planet
<oriansj>however we absolutely should add some logic in cc_spawn.c to give a useful error message when execve fails like it did for you
<oriansj>stikonas: also I don't think of your issue as something needing fault but rather a clear sign of something we could improve in M2-Mesoplanet
<oriansj>say make M2-Mesoplanet responsible for mescc-tools-extra builds
<oriansj>it'll certainly simplify everything
<oriansj>although you may wish to set PATH in kaem prior to its use to prevent it from picking up anything from the environment
<stikonas>well, I thought we need at least full kaem
<stikonas>hmm, or maybe not
<oriansj>well full kaem to set a PATH
<stikonas>it's just that before kaem we can't set environmental variables
<stikonas>although maybe we don't need any
<stikonas>anyway, let me first update stage0-posix to have all required submodules
<oriansj>stikonas: well spawning requires PATH
<stikonas>oh yes
<stikonas>so after kaem then
<stikonas>oh actually maybe after M2-Planet
<stikonas>right... because initial M2 binary would not be picked up by M2-Mesoplanet
<stikonas>so yes, mescc-tools-extra then
<oriansj>and I just upgraded M2-Mesoplanet to produce a meaningful error message if it is unable to execute any of the essential binaries
<oriansj>and patch is up
<stikonas>ok, let me pull that in too
<oriansj>and if possible please verify that it would catch and provide useful information if you have the same issue as you discovered.
<stikonas>sure
<oriansj>as I don't want someone else to also run into that sort of bug in M2-Mesoplanet if possible
<stikonas>ok, got that error messagen ow
<stikonas>oriansj: one idea for optimization if it's not hard
<oriansj>good, now I found another place where we could improve the error reporting
<stikonas>right now I think include happens unconditionally
<stikonas>even if it's wrong #if branch
<stikonas>that is probably what slows M2-Mesoplanet down
<oriansj>that is true
<oriansj>and it is probably should be something we could optimize
<oriansj>fortunately it only hits each file only once (previously it hit them multiple times and really slowed things down)
<stikonas>oh yeah, that would have been much slower
<oriansj>and I found something wrong https://paste.debian.net/1228057/
<oriansj>a spawned program returning 1; wouldn't cause that logic to be hit at all
<stikonas>oh, isn't this code from kaem?
<oriansj>yeah
<stikonas>oriansj: return code is given by (status & 0xff00) >> 8
<stikonas>strange, I thought kaem works with returning 1
<stikonas>at least I tested it with /bin/false
<oriansj>what are the bottom 8 bits for?
<stikonas>well, that's what documentation told to do
<stikonas>let me find
<stikonas> https://git.musl-libc.org/cgit/musl/tree/include/sys/wait.h#n48
<stikonas>so probably bottom 8 bits store other info
<stikonas>e.g. whether it crashed, etc...
<stikonas>ok, stage0-posix submodules are udpated
<stikonas>so should have a working Mesoplanet now
<oriansj>ok since we aggressively fuzz M2-Planet, M1, blood-elf and hex2, they will likely never be set
<oriansj>hmm but probably could be used in output in kaem.
<oriansj>hmm I am tempted to flip kaem's behavior in regards to failing programs
<oriansj>say --non-strict instead of --strict; so that the default is to be strict and exit on failure
<oriansj>fossy: thoughts?