IRC channel logs

2023-12-28.log

back to list of logs

<stikonas>well, I pushed my changes as it is for now...
<oriansj>stikonas: thank you
<stikonas>well, still a while till we can run non-trivial stuff with it
<stikonas>and these syscalls are easy...
<stikonas>I expect stuff like brk / fork / exec to be a bit harder
<oriansj>brk just becomes basic input validation and allocating block multiples
<oriansj>as we only do brk to allocate memory and never deallocate
<stikonas>or alternatively we switch to mmap...
<oriansj>also a valid option
<fossy>stikonas: just to clarify, posix-runner is pretty much a translation layer from POSIX syscalls to UEFI?
<stikonas>fossy: yeah
<fossy>i see
<fossy>that would be very useful
<stikonas>fossy: but since M2libc already pretty much does that for basic syscalls
<stikonas>it's really almost trivial
<fossy>rightt
<stikonas>this is syscall implementation https://git.stikonas.eu/andrius/stage0-uefi/src/branch/posix-runner/posix-runner/syscalls.c
<stikonas>hmm, I seem to have left some debug line there...
<stikonas>fossy: and that's the main part https://git.stikonas.eu/andrius/stage0-uefi/src/branch/posix-runner/posix-runner/posix-runner.c
<stikonas>this has quite a bit more assembly but still not too bad
<stikonas>it's just that C can't do stuff related to stack...
<stikonas>or those x86_64 model specific registers
<stikonas>fossy: still it's fairly simple program right now
<fossy>yeah, i see
<fossy>is fork even possible?
<stikonas>build buidler-hex0 did some hack
<stikonas>so probably something similar could be done too
<fossy>right, so fake-ish fork
<stikonas>well, uefi API only has spawning
<fossy>i guess as long as every process forks & waits in the original process you can fake it
<fossy>ah
<stikonas>yeah, I think so
<stikonas>fossy: so I basically stage0-uefi can get us to the same place as stage0-posix, but afterwards we either need translation layer or port meslibc to UEFI
<stikonas>and later tcc is even more complicated...
<stikonas>so translation layer is probably better
<fossy>i think translation layer is going to be the easier & more sustainable solution
<stikonas>indeed
<stikonas>and M2libc already has basic syscalls
<stikonas>fossy: anyway, feel free to test it
<stikonas>a few weeks ago I realized that stage0-uefi was quite broken on some machines because stack was not properly aligned
<stikonas>now that is fixed...
<stikonas>anyway, it should all be contained to ESP partition, so testing with USB dongle should be fairly safe
<stikonas>and thanks to Googulator again for suggesting this translation layer idea
<stikonas>fossy: one limitation of posix-runner right now is that only position independent elf binaries are supported
<stikonas>i.e. it can't fix symbol addresses in the elf file...
<stikonas>but we can fix this if necessary in the posix path
<stikonas>stage0-uefi already uses relative rather than absolute addressing (except for kaem-optional but that can be fixed too)
<stikonas>mescc might need fixes...
<stikonas>tcc probably supports both, just needs some argument like -fPIC or something like that
<fossy>stikonas: yeah i'll give it a try on bare metal when i get a minute to do that
<stikonas>thanks
<stikonas>basically run make there and then dd build/disk.img
<fossy>okey
<stikonas>(well, with suitable dd syntax)
<fossy>relative addressing, what's that because of? uefi reasons?
<stikonas>more like kernel doesn't fix binaries when it is loading
<stikonas>and partially UEFI reasons
<fossy>right
<stikonas>so the way this works
<stikonas>is it malloc sizeof(ELF)
<stikonas>and puts ELF file there
<stikonas>but that can be anywhere in the memory
<stikonas>and won't be in the location where base address points
<stikonas>builder-hex0 hardcodes base address and loads binary there
<fossy>ahh, that makes sense
<stikonas>which is why rickmasters had to patch mes to put it in the same address
<stikonas>but in UEFI I need to allocate memory pool first
<stikonas>and Linux does far more I guess...
<stikonas>so it can work with absolute addresses too
<stikonas>also i'm not bothering with any of the security ring stuff...
<stikonas>(besides having to fix SS register after syscall instruction, otherwise UEFI hangs)
<stikonas>it all stays in ring 0
<fossy>eh, its bootstrap, i don't think we need to be too concerned about what ring it runs in
<stikonas>exactly...
<stikonas>mes will have futher problems of course
<stikonas>not just relative addressing
<stikonas>we also need to fix tcc-mes not to crash...
<stikonas>but at least it builds these days
<stikonas>(also most of the actual UEFI->POSIX translation right now lives in https://github.com/oriansj/M2libc/blob/main/uefi/uefi.c)
<stikonas>it's a lot lower level than posix...
<stikonas>so need to emulate stuff like current working directory...
<fossy>ooh, i see
<fossy>quite low level
<stikonas>yeah, but for bootstrap we mainly need I/O
<stikonas>everything else is a nice bonus...
<stikonas>and if we don't have those, we can work around
<fossy>how far into the bootstrap do we plan to go with uefi?
<fossy>before loading a posix kernel
<stikonas>fiwix I think
<stikonas>make is tricky
<stikonas>as it does more complicated stuff with forking
<stikonas>so amd64 bootstrap till fiwix
<stikonas>and then continue on x86
<fossy>fiwix sounds reasonable
<fossy>yeah
<stikonas>also to see how annoying UEFI API is, look at hex0 C prototype https://git.stikonas.eu/andrius/stage0-uefi/src/branch/main/Development/hex0.c
<stikonas>and that doesn't even include proper error handling...
<stikonas>but in hex0 we don't bother with that in posix either
<fossy>a lot of intermediary steps for something pretty simple...
<stikonas>indeed...
<stikonas>and also PE32 headers are huge...
<stikonas>so bootstrap seeds are also significantly bigger
<stikonas>especially kaem
<stikonas>but on the other hand, kaem-optional is also optional here
<stikonas>at least if you can kick off your UEFI programs with some arguments from the boot menu
<Googulator>meanwhile, in Perl bfd land: looks like the staged build behavior is the correct one
<Googulator>bfd.h is meant to exist, and is meant to be found by Perl
<Googulator>the question is not "why does it suddenly exist when we rebuild the system from packages", but rather, "where the hell does it go when we don't"
<sam_>bfd.h is a stupid situation
<sam_>binutils cannot decide if anyone is supposed to use libiberty and friends
<sam_>see the various links in https://bugs.gentoo.org/879067
<sam_>it's such a mess
<Googulator>oh...
<Googulator>src_install() {
<Googulator>    # Remove old perl
<Googulator>    rm -rf "${PREFIX}"/lib/perl5/
<Googulator>    default
<Googulator>}
<Googulator>That seems to explain it.
<Googulator>src_install() touches the live system, removing files
<stikonas>oh yes, otherwise we just accumulate a lot of old perl files
<Googulator>Do we actually need this, or is it just cleanup?
<fossy>iirc cleanup, but that was added aaaages ago
<Googulator>More importantly, do we need it *in the install to fakeroot codepath?*
<matrix_bridge><Andrius Štikonas> yeah, just the cleanup
<Googulator>OK, so it's not because "make install" to fakeroot fails / does the wrong thing if an older perl is already installed outside fakeroot
<fossy>i don't *think* so
<Googulator>In that case, the right way to handle that would be to support overriding src_apply
<Googulator>src_install is part of the build path, not the actual package install
<Googulator>musl-1.1.24 is committing the same sin
<Googulator>touching the live system in src_install
<Googulator>coreutils-5.0 too, although this one is a bit better - it creates files on the live system, rather than deleting the,
<Googulator>*them
<Googulator>bash also deletes from the live system, but a comment there suggests it's justified
<Googulator>"# Needs special handling b/c is currently running - tar doesn't like this"
<Googulator>yet somehow it doesn't cause problems when installing preseeded packages
<Googulator>flex is removing lex and yacc in the same way
<oriansj>stikonas: hmmm; well perhaps it is wisest to go full POSIX instead of playing compatability layer with UEFI after we have M2-Planet.
<stikonas>oriansj: perhaps, but then you need to write drivers
<stikonas>i.e. for display...
<stikonas>anyway, I have debugged my current problem a bit...
<stikonas>possibly because stack was not zeroed...
<stikonas>which stage0-posix assumes...
<stikonas>it's either a small fix in hex0 "push rdi" -> push 0
<stikonas>or we allocate a new stack for each executable
<oriansj>well display isn't essential and vga only needs us to write to the correct memory address and have a simple buffer
<stikonas>it will be hard to debug without display too
<stikonas>but maybe it's an option...
<stikonas>though we can have both...
<stikonas>I don't think UEFI compatibility layer would be hard...
<stikonas>I basically have a slightly modified hex0 already running
<stikonas>and proper fix can be done too...
<stikonas>anyway, going to bed now
<oriansj>sweet dreams
<Googulator>located the culprit: it's musl-1.1.24 pass3
<Googulator>src_install() {
<Googulator>    rm -rf "${PREFIX}/include"
<Googulator>    make PREFIX="${PREFIX}" DESTDIR="${DESTDIR}" install
<Googulator>}
<Googulator>rm -rf "${PREFIX}/include"
<Googulator>no, really
<Googulator>introduced here: https://github.com/fosslinux/live-bootstrap/commit/30ebe8ccba7b635a8bba2001deb7f0827d32cddc
<Googulator>and carried forward through several refactors, without anyone noticing
<fossy>that used to be right, i guess, because at that point all previous headers became irrelevant, but obviously not the case anymore LOL
<fossy>not sure how that didn't create much more severe problems
<Googulator>hmm, jut noticed the guile build step is actually building 2 separate versions of guile
<Googulator>shouldn't that be 2 build steps then?
<Googulator>back to the musl issue: according to https://github.com/fosslinux/live-bootstrap/blob/30ebe8ccba7b635a8bba2001deb7f0827d32cddc/sysa/run.sh#L93 it appears this musl rebuild was already after binutils then
<Googulator>(binutils installs headers, which can be relevant later)
<Googulator>& at this point, that was actually the final binutils
<Googulator>the intention was presumably to get rid of meslibc-related headers, but it was already heavy handed back then
<fossy>Googulator: we don't actually build guile 3.0.7 completely; only a (relatively) small part of it, just enough to get psyntax-pp.scm out of it into guile 3.0.9
<Googulator>makes sense then
<oriansj>Googulator: thank you for your continued hard work ^_^
<Googulator>oriansj: you're welcome :)
<stikonas>oriansj: https://github.com/oriansj/bootstrap-seeds/pull/43
<stikonas>that read problem on posix-runner.efi turns out to be a minor bug in hex0
<stikonas>(which could be observed if you get file descriptor > 255)
<stikonas>(though normally that doesn't happen on linux)
<stikonas>easy to fix though, just made sure that we clear all other bits on a stack (by pushing another register)
<stikonas>well, with another almost trivial change, posix-runner can now run quite a bit more stuff: hex0, hex1, hex2, catm and M0. (cc_amd64 somehow hangs...)
<stikonas>(perhaps we don't really need them since stage0-uefi provides the same, native UEFI binaries but it's a good testcase...)