IRC channel logs

2023-12-23.log

back to list of logs

<ekaitz>stikonas: i tested compiling with mes and the time with an if and a case is exactly the same
<ekaitz>but i'll try something longer just in case
<stikonas>ekaitz: mes might be the same, but it would be good to try mes-m2...
<stikonas>but you'll have to build it manually using new M2-Planet...
<stikonas>oh, actually just use M2-Planet
<stikonas>not mes-m2...
<stikonas>(since we haven't fixed mes.c yet)
<stikonas>perhaps only bootstrap step could be faster
<ekaitz>what it's very slow is the mes execution
<ekaitz>so we might want to check other things too
<stikonas>ekaitz: well, one thing is I/O
<stikonas>mescc does it byte by byte
<stikonas>it's definitely slow, but I don't know if that's bottleneck
<ekaitz>oh! in large things it's like a 10% of improvement!
<stikonas>ekaitz: oh, even 10% would still be good
<ekaitz>it's not a very big difference and I made MANY iterations
<ekaitz>in 3 seconds i can see some difference
<ekaitz>mes being a very long run might benefit more from this
<stikonas>yeah, mes can easily run for 10 minutes even on x86
<stikonas>and if we go down to 9, that would be good
<ekaitz>there are many weird if's in the eval-apply.c file
<stikonas>and calgrind showed that we spend significant amount of time in eval-apply()
<ekaitz>i can try to make some changes but then this must be tested thoroughly
<ekaitz>i'd also like to have the other riscv issues fixed
<ekaitz>oh it's doing some struct => int conversion
<ekaitz>this is really cool stuf
<ekaitz>i learned this week you can treat structs as their first fields
<fossy>yeah because of how structs are placed in memory, it is a neat trick
<ekaitz>yeah!
<ekaitz>also i don't think cases can fit here very well
<stikonas>which program is doing struct => int?
<ekaitz>eval-apply
<ekaitz>i think
<stikonas>oh I see
<ekaitz>oh not really
<ekaitz>finally all are pointers...
<ekaitz>it's pretty bad dude
<ekaitz>i thought all the cell_ things were integers after all but they are not
<ekaitz>they are pointers
<ekaitz>so we need to do some magic against them to put them in a switch/case
<ekaitz>can i just go intptr_t? is this too much?
<ekaitz>symbol.c is the problem here
<ekaitz>oh stikonas the largest if block won't work
<stikonas>due to pointers?
<stikonas>sigh, that eval_apply is pretty scary...
<ekaitz>nah don't worry
<ekaitz>look it generates symbols and then uses them as values
<ekaitz>to be able to deal with them in scheme
<ekaitz>the problem that has is it's comparing structs
<ekaitz>just checking if they are equal
<ekaitz>a == b
<ekaitz>and that's not optimal in any case
<stikonas>I'm surprised that even works..
<stikonas>especially in M2-Planet...
<stikonas>or are we comparing pointers to structs?
<stikonas>ok, poitners to stcruts comparison would work in M2
<ekaitz>pointers to pointers
<ekaitz>it's not the best i think
<ekaitz>the symbols are allocated and stored in memory
<ekaitz>and then we compare against them
<ekaitz>it's pretty easy to read
<ekaitz>stikonas: i have my starfive running for 1 day with no output in the terminal
<ekaitz>stuck in stack.c
<ekaitz>is this normal?
<ekaitz>because i think it's hanging
<stikonas>ekaitz: what are you building?
<stikonas>(moved to #guix-risc-v for this )
<oriansj>every percent performance gained is a net win
<muurkha>sometimes, it depends on what you pay for it!
<GoogulatorMobile>stikonas: mes can easily run for 10 minutes even on *modern* x86
<GoogulatorMobile>On older x86, I get 30-40 minutes
<stikonas>yaeh, it's about 10 minutes on 8 year old laptop...
<GoogulatorMobile>And by older, I mean Core 2 / Conroe uarch
<GoogulatorMobile>I haven't even dared to try on a NetBurst
<stikonas>tommorrow I'll finally have riscv64 checksums
<stikonas>build should finish by then
<stikonas>though maybe we should wait till mes 0.26.1...
<GoogulatorMobile>ugh
<GoogulatorMobile>BTW, why is mes version in seed.kaem?
<GoogulatorMobile>Can't it be contained in mes's build script?
<GoogulatorMobile>Seed shouldn't assume it's gonna be building mes next
<GoogulatorMobile>In fact, seed should make no assumptions about the content of the manifest
<stikonas>GoogulatorMobile: that's probably because variable was shared by both mes and tcc packages
<muurkha>haha, NetBurst
<GoogulatorMobile>TCC needs to know about mes version?
<stikonas>GoogulatorMobile: yeah...
<stikonas>because mes is not fully installed into /usr
<stikonas>and it runs mescc from PREFIX...
<stikonas>from MES_PREFIX
<stikonas>and MES_PREFIX is path to mes build dir...
<stikonas>something to clean up ideally
<stikonas>but I guess installing mescc requires installing a lot of scheme files
<stikonas>anyway, that would be good to fix...
<stikonas>though we have other things that would be good to fix too
<GoogulatorMobile>muurkha: NetBurst is pretty much the beginning of what I consider viable bootstrap hardware
<GoogulatorMobile>Maybe also Coppermine & Tualatin, if you can feed it 3GiB of RAM
<GoogulatorMobile>(needs 1GiB SDRAM modules & a compatible board, which is quite rare)
<GoogulatorMobile>Earlier than that, you start to hit the emulation hazard
<GoogulatorMobile>But go too new, and it becomes harder to trust the hardware
<muurkha>GoogulatorMobile: yeah, it might be viable!
<GoogulatorMobile>I'd consider anything with "modern" Intel ME (in the sense of always-on mini-cores running secret firmware) unsuitable
<GoogulatorMobile>That means no newer than Westmere (iirc)
<GoogulatorMobile>AMD might be viable for a little longer - not sure when PSP was made mandatory
<GoogulatorMobile>For the same reason, I consider Raspberry Pi of any generation NSFB
<GoogulatorMobile>(Not Safe For Bootstrapping)
<stikonas>stage0-uefi is slowly getting fixed. And it's getting easier, hex1, hex2 and catm are now fixed. M0 next...
<stikonas>GoogulatorMobile: PSP runs on all AMD CPUs too for at least a decade
<stikonas>though I think PSP does not have any networking capability
<stikonas>raspberry pi has some open firmware too actually
<GoogulatorMobile>Yeah, open
<stikonas>though I think they only managed to bring up bare minimu
<GoogulatorMobile>But hardly bootstrappable
<stikonas>well, true
<GoogulatorMobile>Unless I missed something, VC4 arch isn't even self-hosting
<GoogulatorMobile>Only cross compilers exist
<stikonas>well, as part of bootstrapping, you write your own compilers anyway
<stikonas>anyway, I don't think it's very active, this hasn't had any commits for years https://github.com/christinaa/rpi-open-firmware/
<GoogulatorMobile>You do, but I would rather not include writing a native compiler for a largely secret and undocumented ISA that has never in its existence executed a compiler
<stikonas>well, yeah, that's no fun...
<GoogulatorMobile>VC4 would be more than powerful enough to bootstrap on, the problem is Broadcom's attitude
<stikonas>anyway, I'm not suggesting bootstrapping on raspberry pi...
<muurkha>GoogulatorMobile: Intel ME is pretty bad, yeah
<muurkha>the VideoCore IV ISA isn't secret and undocumented
<GoogulatorMobile>Intel ME is even worse than VC4 - VC4 at least lets you load your own code without signing, if you have a way of compiling for it
<muurkha>what's secret and undocumented is the source code of the firmware they run on it
<GoogulatorMobile>Isn't the ISA also only known from reverse engineering said firmware?
<Googulator>Any idea about this error? https://github.com/fosslinux/live-bootstrap/actions/runs/7299720208/job/19893080535?pr=354#step:7:10823
<Googulator>I can also reproduce this locally on Ubuntu 22.04 in WSL2
<Googulator>basically, it seems the bwrapped environment lacks the CAP_MKNOD capability - but then how did this work before the refactor?
<Googulator>(the step that fails is populate_device_nodes in both GitHub and WSL2)
<Googulator>ok, it seems to be an easy fix, as far as getting bwrap to work in general
<Googulator>2- or 3-pass bwrap for CI might be harder
<Googulator>all of the offending mknod operations were in sysb originally - bwrap/chroot just straight up skipped sysb
<Googulator>now that sysb is no more, the mknods are now attempted in bwrap, and they fail
<fossy>Googulator: don't worry about CI right now, I have plans for that
<Googulator>fossy: I read that too late - I already have it working, at least locally :)
<Googulator>and it turns out, someone else also worked on bwrap (though without CI): https://github.com/fosslinux/live-bootstrap/pull/360
<lrvick>Hey all. I am trying to make a general purpose deterministic multi-party-signed container images that don't trust any single existing linux distribution. I got rust bootstrapped all the way from gcc this way, and now I am going down the other path, replacing my alpine sourced "seed" gcc with a full-source-bootstrapped gcc. This rabbithole led me here.
<lrvick>Sadly in dockerland building from actual scratch is pretty hard, so I bootstrapped "stage0-posix" by building in 3 different seed distros and confirming they all agree with each others hashes like so: https://git.distrust.co/public/packages/src/branch/main/src/bootstrap/stage0/Dockerfile
<lrvick>My understanding is the next thing I need to do on my path to gcc is build Mes with M2-Planet.
<lrvick>My failing attempt at that is here: https://git.distrust.co/public/packages/src/branch/main/src/bootstrap/mes/Dockerfile
<lrvick>This lands me with:
<lrvick> > [build 7/7] RUN ["/M2-Planet","--debug","--architecture","amd64","-f","src/mes.c"]:
<lrvick>0.220 src/mes.c:35:ERROR in create_struct
<lrvick>0.220 Missing {
<lrvick>I am assuming the source is fine and I am just skipping some steps, or missed some docs somewhere
<GoogulatorMobile>Irvick: I suggest starting with live-bootstrap instead
<GoogulatorMobile>Gets you straight up to GCC 13 from stage0
<muurkha>lrvick: sounds like an awesome project!
<lrvick>live-bootstrap?
<lrvick>ACTION looks
<lrvick>okay, looks like maybe I can maybe adapt the chroot path to a container
<matrix_bridge><Andrius Štikonas> lrvick: or bwrap path...
<matrix_bridge><Andrius Štikonas> bwrap is using Linux namespaces just like docker, just a bit lower level tool
<lrvick>Well, goal is that anyone can take this Dockerfile and run it from podman, buildah, docker, kaniko, or any other OCI compatible runner for maximum portability, and get the same hash
<lrvick>without having to care what their host OS is
<matrix_bridge><Andrius Štikonas> Andrius Štikonas: also you were building mes.c without any headers...
<stikonas>Irvise: yeah, but you need to adjust steps in Dockerfile...
<stikonas>and I suggest not starting with debian or alpine...
<stikonas>but start with "scratch" if you want really to bootstrap from source
<lrvick>Well I start with all three, and compare the hashes, in the stage0 one at least.
<stikonas>yeah, but if you run the right binaries, you don't need anything from host containers
<lrvick>then in the "mes" package I only use debian for downloading sources from the internet, then I pivot to a scratch container
<stikonas>yeah, you do need to download stuff... true
<lrvick>I download with the untrusted debian container, then pivot to scratch, and check the hash there with stage0 built sha256sum
<lrvick>which seems about as optimal as I can get
<stikonas>oh, I missed that...
<stikonas>yeah, ineed, it's lower down
<stikonas>s/ineed/indeed/
<stikonas>ACTION will be back in an hour or so
<lrvick>has anyone tested if the final gcc13 produced by live-bootstrap currently deterministic? Would love hashes to compare against
<ekaitz>GoogulatorMobile & stikonas : i sent the patch that avoids double runs to bug-mes
<ekaitz>it's just deleting one line, but it's there so you can just take it and apply it to you mes download if you want to make it faster
<lrvick>Okay attempted live-bootstrap in a scratch container image using a pre-built stage0 image. live-bootsrap: https://git.distrust.co/public/packages/src/branch/fsb/src/bootstrap/live/Dockerfile stage0: https://git.distrust.co/public/packages/src/branch/fsb/src/bootstrap/stage0/Dockerfile
<lrvick>It explodes like so: https://dpaste.org/VvuSN
<lrvick>core dump in script generator
<lrvick>I barely have any idea what I am doing with this, so probably something tumb
<stikonas>Irvise: that script generator is very new, probably some bug in it but it could be caused by e.g. missing files?
<stikonas>I'll take a look at your dockerfiles
<stikonas>lrvick: ok, building podman now, will try those dockerfiles soon
<stikonas>though even in docker scratch we depend on docker binary and also kernel
<lrvick>well you depend on -any- oci capable runner of which several exist
<lrvick>and if they all agree in a reproducible build setup, then that is very confidence inspiring
<lrvick>also OCI runners have been adapted to other kernels that provide a linux system call interfac, like FreeBSD.
<stikonas>well, yeah, I'm using podman both at home and at work...
<stikonas>though pure live-bootstrap can also do kernel bootstrap on BIOS systems...
<stikonas>(we haven't got UEFI bootstrap fully working yet, but there is some work in progress)
<lrvick>yeah, so like if you used podman+linux and I used docker+freebsd then we had no code overlap in our runtimes, and then if everything is deterministic from there, great. I am optimizing for maximum diversity.
<lrvick>and hey, if my the builds in my easily-CI-able bootstrap container end up matching the hashes people do by hand on baremetal, the confidence keeps stacking
<stikonas>yeah, they should match if everything is right
<stikonas>and yes, more diversity is good, that's why even live-bootstrap itself supports a few different running modes
<stikonas>I actually planned to look at docker option a couple of years ago, but then doras implemented rootless bubblewrap mode, so I didn't do any work on docker...
<lrvick>mostly my work here is motivated by the overall supply chain integrity story of pretty much every container image that powers the whole internet at this point is based on alpine which is neither signed nor reproducible, and the non-musl alternatives are worse in other ways.
<lrvick>and all my clients want do do reproducible builds of their mission critical software from a wide range of systems and get the same result, and docker is the thing everyone knows.
<lrvick>but yeah... I bootstrap all my stuff from alpine as I needed a musl gcc, so I just kicked the can further down the stack.
<lrvick>time to finish it ^_^
<stikonas>yeah, we do build musl gcc in live-bootstrap :), so you just need to port it to docker...
<stikonas>to run exactly the same steps
<lrvick>Yeah, on paper it seems like this should be a drop-in replacement for the alpine container I bootstrap the rest of my distro with.
<lrvick>other than my weird core dumps above. no clue how to debug that
<lrvick>but happy to learn if anyone gets a moment to poke it
<stikonas>yeah, I'll run it soon
<stikonas>I'll see what happens there
<stikonas>my first guess is that some file is missing?
<stikonas>but I might be wrong
<lrvick>likely. I am only exporting the bin/x86 and m2libc folders from my stage0 build container
<lrvick>so if there is anything obvious beyond those I need, lmk
<lrvick>trying to have only what is aboslutely required in each stage to make it easier to audit
<stikonas>ok, stage0-posix is building
<stikonas>lrvick: what is https://codeload.github.com/fosslinux/live-bootstrap/legacy.tar.gz ?
<stikonas>in that 2nd dockerfile
<stikonas>I just get HTTP 400
<fossy>oh hello lrvick, i recall you from hashbang
<fossy>i can't promise that live-bootstrap will be a drop in replacement, but it should be very close to ^-^
<fossy>re: reproducibility, all package checksums are in steps/SHA256SUMS.pkgs; live-bootstrap should be 100% reproducible
<stikonas>well, posts bash checksums are there
<stikonas>pre-bash are in steps/*/*.checksum
<stikonas>lrvick: I seem to hit some issues earlier
<stikonas>I run DOCKER_BUILDKIT=1 SOURCE_DATE_EPOCH=1 /usr/bin/docker build -t local/live:latest --build-arg REGISTRY=local --platform linux/amd64 --progress=plain src/bootstrap/live
<stikonas>(remove --target package)
<stikonas>s/remove/removed
<stikonas>and then it runs STEP 14/24: FORCE_TIMESTAMPS=False and then fails
<stikonas>something seems wrong with cat syntax
<stikonas>I think because line endings are not escaped
<stikonas>so docker build treats RUN cat <<EOF > /rootfs/steps/bootstrap.cfg as the whole command
<lrvick>was this in docker or podman?
<lrvick>also about to pass out any moment. 6am... so may be a few hours before I can take a look.
<stikonas_>Irvise: no problem, take a look tomorrow
<stikonas_>it was with podman...
<GoogulatorMobile>early progress towards bootstrapping pine: https://github.com/fosslinux/live-bootstrap/assets/16308406/ac74c34a-8c9f-4fd4-8184-87a3a17febb4
<stikonas>you didn't start with the seed (pine cone)
<stikonas>in the meantime, I've fixed M0.efi
<stikonas>so not much left tilll stage0-uefi is fixed
<stikonas>just need to fix cc_amd64 and M2libc...
<stikonas>and then I probably should try to write a POSIX kernel on UEFI...
<stikonas>probably easier way forward than porting meslibc to UEFI...
<GoogulatorMobile>stikonas: and it's missing an actual bootstrap for Santa to fill :)
<stikonas>goggles-bot: before I opened that link, I thought it will be something on one of the pine64 products...
<stikonas>by the way, how come that pine asset appears under fosslinux repo?
<lrvick>stikonas: looks like you were able to build the stage0 image though?
<stikonas>lrvick: stage0 yes
<stikonas>then I tried building live docker image
<lrvick>Okay, good to know that one is at least portable. So probably just my heredoc usage
<stikonas>and podman first of all complained about command argument
<stikonas>(--target platform)
<lrvick>ah yeah, that is a buildkit specific thing
<stikonas>but even if I removed it, it still didn't work (I think cat <<< EOF problem...
<stikonas>as for you crash, I haven't reached it yet, but maybe worth checking if you have all the files that script-generator tries to fopen...
<stikonas>lrvick: looking at dockerfile, it seems that you copied it to /rootfs/steps
<stikonas>and script generator tries to open /steps
<lrvick>stikonas: okay redid it without a heredoc since I can't figure out why podman hates those
<lrvick>stikonas: I put everything in /rootfs in the "fetch" stage, but then it gets into the "build" stage on line 41, which dumps everything from /rootfs into /, once I am safely under a "from scratch"
<lrvick>I'll get a tree of the final filesystem. sec
<stikonas>oh yes, it does go to /
<stikonas>ok, now it segfaults in script generator
<lrvick>great.
<lrvick>I mean not great, but you replicated where I am
<lrvick>Here is the state of the filesystem in the final stage. injected busybox at runtime so I could use ls -Rlah : https://sprunge.us/dYeEEO
<stikonas>indeed
<stikonas>maybe FILE *env = fopen("/steps/env", "r"); ?
<stikonas>script-generator opens it but does not check if env is nullptr
<lrvick>I can cat /steps/env and it indeed has env vars in it
<stikonas>let me try to strace it...
<lrvick>docker run --volume ~/.local/bin/busybox:/bin/busybox --user 0:0 --entrypoint /bin/busybox -ti local/live:latest sh -c "/bin/busybox --install -s /bin && sh"
<lrvick>use that to get a busybox shell inside the scratch container if you like
<stikonas>strace might be more useful here...
<lrvick>ACTION figures out static strace binary
<stikonas>lrvick: no need to inject strace...
<stikonas>I'll just run strace from outside
<stikonas>it will be noisy...
<lrvick>oh ofc
<stikonas>but I only care about the last steps
<lrvick>I am not copying all of stage0 inside. only the "x86" and "M2Libc" dirs make it over. I'll test copying the kitchen sink to test, but my (likely incomplete) understanding is I only need those two dirs.
<stikonas>I would guess so (just x86 and M2libc...)
<stikonas>I don't see anything bad with openat syscalls...
<lrvick>Good to know I am not doing something super obviously dumb, though that would have led to a faster answer ^_^
<lrvick>Okay yeah, copying the whole stage0 tree over to / had no impact
<stikonas>hmm, now I ran both with -e opeat,execve
<stikonas>and I don't see script generator opening anything at all
<stikonas>maybe I should do -ff to get split logs...
<stikonas>will be easier to read...
<stikonas>lrvick: yes, it is missing a file
<stikonas>open("/steps/lwext4-1.0.0-lb1/files/fiwix-file-list.txt", O_RDONLY) = -1 ENOENT
<lrvick>that is way deeper in the tree than I have interacted with
<lrvick>other files exist in that folder
<lrvick>where does that come from?
<stikonas>hmm, possibly rootfs.py creates it
<lrvick>which I skip
<lrvick>ACTION looks
<stikonas>recently fossy refactored live-bootstrap alot
<lrvick>Yeah, looks like I made it just in time to kick the tires on the less-trodden paths post refactor
<stikonas>and it's not mentioned in the Readme (in the pythonless section)
<lrvick>the the bug might not be by brain this time
<lrvick>my*
<lrvick>in addition to suiting my own needs, this docker setup might become a useful end to end test suite...
<lrvick>no mention in the readme of making a manual bootstrap.cfg either but that one I got a useful error on so easy to work out
<stikonas>well, you can send a PR with readme updates...
<lrvick>Yeah for sure. Once a working path is confirmed :D
<lrvick>here we go
<lrvick> https://github.com/fosslinux/live-bootstrap/blob/55ad47acd764a909e0f337bce322abd346d963fa/lib/generator.py#L137
<lrvick>okay so I need to replicate that in shell. easy enough
<lrvick>also what strace incantation ended up being most useful for you there to find that
<lrvick>I assume you were just wrapping podman in strace ff?
<lrvick>never tried stracing from -outside- the container
<stikonas>lrvick: I just ran strace -ff -o logs/logs
<stikonas>on the whole docker process
<stikonas>initially I did -f but that creates just one log with everything
<stikonas>-ff creates one file per process
<stikonas>so much easier to read
<stikonas>just had to find file that corresponds to script-generator
<stikonas>my initial mistake was filtering on openat
<lrvick>I don't know that this would work with docker given the socket indirection, but not sure how podman does process forking
<stikonas>M2libs was calling open and not openat
<stikonas>processes for mas normal
<stikonas>s/for mas/fork as/
<stikonas>it's just that they are in a separate mount namespace
<stikonas>anyway, I ran "strace -o logs/logs -ff /usr/bin/docker build -t local/live:latest --build-arg REGISTRY=local --platform linux/amd64 --progress=plain src/bootstrap/live"
<stikonas>anyway, that's quite pleasant to debug compared to stage0-uefi...
<stikonas>maybe it's possible to wire in gdb there too, but it's non-trivial
<stikonas>so I was mostly debugging that with return condes
<stikonas>return codes
<lrvick>actually, isn't fiwix only relevant for the live-boot use case vs baremetal or chroot?
<lrvick>Thanks, those are good tips
<stikonas>lrvick: yeah, it's only relevant to baremetal bootstrap
<stikonas>lrvick: you could try fixing that crash in script-generator...
<lrvick>So then the actual issue is I should be skipping looking at fiwix at all with my config
<stikonas>i.e. print a useful error message rather than crash
<stikonas>when files are not found...
<lrvick>I suck at C, but that sounds easy enough
<lrvick>Ah, I didn't want the tooling to actually try to use a chroot in the container, so my config is BARE_METAL=True, but then that tries to do fwix and other things not relevant to me.
<lrvick>I basically need the chroot path, without actually doing a chroot pivot since the container covers that
<lrvick>so maybe creating a "container" path is the right call here.
<lrvick>going to let it try the chroot path the vanilla way and see if everything is happy, then try ripping out the actual chroot pivot with a flag
<stikonas>no, you definitely don't want BARE_METAL=true...
<stikonas>that is either for real bare metal or at the very least qemu mode
<GoogulatorMobile>stikonas: GitHub draft comment embedded image trick
<GoogulatorMobile>By writing a draft comment, embedding an image, and clicking Preview, images can be uploaded to GitHub, and they persist (at least for some time) even after you discard the draft
<GoogulatorMobile>I meant to do that on googulator/live-bootstrap, but accidentally uploaded under fosslinux/live-bootstrap instead
<GoogulatorMobile>BTW, I wasn't thinking of Pine64, but pine, the e-mail client
<stikonas>that pine was replaced by alpine some time ago...
<stikonas>(not to be confused with distro)
<GoogulatorMobile>hmm, looked more at this image upload trick, it's a lot more exploitable than I thought
<GoogulatorMobile>maybe it would be wise to report it to GitHub
<GoogulatorMobile>it's possible to upload arbitrary binaries under any public repo, not even just images
<stikonas>oriansj: you only added $ immediates to M1, not M0?
<stikonas>at leastt that's my reading of the code
<stikonas>in cc_amd64.M1 I should still use @