IRC channel logs
2023-12-23.log
back to list of logs
<ekaitz>stikonas: i tested compiling with mes and the time with an if and a case is exactly the same <ekaitz>but i'll try something longer just in case <stikonas>ekaitz: mes might be the same, but it would be good to try mes-m2... <stikonas>but you'll have to build it manually using new M2-Planet... <stikonas>perhaps only bootstrap step could be faster <ekaitz>what it's very slow is the mes execution <ekaitz>so we might want to check other things too <stikonas>it's definitely slow, but I don't know if that's bottleneck <ekaitz>oh! in large things it's like a 10% of improvement! <stikonas>ekaitz: oh, even 10% would still be good <ekaitz>it's not a very big difference and I made MANY iterations <ekaitz>in 3 seconds i can see some difference <ekaitz>mes being a very long run might benefit more from this <stikonas>yeah, mes can easily run for 10 minutes even on x86 <stikonas>and if we go down to 9, that would be good <ekaitz>there are many weird if's in the eval-apply.c file <stikonas>and calgrind showed that we spend significant amount of time in eval-apply() <ekaitz>i can try to make some changes but then this must be tested thoroughly <ekaitz>i'd also like to have the other riscv issues fixed <ekaitz>oh it's doing some struct => int conversion <ekaitz>i learned this week you can treat structs as their first fields <fossy>yeah because of how structs are placed in memory, it is a neat trick <ekaitz>also i don't think cases can fit here very well <ekaitz>i thought all the cell_ things were integers after all but they are not <ekaitz>so we need to do some magic against them to put them in a switch/case <ekaitz>can i just go intptr_t? is this too much? <ekaitz>oh stikonas the largest if block won't work <stikonas>sigh, that eval_apply is pretty scary... <ekaitz>look it generates symbols and then uses them as values <ekaitz>to be able to deal with them in scheme <ekaitz>the problem that has is it's comparing structs <ekaitz>and that's not optimal in any case <stikonas>or are we comparing pointers to structs? <stikonas>ok, poitners to stcruts comparison would work in M2 <ekaitz>the symbols are allocated and stored in memory <ekaitz>and then we compare against them <ekaitz>stikonas: i have my starfive running for 1 day with no output in the terminal <oriansj>every percent performance gained is a net win <muurkha>sometimes, it depends on what you pay for it! <stikonas>yaeh, it's about 10 minutes on 8 year old laptop... <stikonas>tommorrow I'll finally have riscv64 checksums <stikonas>though maybe we should wait till mes 0.26.1... <GoogulatorMobile>In fact, seed should make no assumptions about the content of the manifest <stikonas>GoogulatorMobile: that's probably because variable was shared by both mes and tcc packages <stikonas>because mes is not fully installed into /usr <stikonas>and MES_PREFIX is path to mes build dir... <stikonas>but I guess installing mescc requires installing a lot of scheme files <stikonas>though we have other things that would be good to fix too <GoogulatorMobile>muurkha: NetBurst is pretty much the beginning of what I consider viable bootstrap hardware <muurkha>GoogulatorMobile: yeah, it might be viable! <GoogulatorMobile>I'd consider anything with "modern" Intel ME (in the sense of always-on mini-cores running secret firmware) unsuitable <GoogulatorMobile>AMD might be viable for a little longer - not sure when PSP was made mandatory <stikonas>stage0-uefi is slowly getting fixed. And it's getting easier, hex1, hex2 and catm are now fixed. M0 next... <stikonas>GoogulatorMobile: PSP runs on all AMD CPUs too for at least a decade <stikonas>though I think PSP does not have any networking capability <stikonas>raspberry pi has some open firmware too actually <stikonas>though I think they only managed to bring up bare minimu <stikonas>well, as part of bootstrapping, you write your own compilers anyway <GoogulatorMobile>You do, but I would rather not include writing a native compiler for a largely secret and undocumented ISA that has never in its existence executed a compiler <GoogulatorMobile>VC4 would be more than powerful enough to bootstrap on, the problem is Broadcom's attitude <stikonas>anyway, I'm not suggesting bootstrapping on raspberry pi... <muurkha>GoogulatorMobile: Intel ME is pretty bad, yeah <muurkha>the VideoCore IV ISA isn't secret and undocumented <GoogulatorMobile>Intel ME is even worse than VC4 - VC4 at least lets you load your own code without signing, if you have a way of compiling for it <muurkha>what's secret and undocumented is the source code of the firmware they run on it <Googulator>I can also reproduce this locally on Ubuntu 22.04 in WSL2 <Googulator>basically, it seems the bwrapped environment lacks the CAP_MKNOD capability - but then how did this work before the refactor? <Googulator>(the step that fails is populate_device_nodes in both GitHub and WSL2) <Googulator>ok, it seems to be an easy fix, as far as getting bwrap to work in general <Googulator>all of the offending mknod operations were in sysb originally - bwrap/chroot just straight up skipped sysb <Googulator>now that sysb is no more, the mknods are now attempted in bwrap, and they fail <fossy>Googulator: don't worry about CI right now, I have plans for that <Googulator>fossy: I read that too late - I already have it working, at least locally :) <lrvick>Hey all. I am trying to make a general purpose deterministic multi-party-signed container images that don't trust any single existing linux distribution. I got rust bootstrapped all the way from gcc this way, and now I am going down the other path, replacing my alpine sourced "seed" gcc with a full-source-bootstrapped gcc. This rabbithole led me here. <lrvick>My understanding is the next thing I need to do on my path to gcc is build Mes with M2-Planet. <lrvick> > [build 7/7] RUN ["/M2-Planet","--debug","--architecture","amd64","-f","src/mes.c"]: <lrvick>0.220 src/mes.c:35:ERROR in create_struct <lrvick>I am assuming the source is fine and I am just skipping some steps, or missed some docs somewhere <muurkha>lrvick: sounds like an awesome project! <lrvick>okay, looks like maybe I can maybe adapt the chroot path to a container <matrix_bridge><Andrius Štikonas> bwrap is using Linux namespaces just like docker, just a bit lower level tool <lrvick>Well, goal is that anyone can take this Dockerfile and run it from podman, buildah, docker, kaniko, or any other OCI compatible runner for maximum portability, and get the same hash <lrvick>without having to care what their host OS is <matrix_bridge><Andrius Štikonas> Andrius Štikonas: also you were building mes.c without any headers... <stikonas>Irvise: yeah, but you need to adjust steps in Dockerfile... <stikonas>and I suggest not starting with debian or alpine... <stikonas>but start with "scratch" if you want really to bootstrap from source <lrvick>Well I start with all three, and compare the hashes, in the stage0 one at least. <stikonas>yeah, but if you run the right binaries, you don't need anything from host containers <lrvick>then in the "mes" package I only use debian for downloading sources from the internet, then I pivot to a scratch container <stikonas>yeah, you do need to download stuff... true <lrvick>I download with the untrusted debian container, then pivot to scratch, and check the hash there with stage0 built sha256sum <lrvick>which seems about as optimal as I can get <lrvick>has anyone tested if the final gcc13 produced by live-bootstrap currently deterministic? Would love hashes to compare against <ekaitz>GoogulatorMobile & stikonas : i sent the patch that avoids double runs to bug-mes <ekaitz>it's just deleting one line, but it's there so you can just take it and apply it to you mes download if you want to make it faster <lrvick>I barely have any idea what I am doing with this, so probably something tumb <stikonas>Irvise: that script generator is very new, probably some bug in it but it could be caused by e.g. missing files? <stikonas>lrvick: ok, building podman now, will try those dockerfiles soon <stikonas>though even in docker scratch we depend on docker binary and also kernel <lrvick>well you depend on -any- oci capable runner of which several exist <lrvick>and if they all agree in a reproducible build setup, then that is very confidence inspiring <lrvick>also OCI runners have been adapted to other kernels that provide a linux system call interfac, like FreeBSD. <stikonas>well, yeah, I'm using podman both at home and at work... <stikonas>though pure live-bootstrap can also do kernel bootstrap on BIOS systems... <stikonas>(we haven't got UEFI bootstrap fully working yet, but there is some work in progress) <lrvick>yeah, so like if you used podman+linux and I used docker+freebsd then we had no code overlap in our runtimes, and then if everything is deterministic from there, great. I am optimizing for maximum diversity. <lrvick>and hey, if my the builds in my easily-CI-able bootstrap container end up matching the hashes people do by hand on baremetal, the confidence keeps stacking <stikonas>yeah, they should match if everything is right <stikonas>and yes, more diversity is good, that's why even live-bootstrap itself supports a few different running modes <stikonas>I actually planned to look at docker option a couple of years ago, but then doras implemented rootless bubblewrap mode, so I didn't do any work on docker... <lrvick>mostly my work here is motivated by the overall supply chain integrity story of pretty much every container image that powers the whole internet at this point is based on alpine which is neither signed nor reproducible, and the non-musl alternatives are worse in other ways. <lrvick>and all my clients want do do reproducible builds of their mission critical software from a wide range of systems and get the same result, and docker is the thing everyone knows. <lrvick>but yeah... I bootstrap all my stuff from alpine as I needed a musl gcc, so I just kicked the can further down the stack. <stikonas>yeah, we do build musl gcc in live-bootstrap :), so you just need to port it to docker... <lrvick>Yeah, on paper it seems like this should be a drop-in replacement for the alpine container I bootstrap the rest of my distro with. <lrvick>other than my weird core dumps above. no clue how to debug that <lrvick>but happy to learn if anyone gets a moment to poke it <stikonas>my first guess is that some file is missing? <lrvick>likely. I am only exporting the bin/x86 and m2libc folders from my stage0 build container <lrvick>so if there is anything obvious beyond those I need, lmk <lrvick>trying to have only what is aboslutely required in each stage to make it easier to audit <fossy>oh hello lrvick, i recall you from hashbang <fossy>i can't promise that live-bootstrap will be a drop in replacement, but it should be very close to ^-^ <fossy>re: reproducibility, all package checksums are in steps/SHA256SUMS.pkgs; live-bootstrap should be 100% reproducible <stikonas>lrvick: I seem to hit some issues earlier <stikonas>I run DOCKER_BUILDKIT=1 SOURCE_DATE_EPOCH=1 /usr/bin/docker build -t local/live:latest --build-arg REGISTRY=local --platform linux/amd64 --progress=plain src/bootstrap/live <stikonas>and then it runs STEP 14/24: FORCE_TIMESTAMPS=False and then fails <stikonas>I think because line endings are not escaped <stikonas>so docker build treats RUN cat <<EOF > /rootfs/steps/bootstrap.cfg as the whole command <lrvick>also about to pass out any moment. 6am... so may be a few hours before I can take a look. <stikonas>you didn't start with the seed (pine cone) <stikonas>so not much left tilll stage0-uefi is fixed <stikonas>and then I probably should try to write a POSIX kernel on UEFI... <stikonas>probably easier way forward than porting meslibc to UEFI... <stikonas>goggles-bot: before I opened that link, I thought it will be something on one of the pine64 products... <stikonas>by the way, how come that pine asset appears under fosslinux repo? <lrvick>stikonas: looks like you were able to build the stage0 image though? <lrvick>Okay, good to know that one is at least portable. So probably just my heredoc usage <stikonas>and podman first of all complained about command argument <lrvick>ah yeah, that is a buildkit specific thing <stikonas>but even if I removed it, it still didn't work (I think cat <<< EOF problem... <stikonas>as for you crash, I haven't reached it yet, but maybe worth checking if you have all the files that script-generator tries to fopen... <stikonas>lrvick: looking at dockerfile, it seems that you copied it to /rootfs/steps <stikonas>and script generator tries to open /steps <lrvick>stikonas: okay redid it without a heredoc since I can't figure out why podman hates those <lrvick>stikonas: I put everything in /rootfs in the "fetch" stage, but then it gets into the "build" stage on line 41, which dumps everything from /rootfs into /, once I am safely under a "from scratch" <lrvick>I'll get a tree of the final filesystem. sec <stikonas>ok, now it segfaults in script generator <lrvick>I mean not great, but you replicated where I am <stikonas>maybe FILE *env = fopen("/steps/env", "r"); ? <stikonas>script-generator opens it but does not check if env is nullptr <lrvick>I can cat /steps/env and it indeed has env vars in it <lrvick>docker run --volume ~/.local/bin/busybox:/bin/busybox --user 0:0 --entrypoint /bin/busybox -ti local/live:latest sh -c "/bin/busybox --install -s /bin && sh" <lrvick>use that to get a busybox shell inside the scratch container if you like <lrvick>ACTION figures out static strace binary <lrvick>I am not copying all of stage0 inside. only the "x86" and "M2Libc" dirs make it over. I'll test copying the kitchen sink to test, but my (likely incomplete) understanding is I only need those two dirs. <stikonas>I would guess so (just x86 and M2libc...) <stikonas>I don't see anything bad with openat syscalls... <lrvick>Good to know I am not doing something super obviously dumb, though that would have led to a faster answer ^_^ <lrvick>Okay yeah, copying the whole stage0 tree over to / had no impact <stikonas>hmm, now I ran both with -e opeat,execve <stikonas>and I don't see script generator opening anything at all <stikonas>maybe I should do -ff to get split logs... <stikonas>open("/steps/lwext4-1.0.0-lb1/files/fiwix-file-list.txt", O_RDONLY) = -1 ENOENT <lrvick>that is way deeper in the tree than I have interacted with <lrvick>other files exist in that folder <stikonas>recently fossy refactored live-bootstrap alot <lrvick>Yeah, looks like I made it just in time to kick the tires on the less-trodden paths post refactor <stikonas>and it's not mentioned in the Readme (in the pythonless section) <lrvick>the the bug might not be by brain this time <lrvick>in addition to suiting my own needs, this docker setup might become a useful end to end test suite... <lrvick>no mention in the readme of making a manual bootstrap.cfg either but that one I got a useful error on so easy to work out <stikonas>well, you can send a PR with readme updates... <lrvick>Yeah for sure. Once a working path is confirmed :D <lrvick>okay so I need to replicate that in shell. easy enough <lrvick>also what strace incantation ended up being most useful for you there to find that <lrvick>I assume you were just wrapping podman in strace ff? <lrvick>never tried stracing from -outside- the container <stikonas>lrvick: I just ran strace -ff -o logs/logs <stikonas>initially I did -f but that creates just one log with everything <stikonas>just had to find file that corresponds to script-generator <stikonas>my initial mistake was filtering on openat <lrvick>I don't know that this would work with docker given the socket indirection, but not sure how podman does process forking <stikonas>it's just that they are in a separate mount namespace <stikonas>anyway, I ran "strace -o logs/logs -ff /usr/bin/docker build -t local/live:latest --build-arg REGISTRY=local --platform linux/amd64 --progress=plain src/bootstrap/live" <stikonas>anyway, that's quite pleasant to debug compared to stage0-uefi... <stikonas>maybe it's possible to wire in gdb there too, but it's non-trivial <stikonas>so I was mostly debugging that with return condes <lrvick>actually, isn't fiwix only relevant for the live-boot use case vs baremetal or chroot? <stikonas>lrvick: yeah, it's only relevant to baremetal bootstrap <stikonas>lrvick: you could try fixing that crash in script-generator... <lrvick>So then the actual issue is I should be skipping looking at fiwix at all with my config <stikonas>i.e. print a useful error message rather than crash <lrvick>I suck at C, but that sounds easy enough <lrvick>Ah, I didn't want the tooling to actually try to use a chroot in the container, so my config is BARE_METAL=True, but then that tries to do fwix and other things not relevant to me. <lrvick>I basically need the chroot path, without actually doing a chroot pivot since the container covers that <lrvick>so maybe creating a "container" path is the right call here. <lrvick>going to let it try the chroot path the vanilla way and see if everything is happy, then try ripping out the actual chroot pivot with a flag <stikonas>no, you definitely don't want BARE_METAL=true... <stikonas>that is either for real bare metal or at the very least qemu mode <GoogulatorMobile>By writing a draft comment, embedding an image, and clicking Preview, images can be uploaded to GitHub, and they persist (at least for some time) even after you discard the draft <GoogulatorMobile>I meant to do that on googulator/live-bootstrap, but accidentally uploaded under fosslinux/live-bootstrap instead <stikonas>that pine was replaced by alpine some time ago... <GoogulatorMobile>hmm, looked more at this image upload trick, it's a lot more exploitable than I thought <GoogulatorMobile>it's possible to upload arbitrary binaries under any public repo, not even just images <stikonas>oriansj: you only added $ immediates to M1, not M0?