<doras>Is there a docker/oci system image (or similar) which can be bootstrapped using a set of publicly available bootstrappable components? Something projects can rely upon to build larger systems? <stikonas>doras: live-bootstrap, although we don't have docker mode <stikonas>doras: there is chroot mode or qemu mode <stikonas>(one could also try running it on real hw but I don't think you are interested in that right now) <stikonas>well, I guess there are two separate things that one can try with docker images <stikonas>1) run live-bootstrap and then put resulting binaries into scratch docker image <stikonas>2) create a docker image that contains live-bootstrap's starting sources and run it in docker <stikonas>2. might hit some docker restrictions that are not present in e.g. chroot <doras>It would be nice if we could have the resulting image of (1) publicly available somewhere. <stikonas>yes, that might be useful, but so far nobody looked at it. If you have some time, you can try, it shouldn't be too hard <stikonas>one of the issues is where can we run it and store results <stikonas>there rootfs.py script used to download other source tarballs, so it needs python3 and python3-requests and then you need sudo and chroot to run it <oriansj>doras: the short answer is no one has yet had a need for a docker image; so no one made one yet that I know of but it should be rather simple to make such an image. <doras>An OSTree repo may also be useful to have. <doras>stikonas: regarding running it, it's most likely possible to do so using CI. <doras>I'm more familiar with GitLab's, which can definitely run QEMU and also has a docker registry where you can store the result. <stikonas>doras: neither of the free CI's would let you run it for as long as live-bootstrap needs <doras>How long does a usual build require? <stikonas>live-bootstrap needs more than 2h if you run it in qemu <stikonas>even chroot mode is close to 2h on my laptop <stikonas>well, it might be a bit quicker if you disable guile at the end... <doras>We're interested in making our runtime fully bootstrappable. We currently rely on our previous images to build our newer images, with yocto (I think?) being the foreign source used by the first runtime. <stikonas>yeah, live-bootstrap can let you start from hex0 <stikonas>right now it only goes up to GCC 4.7.4 but you can build newer GCC with it <stikonas>especially if you are not too worried about pregenerated files in GCC source <stikonas>yeah, I think bootstrappable.org project can be a good base for freedesktop-sdk <stikonas>but there is some gap that has to be bridged <stikonas>first of all hex0 to gcc bootstrap only works on x86, rest of the arches have to be cross-compiled <stikonas>non-x86 arches only have early bootstrap sorted (e.g. hex0 to simple C compiler, either M2-Planet or mescc) <doras>Is the x86 system we have so far enough to cross-compile other architectures? <doras>We need aarch64 at minimum, but probably ppc64le and risc-v too. <doras>I didn't even think about x86_64, it was too obvious. <doras>So how would it work? Say we use live-bootstrap in a CI job along with QEMU to produce the base x86 system. What next? We'll need another layer to cross compile similar base systems for other architectures using the x86 base? <stikonas[m]>Or add hook to live bootstrap to run more stuff that you need <oriansj>doras: well just getting from hex0 to your runtime on just x86 is a stepping stone to future progress. Which will be made in live-bootstrap as we get more architectures working <oriansj>cross compiling will provide a short term solution <oriansj>a single step in the right direction is more important than doing everything perfectly in a single step. <doras>stikonas[m]: Root is an issue for CI. <stikonas[m]>User namespaces might be alternative to chroot but nobody tried it yet... <fossy>i would love to see live-bootstrap used for applications such as this but be aware it is in still some state of flux, many architectural designs are not set in stone <fossy>doras: depending on what the dependency is that uses the previous image to build the newer image, i would run live-bootstrap in QEMU to the end, then cross compile a toolchain and basic system for any other required architectures <fossy>btw, i'm working on "packaging" each component of live-bootstrap, so that you can jump to any arbitary point in the bootstrap from those pacakges <fossy>up until perl in sysc it just uses tar & fakeroot and past that xbps <fossy>i'm open to chnaging from xbps but that's just what i am most familar with ***ChanServ sets mode: +o janneke
<doras>So why is chroot actually required? Only to have filesystem-level isolation? <stikonas>doras: mostly that but we also mount stuff inside chroot <stikonas>so if you want to have non-chroot mode, some extra work would be required <doras>So does QEMU do "more" boostrapping in comparison to the chroot method? I'm guessing you need to bootstrap a Linux user space too. <stikonas>doras: userspace bootstrap is the same, but qemu also at some point later (after gcc) builds linux-libre kernel and boots it <stikonas>doras: userspace parts (non-kernel are the same) <doras>Do you need a real /dev for the bootstrap process? <stikonas>right now kernel is a blob for bootstrap purposes... <stikonas>because we create a qemu disk that bootstrap process mounts <stikonas>I guess some configure scripts fail without them <stikonas>this might actually be more than what is necessary <stikonas>random and urandom are probably not used... Binaries are deterministic anyway <doras>And do these have to point at the "real" /dev of the running system? <stikonas>guile is only deterministic in qemu mode <stikonas>doras: they are created inside bootstrap <doras>We actually need binaries to be deterministic/reproducible for caching purposes. <stikonas>it's just that guile non-trivially depends on the kernel <stikonas>in qemu it's fine since we build our own kernel before that <stikonas>anyway, guile is just built but not used... <stikonas>there is nothing in live-bootstrap that depends on guile <stikonas>but GNU autogen is not really bootstrappable in no-pregenerated files sense <stikonas>so we just ended up with guile in live-bootstrap that is not used <stikonas>it might still be useful if once wants to run GNU Guix... <stikonas>although these are manually created and we mostly checksummed just binaries, not text scripts <doras>I see the checksum is skipped for guile. <stikonas>it would be good if somebody could patch out dependency on kernel... <stikonas>in general we try to patch all non-determinism out <stikonas>but there is always more work than people to do that work... <stikonas>especially considering that live-bootstrap is only on of the projects here <stikonas>(live-bootstrap scripting is mostly written by fossy and me) <doras>I think we can use ushare(2) to allow chroot. <stikonas>one thing that would be nice to have but is not implemented in live-bootstrap is "packages" inside it now we only produce the whole filesystem image with everything in it <stikonas>so it might not be easy to take just some binaries out of it <stikonas>although, all binaries there are static, so often is just that 1 binary <doras>We use BuildStream to achieve something like this, but I bet this is more difficult before you can actually run Python :) <stikonas>and also PREFIX variable is supported, so e.g. make install PREFIX=/staging can be used <stikonas>at this point python should be buildable in live-bootstrap <doras>We basically need to be able to reach modern GCC, Python, and everything else you'd expect from a minimal but modern development SDK and runtime environment. <stikonas>well, at this stage live-bootrap ends somewhere where you might start building linux from scratch <stikonas>I think these packages should be buildable <doras>Well, a previous "minimal" image called "PreBootstrap". <stikonas>some are already built, but not everything <stikonas>you'll probably want to start by re-building toolchain using live-bootstrap image <stikonas>(although gcc 11 can't be directly built, have to build some earlier gcc first, maybe 10) <stikonas>ok, we mostly have thse pre-bootstrap.bst packages <stikonas>with the exception of last 3 (python, rsync and go) <doras>Go comes from a binary at the moment. We can ignore it for now. <stikonas>well, go is not hard to bootstrap once you have recent gcc <stikonas>well, that one is in principle bootstrappable (with mrustc) but also takes time <doras>I have a strange question. I want to run live-bootstrap locally to see what I get, but it feels... risky. <doras>Do I need to worry about running it in my personal system? <doras>I usually build things sandboxed. <stikonas>chroot and qemu modes should give enough sandboxing <stikonas>by baremetal I mean take the created initramfs and boot it <stikonas>then it might be dangerous if you reformat the wrong drive <doras>Yes, I won't be doing that ;) <stikonas>and we create temporary mounts for chroot mode <stikonas>so after you exit, nothing should be modified <doras>So where should I expect to see the build products? <doras>We'll need a sort of "install" mode for this. <doras>So the end result of running rootfs.py would be directory containing the sysroot that was bootstrapped. <stikonas>doras: right now live-bootstrap finishes and starts interactive bash <stikonas>so build products are in those temp directories <doras>Temp like tmpfs or just directories you create wherever rootfs.py is executed? <doras>I wonder if we can do that without root... <stikonas>and for "install" mode we might want to add a post-live-bootstrap hook <stikonas>just like stage0-posix has that hook that live-bootstrap replaces to run itself on top of stage0-posix <stikonas>after that tarballs are used (stage0-posix contains simple untar and ungz implementations) <fossy>You can assume reasonable safety from live-bootstrap <fossy>Chroot has a small possibility that we might overwrite a disk somehow, but I cant think how that would happen <doras>Where can I see the build products when running in QEMU? <doras>Does it create a disk image or something? <stikonas>chroot mode does not create device nodes for hard drives <stikonas>doras: there is hard drive image created in sysc stage <stikonas>getting stuff out of qemu is indeed tricky... <stikonas>it's probably easier now that we create disk image in sysc stage <doras>It seems creating tmpfs mounts requires some elevated permissions. <stikonas>doras: well, you can try without tmpfs mounts <stikonas>oh, but right now chroot mode does recursive chroot later <fossy>Yes, you can see the hard drive image <stikonas>so it's not completely trivial to get rid of root <stikonas>if running in docker as root is possibility, you can try that <stikonas>but we don't have any scripting yet for that <doras>I see it also downloads some sources. <stikonas>doras: yes, all those packages that it compiles <doras>The Internet is not accessible during our builds, only during the initial source fetch. <stikonas>yes, rootfs.py fetches everything after stage0-posix as tarball <doras>The definition of "source fetch" is based on the BuildStream plugin, but generally for git repositories it means the repository itself and its submodules. <stikonas>well, those tarballs are fetched only once and are later cached <stikonas>maybe one can add command line argument to rootfs.py to do --fetch-only <stikonas>oh, buildstream plugin wouldn't be able to run roofs.py even if we add --fetch-only? <stikonas>doras: although, that should be workaroundable I guess... <stikonas>sources directory is cached, rootfs.py does not have to download every time <stikonas>if files are there and have the right checksum, it will happily use them <doras>If we were to run live-bootstrap in CI, we can't have it fetch sources "at build time". <stikonas>well, then we somehow need to prefetch it before build time <doras>So we'll either need to write a plugin just for this that first does "--fetch-only", or provide the required files ourselves. <doras>The former being kind of an overkill. <stikonas>although file list might change if you move to another commit in live-bootstrap <doras>Do you checksum the fetched files? <stikonas>doras: although, to be more precise, only rootfs.py checks hash of fetched files <stikonas>the file with checksums is already copied into rootfs <doras>You mean it's checked only when initially fetched, or? <oriansj>doras: we also build sha256sum before we even use the tars, so we could check prior to use as well <stikonas>we build (custom) sha256sum very early in bootstrap (in stage0-posix), so we could use that but right now it's only used to check compiled binaries <stikonas>anyway, it's probably sufficient to check outside live-bootstrap <doras>So rootfs.py mostly prepares the environment using host tools, and when it's done we jump into bootstrap mode where we're only allowed to use what we build? <oriansj>doras: only use what sources we have downloaded <doras>Also, BuildStream has a built-in source cache, so it's better to do the fetching ourselves anyway. Sources can then be cached and accessible to different CI builders, etc. <doras>So maybe we'll need a plugin after all, and some kind of manifest at live-bootstrap's level. <stikonas>well, strictly speaking root seed is not just hex0 but also kaem-optional-seed <stikonas>kaem-optional-seed is basically trivial shell <stikonas>it reads list of commands from the file and runs them <doras>BuildStream also verifies the checksum of every source it fetches, so a manifest would contain both the source location and expected checksum. <stikonas>hmm, I wonder if manifest can be automatically created <stikonas>but parsing them without pythin is probably too complicated