IRC channel logs

2021-12-24.log

back to list of logs

<doras>Is there a docker/oci system image (or similar) which can be bootstrapped using a set of publicly available bootstrappable components? Something projects can rely upon to build larger systems?
<stikonas>doras: live-bootstrap, although we don't have docker mode
<stikonas>doras: there is chroot mode or qemu mode
<stikonas>(one could also try running it on real hw but I don't think you are interested in that right now)
<stikonas>doras: https://github.com/fosslinux/live-bootstrap/
<stikonas>well, I guess there are two separate things that one can try with docker images
<stikonas>1) run live-bootstrap and then put resulting binaries into scratch docker image
<stikonas>2) create a docker image that contains live-bootstrap's starting sources and run it in docker
<stikonas>the former is probably simpler
<stikonas>2. might hit some docker restrictions that are not present in e.g. chroot
<doras>It would be nice if we could have the resulting image of (1) publicly available somewhere.
<stikonas>yes, that might be useful, but so far nobody looked at it. If you have some time, you can try, it shouldn't be too hard
<stikonas>one of the issues is where can we run it and store results
<stikonas>runnig itself is actually not hard
<stikonas>the dependencies are fairly minimal
<stikonas>there rootfs.py script used to download other source tarballs, so it needs python3 and python3-requests and then you need sudo and chroot to run it
<stikonas>qemu more also needs qemu and cpio
<stikonas>(and pre-built 32-bit kernel)
<oriansj>doras: the short answer is no one has yet had a need for a docker image; so no one made one yet that I know of but it should be rather simple to make such an image.
<doras>I see.
<doras>An OSTree repo may also be useful to have.
<doras>stikonas: regarding running it, it's most likely possible to do so using CI.
<doras>I'm more familiar with GitLab's, which can definitely run QEMU and also has a docker registry where you can store the result.
<doras>We use it in the freedestop-sdk project (https://gitlab.com/freedesktop-sdk/freedesktop-sdk).
<doras>You can see the source of our Docker-generating CI here: https://gitlab.com/freedesktop-sdk/infrastructure/freedesktop-sdk-docker-images
<stikonas>doras: neither of the free CI's would let you run it for as long as live-bootstrap needs
<doras>How long does a usual build require?
<stikonas>live-bootstrap needs more than 2h if you run it in qemu
<stikonas>even chroot mode is close to 2h on my laptop
<stikonas>well, it might be a bit quicker if you disable guile at the end...
<stikonas>guile takes quite a long time
<doras>We're interested in making our runtime fully bootstrappable. We currently rely on our previous images to build our newer images, with yocto (I think?) being the foreign source used by the first runtime.
<stikonas>yeah, live-bootstrap can let you start from hex0
<stikonas>right now it only goes up to GCC 4.7.4 but you can build newer GCC with it
<stikonas>especially if you are not too worried about pregenerated files in GCC source
<stikonas>yeah, I think bootstrappable.org project can be a good base for freedesktop-sdk
<stikonas>but there is some gap that has to be bridged
<stikonas>first of all hex0 to gcc bootstrap only works on x86, rest of the arches have to be cross-compiled
<stikonas>non-x86 arches only have early bootstrap sorted (e.g. hex0 to simple C compiler, either M2-Planet or mescc)
<stikonas>but nothing else goes to TCC
<doras>Is the x86 system we have so far enough to cross-compile other architectures?
<doras>We need aarch64 at minimum, but probably ppc64le and risc-v too.
<stikonas[m]>Yes, GCC on x86 can compile others
<stikonas[m]>Also you need amd64
<stikonas[m]>But it can run x86 code natively
<doras>I didn't even think about x86_64, it was too obvious.
<doras>So how would it work? Say we use live-bootstrap in a CI job along with QEMU to produce the base x86 system. What next? We'll need another layer to cross compile similar base systems for other architectures using the x86 base?
<stikonas[m]>x86_64 might be the easiest next arch to build
<stikonas[m]>Yes I think you produce base x86 image
<stikonas[m]>Then use another layer to cross-build
<stikonas[m]>Or add hook to live bootstrap to run more stuff that you need
<stikonas[m]>We should add something like https://github.com/oriansj/stage0-posix/blob/master/after.kaem to live-bootstrap
<stikonas[m]>I.e. empty shell file that is run at the end
<stikonas[m]>doras: qemu is actually not required
<stikonas[m]>Can just run in chroot if your hw is x86_64
<oriansj>doras: well just getting from hex0 to your runtime on just x86 is a stepping stone to future progress. Which will be made in live-bootstrap as we get more architectures working
<oriansj>cross compiling will provide a short term solution
<oriansj>a single step in the right direction is more important than doing everything perfectly in a single step.
<doras>stikonas[m]: Root is an issue for CI.
<stikonas[m]>Then qemu is the way to go...
<stikonas[m]>If you can get nested KVM on your worker
<stikonas[m]>Without KVM it would be too slow
<doras>KVM works.
<stikonas[m]>User namespaces might be alternative to chroot but nobody tried it yet...
<stikonas[m]>I.e. run live-bootsrap in podman
<fossy>i would love to see live-bootstrap used for applications such as this but be aware it is in still some state of flux, many architectural designs are not set in stone
<fossy>doras: depending on what the dependency is that uses the previous image to build the newer image, i would run live-bootstrap in QEMU to the end, then cross compile a toolchain and basic system for any other required architectures
<stikonas[m]>fossy: maybe worth making a release?
<fossy>stikonas[m]: hm, maybe
<stikonas[m]>At least git tag...
<stikonas[m]>We do have a working C/C++ compiler
<fossy>btw, i'm working on "packaging" each component of live-bootstrap, so that you can jump to any arbitary point in the bootstrap from those pacakges
<fossy>to make development easier
<stikonas[m]>fossy: oh that's nice
<stikonas[m]>I think bauen tried it bad abandonned
<fossy>up until perl in sysc it just uses tar & fakeroot and past that xbps
<fossy>i'm open to chnaging from xbps but that's just what i am most familar with
<stikonas[m]>Oh OK, it's slightly different approach
***ChanServ sets mode: +o janneke
<doras>So why is chroot actually required? Only to have filesystem-level isolation?
<stikonas>doras: mostly that but we also mount stuff inside chroot
<stikonas>e.g /tmp, /dev/, etc...
<stikonas>so if you want to have non-chroot mode, some extra work would be required
<doras>So does QEMU do "more" boostrapping in comparison to the chroot method? I'm guessing you need to bootstrap a Linux user space too.
<stikonas>doras: userspace bootstrap is the same, but qemu also at some point later (after gcc) builds linux-libre kernel and boots it
<stikonas>chroot mode completely skips it
<stikonas>doras: userspace parts (non-kernel are the same)
<stikonas> https://github.com/fosslinux/live-bootstrap/blob/master/parts.rst
<stikonas>71-74 are skipped in chroot mode
<doras>Do you need a real /dev for the bootstrap process?
<stikonas>right now kernel is a blob for bootstrap purposes...
<stikonas>doras: qemu mode definitely needs it
<stikonas>because we create a qemu disk that bootstrap process mounts
<doras>But for chroot?
<stikonas>let me check
<stikonas>I don't remember exactly
<stikonas>doras: we create these nodes https://github.com/fosslinux/live-bootstrap/blob/master/sysglobal/helpers.sh#L170
<stikonas>I guess some configure scripts fail without them
<stikonas>this might actually be more than what is necessary
<stikonas>but at least some are needed
<stikonas>random and urandom are probably not used... Binaries are deterministic anyway
<stikonas>well, maybe with the exception of guile
<doras>And do these have to point at the "real" /dev of the running system?
<stikonas>guile is only deterministic in qemu mode
<stikonas>doras: they are created inside bootstrap
<stikonas>so it's kind of a copy
<doras>I see.
<stikonas>we have mknode inside live-bootstrap
<stikonas>mknod
<doras>We actually need binaries to be deterministic/reproducible for caching purposes.
<stikonas>well, all binaries should be checksumed
<stikonas>it's just that guile non-trivially depends on the kernel
<stikonas>in qemu it's fine since we build our own kernel before that
<stikonas>in chroot mode it's system kernel
<stikonas>anyway, guile is just built but not used...
<stikonas>there is nothing in live-bootstrap that depends on guile
<stikonas>initially it was built so that autogen could be bootstrapped (to regenerate pre-generated https://raw.githubusercontent.com/gcc-mirror/gcc/master/Makefile.in)
<stikonas>but GNU autogen is not really bootstrappable in no-pregenerated files sense
<stikonas>so we just ended up with guile in live-bootstrap that is not used
<stikonas>it might still be useful if once wants to run GNU Guix...
<stikonas>anyway, eacgh component in live-bootstrap should have checksum file such as https://github.com/fosslinux/live-bootstrap/blob/master/sysa/gcc-4.0.4/checksums/pass2
<stikonas>although these are manually created and we mostly checksummed just binaries, not text scripts
<stikonas>still, those should be deterministic
<doras>I see the checksum is skipped for guile.
<stikonas>doras: in chroot mode
<doras>Yes
<stikonas>it would be good if somebody could patch out dependency on kernel...
<stikonas>but in qemu mode we are fine
<stikonas>in general we try to patch all non-determinism out
<stikonas>e.g. binutils 2.14 was patched out to not include timestamps https://github.com/fosslinux/live-bootstrap/blob/master/sysa/binutils-2.14/patches/deterministic_binutils.patch
<stikonas>but there is always more work than people to do that work...
<stikonas>especially considering that live-bootstrap is only on of the projects here
<stikonas>(live-bootstrap scripting is mostly written by fossy and me)
<doras>I think we can use ushare(2) to allow chroot.
<doras>unshare(2)*
<stikonas>one thing that would be nice to have but is not implemented in live-bootstrap is "packages" inside it now we only produce the whole filesystem image with everything in it
<stikonas>so it might not be easy to take just some binaries out of it
<stikonas>although, all binaries there are static, so often is just that 1 binary
<doras>We use BuildStream to achieve something like this, but I bet this is more difficult before you can actually run Python :)
<stikonas>indeed
<stikonas>there are other ways but it needs work
<stikonas>we do have perl
<stikonas>and also PREFIX variable is supported, so e.g. make install PREFIX=/staging can be used
<stikonas>to create individual packages inside
<stikonas>and then maybe use stow for symlinks
<stikonas>at this point python should be buildable in live-bootstrap
<stikonas>fossy did have a quick look at it
<stikonas>need to build older python2 first...
<doras>We basically need to be able to reach modern GCC, Python, and everything else you'd expect from a minimal but modern development SDK and runtime environment.
<doras>We currently bootstrap the following using our previous release image: https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/blob/master/elements/bootstrap/bootstrap.bst
<stikonas>well, at this stage live-bootrap ends somewhere where you might start building linux from scratch
<stikonas>I think these packages should be buildable
<doras>Well, a previous "minimal" image called "PreBootstrap".
<stikonas>some are already built, but not everything
<doras>Which contains, in addition to the above, the following: https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/blob/master/elements/pre-bootstrap.bst
<stikonas>you'll probably want to start by re-building toolchain using live-bootstrap image
<stikonas>i.e. glibc/gcc
<stikonas>(although gcc 11 can't be directly built, have to build some earlier gcc first, maybe 10)
<stikonas>ok, we mostly have thse pre-bootstrap.bst packages
<stikonas>with the exception of last 3 (python, rsync and go)
<doras>Go comes from a binary at the moment. We can ignore it for now.
<stikonas>well, go is not hard to bootstrap once you have recent gcc
<stikonas>or via go 1.4...
<stikonas>(which is buildable with older gcc too)
<stikonas>doras: I guess you use rust binary too?
<doras>Yes
<stikonas>for librsvg...
<stikonas>well, that one is in principle bootstrappable (with mrustc) but also takes time
<doras>I have a strange question. I want to run live-bootstrap locally to see what I get, but it feels... risky.
<doras>Do I need to worry about running it in my personal system?
<doras>I usually build things sandboxed.
<stikonas>doras: no, it should be fine
<stikonas>unless you actually run it on baremetal
<doras>I'll try the chroot method.
<stikonas>chroot and qemu modes should give enough sandboxing
<stikonas>by baremetal I mean take the created initramfs and boot it
<stikonas>then it might be dangerous if you reformat the wrong drive
<doras>Yes, I won't be doing that ;)
<stikonas>and we create temporary mounts for chroot mode
<stikonas>so after you exit, nothing should be modified
<doras>So where should I expect to see the build products?
<doras>We'll need a sort of "install" mode for this.
<doras>So the end result of running rootfs.py would be directory containing the sysroot that was bootstrapped.
<stikonas>doras: right now live-bootstrap finishes and starts interactive bash
<stikonas>so build products are in those temp directories
<doras>Temp like tmpfs or just directories you create wherever rootfs.py is executed?
<stikonas>rootfs.py uses actual tmpfs
<stikonas> https://github.com/fosslinux/live-bootstrap/blob/master/lib/sysgeneral.py#L40
<stikonas>mostly to not mess up your system...
<stikonas>but one can always add other modes...
<doras>I wonder if we can do that without root...
<stikonas>and for "install" mode we might want to add a post-live-bootstrap hook
<doras>I'll try.
<stikonas>just like stage0-posix has that hook that live-bootstrap replaces to run itself on top of stage0-posix
<stikonas>stage0-posix are first steps that go from hex0 to simple self-hosting subset of C compiler: https://github.com/oriansj/stage0-posix
<stikonas>it's a git submodule in live-bootstrap
<stikonas>after that tarballs are used (stage0-posix contains simple untar and ungz implementations)
<fossy>You can assume reasonable safety from live-bootstrap
<fossy>Especially QEMU mode
<fossy>Chroot has a small possibility that we might overwrite a disk somehow, but I cant think how that would happen
<stikonas>I don't see it happening either...
<doras>Where can I see the build products when running in QEMU?
<doras>Does it create a disk image or something?
<stikonas>chroot mode does not create device nodes for hard drives
<stikonas>doras: there is hard drive image created in sysc stage
<stikonas>before that it's all in RAM
<stikonas>getting stuff out of qemu is indeed tricky...
<stikonas>it's probably easier now that we create disk image in sysc stage
<doras>It seems creating tmpfs mounts requires some elevated permissions.
<stikonas>doras: well, you can try without tmpfs mounts
<stikonas>just create directories
<stikonas>oh, but right now chroot mode does recursive chroot later
<fossy>Yes, you can see the hard drive image
<stikonas>it chroots from sysa into sysc
<stikonas>so it's not completely trivial to get rid of root
<stikonas>if running in docker as root is possibility, you can try that
<stikonas>but we don't have any scripting yet for that
<doras>I see it also downloads some sources.
<stikonas>doras: yes, all those packages that it compiles
<doras>The Internet is not accessible during our builds, only during the initial source fetch.
<stikonas>yes, rootfs.py fetches everything after stage0-posix as tarball
<doras>The definition of "source fetch" is based on the BuildStream plugin, but generally for git repositories it means the repository itself and its submodules.
<stikonas>well, those tarballs are fetched only once and are later cached
<stikonas>maybe one can add command line argument to rootfs.py to do --fetch-only
<stikonas>oh, buildstream plugin wouldn't be able to run roofs.py even if we add --fetch-only?
<stikonas>doras: although, that should be workaroundable I guess...
<stikonas>sources directory is cached, rootfs.py does not have to download every time
<stikonas>if files are there and have the right checksum, it will happily use them
<doras>If we were to run live-bootstrap in CI, we can't have it fetch sources "at build time".
<stikonas>well, then we somehow need to prefetch it before build time
<doras>So we'll either need to write a plugin just for this that first does "--fetch-only", or provide the required files ourselves.
<doras>The former being kind of an overkill.
<stikonas>yes, the latter is probably simpler
<stikonas>although file list might change if you move to another commit in live-bootstrap
<stikonas>s/might/will often/
<doras>Do you checksum the fetched files?
<stikonas>doras: yes
<stikonas>here: https://github.com/fosslinux/live-bootstrap/blob/master/lib/sysgeneral.py#L47
<stikonas>doras: although, to be more precise, only rootfs.py checks hash of fetched files
<stikonas>we didn't do that inside bootstrap
<stikonas>although we could add that
<stikonas>the file with checksums is already copied into rootfs
<doras>You mean it's checked only when initially fetched, or?
<stikonas>doras: yes
<stikonas>with python that is on your host system
<oriansj>doras: we also build sha256sum before we even use the tars, so we could check prior to use as well
<stikonas>we build (custom) sha256sum very early in bootstrap (in stage0-posix), so we could use that but right now it's only used to check compiled binaries
<stikonas>anyway, it's probably sufficient to check outside live-bootstrap
<doras>So rootfs.py mostly prepares the environment using host tools, and when it's done we jump into bootstrap mode where we're only allowed to use what we build?
<oriansj>doras: only use what sources we have downloaded
<oriansj>and our root seed
<doras>I see.
<doras>Also, BuildStream has a built-in source cache, so it's better to do the fetching ourselves anyway. Sources can then be cached and accessible to different CI builders, etc.
<doras>So maybe we'll need a plugin after all, and some kind of manifest at live-bootstrap's level.
<stikonas>well, strictly speaking root seed is not just hex0 but also kaem-optional-seed
<stikonas>but yes we use just those and sources
<stikonas>kaem-optional-seed is basically trivial shell
<stikonas>it reads list of commands from the file and runs them
<doras>BuildStream also verifies the checksum of every source it fetches, so a manifest would contain both the source location and expected checksum.
<stikonas>hmm, I wonder if manifest can be automatically created
<stikonas>everything is in sys*.py files
<stikonas>but parsing them without pythin is probably too complicated