IRC channel logs

2025-01-23.log

back to list of logs

<matrix_bridge><Lance Vick> we are simply trying to find the shortest path from hex0 to gcc13
<matrix_bridge><Lance Vick> kernel can/should be a separate path
<matrix_bridge><Lance Vick> imo
<aggi>ok, along the way i suspect a fiwix (or linux abi) were necessary, one way or another
<matrix_bridge><Lance Vick> I have long dropped fiwix
<matrix_bridge><Lance Vick> now trying to clean up all the other bits I don't actually need or could work around
<fossy>kernel kinda already is a separate path (KERNEL_BOOTSTRAP flag), but granted things are built that are not required for the non-kernel bootstrap
<matrix_bridge><Lance Vick> Yeah like nuking all the python builds saved us a bunch of time
<matrix_bridge><Lance Vick> looking for any other wins like that
<fossy>yep, gotcha
<matrix_bridge><Lance Vick> we ask a lot of people to reproduce our bootstrap images, and that currently takes > 1 day
<matrix_bridge><Lance Vick> so anything we can do to cut that down will encourage more people to reproduce
<aggi>ok, the assumption then is an existing linux-runtime is out of scope when concerned with arriving at any gcc
<fossy>in terms of speed, trying to build GCC 10 without autogen would be very beneficial (removes autogen and guile from the path)
<matrix_bridge><Lance Vick> For our use case it is about going from a "from scratch" hex0 container, all the way up to gcc13 x86, and then to a bunch of cross compilers, and then to native packages, which can then replace every alpine/arch/fedora/debian container in the wild and dramatically improve supply chain security.
<matrix_bridge>It is assumed people are running this from a variety of different container runtimes and kernels
<fossy>possibly, you don't even require GCC 13
<fossy>and could deal with GCC 10
<aggi>while i was thinking still a _complete_ development host with a clean bootstrapping path towards a linux2.4 with SMP was a desirable baseline for just that, bootstrapping gcc/binutils/gnu based upon that
<fossy>aggi: lrvick's stagex doesn't care about some of the things that live-bootstrap does, such as the kernel bootstrap (which is valid for their case)
<fossy>they start with a Linux host running some kind of OCI containerisation
<matrix_bridge><Lance Vick> fossy: actually that is a good point. If we can pivot from x86 gcc10 to x86-x86_64 cross gcc that would save some time
<matrix_bridge><Lance Vick> provided gcc10 can build cross gcc13
<matrix_bridge><Lance Vick> but also our goal is to get clang/llvm as early as possible as well and switch to that
<fossy>what's your (i think you call it stage1?) final toolchain?
<fossy>llvm/musl
<fossy>?
<matrix_bridge><Lance Vick> in our model: stage0 is up to M2-Planet, stage1 is up to a modern x86 c compiler, stage2 is cross compilers, and stage3 are native compilers
<matrix_bridge><Lance Vick> by stage3 we want to be clang/musl
<matrix_bridge><Lance Vick> if we can drop any steps in the path to that, amazing
<fossy>okayyy
<matrix_bridge><Lance Vick> currently stage2 is gcc13 cross compilers built from live-bootstrap x86 gcc 13
<fossy>can gcc build cross clang?
<matrix_bridge><Lance Vick> llvm/clang in theory are always cross by design, and we can build modern llvm/clang directly from gcc13
<matrix_bridge><Lance Vick> but paths earlier than that, I don't know
<fossy>hmmmmmmm
<matrix_bridge><Lance Vick> being able to have a single llvm/clang will be -much- cheaper than having to build gcc13 several times for each arch though
<matrix_bridge><Lance Vick> so if we could somehow pivot to llvm/clang in stage2 that would be a massive win
<fossy>has that been attempted yet?
<matrix_bridge><Lance Vick> nope! currently focused on removing lots of steps that cost us time
<matrix_bridge><Lance Vick> so we can experiment faster
<fossy>yeppp gotcha
<matrix_bridge><Lance Vick> we are currently trying to multiarch our distro via a native multiarch stage3 with gcc13, then build our actual clang/llvm packages in core, then build the rest of our distro.
<matrix_bridge><Lance Vick> once that works end to end as a baseline, then we can go make another aggressive pass at bootstrap to try to get clang/llvm as early as possible
<matrix_bridge><Lance Vick> maybe even in stage1
<matrix_bridge><Lance Vick> as it would be a massive timesaver
<matrix_bridge><cosinusoidally> What's the benefit of uclibc over musl? Could musl be built with just tcc + a kaem script? If musl could be built that way then in theory it could be moved before bash.
<fossy>^ this is also feasible i think
<matrix_bridge><Lance Vick> That would be a much bigger win
<matrix_bridge><Lance Vick> Also just found this: https://github.com/ZilchOS/bootstrap-from-tcc
<matrix_bridge><Lance Vick> so this goes from tcc to busybox
<matrix_bridge><Lance Vick> then to gcc4 gcc10 and then clang
<matrix_bridge><Lance Vick> he goes from tinycc to musl to busybox with a single c file run by tinycc
<matrix_bridge><Lance Vick> https://github.com/ZilchOS/bootstrap-from-tcc/blob/main/recipes/1-stage1.c
<matrix_bridge><Lance Vick> I can't tell if this is insane or amazing. both?
<matrix_bridge><Lance Vick> So in his case he has no libc, so he starts with -just- tinycc as his seed and basically shoehorns in the basic libc functions he needs into this one c file
<fossy>yeah
<matrix_bridge><Lance Vick> wild
<fossy>both insane and amazing, yes haha
<fossy>x86_64 as well, hm
<matrix_bridge><Lance Vick> but since we actually have M2Libc by this point, this could be cleaned up and made much easier to read
<fossy>bits from here could absolutely be lifted to do musl earlier
<matrix_bridge><Lance Vick> well, by this point we have mes libc
<matrix_bridge><Lance Vick> yeah that is what I am thinking
<fossy>i wouldn't keep it in C, because we have no reason to, but that might be cool
<fossy>hmmmmmm
<fossy>thanks for posting this
<matrix_bridge><Lance Vick> open source everything. It will help randos you don't know years from now with problems you never thought about
<matrix_bridge><Lance Vick> but yeah being able to pivot from mes to musl/busybox right away would be a major unlock I think
<matrix_bridge><Lance Vick> I don't think we need to keep any of this code, but someone working out the steps saves a lot of time
<fossy>well, a lot of it past that first musl/busybox bit is of limited utility (imo), as they don't care about pregened files + it's actually roughly the same minus utils
<fossy>however it would make the start wayyy cleaner
<fossy>which i am 100% for
<matrix_bridge><Lance Vick> Exactly. Can throw a ton of awkward early steps out he window
<matrix_bridge><Lance Vick> I am glad my rabbit hole researching prior art pays off every once in a while. ha ha
<fossy>the web of dependencies is complicated. there is always bits and pieces that are really helpful :D
<fossy>and you very well may have found one
<matrix_bridge><Lance Vick> bootstrapping is a fun archeology game
<matrix_bridge><Lance Vick> Red Hat employee. Huh.
<matrix_bridge><Lance Vick> Should try to lure him over here
<stikonas>fossy: what's wrong with our early steps though?
<stikonas>between tcc and musl there aren't that many steps anyway and they are fairly fast
<stikonas>we had to write some custom Makefiles there but that actually helps a lot on other arches
<stikonas>well, it's mostly because of tcc that we go there fast
<stikonas>live-bootstrap goes slowly during mes stage, then is fairly fast and gets slower once we reach GCC4
<fossy>stikonas: nothing "wrong" imo, only that it would be cleaner i think
<fossy>yes, no benefits from a speed standpoint
<fossy>s/no/minimal
<fossy>we could (probably) replace patch, gzip, tar, sed, bzip2, coreutils, bash, grep, diffutils, which and gawk; and maybe curl & dhcpcd
<fossy>for at least some portion of the bootstrap
<stikonas>well, gzip tar and bzip2 are now optional
<stikonas>since mescc-tools-extra can unpack sources
<fossy>mm, true
<fossy>would be interesting to see how long it can last
<fossy>but you make a good point that i don't think lrvick will see much speed increases if this was implemented
<stikonas>though if you want them to last longer, perhaps rebuilding mescc-tools-extra with tcc would be needed
<stikonas>GCC cross-compilation can eventually be eliminated by porting live-bootstrap to x86_64...
<stikonas>but at the moment tcc-mes is buggy and is not able to fully build meslibc
<stikonas>(it can build some of the files)
<stikonas>and I don't think there was any progress there for about a year or so
<fossy>mes->tcc is a very awkward step :(
<stikonas>well, it's scheme
<stikonas>we should at some point try to update nyacc...
<stikonas>but right now we use Googulators fork with patches to regenerate some files
<matrix_bridge><Lance Vick> > between tcc and musl there aren't that many steps anyway and they are fairly fast
<matrix_bridge>I am actually optimizing for two things
<matrix_bridge>1. minimal code, as ultimately for some of my use cases I am going to have to budget/coordinate an audit of everything in the bootstrap chain. Every step we skip, even if fast, is less code malware can hide in
<matrix_bridge>2. build speed, to justify as many people as possible being able to reproduce on a very diverse range of (even low spec) hardware
<fossy>in the (idealistic, maybe infeasible) long term, i would love to do M2-Planet->(something that doesn't exist yet)->gcc 4
<fossy>and skip mes + tinycc entirely
<fossy>stikonas: yeah, the diff doesn't look huge for Googulator's fork if i am observing this correctly?
<fossy>should be somewhat straightforward?
<stikonas>yeah, I think so
<stikonas>well, ideally it's upstreamed
<stikonas>but perhaps we need to test it with newer nyacc first
<stikonas>mes should support much newer versions these days
<fossy>yes, upstreaming would be good
<stikonas>anyway, M2-Planet is also getting better
<stikonas>it might be not that far behind mescc...
<stikonas>it's jus that mescc was debugged specifically to build tinycc
<fossy> mmmm
<stikonas>tinycc is not fully C compatible in a sense that e.g. it depends on int being 32-bit
<stikonas>whereas spec does not guaranteed that
<homo>Googulator91 rekado time to celebrate: microhs is bootstrappable with hugs https://issues.guix.gnu.org/75778
<homo>so much better than nhc98
<matrix_bridge><Andrius Štikonas> @irc_libera_homo:stikonas.eu: nice
<aggi>"build speed, to justify as many people as possible being able to reproduce on a very diverse range of (even low spec) hardware"
<aggi>as soon as c++ enters the scene with g++ and/or llvm/clang needed it's a lost cause
<aggi>however, tinycc support can be extended far beyond a minimal package set towards a one for a complete development host (with linux2.4 that is here)
<aggi>anyway, i just got molested with extortion letters from bureaucracy here, chances are i'll have to terminate my work, _again_
<aggi>anyway, another principle issue, bootstrapping of bootloader/kernel and assembler needed shouldn't be excluded with any approach
<aggi>as soon as you transition beyond linux2.4; avoiding gcc/binutils or llvm/clang early during bootstrapping becomes very complicated then
<aggi>the kexec towards fiwix is an excellent solution, and currently the only one that does work; except for the fact it introduces an additional kernel needed, regardless of how minimal that kernel was
<aggi>from a practical standpoint, another distinction was exactly when a sufficiently complete development host is available that can be booted on some real hardware with sufficient hardware support
<aggi>currently, that depends on some linux-4.x and with it several dozens of millions of lines of code for toolchain and kernel, all of which can be avoided being introduced too early in the bootstrapping chain to arrive at a bootable and sufficiently complete development host
<aggi>in theory, the kexec towards fiwix could be replaced with kexec towards linux-2.4 instantly, in practice there is no benefit in particular when fiwix cannot aim for a complete development host with sufficient hardware support, but is utilized as an intermediate bootstrapping dependency only
<aggi>fiwix is the better option for the use case of an intermediate dependency, linux-2.4 is a better option for sufficient hardware support
<aggi>however, i would argue, a bootable linux-2.4/tinycc initial development host is a preferrable option over linux-4/gcc
<aggi>although this would extend the bootstrappable dependency chain itself (depending on when the job is considered done), it would greatly reduce the amount of lines of code and dependencies for a complete development host
<Googulator> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=75778 that's indeed great news:
<Googulator>although ;; TODO: CONF=unix-32 if CPU is 32-bit.
<Googulator>We will need that to continue the bootstrap towards GHC, since early GHC is 32-bit only.
<Googulator>GHC 4.08.x seems to be a viable target to build using mhs - 4.08 and 4.08.1 mention "GHC > 2.10 required", 4.08.2 says "GHC required, preferably 4.08x"
<Googulator>which sounds like it's Haskell 1.4 / Haskell98 code (with who-knows-what extensions, sadly)
<Googulator>best part is, 4.08.2 is where Guix's bootstrap chain currently starts, and with my WIP work, that can propagate all the way to 9.x for x86-32
<Googulator>(cross-compiling to x86-64 is TODO)
<Googulator>if we can do MicroHs -> GHC 4.08.2, that completes the chain from a basic C/C++ development environment without anything functional or lazy all the way up to modern GHC
<homo>I guess it counts as bootstrap of cabal https://issues.guix.gnu.org/75787
<fossy>Googulator: what's the status of 4.08.x -> 9.x for GHC?
<stikonas>presumably just finding a path there
<stikonas>it should be much easier once you have GHC since that's what GHC people used for development