IRC channel logs

2024-06-10.log

back to list of logs

<fossy>stikonas: Hm
<stikonas>fossy: yeah, it does look like my commit that is breaking kernel bootstrap... But I don't understand why...
<fossy>i know i tested configurator on qemu on the final branch, so that would be strange
<fossy>ah
<stikonas>something is wrong there
<stikonas>maybe some limit is exceeded?
<stikonas>well, I pushed a draft PR for now
<stikonas>(I need to gather one more checksum first anyway)
<fossy>#471 yeah?
<stikonas>yes
<stikonas>it's combined stage0/mes/tinycc upgrade
<stikonas>stage0 because current one is broken
<stikonas>(riscv64 checksums are wrong)
<fossy>hmm, nothing stands out to me as problematic for builder-hex0. i'll give it a run and see if anything comes up
<stikonas>yeah, I only tried it once...
<stikonas>bwrap mode ran fine on x86
<stikonas>x86_64 ran further than I expected...
<stikonas>though still doesn't work in the end
<stikonas>(tcc-mes was able to build crt1.c, and some other very simple C files but not full unified-libc.c)
<aggi>seems the re-compiled tccboot did launch correctly to compile the example hello.c...
<aggi>next one... kernel
<aggi>it's a little more complicated, because different tcc/tccboot versions required different kernel-patches
<aggi>it is even more confusing, because on the buildhost i compiled kernel AoT with a more recent tcc-version than i may use for tccboot JIT
<aggi>i can't keep tccboot-compiler version and tcc in sync yet, because of build-system issues due to restructured code of tcc since 0.9.25
<aggi>ACTION needs another deep breath before loading the kernelsrc.romfs
<aggi>some progress... recompiled tccboot fails to process assembly, and it complains over some other problems when compiling kernel c sources
<aggi>however... this time, tccboot does launch, and it does start compilation... have to see why compilation/assmbling _fails_ although i used the original romfs from Bellards tccboot.iso, and i too used an early version tcc-0.9.22 to link into tccboot
<aggi>finally, i want to identify if/why the kernels i already succeed to AoT compile with tcc crashed, while bellards tccboot compiled kernel didn't
<aggi>for this, i _must_ reproduce his known-working state from 20 years ago first
<aggi>ideally, i may skip tccboot entirely, if i can track down why an AoT tcc-compiled kernel crashes
<aggi>monday morning, 3o'clock, second pot of coffee
<janneke>snuik: later tell vagrantc: oops! and thanks for testing; i'm testing a couple of patches on the version-0.26 branch right now
<snuik>Sure thing.
<janneke>snuik: botsnack
<snuik>:)
<aggi>i would say with certainty, it's not feasible to re-integrate tccboot for tcc-version any later than 0.9.24
<aggi>a test-case i fiddled together compiled, had to link libc.a for tccboot to supply the rather many new libc functions required, yet then tccboot doesn't spawn compilation
<aggi>testing tccboot with early 0.9.21..22 versions yielded problems during the parsing stage... "include recursion too deep", or "bad expression syntax" inside asm.S files at the '.align' token
<aggi>depending on what's tested first... those errors show up, with an inconclusive root-cause
<aggi>so, all this failed:
<aggi>1) AoT compilation of linux-2.4 with tcc
<aggi>2) JIT compilation of linux with tccboot
<aggi>3) updating tccboot to use a recent version of tcc
<aggi>with regards to 1) kernel is compiled and linked with both an old and recent tcc, but _crashes_ without any further option for me to find the error
<aggi>and 2) is very difficult/almost impossible to debug, to at least re-produce the known-working state from 2004 tccboot.iso
<aggi>only good news is, i got scripting and git-repos arranged, which covered _many_ test-cases to match various kernel versions and compiler version for either JIT or AoT
<aggi>including a tccboot which can compile/execute a simple example/hello.c, and booting AoT-compiled kernel-2.4.26 or kernel-2.4.37.11 which crash
<aggi>the reason i am hesitating to publish this is simple: it's a huge waste of time for almost two decades already, and many other developers seem to have failed with it
<aggi>i may try to move the setup from BUILDHOST=arm to BUILDHOST=x86, yet i am certain cross-compile was sanitized to prevent issues
<aggi>with AoT compilation i recall kernel got stuck with processing interrupts, not hitting the correct ISR i suspect
<aggi>what's confusing me too, is the fact a recent tcc-version sufficed to compile an entire userland (except libc asm parts and kernel), and this worked
<aggi>indicating recent tcc versions were somewhat stabilized, to correctly process preprocessor macros for example
<aggi>and too the bootcode.S related parts were correctly assembled by i386-tcc, since the bootloaders do execute too
<aggi>at least the ones for 32bit asm (not 16bit)
<aggi>hence it's searching for a needle in the haystack now
<matrix_bridge><cosinusoidally> aggi: I've just done a bit of googling. I assume you found https://github.com/seyko2/tccboot and https://github.com/seyko2/tinycc . This looks like it may have been a semi-successful attempt to resurrect tccboot in 2016. There's also some contemporary messages on the tcc mailing list https://lists.nongnu.org/archive/html/tinycc-devel/
<matrix_bridge><cosinusoidally> (I've not tried to run that code though)
<aggi>cosinusoidally, yes, i grabbed all projects there were and bisected through them
<aggi>seyko2 seems the most promising, which i did a little cleanup for to cover the many test cases
<aggi>the seyko2 AoT kernel compilation it is which finishes, kernel loaded, and crashes
<aggi>which is confusing, because for linux-2.4.26 the JIT/tccboot compilation was known-good year 2004 (which i can't reproduce), and the exact same sources compiled with the exact same compiler setup for AoT/tcc crashes
<aggi>hence i was hoping for re-producing tccboot to confirm what Bellard did 2004, to bisect this against AoT compilation (differences in memory layout, linking etc)
<aggi>in principle however, i would conclude, even if i could repair/reproduce any known-working state, even then i raise doubts over the quality of tcc and bootstrapping with it
<aggi>not sure if i wanted to rely upon this with any production-system, same with gcc/binutils
<matrix_bridge><cosinusoidally> tcc is high enough quality to build fiwix, binutils and gcc, and then use the version of gcc it built to then build a relatively modern linux kernel (this is what live-bootstrap does)
<matrix_bridge><cosinusoidally> tcc is totally unmaintainable by mortals though.
<aggi>tcc too can build linux-kernel, but it crashes; and fiwix kernel was run by kexec, that's the practically severe limitations still
<aggi>although i see the kernel-side at fault, because this is what they do over there: #ifdef gcc #else #error #endif
<matrix_bridge><cosinusoidally> In live-bootstrap the fiwix kernel is run from builder-hex0. It goes builder-hex0-> fiwix -> Linux . It then is able to loop back around at a later stage and build and install grub in order to boot normally into Linux
<aggi>if this system-integration path was clean and stable, i would stop at fiwix+tcc and drop linux+gcc
<aggi>and i rather avoided binutils too, and find a solution for the missing 16bit assembly support elsewhere
<aggi>the _pracitcally_ relevant aspect with Fiwix to run on real x86-hardware is USB and ethernet support
<matrix_bridge><cosinusoidally> yep getting rid of binutils/gcc would open up the possibility of booting from source all the way up to userspace in a reasonable amount of time.
<matrix_bridge><cosinusoidally> qemu does have a gdb remote debugging mode. If you can figure out which object code file causes the crash then you can possibly replace that file with one compiled by gcc and see if that fixes the issue. If it does fix the issue then that will identify which file tcc has miscompiled.
<aggi>i suspect it's a linking memory-section issue
<aggi>tracking this, it seems to fault when handling/enabling interrupts with sti()/__global_sti/__sti (assembly routine) called first in init/main.c
<aggi>hitting an invalid memory page instantly
<aggi>ACTION will leave for a little while