IRC channel logs

2023-12-18.log

back to list of logs

<fossy>thank you Googulator! i'll take a look at thse changes
<fossy>Googulator: regarding creating external.img only when needed, what is your vision here? (not to transition to a disk?)
<Googulator>The goal there is to only require a single disk
<Googulator>At least when no pre-download is used
<fossy>Ah, alright, that makes sense
<fossy>yes, that should work quite fine
<stikonas>fossy: while you are there, any thoughts how we can try to integrate (future) UEFI bootstrap into live-bootstrap?
<fossy>hmm
<Googulator>Especially important on bare metal, where it means not only an extra physical disk (with extra firmware and corresponding extra risks involved), but also a motherboard that's able to handle 2 disks properly in both BIOS and Linux *and* present them to Linux in the anticipated order
<stikonas>Googulator and I were thinking that UEFI bootstrap will start in UEFI, then after M2-Planet it could build something like builder-hex0 (but written in C) that would then try to execute POSIX binaries
<fossy>yeah, i did read the thoughts abut that
<fossy>i think that is the best option
<Googulator>fossy: Are there any patches to sfdisk itself in the final merged version?
<stikonas>anyway, for now I'm just trying to fix-up stage0-uefi to work on my baremetal machine...
<Googulator>Because if this is the same version of sfdisk that we used pre-simplify, it's gonna crash and burn on bare metal
<Googulator>with the parameters now used
<fossy>i think the most ideal scenario is that UEFI stage is solely in stage0-uefi and then live-bootstrap wouldn't need toooo much special logic for uefi version
<fossy>which should be possible with that idea
<stikonas>yeah, needing just 1 disk is quite important feature
<fossy>I don't think there's any technical limitations to 1 disk. just not something i implemented in simplify PR for the sake of the PRs simplicity
<Googulator>> echo ";" | sfdisk "/dev/${DISK}"
<fossy>Googulator: i haven't patched sfdisk, no
<Googulator>Did you test on bare metal?
<fossy>nope.. it hasn't got any of your bare metal changes
<fossy>i wasn't expecting it to work on bare metal yet either
<Googulator>OK, so only tested on qemu/bwrap/chroot
<fossy>yep
<fossy>that PR had very very minimal functionality changes
<Googulator>for future reference, this is where I keep everything merged now: https://github.com/Googulator/live-bootstrap/tree/simplify-playground
<Googulator>I'm trying to make the individual PRs independent as much as possible, but all will get merged here so I have something to actually test
<fossy>that's a good workflow, i did something very similar for simplify branch
<fossy>is the 1GB unpartitioned fairly arbitary? it might make more sense to calculate that on the fly
<stikonas>yeah, that helped a lot with reviews...
<stikonas>even though those smaller PRs had mostly the same content
<stikonas>but definitely easier to review
<fossy>yeah exactly stikonas
<Googulator>The 1GB is fairly arbitrary, but I do want to keep it a power of 2
<fossy>why a power of 2?
<Googulator>alignment
<Googulator>especially on SSDs
<fossy>hmm, isn't 4K alignment generally sufficient
<fossy>or do SSDs particularly like power of 2 alignment
<Googulator>4K alignment is for AF HDDs
<Googulator>For SSDs, it's a bit more complicated
<Googulator>Real SSDs with a proper controller usually don't care much about alignment, as the controller will take care of it
<fossy>i was about to say -- i'm a bit surprised SSD controllers don't do this
<Googulator>But if you're using something like a USB drive or SD card, then you want to stay on an erase block boundary
<fossy>but i guess there are bad/nonexistant controllers out there
<fossy>yep ok
<stikonas>generally recomentation is to keep partitions 1 MiB aligned
<stikonas>which I think is still valid for SSDs...
<Googulator>Erase blocks can be quite big - I've personally seen 32MiB erase blocks on an SD card
<Googulator>& anecdotally heard of 128MiB
<Googulator>Also, 1GiB leaves space after the srcfs to create a boot partition at the end, without overwriting srcfs
<Googulator>assuming a sane BIOS that doesn't require the boot partition to reside in the first 504MB or worse
<fossy>hm, as in, after the 1GiB or before the 1GiB but after teh srcfs?
<Googulator>between srcfs and 1GiB
<fossy>a boot partition which does what?
<Googulator>srcfs currently ends @ 283MiB
<Googulator>holds e.g. GRUB
<fossy>oh
<fossy>i guess so
<Googulator>It can be done without one, but better to have a dedicated /boot
<fossy>i think this had something to do with your trusting trust drive - what was the benefit of keeping srcfs around?
<Googulator>It's not about that, in fact, for the trusted Flash drive, I explicitly want to make the srcfs inaccessible once Linux starts
<fossy>ok whats the benefit then?
<Googulator>Mostly debugging
<Googulator>Also, I was thinking of getting the previously erased sources in Linux again by reading srcfs
<Googulator>It's an alternative path, vs. including distfiles in the initramfs
<Googulator>But I guess it's not that important
<stikonas>well, that can always be done manually
<stikonas>probably doesn't have to be part of automation
<Googulator>What's important is to have the real on-disk file system start in a sector divisible by 8 (to avoid issues with AF drives) and to have some space to install Grub's early stages into
<Googulator>& of course, all of that needs to be done while keeping sfdisk satisfied about CHS correctness
<Googulator>(-S32 -H64 helps a lot there)
<Googulator>(and --force)
<fossy>yeah that's fine, it's just the 1GiB padding i'm not so sure about, i think the padding will cause more confusion for end users than the debugging benefits it provides (which can always be done elseways)
<Googulator>Maybe reducing it to 512MiB could work
<Googulator>that still leaves enough space for /boot
<Googulator>also, if you make /boot FAT32, it can also be used as EFIESP
<fossy>OHH wait i was misinterpreting this a bit
<Googulator>so you bootstrap in legacy/CSM mode (for now, until stage0-uefi is ready), then install grub, and reboot using UEFI
<fossy>the partition table itself still fills the whole disk...
<Googulator>Yes, it does
<Googulator>Just leaves empty space at the beginning
<Googulator>Perfect for /boot or preserving srcfs, if you care about those
<fossy>ok in that case my actual preference would be making a partition of type 0 covering the srcfs region followed by empty space followed by the partition
<fossy>that way it's really clear to end users once the bootstrap completes what's going on
<Googulator>I would love to do that - if we had a sane tool to do so
<Googulator>like parted or modern fdisk
<fossy>mmmm
<fossy>i might try building parted but not sure how that will go
<fossy>will get back to you on that
<Googulator>already, I'm fighting fdisk's insistence on IBM PC-XT compatibility to just get things to work at all on bare metal
<Googulator>one more thing: mkfs.ext4 will need -F -F
<Googulator>otherwise nasty surprises can result when you bootstrap twice from the same physical HDD
<Googulator>don't ask me how I learned that...
<fossy>.... weird
<stikonas>modern fdisk is better suited for us
<stikonas>at least that's my experience
<stikonas>parted tries to be a bit higher level tool...
<Googulator>without -F -F, mkfs checks for an existing superblock and errors out if it finds one - it wants to play it safe and not overwrite an important FS
<Googulator>but of course, our error handling at that point is "ABORTING HARD"
<stikonas>where as sfdisk really just deals with partitioning without caring for file systems
<Googulator>& several hours of work irrecoverably lost at that point
<Googulator>>     find / -xdev -type c -or -type b -not -name "ram*" -printf "nod %p %m %U %G %y " -exec stat -c '%Hr %Lr' {} \; >> /initramfs.list
<Googulator>I'm surprised this works
<Googulator>-printf and -exec in the same command
<Googulator>linux booted with my changes (in qemu)
<Googulator>btw, another quick tip for enterprising bare-metallurgists: don't bootstrap with a GeForce GT 730
<fossy>yeah i was slightly surprised that worked but was glad to see it did
<Googulator>linux-4.9.10 doesn't like it for some reason, consistently locks up with a white-on-green screen
<fossy>weird
<Googulator>probably incomplete implementation for that generation of cards
<Googulator>or maybe some incompatibility between the much older chipset and it
<Googulator>NV3x (GeForce FX/PCX generation) works perfectly though, and luckily I had one of those at hand
<Googulator>Intel integrated also works-ish, but it won't actually get a high res console
<Googulator>limited to 640x480, 16 colors
<Googulator>probably something wrong with my kernel configuration (maybe uvesafb taking priority, and then not finding the needed userspace tools)
<Googulator>BTW, I wonder if we even need that find line
<Googulator>it's basically for /dev
<Googulator>but it doesn't sound like a sane idea to copy /dev from Fiwix to Linux
<fossy>technically no, we could just rerun populate_device_nodes
<Googulator>not even that
<Googulator>we have devtmpfs
<Googulator>mkdir /dev; mount -t devtmpfs none /dev
<fossy>are we acutlaly using devtmpfs
<Googulator>pre-disk init, we probably use it
<Googulator>it's mounted by default in initramfs in recent-ish Linuxes
<Googulator>although it could be disabled in the current Linux kernel config
<Googulator>it's definitely enabled in mine, I used it when I was bringing up bare-metal on the WIP PR
<fossy>"If CONFIG_DEVTMPFS_MOUNT is set to y when building the kernel, the resulting kernel will automatically attempt to mount devtmpfs to /dev after mounting a root filesystem - unless the kernel is using an initramfs for the initial root filesystem"
<fossy>and we have CONFIG_DEVTMPFS_MOUNT=y
<Googulator>CONFIG_DEVTMPFS=y is what matters to us
<Googulator>because "the kernel is using an initramfs for the initial root filesystem"
<fossy>as far as i can tell CONFIG_DEVTMPFS=y just means that devtmpfs *exists*, not that it is automoutned
<Googulator>but it's always automounted in an initramfs iirc, if /dev exists in it
<Googulator>this is so you don't need to include systemd in your (regular, non-bootstrap) initramfs
<fossy>oh ok, devtmpfs automount in initramfs is post- linux 4.9.10
<fossy>that explains things
<Googulator>It should be there in 4.9.10 already
<Googulator>hmm... now I'm not sure
<fossy>nope
<fossy> https://patchwork.kernel.org/project/linux-sh/patch/576AE1C5.5090909@landley.net/
<fossy>is not in 4.9.10
<stikonas>when is it in?
<stikonas>(though perhaps kernel upgrade is out of scope now...)
<fossy>very easy to port anyways
<stikonas>sigh, UEFI's are quite annoying...
<stikonas>fixed stack alignment in hex1
<stikonas>so hex1 no longer gets stuck, but now it just doesn't create any output on baremetal... (still works fine in qemu)
<Googulator>fossy: is it safe to use something like "\"" in script-generator.c?
<Googulator>Or will M2-Planet choke on that?
<Googulator>I'm trying to do something like ( SWAP_SIZE != DISK_SIZE ) in manifest predicates - I've already implemented != support, but need a way to distinguish between variable-to-constant vs variable-to-variable comparisons
<fossy>"\"" should work...
<fossy>but i'm not certain of what
<fossy>i think ive escaped quotes in m2-planet before
<fossy>s/what/that/
<Googulator>( VARIABLE1 == VARIABLE2 ) vs. ( VARIABLE == " VALUE " ) seems to be the obvious choice
<Googulator>also define: ORIG_JOBS = JOBS
<Googulator>and then define: JOBS = " 1 "
<fossy>why not just start with quote means string, no quote means variable?
<fossy>instead of spaces about the quote
<Googulator>because it's hard to ignore the closing quote
<Googulator>same reasaon why you can't have (VARIABLE1 && VARIABLE2)
<Googulator>only ( VARIABLE1 && VARIABLE2 )
<Googulator>you would need to use something like strncpy(target, tok->val + 1, strlen(tok->val + 1) - 1)
<Googulator>and I'm not sure M2 would like that
<Googulator>also, right now, VARIABLE == " VALUE WITH SPACE " won't work
<Googulator>because of how tokenization is done
<fossy>hold up, what's the actual usecase for variable-constant comparisons?
<Googulator>it's almost all variable-constant
<fossy>oh, currently, alright
<Googulator>variable-variable is just nice to have, what I'm after is backing up a variable under another name
<Googulator>define: ORIGINAL_JOBS = JOBS
<Googulator>define: JOBS = " 1 "
<Googulator>... bootstrap to Linux ...
<Googulator>define: JOBS = ORIGINAL_JOBS
<fossy>hmm, that won't really work at the moment; defines occur globally. that will just make JOBS whatever it was originally
<Googulator>(simplified, because all of those also need to be transferred into bootstrap.cfg)
<fossy>currently a variable cannot hold two different values at different parts of the bootstrap is what i mean
<Googulator>I understand it's not scoped, but why wouldn't overdefining an existing value work?
<Googulator>I actually got something like this to work in the last iteration, just with slightly different syntax
<Googulator>I used define: VARIABLE = VALUE vs define: VARIABLE1 = $ VARIABLE2 there
<Googulator>but I now think it's cleaner to mark literals explicitly and have everything else be a name, vs. have plain text be names in one context, values in another, and having to mark when you do want a name even though it would by default be a value
<Googulator>there's actually code in script-generator.c for updating an existing variable
<fossy>yes, you can update an existing variable, but only the new value is ever used
<fossy>the only way that variables are passed through to live-bootstrap is through output_config function
<Googulator>the only trick to keep in mind is that bootstrap.cfg will initially contain the values valid at the end
<fossy>yeah exactly...
<fossy>so how does backing up the variable help?
<Googulator>right, that's what I meant by having to also use an improve step to transfer the desired value into bootstrap.cfg
<Googulator>It helps if you're using that variable for predicates
<Googulator>also in kaem
<fossy>oh okay... but in this context of JOBS, what did you do to make JOBS = 1 ever actually apply?
<Googulator>I reset JOBS to 1 as the very last step in the manifest
<Googulator>so that's what gets written to bootstrap.cfg
<Googulator>then I use an improve step immediately after setting JOBS = ORIGINAL_JOBS that appends the correct value of JOBS to bootstrap.cfg, so it gets used from then on
<fossy>okay that makes a lot more sense
<Googulator>All of this is needed to reenable bootstrapping on multiple cores
<Googulator>using kernel bootstrap
<fossy>yeah, i figured
<Googulator>fossy: the script-generator uninitialized variable bug I reported on the 5th came back to haunt me now...
<Googulator>    Directive *last;
<Googulator>that's gonna get used uninitialized if the very first directive in the manifest is a define
<Googulator>bare metal time, wish me luck :)
<matrix_bridge><Andrius Štikonas> Good luck!
<Googulator>...and after some fiddling with the memory modules, it booted
<Googulator>seems to have an issue cold-booting with 4 different sticks of RAM installed
<Googulator>boots fine with 2, then adding 2 more (while standby power is on, but the board is not running) works
<Googulator>but all 4 installed + multiple days without power = black screen
<Googulator>reminds me of my old Acer which wouldn't boot after a CMOS clear with more 4GB or more installed
<Googulator>had to boot once with 3GB
<Googulator>then swapping in the original sticks would work
<Googulator>Hello,M2-mes!
<Googulator>take #2: forgot to patch memory map & ramdisk size in kexec-fiwix...
<Googulator>(the commit I just pushed to simplify-playground already has these fixed, for anyone trying at home)
<Googulator>fossy: just caught another bug that got through the big PR: https://github.com/fosslinux/live-bootstrap/commit/545bb42ca800af28086ebd1fa2c8ed46726a6f74#diff-e3cf7b4ae6ba10383ad9b192d90d49c82c4435b0b5740bf213691c699f277200R170
<Googulator>shutil.copytree expects its target directory to _not_ exist yet when called
<fossy>oh totally forgot about that script-generator bug Googulator
<Googulator>luckily it's an easy one
<fossy>hm, not sure when that external_sources bug was added, because i did test external-sources near the end
<Googulator>looks like it's not the only bug in bwrap either: mknod: `/dev/sda': Operation not permitted
<Googulator>or maybe my security settings are wrong?
<Googulator>meanwhile, bare metal has reached Fiwix :)
<Googulator>...and it's building Linux
<Googulator>Booted into Linux... and it failed on creating swap
<Googulator>and now I get it :)
<Googulator>created swap.sh in the wrong directory...
<Googulator>thanks to the new Bash trap feature, it's salvageable
<Googulator>(I just need to create the swap by hand, and then drop back to the script)
<fossy>i'll retest bwrap, not sure i tested it sufficiently toward the end
<Googulator>manually creating the swap worked, now building curl
<Googulator>meanwhile, pushed the missing script to simplify-playground
<Googulator>Network is working on baremetal! (Marvell NIC)
<Googulator>meanwhile: https://github.com/fosslinux/live-bootstrap/pull/356
<Googulator>Just had a chance to check up on the bare metal test system again
<Googulator>The good news: it's building gcc 13 as we speak
<Googulator>The bad news: it's building gcc 13 as we speak
<matrix_bridge><Andrius Štikonas> Meaning it is slow?
<Googulator>good, because it's the last step and it hasn't failed
<Googulator>bad, because it has been running for 13 hours now
<Googulator>This same system completed the bootstrap in 7 hours before simplify
<Googulator>meaning, we have a massive perf regression
<matrix_bridge><Andrius Štikonas> By the way, mes or looks OK, I had something similar but not rebased after fossy's merges
<matrix_bridge><Andrius Štikonas> Hmm, is it due to parallelism not working?
<Googulator>Could be, although I did try to get it to work
<matrix_bridge><Andrius Štikonas> But GCC 13 is the slowest package...
<Googulator>It is, but it's not the only place where it's visible
<matrix_bridge><Andrius Štikonas> It does full 3 stage bootstrap
<matrix_bridge><Andrius Štikonas> Guile is another slow one...
<Googulator>The regression was already well apparent during guile's BOOTSTRAP(phase0)
<Googulator>mes-0.26 got as far as the first Bash build
<Googulator>needs HAVE_RENAME defined since the new mes now supports rename()
<matrix_bridge><Andrius Štikonas> And does fiwix support it?
<Googulator>I tested in bwrap with --build-kernels, so fiwix was at the very least built
<Googulator>might need to switch from linux/rename.c to stub/rename.c though, if the real kernel doesn't support the syscall
<matrix_bridge><Andrius Štikonas> I think it does support it
<matrix_bridge><Andrius Štikonas> https://github.com/mikaku/Fiwix/blob/9560a8d51d31a925eeb84e8d32eee65365665874/kernel/syscalls.c#L233
<Googulator>oddly, each mes version prints different error messages during the tcc build
<Googulator>(but they all ultimately succeed)
<Googulator>0.24.2:
<Googulator>->type--: not a <type>: (typename "BufferedFile")
<Googulator>->type--: not a <type>: (typename "BufferedFile")
<Googulator>unexpected size:8
<Googulator>rank--: not a pointer: #<<type> type: signed size: 1 description: #f>
<Googulator>rank--: not a pointer: #<<type> type: signed size: 1 description: #f>
<Googulator>0.25:
<Googulator>->type--: not a <type>: (typename "BufferedFile")
<Googulator>->type--: not a <type>: (typename "BufferedFile")
<Googulator>rank--: not a pointer: #<<type> type: signed size: 1 description: #f>
<Googulator>rank--: not a pointer: #<<type> type: signed size: 1 description: #f>
<Googulator>0.26:
<Googulator>->type--: not a <type>: (typename "BufferedFile")
<Googulator>->type--: not a <type>: (typename "BufferedFile")
<Googulator>rank--: not a pointer: #<<type> type: signed size: 1 description: #f>
<Googulator>rank--: not a pointer: #<<type> type: signed size: 1 description: #f>
<Googulator>->type--: not a <type>: (typename "BufferedFile")
<Googulator>->type--: not a <type>: (typename "BufferedFile")
<Googulator>rank--: not a pointer: #<<type> type: signed size: 1 description: #f>
<Googulator>rank--: not a pointer: #<<type> type: signed size: 1 description: #f>
<Googulator>gcc-13.1.0: creating package.
<matrix_bridge><Andrius Štikonas> I know mes 0.25 disabled some of these false warnings
<Googulator>the "unexpected size" one is gone in 0.25
<Googulator>everything else remains
<Googulator>and in 0.26, everything is printed twice
<matrix_bridge><Andrius Štikonas> Yeah, odd
<Googulator>bare-metal bootstrap finally done!
<Googulator>has been running for almost 14 hours
<Googulator>before simplify, this machine could bootstrap in half that time
<Googulator>hmm...
<Googulator># echo $JOBS
<Googulator>4
<Googulator># echo $MAKEJOBS
<Googulator>-j1
<Googulator>that explains it
<Googulator>bwrap test of mes 0.26 got to populate_device_nodes, where it fails due to a permission issue (known on my part, not related to mes upgrade)
<Googulator>that sounds like a success
<Googulator>that's well past where mescc is discarded
<Googulator>& meslibc
<Googulator>& now I see why makejobs is wrong
<Googulator>in update_env.sh: cat > /steps/env <<- EOF
<Googulator>should be cat > /steps/env <<- 'EOF'
<Googulator>otherwise it will do the substitution when env is created, not when it's read
<Googulator>Bootstrap started again on bare metal, with this bug fixed, and mes upgraded
<Googulator>Hopefully this does finish in 7 hours
<Googulator>...and again, because I forgot to set swap on the rootfs.py command line
<Googulator>fossy: marked https://github.com/fosslinux/live-bootstrap/pull/356 as ready for review
<matrix_bridge><Andrius Štikonas> Googulator: I guess at least x86 checksums should be updated too?
<Googulator>yes, just realized that
<Googulator>back to draft while I do that
<matrix_bridge><Andrius Štikonas> amd64 might be easy to do for mes too
<matrix_bridge><Andrius Štikonas> riscv you can probably ignore
<matrix_bridge><Andrius Štikonas> I can update that later
<matrix_bridge><Andrius Štikonas> It takes 7h or so in qemu
<matrix_bridge><Andrius Štikonas> or 7d on my SoC
<Googulator>but even worse, I forgot about pre-network-sources
<Googulator>(because the branch I'm testing on no longer has it)
<Googulator>Subprocess error 1
<Googulator>ABORTING HARD
<Googulator>+> /external/distfiles/mes-0.26.tar.gz: No such file or directory
<Googulator>I guess I'll do x86 and amd64 then
<Googulator>never heard of a situation where qemu with full emulation was faster than a real CPU... :)
<matrix_bridge><Andrius Štikonas> Googulator: it was qemu with user mode emulation...
<matrix_bridge><Andrius Štikonas> Without emulating kernel
<matrix_bridge><Andrius Štikonas> Still probably caused by ram bandwidth...
<matrix_bridge><Andrius Štikonas> Googulator: it was user mode emulation
<matrix_bridge><Andrius Štikonas> Without emulating kernel
<Googulator>even then, TCG rarely beats real silicon
<Googulator>of course, it's mes, the Scheme interpreter that thinks it's llama.cpp, so memory bandwidth is indeed everything
<Googulator>in fact, it may be even worse than llama.cpp in this regard - I've never seen that saturate RAM bandwidth on just 1 thread, unlike mes
<matrix_bridge><Andrius Štikonas> Googulator: we should also recheck mes 0.26 for any new pregen files...
<Googulator>right
<matrix_bridge><Andrius Štikonas> All those new guile modules might have introduced non source stuff...
<matrix_bridge><Andrius Štikonas> fossy is especially good at finding those...
<matrix_bridge><Andrius Štikonas> I can try to look a bit too
<Googulator>amd64 actually fails with "mkdir NOT FOUND"
<Googulator>well before it even touches mes
<Googulator>it dies in seed.kaem
<matrix_bridge><Andrius Štikonas> Installed into wrong place?
<matrix_bridge><Andrius Štikonas> Hmm
<Googulator>Probably the PATH fixes were not applied to amd64
<Googulator>I remember the same issue on x86 when I was working on the make-3.82 PATH issue in the draft simplify PR
<matrix_bridge><Andrius Štikonas> It calla mkdir before it is installed to PATH
<matrix_bridge><Andrius Štikonas> https://github.com/fosslinux/live-bootstrap/blob/545bb42ca800af28086ebd1fa2c8ed46726a6f74/seed/seed.kaem#L12
<matrix_bridge><Andrius Štikonas> Hmm, PATH should have that from after.kaem...
<Googulator>wait, but then how does it work in x86?
<matrix_bridge><Andrius Štikonas> Maybe ARCH_DIR has wrong value?
<matrix_bridge><Andrius Štikonas> Here: https://github.com/fosslinux/live-bootstrap/blob/545bb42ca800af28086ebd1fa2c8ed46726a6f74/seed/after.kaem#L12C14-L12C14
<matrix_bridge><Andrius Štikonas> Should be coming from https://github.com/oriansj/stage0-posix-amd64/blob/93fbe4c08772d8df1412e2554668e24cf604088c/kaem.run#L24C5-L24C5
<Googulator>found it: PATH=${BINDIR}:/${ARCH}/bin
<Googulator>on amd64, $ARCH != $ARCH_DIR
<Googulator>on x86, they are the same
<Googulator>Ouch. Seems like the new mes uses more memory than what builder-hex0 is able to give it...
<Googulator>...or something similar
<Googulator>when tcc-mes tries to build tcc-boot0, it just prints "tcc version 0.9.26 (i386 linux)" and dies
<Googulator>(this is on bare metal
<Googulator>)
<Googulator>worked perfectly in bubblewrap
<matrix_bridge><Andrius Štikonas> Googulator: that's after mes
<matrix_bridge><Andrius Štikonas> tcc-mes is a C program (tcc)
<matrix_bridge><Andrius Štikonas> Googulator: later in the evening I'll try to compare with my unrebased patch
<matrix_bridge><Andrius Štikonas> I think it worked there
<Googulator>I know it's after mes - but it appears that mes corrupts memory as it runs, causing tcc to subsequently fail
<Googulator>probably because it overruns the memory block builder-hex0 gives it
<Googulator>right now, I'm running amd64 mes (building tcc) in bwrap, VSZ is 1132564
<Googulator>x86 is probably less than that, but still likely exceeding builder-hex0's limit
<Googulator>tcc-mes locks up on amd64
<stikonas>Googulator: that's expected on amd64...
<stikonas>tcc-mes builds but crashes...
<stikonas>and I know at least one issue there
<stikonas>there is something wrong going on with sign vs zero extension when outputing some 32-bit constants
<stikonas>I think o(0x81234567) gets sign extended to 64-bits
<stikonas>and then probably loop never finishes...
<Googulator>OK, so that's not a regression then
<Googulator>hmm, x86 version of mes tops out at 566808KiB memory usage in bwrap
<Googulator>that's 554MiB
<Googulator>builder-hex0 supports up to 639MiB
<Googulator>so then why is it that tcc prints its version banner and then locks up in builder-hex0?
<Googulator>(it's not supposed to print a version banner there at all)
<stikonas>On x86?
<stikonas>I think my changes pre simplify pr ran till tcc 0.9.27
<stikonas>But they are basically same as your pr...
<stikonas>(after tcc 0.9.27 hashes were bad...)
<Googulator>Yes, x86
<stikonas>Googulator: how about https://github.com/stikonas/live-bootstrap/tree/mes-0.26
<stikonas>does this run for you?
<Googulator>testing now
<Googulator>stikonas: it works
<Googulator>meanwhile, checking out a memory dump from the failed bootstrap in qemu
<Googulator>which worked
<Googulator>WTF... Libera swallowed one of my messages
<Googulator>I wrote "/usr/bin/tcc-mes is identical to the version built in bwrap" ... which worked
<Googulator>so tcc itself is being generated correctly
<Googulator>but then something goes wrong when executing it
<Googulator> https://gist.github.com/Googulator/2e9ee8ed7db95234236b04b6fb10acc7 this is the last thing that got printed before it locked up
<Googulator>(retrieved from a memory dump of the VM)
<Googulator>that's looking like it dies here: https://github.com/Googulator/live-bootstrap/blob/mes-0.26/steps/tcc-0.9.26/pass1.kaem#L143
<[exa]>Googulator: messages starting with / may be interpreted as commands
<Googulator>I think I may have figured out what's going on..
<Googulator>all of those new /lib files I had to include for mes-0.26 to successfully build itself came from /lib/linux
<Googulator>they're probably syscall wrappers
<Googulator>including for syscalls builder-hex0 doesn't support
<Googulator>and when builder-hex0 doesn't support a syscall... it pretends to
<Googulator>returning success, but doing nothing
<stikonas>Googulator: so why does it work on my branch?
<Googulator>maybe I added more files from /lib/linux
<stikonas>oh, there are some differences
<stikonas>between my changes and yours
<stikonas>I don't have read.c
<stikonas>whereas you have mescc lib/linux/read.c
<stikonas>neither of us fully sorted the list there...
<stikonas>doesn't help with comparison
<Googulator>Retrying with the new files taken from lib/stub instead
<stikonas>Googulator: I had some issues with lib/stub
<stikonas>anyway, you have my list too
<stikonas>which will hopefully work
<stikonas>so you have something to try, bisect the differences...
<stikonas>in the meantime I need to figure out why my hex1.efi does not output anything one my machine
<stikonas>(even though it works in qemu)
<stikonas>s/one/on/
<stikonas>and even hex C prototype seems to be misbehaving...
<Googulator>The obvious differences are read, sleep and utime - testing with these removed
<stikonas>also only stuff that tcc calls can matter...
<stikonas>I don't think tcc uses utime
<stikonas>and slee is only used in tests
<Googulator>Removing those 3 files fixed the lockup
<Googulator>in qemu at least
<fossy>if/when builder-hex0 is split into a lower and higher level kernel, it would be nice for the higher level kernel to print an error when a nonexistant syscall is given
<fossy>i will take a look for new pregen files in mes 0.26
<fossy>i'm working on binutils 2.41
<Googulator>also, it would be nice if unsupported syscalls returned failure, to at least give programs a chance to use alternate paths
<fossy>is that usual POSIX behaviour?
<fossy>i presume so
<Googulator>AFAIK it is
<Googulator>swallowing an error and feigning success certainly isn't POSIX
<fossy>yea ok, it should error
<Googulator>and now, 3 simultaneous bootstrap tests in progress (simplify-playground x86 on baremetal, mes-0.26 x86 in qemu, mes-0.26 amd64 in bwrap)
<Googulator>stikonas: the current code in my repo should be good for riscv64 checksum update
<Googulator>(pending fossy's OK w.r.t. generated files, of course)
<stikonas>yeah, I'll check those too
<stikonas>I did start looking though those scheme dirs
<stikonas>haven't finished but it looks ok
<stikonas>seems normal source...
<stikonas>well, there is that older file we found mes/module/mes/psyntax.pp but we are already removing it in the script
<fossy>Googulator: could you point me to commits that aren't in the main tree that you have found to be needed for bare metal bootstrap?
<Googulator>fossy: https://github.com/fosslinux/live-bootstrap/commit/434f5fb25255c4164a8f8855a0d86ac8484d4732 should have the bare minimum
<Googulator>well, not quite - it's built on top of the script improvements
<Googulator>but it should be easy to backport