IRC channel logs

2025-05-10.log

back to list of logs

<oriansj>stikonas: then I think we have a bug in How M2-Mesoplanet handles conditional blocks
<stikonas>gtker, oriansj: I've now created a PR for cc_x86 to remove support for CONTANT and unions: https://github.com/oriansj/stage0-posix-x86/pull/9
<stikonas>(I can merge it myself later, but publishing it for review)
<stikonas>and we'll have to replicated it across all cc_* variants
<aggi>made some progress with debugging a TinyCC statically compiled/linked python
<aggi>it's segfaulting in two extensions, _ssl/hashlib and xml/expat/elementtree
<aggi>removed those two extensions temporarily, and some portage utilities work now
<aggi>should be good enough, to bootstrap and/or retain python/portage self-hosting with tinycc
<aggi>portage got a few more dependencies, which i couldn't test yet with a tinycc compiled OS release, because python was blocked a very long time
<aggi>publishing this isn't trivial, because it's an all-or-nothing with regards to the forked portage-tree/musl-libc etc.
<aggi>i'll re-compile a complete distro with portage today, which should sufficiently test-cover tinycc python.static then
<aggi>then next, i can see to if _all_ utilities remained self-hosting for portage, and with it a tinycc driven OS release remaining self-hosting too
<aggi>anyway, git-pushing any sources isn't problematic, but i rather do so with a domain that could host the operating system release itself
<aggi>as an alternative, i could dissect individual ebuilds with their patch-folders each, and publish those only
<aggi>depending on what's of interest to bootstrappable or tinycc-devel; but this would complicated re-producing test-cases without a release-tagged bootable ISO
<aggi>another side note: i do not hesitate to publish _everything_ to bootstrappable and tinycc developers; but i hesitate to do so for _everyone_ else
<aggi>for various reasons; including the fact i may not want to lose control over a TinyCC driven Operating System release without the opportunity to establish a project-site hosted myself, meanwhile anyone else could easily scrape and salvage hosting it while i can't
<aggi>i think a complete somewhat-POSIX operating system release fully supported by a *nix type compiler/toolchain such as tinycc is rather valuable
<aggi>too, as a baseline for any other OS/ARCH/CC variant (for example bsd/riscv/cproc), given a verification against TinyCC partially covers issues for any other such approach
<stikonas>gtker: do you want to replicate cc_x86 changes to some of the others cc_* and if so then which ones?
<stikonas>(just asking to make sure we don't have conflicts)
<matrix_bridge><gtker> I can do armv7l and aarch64 right now
<stikonas>ok, I'll start with amd64 and uefi versions then
<stikonas>at least cc_* is not too hard to modify...
<stikonas>earllier stages need more and more hex/machine code...
<stikonas>risc-v is particularly hard to work at hex0 level...
<matrix_bridge><gtker> Yeah, and it's mostly deletions and small edits, so it shouldn't be too painful
<stikonas>yeah, the previous changes were harder. Took me a while to understand why aarch version was not working (that FBRANCH vs RBRANCH thingy)
<stikonas>though I wasn't that familiar with aarch64 version before...
<matrix_bridge><gtker> I spent a few hours not being able to get it working before I gave up and hoped that it was just some toolchain specific issue that I didn't know about 😄
<matrix_bridge><gtker> Good spot on the FBRANCH/RBRANCH though
<stikonas>aarch64 hex/M1 could probably benefit from riscv-v word style rework..
<stikonas>but it's quite a bit of work to rewrite it
<stikonas>would probably take a month
<matrix_bridge><gtker> Yeah, it's not something that I would be super hyped about doing
<matrix_bridge><gtker> I would probably rather write a "real" assembler in cc_* compatible C so that we could use it in M2-Planet?
<stikonas>oriansj: please merge https://github.com/oriansj/stage0-posix-armv7l/pull/4
<stikonas>gtker: something GAS compatible?
<stikonas>I think oriansj at some point spent a bit of time on it but didn't make enough progress
<stikonas> https://github.com/oriansj/M3-Meteoroid though probably not too much to harvest from here
<stikonas>it even predates M2libc...
<matrix_bridge><gtker> Possibly, I'm not sure what would be best since I haven't done much assembly work, I've mostly used it for reverse engineering x86 windows programs
<matrix_bridge><gtker> Although being compatible with some existing tooling would probably be a necessity just so we get editor support for free
<stikonas>well, I mostly didn't know assembly before except for really basic stuff but learned quite a bit while writing stage0-posix-riscv64
<matrix_bridge><gtker> Being forced to write a complex application in assembly has a tendency to force you to learn assembly 😉
<agg1>x86 16bit real-mode asm cought my interest, to re-write linux2 bootcode for as86 syntax, no time yet :(
<agg1>this could shrink the circular dependency graph surrounding binutils involved with tccboot/bootloader/kernel bootstrapping, not urgent
<matrix_bridge><gtker> How does bootstrapping a kernel even work? How are you executing ELFs without a baseline kernel that must be trusted?
<agg1>one idea was/is, to revive tccboot loader (tested it while ago against some recent libtcc1.a)
<agg1>this loader contains some real-mode asm itself, which must be process by binutils currently; as86 would shrink that circular dependency graph
<agg1>*processed
<agg1>of cause, this is a different approach to bootstrappable stage0, not entirely sure yet of tccboot/linux/as86 could fit into live-bootstrap
<stikonas>on BIOS systems, builder-hex0 already gives one way of bootstrapping kernel in the traditional sense
<stikonas>on UEFI, I've started that stage0-uefi and posix-runner work but haven't done anything in the last year or so
<stikonas>I think I've got it to the stage where it could run mes
<stikonas>and for further work one needs to support loading different segments from the ELF binary
<stikonas>and we also need to implement paging
<stikonas>(at the moment posix-runner can only run fully relocatable binaries, since it loads them in the wrong virtual address)
<agg1>i'm a little worried about gcc/binutils themselves, in conjunction with those relying on autotools/autogen; at a later stage of bootstrapping
<agg1>in conjunction with the idea, relying upon i386-tcc instead, to provide a regular bootable ISO, which contains all utilities to establish binutils/gcc those at a comfortable posix shell
<agg1>the moment when bootstrappable arrived at musl-libc/tinycc, linux2 perl/autotools and python/portage can be established, _before_ gcc/binutils
<matrix_bridge><gtker> stikonas: I'll do the stage0 repo
<stikonas>thanks
<stikonas>I'm finishing stage0-uefi, then I'll do riscv versions
<agg1>this week i tested arm-tcc against musl-libc, and failed; out of curiosity, the transition towards arm-tcc isn't covered on your side either?
<agg1> https://lists.gnu.org/archive/html/tinycc-devel/2025-05/msg00009.html (not to mention any suitable kernel for arm/64 or riscv-64)
<stikonas>hmm, stage0-uefi will take a bit more time...
<stikonas>need to update submodules and add all those kaem scripting changes
<agg1>seems any other than i386 will bite; and i386 itself is hit by bootcode/firmware-related bootstrapping issues too, BIOS/real-mode stuff (not to mention modern UEFI), which would need a capable assembler one way or another, hence the idea with as86 that would imply a little smaller circular dependency and provided an intial playground for myself with a suitable use-case to keep binutils optional with linux2 bootstuff.S
<stikonas>gtker: I'm still working on stage0-uefi, so if you want, feel free to take over cc_riscv*
<stikonas>having various issues with stage0-uefi... at the momemnt trying to build M2-Mesoplanet
<stikonas> https://paste.debian.net/1373986/
<matrix_bridge><gtker> Alright, I'll start with RISCV32
<stikonas>something is also broken with output there, so I don't get meaningful error messages
<stikonas>though M2libc/stdio.c:472: suggessts it's va_args stuff
<matrix_bridge><gtker> That's... very interesting output. It seems to be repeating the characters?
<matrix_bridge><gtker> I'm assuming it's outputting UTF-8 since there seems to be a mix of european and asian characters?
<stikonas>yeah, some garbage...
<stikonas>yeah, probably
<matrix_bridge><gtker> What does xxd output if you redirect the output into a file and xxd on it?
<stikonas>nothing obvious...
<stikonas>seems like random stuff...
<stikonas>that's the end: https://paste.debian.net/1373989/
<stikonas>the first output was already redirected since on the qemu screen it was just whitespace there
<stikonas>gtker: so those random characters did not happens if I type M2-Planet.efi command line directly into UEFI shell
<stikonas>so the order of includes needed adjusting to build M2-mesoplanet
<matrix_bridge><gtker> Nice, so it's been fixed?
<stikonas>need to test...
<stikonas>gtker: doesn't seem to work somehow :(
<stikonas>I'll keep digging
<matrix_bridge><gtker> Alright, I'll look at riscv32 and 64
<stikonas>thanks
<stikonas>perhaps some bug in kaem too
<stikonas>all those characters were not there in direct call to M2.efi
<matrix_bridge><gtker> Maybe a rogue pointer? As far as I know we don't really dealloc anything so probably not a use after free
<stikonas>yeah, maybe...
<stikonas>anyway, not sure if I'll be able to finish it today, I'll look a bit more, but might need to come back to this problem later...
<matrix_bridge><gtker> Alright. I'll see if I can finish up the RISCVs. Do we need any others?
<stikonas>I guess with risc-v it will be all
<stikonas>except stage0-uefi and those that are waiting on oriansj to be merged
<agg1>low prio -> https://lists.gnu.org/archive/html/tinycc-devel/2025-05/msg00011.html
<agg1>^ bootstrapping python/portage and perl/autotools with tinycc
<agg1>that's all for a little while; suggestions to coordinate efforts are welcome
<stikonas>gtker: hmm, given that current stage0-uefi commit works and latest attempt at updating submodules prints garbage, I guess the easiest way would be to bisect
<stikonas>(though there are a few submodules, so need to be careful)
<stikonas>otherwise I'm lost at what went wrong
<stikonas>gtker: some progress on bisecting...
<stikonas>seems like M2-Planet PR #120
<matrix_bridge><gtker> Hmm, maybe something is clobbering registers?
<matrix_bridge><gtker> Can you narrow it down to a commit?
<stikonas>yeah, will do so
<stikonas>I just got it to the PR level
<stikonas>was doing manual submodule updates, not really bisecting
<stikonas>(since I had to update kaem scripts a few times)
<stikonas>UEFI is using somewhat other registers I think
<stikonas>cause of UEFI calling conventions
<stikonas>so it's likely that you are right
<matrix_bridge><gtker> We now use rax, rbx like always. rdi as temp, rbp as base, rsp as stack, r13 as locals, and r14 + r15 as temps
<stikonas>The registers Rax, Rcx Rdx R8, R9, R10, R11, and XMM0-XMM5 are volatile and are, therefore, destroyed on function calls.
<stikonas>The registers RBX, RBP, RDI, RSI, R12, R13, R14, R15, and XMM6-XMM15 are considered nonvolatile and must be saved and restored by a function that uses them.
<matrix_bridge><gtker> Might also be > A caller must always call with the stack 16-byte aligned.
<matrix_bridge><gtker> I'm not aware of us intentionally doing that?
<stikonas>I think it was in M2libc
<stikonas>let me find it
<stikonas>gtker: see https://github.com/oriansj/M2libc/blob/0247ef9b18945d29f20433a15c7b6a4729e07673/uefi/uefi.c#L402
<matrix_bridge><gtker> Ah, OK
<stikonas>I didn't do that initially
<stikonas>and it worked on my old laptop
<stikonas>but then nothing was working on my current laptop
<stikonas>so i debugged it to this stack alignment issue
<stikonas>had to update all those calls
<stikonas>easy to do in M2libc.... More work in all the earlier programs
<stikonas>especially all those hex0...
<stikonas>it could also be some code that was relying on non-standard M2-Planet C behaviour
<stikonas>but let me finish bisecting
<stikonas>hmm, seems to be 413c69f4a58ff3e294572549875195df9ccfb5a8
<stikonas>Allocate stack up front and always clean up locals
<stikonas>(and it's not M2libc part of the commit)
<matrix_bridge><gtker> Hmm, might be related to REGISTER_LOCALS then? Would be r13
<stikonas>yeah, that needs restoring apparenty
<matrix_bridge><gtker> r13 is non-volatile, so the UEFI syscalls should restore it automatically/not touch it?
<stikonas>*apparently
<stikonas>oh indeed
<matrix_bridge><gtker> It _could_ also be because we're manually subtracting from the stack instead of push/pop, which could make it unaligned?
<stikonas>hmm, the only mention of r13 is in libc-full.M1
<stikonas>where we push it before the rest of the program and pop it afterwards...
<stikonas>(since it is non-volatile)
<matrix_bridge><gtker> Could UEFI use it internally and clobber it?
<matrix_bridge><gtker> Is it only UEFI that has this problem?
<stikonas>well, if it uses it internally, it should save/restore it
<stikonas>each syscall/function us responsibe to be compatible
<stikonas>I should eat now, but I can push wip commit (to reproduce this issue) out later...
<stikonas>hmm, looking at when bug appears, I think the first broken binary is the one that uses full M2libc...
<stikonas>so perhaps something is miscompiled there
<stikonas>or maybe because hex2.efi is the one printing stuff...
<stikonas>gtker: hmm write function is here https://github.com/oriansj/M2libc/blob/0247ef9b18945d29f20433a15c7b6a4729e07673/uefi/unistd.c#L222 do you see anything that would interfere with your commit?
<stikonas>pushed broken repo state into broken branch: https://git.stikonas.eu/andrius/stage0-uefi/src/branch/broken
<matrix_bridge><gtker> Can't see what that would be, other than having locals which would mean using r13
<stikonas>anyway, no clue for now
<stikonas>don't see anything different about r13 compared to r14 or r15
<matrix_bridge><gtker> Yeah, might be necessary to undo parts of the patch and see what fixes it
<stikonas>well, not the risc-v stuff (I hope) or the first if conditional
<matrix_bridge><gtker> By first if do you mean "function->locals != NULL"?
<stikonas>yeah
<stikonas>that one I tested in isolation
<stikonas>seemed fine
<stikonas>and it's just micro-optimization...
<matrix_bridge><gtker> Trying to get a qemu up and running
<stikonas>the other stuff is more intertwined, will be harder to split
<stikonas>gtker: once you get qemu running with "make qemu"
<stikonas>I find it easier to follow if I first press "F2", pick Boot Option->UEFI shell
<stikonas>then type "FS0:" and "BOOTX64"
<stikonas>and for getting file access on normal system, you could run "sudo losetup -P /dev/loop0 build/disk.img && sudo mount /dev/loop0p1 /media"
<matrix_bridge><gtker> How do I redirect the output to a file?
<matrix_bridge><gtker> I'm also not sure how I can tell that it fails? After M2-Mesoplanet finishes I'm dropping in "UEFI Interactive Shell v2.2, EDK II, UEFI v2.70"?
<stikonas>gtker: in UEFI shell just "command > output"
<stikonas>same as in BASH
<matrix_bridge><gtker> Ah wait, if I type "fs0:" and "bootx64" it fails with a lot of blank output and "Subprocess error"
<stikonas>yeah, blank output is the bad thinbg
<stikonas>subprocess error is just because I didn't update checksums
<stikonas>it probably still creates correct files at this stage
<stikonas>but stdout is garbage
<stikonas>not sure if that's the only bug on the way to top of the tree
<matrix_bridge><gtker> Alright, I'll try to have a look
<stikonas>but this is the first commit with problems
<stikonas>there is a way to attach gdb if we can't figure out any other way
<stikonas>but it's somewhat trickier than on posix
<stikonas>you need to insert an infinite loop at where you want to have a breakpoint
<stikonas>then attach gdb to qemu
<stikonas>that should interrupt the execution there and then you an jump into the code after the loop and step from there
<matrix_bridge><gtker> It could also be because you're modifying the stack pointer in order to align it?
<stikonas>hmm, but that should be restored after UEFI call
<stikonas>hmm
<matrix_bridge><gtker> Or maybe that doesn't make sense. Since there are no locals in the UEFI function
<stikonas>mov_rsp,[rsp+BYTE] %40 should restore the correct pre-alignment stack
<matrix_bridge><gtker> The function that is called in e.g. "__uefi_2", should that have a particular ABI?
<stikonas>and function arguments are preloaded before stack alignment
<stikonas>__uefi_2 means with 2 arguments
<matrix_bridge><gtker> Arguments are passed in registers?
<stikonas>some of them
<stikonas>first 4 are in rcx, rdx, r8, r9,
<stikonas>are we using these?
<matrix_bridge><gtker> Are they functions supplied by UEFI, or are they our functions?
<stikonas>UEFI
<matrix_bridge><gtker> OK
<matrix_bridge><gtker> We don't use those registers
<stikonas>so instead of syscalls on UEFI
<stikonas>you just call various functions that you know by pointer
<stikonas>e.g. memory allocation is then return __uefi_3(memory_type, size, pool, _system->boot_services->allocate_pool);
<matrix_bridge><gtker> Hmm, OK
<stikonas>__uefi_3 wrapper should take care of that stack alignment stuff
<stikonas>also UEFI calling convention tells to leave shadow stack space
<stikonas>even if first 4 arguments are passed in registeers
<stikonas>those __uefi_ wrappers should ensure that too
<stikonas>anyway, I need to go today
<stikonas>we can continue tomorrow
<stikonas>(or next week...)
<matrix_bridge><gtker> Alright, see ya. I'll see if I can figure it out but I also might not have a lot of time
<matrix_bridge><Andrius Štikonas> Sure, no rush
<matrix_bridge><Andrius Štikonas> We, can debug it later
<matrix_bridge><Andrius Štikonas> I just want to avoid adding too many new features till we fix this
<matrix_bridge><Andrius Štikonas> To avoid more and more problems piling up
<matrix_bridge><gtker> Yeah, definitely
<matrix_bridge><gtker> stikonas: I've found a "fix", or at least the offending code. If you remove the if in "declare_function" and readd the "emit_push" loop I get output again. I believe it has something to do with how the old way aligns to "register_size"
<matrix_bridge><gtker> Aligning the stack with "locals_depth = locals_depth + (32 - (locals_depth % 32));" inside the "if(locals_depth != 0)" on line 3902 also has an effect on the output
<matrix_bridge><gtker> stikonas: I believe I found at least one solution that seems to work for me. https://github.com/gtker/M2-Planet/tree/broken-fix It seems like UEFI doesn't like it when we just allocate the stack without touching it. Weirdly enough changing the amount allocated by a fixed amount on line ~3901 also affects the output in various ways.