IRC channel logs
2025-05-10.log
back to list of logs
<oriansj>stikonas: then I think we have a bug in How M2-Mesoplanet handles conditional blocks <stikonas>(I can merge it myself later, but publishing it for review) <stikonas>and we'll have to replicated it across all cc_* variants <aggi>made some progress with debugging a TinyCC statically compiled/linked python <aggi>it's segfaulting in two extensions, _ssl/hashlib and xml/expat/elementtree <aggi>removed those two extensions temporarily, and some portage utilities work now <aggi>should be good enough, to bootstrap and/or retain python/portage self-hosting with tinycc <aggi>portage got a few more dependencies, which i couldn't test yet with a tinycc compiled OS release, because python was blocked a very long time <aggi>publishing this isn't trivial, because it's an all-or-nothing with regards to the forked portage-tree/musl-libc etc. <aggi>i'll re-compile a complete distro with portage today, which should sufficiently test-cover tinycc python.static then <aggi>then next, i can see to if _all_ utilities remained self-hosting for portage, and with it a tinycc driven OS release remaining self-hosting too <aggi>anyway, git-pushing any sources isn't problematic, but i rather do so with a domain that could host the operating system release itself <aggi>as an alternative, i could dissect individual ebuilds with their patch-folders each, and publish those only <aggi>depending on what's of interest to bootstrappable or tinycc-devel; but this would complicated re-producing test-cases without a release-tagged bootable ISO <aggi>another side note: i do not hesitate to publish _everything_ to bootstrappable and tinycc developers; but i hesitate to do so for _everyone_ else <aggi>for various reasons; including the fact i may not want to lose control over a TinyCC driven Operating System release without the opportunity to establish a project-site hosted myself, meanwhile anyone else could easily scrape and salvage hosting it while i can't <aggi>i think a complete somewhat-POSIX operating system release fully supported by a *nix type compiler/toolchain such as tinycc is rather valuable <aggi>too, as a baseline for any other OS/ARCH/CC variant (for example bsd/riscv/cproc), given a verification against TinyCC partially covers issues for any other such approach <stikonas>gtker: do you want to replicate cc_x86 changes to some of the others cc_* and if so then which ones? <stikonas>(just asking to make sure we don't have conflicts) <stikonas>ok, I'll start with amd64 and uefi versions then <stikonas>at least cc_* is not too hard to modify... <stikonas>earllier stages need more and more hex/machine code... <stikonas>risc-v is particularly hard to work at hex0 level... <matrix_bridge><gtker> Yeah, and it's mostly deletions and small edits, so it shouldn't be too painful <stikonas>yeah, the previous changes were harder. Took me a while to understand why aarch version was not working (that FBRANCH vs RBRANCH thingy) <stikonas>though I wasn't that familiar with aarch64 version before... <matrix_bridge><gtker> I spent a few hours not being able to get it working before I gave up and hoped that it was just some toolchain specific issue that I didn't know about 😄 <stikonas>aarch64 hex/M1 could probably benefit from riscv-v word style rework.. <stikonas>but it's quite a bit of work to rewrite it <matrix_bridge><gtker> Yeah, it's not something that I would be super hyped about doing <matrix_bridge><gtker> I would probably rather write a "real" assembler in cc_* compatible C so that we could use it in M2-Planet? <stikonas>I think oriansj at some point spent a bit of time on it but didn't make enough progress <matrix_bridge><gtker> Possibly, I'm not sure what would be best since I haven't done much assembly work, I've mostly used it for reverse engineering x86 windows programs <matrix_bridge><gtker> Although being compatible with some existing tooling would probably be a necessity just so we get editor support for free <stikonas>well, I mostly didn't know assembly before except for really basic stuff but learned quite a bit while writing stage0-posix-riscv64 <matrix_bridge><gtker> Being forced to write a complex application in assembly has a tendency to force you to learn assembly 😉 <agg1>x86 16bit real-mode asm cought my interest, to re-write linux2 bootcode for as86 syntax, no time yet :( <agg1>this could shrink the circular dependency graph surrounding binutils involved with tccboot/bootloader/kernel bootstrapping, not urgent <matrix_bridge><gtker> How does bootstrapping a kernel even work? How are you executing ELFs without a baseline kernel that must be trusted? <agg1>one idea was/is, to revive tccboot loader (tested it while ago against some recent libtcc1.a) <agg1>this loader contains some real-mode asm itself, which must be process by binutils currently; as86 would shrink that circular dependency graph <agg1>of cause, this is a different approach to bootstrappable stage0, not entirely sure yet of tccboot/linux/as86 could fit into live-bootstrap <stikonas>on BIOS systems, builder-hex0 already gives one way of bootstrapping kernel in the traditional sense <stikonas>on UEFI, I've started that stage0-uefi and posix-runner work but haven't done anything in the last year or so <stikonas>I think I've got it to the stage where it could run mes <stikonas>and for further work one needs to support loading different segments from the ELF binary <stikonas>(at the moment posix-runner can only run fully relocatable binaries, since it loads them in the wrong virtual address) <agg1>i'm a little worried about gcc/binutils themselves, in conjunction with those relying on autotools/autogen; at a later stage of bootstrapping <agg1>in conjunction with the idea, relying upon i386-tcc instead, to provide a regular bootable ISO, which contains all utilities to establish binutils/gcc those at a comfortable posix shell <agg1>the moment when bootstrappable arrived at musl-libc/tinycc, linux2 perl/autotools and python/portage can be established, _before_ gcc/binutils <stikonas>I'm finishing stage0-uefi, then I'll do riscv versions <agg1>this week i tested arm-tcc against musl-libc, and failed; out of curiosity, the transition towards arm-tcc isn't covered on your side either? <stikonas>hmm, stage0-uefi will take a bit more time... <stikonas>need to update submodules and add all those kaem scripting changes <agg1>seems any other than i386 will bite; and i386 itself is hit by bootcode/firmware-related bootstrapping issues too, BIOS/real-mode stuff (not to mention modern UEFI), which would need a capable assembler one way or another, hence the idea with as86 that would imply a little smaller circular dependency and provided an intial playground for myself with a suitable use-case to keep binutils optional with linux2 bootstuff.S <stikonas>gtker: I'm still working on stage0-uefi, so if you want, feel free to take over cc_riscv* <stikonas>having various issues with stage0-uefi... at the momemnt trying to build M2-Mesoplanet <stikonas>something is also broken with output there, so I don't get meaningful error messages <stikonas>though M2libc/stdio.c:472: suggessts it's va_args stuff <matrix_bridge><gtker> That's... very interesting output. It seems to be repeating the characters? <matrix_bridge><gtker> I'm assuming it's outputting UTF-8 since there seems to be a mix of european and asian characters? <matrix_bridge><gtker> What does xxd output if you redirect the output into a file and xxd on it? <stikonas>the first output was already redirected since on the qemu screen it was just whitespace there <stikonas>gtker: so those random characters did not happens if I type M2-Planet.efi command line directly into UEFI shell <stikonas>so the order of includes needed adjusting to build M2-mesoplanet <stikonas>all those characters were not there in direct call to M2.efi <matrix_bridge><gtker> Maybe a rogue pointer? As far as I know we don't really dealloc anything so probably not a use after free <stikonas>anyway, not sure if I'll be able to finish it today, I'll look a bit more, but might need to come back to this problem later... <matrix_bridge><gtker> Alright. I'll see if I can finish up the RISCVs. Do we need any others? <stikonas>except stage0-uefi and those that are waiting on oriansj to be merged <agg1>^ bootstrapping python/portage and perl/autotools with tinycc <agg1>that's all for a little while; suggestions to coordinate efforts are welcome <stikonas>gtker: hmm, given that current stage0-uefi commit works and latest attempt at updating submodules prints garbage, I guess the easiest way would be to bisect <stikonas>(though there are a few submodules, so need to be careful) <stikonas>was doing manual submodule updates, not really bisecting <stikonas>(since I had to update kaem scripts a few times) <stikonas>UEFI is using somewhat other registers I think <matrix_bridge><gtker> We now use rax, rbx like always. rdi as temp, rbp as base, rsp as stack, r13 as locals, and r14 + r15 as temps <stikonas>The registers Rax, Rcx Rdx R8, R9, R10, R11, and XMM0-XMM5 are volatile and are, therefore, destroyed on function calls. <stikonas>The registers RBX, RBP, RDI, RSI, R12, R13, R14, R15, and XMM6-XMM15 are considered nonvolatile and must be saved and restored by a function that uses them. <matrix_bridge><gtker> Might also be > A caller must always call with the stack 16-byte aligned. <stikonas>but then nothing was working on my current laptop <stikonas>so i debugged it to this stack alignment issue <stikonas>easy to do in M2libc.... More work in all the earlier programs <stikonas>it could also be some code that was relying on non-standard M2-Planet C behaviour <stikonas>hmm, seems to be 413c69f4a58ff3e294572549875195df9ccfb5a8 <stikonas>Allocate stack up front and always clean up locals <stikonas>(and it's not M2libc part of the commit) <matrix_bridge><gtker> Hmm, might be related to REGISTER_LOCALS then? Would be r13 <matrix_bridge><gtker> r13 is non-volatile, so the UEFI syscalls should restore it automatically/not touch it? <matrix_bridge><gtker> It _could_ also be because we're manually subtracting from the stack instead of push/pop, which could make it unaligned? <stikonas>hmm, the only mention of r13 is in libc-full.M1 <stikonas>where we push it before the rest of the program and pop it afterwards... <stikonas>well, if it uses it internally, it should save/restore it <stikonas>each syscall/function us responsibe to be compatible <stikonas>I should eat now, but I can push wip commit (to reproduce this issue) out later... <stikonas>hmm, looking at when bug appears, I think the first broken binary is the one that uses full M2libc... <stikonas>so perhaps something is miscompiled there <stikonas>or maybe because hex2.efi is the one printing stuff... <matrix_bridge><gtker> Can't see what that would be, other than having locals which would mean using r13 <stikonas>don't see anything different about r13 compared to r14 or r15 <matrix_bridge><gtker> Yeah, might be necessary to undo parts of the patch and see what fixes it <stikonas>well, not the risc-v stuff (I hope) or the first if conditional <stikonas>the other stuff is more intertwined, will be harder to split <stikonas>gtker: once you get qemu running with "make qemu" <stikonas>I find it easier to follow if I first press "F2", pick Boot Option->UEFI shell <stikonas>and for getting file access on normal system, you could run "sudo losetup -P /dev/loop0 build/disk.img && sudo mount /dev/loop0p1 /media" <matrix_bridge><gtker> I'm also not sure how I can tell that it fails? After M2-Mesoplanet finishes I'm dropping in "UEFI Interactive Shell v2.2, EDK II, UEFI v2.70"? <stikonas>gtker: in UEFI shell just "command > output" <matrix_bridge><gtker> Ah wait, if I type "fs0:" and "bootx64" it fails with a lot of blank output and "Subprocess error" <stikonas>subprocess error is just because I didn't update checksums <stikonas>it probably still creates correct files at this stage <stikonas>not sure if that's the only bug on the way to top of the tree <stikonas>but this is the first commit with problems <stikonas>there is a way to attach gdb if we can't figure out any other way <stikonas>but it's somewhat trickier than on posix <stikonas>you need to insert an infinite loop at where you want to have a breakpoint <stikonas>that should interrupt the execution there and then you an jump into the code after the loop and step from there <matrix_bridge><gtker> It could also be because you're modifying the stack pointer in order to align it? <stikonas>hmm, but that should be restored after UEFI call <matrix_bridge><gtker> Or maybe that doesn't make sense. Since there are no locals in the UEFI function <stikonas>mov_rsp,[rsp+BYTE] %40 should restore the correct pre-alignment stack <matrix_bridge><gtker> The function that is called in e.g. "__uefi_2", should that have a particular ABI? <stikonas>and function arguments are preloaded before stack alignment <matrix_bridge><gtker> Are they functions supplied by UEFI, or are they our functions? <stikonas>you just call various functions that you know by pointer <stikonas>e.g. memory allocation is then return __uefi_3(memory_type, size, pool, _system->boot_services->allocate_pool); <stikonas>__uefi_3 wrapper should take care of that stack alignment stuff <stikonas>also UEFI calling convention tells to leave shadow stack space <stikonas>even if first 4 arguments are passed in registeers <stikonas>those __uefi_ wrappers should ensure that too <matrix_bridge><gtker> Alright, see ya. I'll see if I can figure it out but I also might not have a lot of time <matrix_bridge><Andrius Štikonas> I just want to avoid adding too many new features till we fix this <matrix_bridge><Andrius Štikonas> To avoid more and more problems piling up <matrix_bridge><gtker> stikonas: I've found a "fix", or at least the offending code. If you remove the if in "declare_function" and readd the "emit_push" loop I get output again. I believe it has something to do with how the old way aligns to "register_size" <matrix_bridge><gtker> Aligning the stack with "locals_depth = locals_depth + (32 - (locals_depth % 32));" inside the "if(locals_depth != 0)" on line 3902 also has an effect on the output <matrix_bridge><gtker> stikonas: I believe I found at least one solution that seems to work for me. https://github.com/gtker/M2-Planet/tree/broken-fix It seems like UEFI doesn't like it when we just allocate the stack without touching it. Weirdly enough changing the amount allocated by a fixed amount on line ~3901 also affects the output in various ways.