IRC channel logs

2022-12-30.log

back to list of logs

<stikonas>hmm, I've started adding more UEFI support to M2-Mesoplanet. In particular tried to fix spawning but it's only partially successful
<stikonas>it can successfully runs M2-Planet but then gets stuck with M1 (so might be a bug in M1)
<stikonas>though I don't expect it to work anyway, as I haven't fixed M2libc to pull in correct #includes
<stikonas>but looping forever is not what I expected...
<oriansj>well can you reduce the M1 to the code that causes a loop and I'll fix it
<stikonas>well, first test would be to try to reproduce it in POSIX
<stikonas>otherwise it's just too hard to debug
<stikonas>which shouldn't be too hard as we M2-Planet can emit UEFI code when running on Linux
<stikonas>only then we can actually tell whether it is indeed a loop
<stikonas>it might be something like segfault
<stikonas>sicne UEFI has no memory protection, segfaults can manifest themselves in various ways
<oriansj>the stack looks wrong
<stikonas>without --architecture?
<stikonas>in the meantime I think the issue is actually not M1 getting stuck but something later, I think M1.efi worked but for some reason hex2.efi was not called
<oriansj>stikonas: I am very tempted to change the malloc/free design in a big way
<stikonas>oriansj: sure, what are you proposing?
<stikonas>that design was just something quick to be able to proceed with UEFI work
<stikonas>(which just needs add_block functionality rather than brk)
<oriansj>something a bit simpler and slightly less efficient
<stikonas>yeah, I guess that's fine
<stikonas>my first idea was to allocate big blocks and have per block "brk" pointer
<stikonas>and then free that does nothing
<stikonas>but I think then muurkha convinced me to do first-fit malloc
<oriansj>I am thinking of doing a simple struct mnode {struct mnode* next, void* block, size_t size};
<oriansj>do an insertion sort on every malloc
<oriansj>(oh and , int used)
<oriansj>on free, we just mark used to FALSE
<stikonas>well, that is simpler
<stikonas>I actually used singly linked list to track those big blocks in M2libc/UEFI too
<stikonas>(as we also have to free them on exit)
<oriansj>on malloc, we check if there is any block big enough and !used
<oriansj>if not we brk/uefi_brk a new mnode and insert it based on size
<oriansj>it'll waste memory and take a little longer but it should be relatively easy to reason about
<stikonas>yeah, I think it should work
<stikonas>so I've looked again, the only requirement for UEFI is being able to call block allocator (https://github.com/oriansj/M2libc/blob/1b8dacd2acb82e3e598af946992f11cd6bc10e8b/amd64/uefi/uefi.c#L827) with some size (doesn't have to be the same from call to call)
<oriansj>easy to do
<oriansj>stikonas: you call __init_malloc twice for UEFI?
<stikonas>I don't think so, but le me check
<stikonas>might be accidental
<stikonas>oriansj: where is the secnd call? I can only see one in https://github.com/oriansj/M2libc/blob/1b8dacd2acb82e3e598af946992f11cd6bc10e8b/amd64/uefi/uefi.c#L472
<oriansj>in __init_malloc_uefi()
<stikonas>that's the only one as far as I'm aware
<stikonas>on UEFI libc-full.M1 does not call __init_malloc
<oriansj>and in amd64/libc-full.M1
<stikonas>that's POSIX libc
<stikonas>uefi one is in amd64/uefi/libc-full.M1
<stikonas>it's quite a bit differnet
<oriansj>but I guess that is because the UEFI version doesn't have that matching call
<stikonas>indeed
<stikonas>you can't use the wonr libc-full.M1
<oriansj>true
<stikonas>calling __init_malloc twice would be the least of your worries
<stikonas>on UEFI you need to get those system and image pointers
<stikonas>rather than argc,argv
<oriansj>and I am guessing there is some reason why UEFI's libc-*.M1 doesn't have that call too?
<stikonas>well, because I added it in C file
<oriansj>ok
<stikonas>since initialization is fairly long, and can be done in C file, I thought it will be easier
<stikonas>than doing the whole thing in assembly
<oriansj>fair enough
<stikonas>only the parts that absolutely have to be done in assembly are done so
<stikonas>in particular switching to user stack
<oriansj>generally a good plan
<stikonas>you can't do that in .c file, not even in inline assembly
<oriansj>I ask because I was thinking of requiring the call to __init_malloc prior to any calls to malloc
<stikonas>that is fine
<stikonas>that's is already a requirement
<stikonas>and UEFI respects it, __init_malloc is the first thing init() function calls
<oriansj>just trying not to break UEFI too badly for you
<stikonas> https://github.com/oriansj/M2libc/blob/1b8dacd2acb82e3e598af946992f11cd6bc10e8b/amd64/uefi/uefi.c#L757
<stikonas>no, that's fine. I already had that requirement, so you are not adding anything extra at all
<oriansj>ok with #define EXIT_MALLOC_FAILED 42 ?
<oriansj>or do you feel we don't need it
<oriansj>actually I'll just return a NULL pointer and let the C code calling malloc deal with it
<stikonas>from malloc()?
<stikonas>yeah, I think normally application is supposed to deal with malloc failures
<stikonas>though perhaps we are not that good at checking that
<stikonas>"On error, these functions return NULL and set errno."
<oriansj>I'll just return NULL and we can deal with errno if needed later
<stikonas>yeah, that's fine
<stikonas>I never bothered with errno in M2libc too
<stikonas>we don't claim M2libc to be complete POSIX anyway
<oriansj>well free takes a good bit longer
<stikonas>and does it help with segfault?
<oriansj>well, I'm still testing
<oriansj>well it fixes the segfault for --version
<oriansj>but I can't accept performance this bad
<oriansj> limit mallocs to 1 << 30 fair?
<stikonas>I suppose
<stikonas>given that we only have up to 3 GiB of memory on x86
<stikonas>1 << 30 is already 1/3 of that
<stikonas>but yes, bad performance is a pity :(
<fossy>stikonas[m]: thank you :)
<stikonas[m]>fossy: no problem. I'll leave it to you to check difference
<fossy>yep, http://ttm.sh/0xd.txt
<oriansj>here is the diff if you wanted to play with it: https://paste.debian.net/1265663/
<fossy>very odd
<oriansj>but I'll be trying to speed it up before commiting it
<fossy>i have two changes which shouldn't affect checksums at all but i'm testing if it's them now
<stikonas[m]>oriansj: I'm preparing to sleep, so not right now...
<oriansj>stikonas: fair enough
<oriansj>muurkha: I'm probably reinventing the wheel with this: https://paste.debian.net/1265664/ but emacs has this sort of functionality already baked in right?
<oriansj>because it seems like such an obviously useful feature when doing macros
<oriansj>I just couldn't figure out the keybindings or primitives that would do it
<muurkha>yeah, you can push a new mark on the mark stack with C-SPC (or C-@) and pop it back off with C-u C-SPC (or C-u C-@)
<muurkha>if you're writing editing commands in Lisp though it's usually better to use save-excursion
<muurkha>or, as the set-mark documentation points out, store the location in a Lisp variable, like (let ((x (point))) ...)
<oriansj>well I use a stack variable
<oriansj>and I needed to preserve C-SPC for block selecting
<muurkha>C-SPC still works for block selecting when you do this
<muurkha>the problem is more that there are other commands that push marks on the stack you might not be expecting
<oriansj>muurkha: probably what was causing the issues; which resulting in me writting the little functions
<oriansj>probably best to keep it around, as it is highly unlikely for anyone to pick the same namespace for a variable with the same name
<oriansj>well free is probably the absolutely most expensive function
<oriansj>ok, this doesn't quite make sense to me; M1/hex2 don't ever call free
<oriansj>but once free is enabled => hang
<oriansj>ok, got that working and removed the need for __init_malloc() to even be called
<oriansj>but I'll leave a stub there, just in case stikonas has a good reason to keep it around for UEFI
<oriansj>and the slowdown is now reasonable
<oriansj>so I think I solved all of our memory allocation problems in M2libc
<oriansj>(and I made it easy to turn off free(void*) just in case)
<oriansj>>.<
<oriansj>no, that isn't quite right
<oriansj>well last = i->next; was annoying to find
<oriansj>but I think I cleared out the free(void*) bugs
<oriansj>and now M2libc should be free of malloc bugs
<oriansj>(at the cost of slighly worse performance on free calls)
<stikonas>oriansj: some bad news, you did break uefi part...
<stikonas>at least in my qemu with limitted RAM
<stikonas>but perhaps with change in how malloc allocates, we should instead fix UEFI part
<stikonas>it basically dous double tracking now
<stikonas>before malloc was allocating fairly large blocks (say 1 MiB)
<stikonas>and then allocate required number of 4K size pages
<stikonas>(and it would lose 1 page when it adds list tracking)
<stikonas>now I think _malloc_add_new might be quite small
<stikonas>so even if you request 10 bytes, it will still allocate 4K page
<stikonas>I could possibly switch to allocate_pool instead of allocate_pages
<stikonas>that is measured in bytes rather than pages
<stikonas>but in any case if we don't want to do double tracking of these small allocations, we need to add a function to stdlib.c to free all used block
<stikonas>s/block/blocks/
<stikonas>possibly something optional that cleanup() can call
<oriansj>stikonas: the solution is probably simpler
<stikonas>yes, it will be simpler
<stikonas>I just need to add _free_all() function, remove block tracking in uefi.c and switch it to allocate_pool
<oriansj>well if the smallest allocation UEFI can do is 4K, lets just make that the new floor
<oriansj>and then the logic for _malloc_uefi can be quite simple indeed
<stikonas>oriansj: no, I can do any allocation in UEFI
<stikonas>it's just a different call
<stikonas>to make it more confusing UEFI has to calls
<stikonas>AllocatePool and AllocatePages
<stikonas>probably one is more optimal than the other in some cases but I don't think we care about optimizing so much, so we can use both
<stikonas> https://edk2-docs.gitbook.io/edk-ii-uefi-driver-writer-s-guide/5_uefi_services/51_services_that_uefi_drivers_commonly_use/511_memory_allocation_services
<oriansj>well I am guessing AllocatePages is the optimal one for performance
<stikonas>indeed because it is alligned to page boundary
<stikonas>but I think it's fine to use allocate_pool
<stikonas>I used it in all early programs up to cc_amd64
<oriansj>even with 4K as the smallest allocation we don't use past 256MB
<stikonas>hmm, I wonder why though then we ran out of resources in UEFI as it stands now
<stikonas>the thing that returned out of resources was some step of building M2-Mesoplanet
<stikonas>I think M1 step
<oriansj>I'm guessing the allocatePool caps out at 200MB
<oriansj>Maximum resident set size (kbytes): 990,780 (if one counts every byte of every program running in memory at the worst case)
<oriansj>(with 4K min allocations)
<oriansj>if we do 256B min allocations: Maximum resident set size (kbytes): 72,000
<oriansj>so perhaps do the following: in _malloc_uefi; any allocation smaller than 4K do in allocatePool (we lower the allocation floor to 32bytes) and everything above 4K is done via AllocatePages
<oriansj>which should put our Max memory usage to: Maximum resident set size (kbytes): 34,556
<oriansj>(as we round up the the nearest usage << 1)
<oriansj>unbz2.c now makes it to 2 days of fuzzing without a segfault
<oriansj>I think that is stable enough
<oriansj>part of me wonders if debugging M2libc would be easier if we enabled building with GCC and standard binutils linking
<stikonas[m]>In think so
<stikonas[m]>M2-Planet binaries are hard to debug
<stikonas[m]>GCC generates much better debug info
<stikonas[m]>You can see source lines, variables, etc...
<oriansj>I guess wrapping the M2-Planetisms in #ifdef blocks and probably start getting some tests in m2libc
<stikonas[m]>We also need to build it without glibc, so need libc-full wrapper for gcc
<stikonas[m]>And inline assembly is tricky too...
<stikonas[m]>Need to write 2 copies of all syscalls
<oriansj>kind of unavoidable
<stikonas>ok, in the meantime I've pulled in new M2libc into stage0-posix to fix segfault in M2-Mesoplanet
<stikonas>oriansj: so I think we'll be able to survive with just allocatepool, no need to do allocatepages
<stikonas>I've done part of my changes
<stikonas>and with end-of-application cleanup disabled, it still runs all the way to mescc-tools-extra
<stikonas>but I still need to add free_pool calls at on cleanup
<stikonas>which get stuck right now
<stikonas>I tried to iterate over _allocated_list and _free_list and free everything there, but I guess that's not right
<stikonas>ok, looking closer at it and it's a bit problematic
<stikonas>we have n = _malloc_uefi(sizeof(struct _malloc_node)); and then n->block = _malloc_uefi(size);
<stikonas>which means so I'll have to free both of those
<stikonas>ok, maybe it won't be too problematic...
<oriansj>well yes, you have to free the blocks prior to the _malloc_nodes
<oriansj>which if you have 3 pointers struct _malloc_node* next; struct _malloc_node* current; void* block; it should be an easy iteration.
<oriansj>while(NULL != current) {next = current->next; free(current->block); free(current); current = next;}
<stikonas>oriansj: https://github.com/oriansj/M2libc/pull/38
<stikonas>yeah, I've done something like that
<stikonas>well, my first attempt was before I carefully looked at your code
<stikonas>maybe I can rename variables into current and next
<stikonas>rather than node and node2...
<stikonas>would be a bit nicer
<stikonas>or at least current and ndoe
<stikonas>ok, PR updated
<stikonas>oriansj: by the way, I've ended up using callbacks rather than doing this cleanup in uefi.c
<stikonas>it was a bit simpler as M2-Planet does not allow early declaration of structs
<stikonas>and right uefi.c file have to be included before stdlib.c since uefi.c defines __uefi__
<oriansj>merged
<oriansj>it certainly looks a good bit cleaner
<stikonas[m]>Indeed
<stikonas[m]>At least I got rid of double tracking...
<stikonas[m]>And calling init_malloc is indeed not necessary
<stikonas>oh, actually current problem in M2-Mesoplanet where it gets stuck is simply it calling unlink which is not implemented yet
<stikonas>that is probably good as it's nothing serious
<muurkha>yay
<stikonas>ok, unlink is here https://github.com/oriansj/M2libc/pull/39
<stikonas>hmm, and perhaps I should rethink uefi directory layout in M2libc
<stikonas>it's almost exclusively C code with very few arch specific bits: only memcpy(unameData->machine, "x86_64", 7); and __uefi_1, ..., __uefi_6 functions
<oriansj>perhaps it is wise to create a shared UEFI folder for that C code
<stikonas>oriansj: yeah, that's what I'm thinking
<stikonas>so arch specific bits can contain libc.M1 and PE32 header
<stikonas>and rest ist simply in M2libc/uefi/ folder
<oriansj>and a handful of assembly functions
<stikonas>those can even be #ifdef'ed in the shared folder
<stikonas>it's mostly calling convention
<oriansj>interesting
<stikonas>oh, bootstrap.c should still be in arch specific directory
<stikonas>but that's a separate case
<oriansj>a very special one indeed
<stikonas>also some more M2-Mesoplanet changes: https://github.com/oriansj/M2-Mesoplanet/pull/5
<oriansj>stikonas: M2-Mesoplanet changes merged
<stikonas>thanks
<stikonas>perhaps in the future we might want to differentiate between source and target os...
<stikonas>but this will do for now
<stikonas>oriansj: and unlink PR?
<stikonas>oh, also merged, thanks
<oriansj>and M2libc change also merged
<oriansj>stikonas: well we shouldn't care what the source/host OS is
<stikonas>oriansj: it matters a bit
<oriansj>(as far as the generated output is concerned)
<stikonas>for spawning, files might have extention
<oriansj>granted the host might have special rules for M2-Mesoplanet
<oriansj>but M2-Planet and below, it shouldn't matter
<stikonas>i.e M2-Planet might be M2-Planet.efi or M2-Planet.exe...
<stikonas>yeah, other than that it doesn't matter
<stikonas>hmm, I think I needto add a bit of sanity checking
<stikonas>you added extra checks, but I was based on older commit
<oriansj>yeah and M2-Mesoplanet definitely has to be aware of all of those differences
<oriansj>I guess I prefer the purity of environmentless C and the only environment like bits are those you can -D define=in
<stikonas>hmm, perhaps in that previous PR I should remove __init_macro_env("__uefi__", ...)
<stikonas>hmm
<stikonas>anyway, we can think about cross-compiling later
<oriansj>here is my thought
<stikonas>I'll first refactor uefi dirs
<stikonas>ok?
<oriansj>the default operating system is POSIX
<stikonas>and -D__uefi__ would turn on UEFI build
<stikonas>?
<stikonas>hmm
<stikonas>though spawning logic is also different
<oriansj>well we could just check for binaries
<stikonas>just -D__uefi__ wouldn't be enough
<stikonas>anyway, let's refactor M2libc uefi dirs first
<stikonas>as right now I can't even build any UEFI binaries with it
<stikonas>and it will be clearer what do do once we can test stuff
<oriansj>well the default exec stuff for the host can be hard coded
<oriansj>(inside of #ifdef blocks )
<oriansj>as we don't expect UEFI binaries to run on POSIX or Windows Systems or aarch64 binaries to run on RISC-V
<oriansj>so we have a #ifdef __uefi__ with the spawn logic and a #else block with the default logic
<oriansj>which one will be decided at compile time, not runtime
<oriansj>(same for target binary names)
<stikonas>yeah, that makes sense...
<oriansj>then the question of host identification becomes a non-question
<stikonas>and then what do we do with --os?
<stikonas>remove it
<stikonas>or repurpose it for target os
<oriansj>replace it with --operating-system for the target operating system
<oriansj>(you can keep -os or --os as a short hand if you want)
<oriansj>and I see no reason why target operating system should be any more special than target architecture
<oriansj>we can just use __init_macro_env and let standard C macro expansion do its magic
<oriansj>and you may wish to have name to __${NAME}__ conversions
<oriansj>so we can leverage more standard C operating system defintions
<stikonas>well, normally it's __linux__
<stikonas>hence I also used lowercase __uefi__
<oriansj>and we can do __dos__, __cpm__, __lose__, __vms__ and more if we want
<stikonas>oriansj: for now I have this https://github.com/oriansj/M2-Mesoplanet/pull/6/files
<stikonas>I'll have to build blood-elf on UEFI first...
<stikonas>if we want to build POSIX binaries on UEFI
<stikonas>but that's unavoidable
<stikonas>ok, just tested, seems to work and M2-Mesoplanet is now building POSIX binaries for mescc-tools-extra on UEFI