IRC channel logs
2025-05-11.log
back to list of logs
<matrix_bridge><Andrius Štikonas> gtker: hmm, interesting... Not sure I understand why it doesn't like allocated but not touched stack though... <matrix_bridge><Andrius Štikonas> in UEFI the original stack is fairly small and not sufficient to run M2-Planet, so actually all programs on startup allocated a bigger block of memory (I think 8 MiB) and mov rsp pointer to there <matrix_bridge><Andrius Štikonas> so all that memory should in principle already belong to the application <matrix_bridge><Andrius Štikonas> ok, the latest versions of submodules + your patch seem to fix everything <matrix_bridge><Andrius Štikonas> though I still don't understand why it was happening... <matrix_bridge><Andrius Štikonas> all the calls to UEFI functions (and hence printing too) should realign stack before passing control to UEFI <agg1>isn't 32byte alignment less strict than 64byte? seems not plausible, because if something is 64byte aligned it's too 32byte aligned <matrix_bridge><Andrius Štikonas> push rsp; push [rsp]; and_rsp, %-16 should align the stack to 16-bytes already <agg1>it's probably not an alignment issues as such, but uefi expecting routines placed at a specific address (don't know what the call-flow of this is) <agg1>then it could be a somewhat lucky hit or miss, sometimes it's working with this alignment, some other time it's another one <matrix_bridge><gtker> stikonas: I really don't understand it either. I could also make it output things semi-correcly by deliberately adding additional stack space. The effects weren't really consistent and it did things that I would consider extremely weird like printing a space (or other invisible byte) before every character. I'm not sure if UEFI expects UTF-16 or something like that or what else could go wrong. <matrix_bridge><gtker> I'm also not 100% fully convinced that it's entirely because we need to touch the stack, it might also have something to do with the values written to the stack in an unpredictable way <matrix_bridge><gtker> The patch effectively does the same thing as pushing REGISTER_ZERO to the stack to allocate more stack space so I'm thinking that there might be an uninitialized variable or something that gets a good-enough value with the patch <matrix_bridge><Andrius Štikonas> anyway, I don't have time to look at it today or even tomorrow... <matrix_bridge><Andrius Štikonas> generally UEFI is much pickier about things than Linux kernel <matrix_bridge><gtker> Alright. If we're adding my patch then we might want to be able to detect when we're compiling for UEFI so that we don't have the touch the stack for all architectures unnecessarily <matrix_bridge><Andrius Štikonas> well, we could spend some more time to try to understand <matrix_bridge><Andrius Štikonas> though it's not ideal to push a workaround without understanding <matrix_bridge><Andrius Štikonas> I might connect gdb to it a bit later next week <matrix_bridge><gtker> Yeah, although we do want UEFI to have parity with the others in regards to enums/CONSTANTs, so we might need to push it <matrix_bridge><Andrius Štikonas> well, yeah, though short term non-parity is ok <matrix_bridge><gtker> stikonas: I tried writing different immediate values to the stack instead of just using whatever was in register zero. Couldn't find a value that made it not work, so I'm leaning on it being just needing to touch the stack. How does paging work in UEFI? Do we get a catchable interrupt if we try to read/write wrong memory? <matrix_bridge><cosinusoidally> I'm not too familiar with uefi, but I wonder if stack allocation is similar to windows where it grows the stack using a guard page. On win32 if you have more than a page worth of local variables you need to emit code to touch each stack page in turn to grow the stack. If you don't do that then the app may crash. https://devblogs.microsoft.com/oldnewthing/20220203-00/?p=106215 explains it in diagrams <matrix_bridge><cosinusoidally> The scenario I have hit is when I essentially did something like foo(){ int bar[2000]; bar[0]=100;} that caused a write below the stack page and a crash. This only happened because I was trying to run code generated by the Linux tcc backed on win32. In win32 mode tcc would insert stack probes that mitigated the issue. <matrix_bridge><gtker> cosinusoidally: That's what I think is happening. I don't really remember, but isn't UEFI very Windows/Microsoft inspired? <matrix_bridge><gtker> It's also low level enough that there could realistically be an interrupt that we just aren't catching when it happens <matrix_bridge><cosinusoidally> Interesting https://wiki.osdev.org/GNU-EFI also shows "-fshort-wchar" which does sound quite windows centric. As I mention I don't know much about uefi, but the bits I've seen seem weirdly windows specific (like using PE files). <mihi>Protocols_Console_Support.html#efi-simple-text-output-protocol-outputstring is the issue - it takes a pointer to a CHAR16 null-terminated string (i.e. terminated by two null bytes), but you pass it a pointer to a single CHAR8. So unless the next three bytes are zero, it will cause unexpected behaviour. <mihi>and proably depending on the actual value on the (uninitialized) stack, it manifests differently depending on how you touch the stack :) <mihi>a pointer to buf+i will never work if there are any following chars. Probably the correct fix would be to allocate a 4-byte (or two-16bit-word) array and fill the first one only. <matrix_bridge><gtker> Ah right, we're looping over the count. I assumed that we could just write the entire string at once like a normal write. I _believe_ that currently the "char c" is padded to 8 bytes with zeroes which it why it currently works <mihi>you could use a single write in case you converted it to UTF-16 first :) <matrix_bridge><gtker> Don't think we'll do that, it's probably easier to just print one char at a time 😄 <matrix_bridge><gtker> The "__uefi_3" below where we supply the count as a pointer and the buffer as chars would also be wrong, right? <matrix_bridge><gtker> Right now I'm wondering how this even worked at all, unless the normal write just accepts a char buffer? <mihi>file write accepts just raw bytes, which are stored to the file, not interpreted in any encoding. As long as whatever tool reading the file can cope with ASCII/latin1 encoded files, it should be fine. <mihi>and I assume the tools reading the files will be M1/hex, which are designed to work with 8-byte characters anyway. <matrix_bridge><gtker> That's what I'm thinking, but I can't find the filo IO part of the UEFI docs <matrix_bridge><gtker> Does UEFI have a concept of a file system outside of the actual storage devices? <mihi>the concept is called EFI_SIMPLE_FILE_SYSTEM_PROTOCOL and it supports FAT32 only by default <mihi>(in case you wonder why my links point to different versions of the UEFI specification - I just used the first result when googling for EFI_SIMPLE_FILE_SYSTEM_PROTOCOL and then clicked through to the write function) <matrix_bridge><gtker> mihi: Do you know what the endianness of the UTF-16 is? Never really done development on Windows but I'm guessing it's just LE? <mihi>gtker, yes it is UTF-16-LE <mihi>otherwise the old code would have produced mojibake anyway :) <mihi>(just like some old versions of CDex cd ripping software did in their Unicode ID3 tags) <matrix_bridge><gtker> mihi: Just to make sure I'm not stupid: A single char followed by a nullterminator would be char, 0, 0, 0? <mihi>I can confirm that you are not stupid. <matrix_bridge><Andrius Štikonas> gtker: UEFI boots with identity map between physical and virtual memory <matrix_bridge><Andrius Štikonas> It is allowed to set up different paging map as long as you restore identity map before calling UEFI functions <matrix_bridge><Andrius Štikonas> Anyway, I can't look much more right now as I'm taking flight home soon <matrix_bridge><gtker> stikonas: No worries. I'm trying out a few things. Haven't gotten anything working other than my previous fix <matrix_bridge><gtker> stikonas: Do you know which write functions are definitely used for UEFI? I've tried just inserting "exit(1)" into the ones I suspect were used but it doesn't do anything <matrix_bridge><Andrius Štikonas> gdbing that write function might make sense... <matrix_bridge><Andrius Štikonas> int write should be used for both stdout and file output <matrix_bridge><Andrius Štikonas> bootstrap.c is used in M2-Planet --boostrap-mode only <matrix_bridge><gtker> Weird. If I insert "exit(1)" calls into those functions it just keeps on working... <matrix_bridge><gtker> Even if I put an "exit(1)" inside "_write_stdout" it still keeps working... <matrix_bridge><gtker> Tried with an exit inside main of M2-Planet and it exited correctly... <matrix_bridge><gtker> But then it shouldn't be able to keep working and run M2-Mesoplanet and so on? <matrix_bridge><Andrius Štikonas> There is a chance that it would work but would leak some resources <matrix_bridge><gtker> Trying now to just call a function called "sdafsdgasdg(1)" which also just works. Maybe it's not finding the exit function? <matrix_bridge><Andrius Štikonas> M2libc isn't but maybe you are editing the wrong copy? <matrix_bridge><Andrius Štikonas> I guess it used to be zeroed by accident? <matrix_bridge><gtker> I believe it's because (as mihi said) that "push" causes the entire register to be zeroed, but "store_value" depends on the size of the type <matrix_bridge><Andrius Štikonas> (I was just quickly skimming through earlier conversation on my phone) <matrix_bridge><gtker> It makes sense that changing the values loaded into uninitialized variables didn't affect it since the variable was initialized <matrix_bridge><gtker> I think what made my fix work was changing "store_value(type_size->size)" to "store_value(register_size)" <matrix_bridge><Andrius Štikonas> Now we understand this and have a proper fix <matrix_bridge><gtker> It was a hack, but it effectively does the same thing as changing the variable to an "int" since it zeros out the entire memory. Not sure why adding more to the stack space made it work. Maybe the memory is just naturally zeroed out and we were lucky that we didn't overwrite the zeroes in exactly the right places? <matrix_bridge><gtker> I think we'll need at least a 4 byte size since the first 2 bytes are the char and the other 2 bytes are the null terminator <mihi>gtker: regardless what variable size you choose, I'd suggest to add a comment so that the next person debugging it knows what's happening there :) <mihi>[<Andrius Štikonas> It is allowed to set up different paging map as long as you restore identity map before calling UEFI functions] : Any source for that (does not need to be now)? <mihi>as far as I know, as long as you are in boot services mode, you may not touch any processor registers (like enable paging or more processor cores or changing interrupts). Once you swiched to Runtime Services mode, you either need to call SetVitualAddressMap right after ExitBootServices to declare your desired mapping, or you always need to switch back to identity mapping before calling any runtime function. <mihi>but being able to enable paging in boot services mode would definitely make life easier for us (otherwise we'd have to cope without disk IO, keyboard, text output until we load our own drivers) <mihi>So query EFI_GRAPHICS_OUTPUT_PROTOCOL to enable linear framebuffer mode, and also query the memory map and manage memory ourselves. And especially important, after exiting boot services you cannot return to EFI shell any longer.. <mihi>ah ok, makes sense. When you disable or wrap interrupts, the boot services cannot notice it anyway :)