IRC channel logs

2022-07-23.log

back to list of logs

<vagrantc>janneke: lost track of this, but apparently some of mes's tests fail with gcc-12 https://bugs.debian.org/1012996
*vagrantc will also ping bug-mes
<stikonas>oriansj: there is a tool that might be able to generate TE image from PE32 images...
<stikonas>we might be able to use it to create TE and see if it boots
<stikonas> https://github.com/tianocore/edk2/blob/master/BaseTools/Source/C/GenFw/GenFw.c
<stikonas>and TE image does not run in UEFI shell after processing with this tool...
<stikonas>so I guess we should forget about it and just deal with full PE header...
<oriansj>stikonas: it is ok, just consider a cost of doing business like the bloated ELF header
<oriansj>progress is made one painful step at a time
<oriansj>I am only this far: https://paste.debian.net/1248116/ it is very slow going
<oriansj>a full reboot cycle to test a single 4byte change
<oriansj>The Timestamp supposedly seems to prefer: 50 45 00 00 # Timestamp supposedly (But seems to have to be a duplicate signature???)
<oriansj>nope looks like duplication bug in standard tooling
<oriansj>so moderate correction: https://paste.debian.net/1248119/
<oriansj>and if you want to see the full file I am slowly figuring out: https://paste.debian.net/1248120/
<oriansj>it is hex0 in hex0 just no comments as it was a dump in attempt to decode the PE bits and it does self-host on UEFI
<stikonas[m]>I thought DOS stub can be zeroed...
<stikonas[m]>hmm
<stikonas[m]>maybe not...
<oriansj>well this is a very VERY rough stage
<oriansj>just trying to figure out the ugly details and be roughly correct enough to run
<stikonas>well, I have started converting hex0.S to hex0.M1
<stikonas>but yes, we can make header smaller later if possible
<oriansj>well, once it gets to a stable state, we can more effectively make tests
<oriansj>and I am making good progress today on converting your hex0.s to hex2
<stikonas>oh you are working directly...
<stikonas>well, my .M1 prototype would still be useful later
<stikonas>we can move comments to hex2 file
<stikonas>and right now I have no means of testing that M1 file
<oriansj>yeah, unfortunately I have a bad habit of working backwards
<oriansj> https://paste.debian.net/1248139/ but this is self-hosting
<oriansj>and I haven't yet figured out all of the non-null bits yet
<oriansj>but I am slowly converting them to null if possible
<stikonas>well, it's fine, we'll converge on something...
<stikonas>I'll later port my changes to your hex2 file
<stikonas>I'm also looking in some places at some mini optimizations. I think in command line argument processing loop I'll swap rax and rbx registers
<stikonas>cmp bl, 0x20 is one byte longer than cmp al, 0x20
<oriansj>probably a good plan
<stikonas>now I have slightly better idea which code is shorter than when I was writing hex0.S
<stikonas>in risc-v it's somewhat simpler, everything is 32-bit long
<oriansj>well that is a micro-optimization they did to make programs smaller if you use the accumulator
<oriansj>which for many code paths was the most frequently used register by default
<stikonas>ok, pushed https://git.stikonas.eu/andrius/stage0-uefi/commit/93c8c5372cefb1d34e6ef4f7c0f9c7e127bf709c
<stikonas>hmm, another thing we can optimize in hex0 is add JNE8 instead of JNE32...
<stikonas>since hex0 is about 256 bytes, we shouldn't ever need 32-bit jumps
<stikonas>ok it's actually already used in stage0-posix...
<stikonas>oriansj: are you working on older hex0.S?
<stikonas>although I guess it's fine...
<stikonas>as long as we can figure out header stuff
<oriansj>stikonas: I am working on the output of the clang build of your hex0.S
<oriansj>So not optimized at all and mostly just to figure out the PE details
<stikonas>yeah, but I meant it's older build of hex0.S before I optimized some asm source stuff
<stikonas>but it's fine...
<oriansj>well yeah, it isn't going to be used for anything more than learning
<oriansj>so much for completely removing the dos stub
<oriansj>but we can completely zero it out
<oriansj>doesn't look like UEFI supports even shrinking it
<oriansj>so we are gonna have 64bytes of just nulls
<oriansj>unless I wish to be pithy and put some text
<oriansj>64 ascii chars of how much this is dumb
<stikonas>hmm, yes, I think I tried only zeroing it, not removing
<stikonas>I think zeroing is fine
<oriansj>and figured out the coff section: https://paste.debian.net/1248154/
<oriansj>up next is data directories and section table
<stikonas>oriansj: any idea what File and Section alignment is?
<stikonas>yesterday when I was trying to convert PE file to TE it was complaining that File and Section alighnments are not equal
<oriansj>I've got a few guesses
<oriansj>but the mChecksum bit is the big puzzle I am trying to figure out
<oriansj>lol, the checksum field isn't even looked at
<oriansj>zero the sucker
<oriansj>FileAlignment seems to be the byte offset in the file which is the program itself
<stikonas>and hex0.M1 is now ready https://git.stikonas.eu/andrius/stage0-uefi/src/branch/main/amd64/Development/hex0.M1
<oriansj>neat
<stikonas>once we have header ready, conversion to hex0 shouldn't be too hard
<stikonas>it's definitely easier than on risc-v
<oriansj>also you just helped me figure out something
<stikonas>?
<oriansj>LOADED_IMAGE_PROTOCOL
<oriansj> https://paste.debian.net/1248156/
<oriansj>it is a ways down but now I have a proper label for it
<stikonas>I suspect we can get rid of all this padding at the bottom
<oriansj>I thought so too but there are non-null bits that appear to be function and haven't figured out moving them yet
<oriansj>and trimming off the null bytes at the end current makes it unsupported for some reason
<stikonas>hmm
<stikonas>maybe there is file size somewhere...
<stikonas>or maybe alignment is wrong...
<oriansj>well lets see what happens if I shrink the alignment
<oriansj>to say 1 byte
<oriansj>with no other changes
<oriansj>ok that didn't break anything
<oriansj>and lets see if it enables us to delete some nulls
<oriansj>nope
<oriansj>maybe file alignment?
<oriansj>well setting file alignment to 1 didn't seem to break anything
<oriansj>nope still stuck with those nulls
<oriansj>stikonas: to your previous question: SectionAlignment and FileAlignment: Both members indicate the alignment of the sections of PE in the memory and in the file, respectively. When an executable is mapped into the memory, each section of that executable starts at a virtual address which is actually the multiple of this value.
<stikonas>hmm, for us I guess it doesn't matter too much where it is mapped in the memory
<stikonas>oriansj: minor issue but your LOADED_IMAGE_PROTOCOL_8 label is off by 2 bytes
<stikonas>although that whole string is just single 128-bit UUID...
<stikonas>it's just that we have to load it in two operations
<oriansj>stikonas: really?
<stikonas>well, you have 6 bytes in the first part
<stikonas>and 10 bytes in the second
<stikonas>should be 8 and 8
<oriansj>I was just matching what you had
<oriansj>but easy to fix
<oriansj>thank you for helping me spot that mistake of mine
<stikonas>well, it's not a bit deal
<stikonas>since in hex0 we don't need labels...
<stikonas>they'll only be important in hex2 code
<stikonas>well, I should probably start converting hex0.M1 to hex0.hex2 (without header)
<oriansj>well sorting out the header is going slowly, sorry
<stikonas[m]>No problem, that's expected
<stikonas[m]>It's harder than what I'm doing now
<oriansj>good news, I can rearrange the section headers without anything breaking
<oriansj>lets see if I can figure out how to remove one
<oriansj>it is looking like a hard no
<stikonas>oriansj: have updated num sections field?
<stikonas>still it might be that we need all of them...
<oriansj>well if I turn off .reloc it'll refuse to load; if I turn off .text it'll just hang forever as there are no executable bytes and if I turn off .data it'll hang after a few instructions
<stikonas>well, we had protocol GUIDs stored in .data
<stikonas>although there shouldn't be any read restrictions...
<stikonas>hmm
<oriansj>well we could move those into .text if I knew how to make it writable
<stikonas>oh, I think that's the issue
<stikonas>your older version needs them writable
<stikonas>but in my latest version it's only used read only
<stikonas>to read GUIs
<stikonas>rootdir, fin, fout are moved into registers
<stikonas>see https://git.stikonas.eu/andrius/stage0-uefi/src/branch/main/amd64/Development/hex0.M1#L330
<stikonas>so maybe for hex0 we don't need .data
<oriansj>well if we make .text writable, we will not need .data at all
<stikonas>that's also true...
<stikonas>but even if we can't, we can get rid of it at least in hex0
<oriansj>indeed
<muurkha>usually you can't make .text writable
<muurkha>I mean I assume youre talking about UEFI and not Linux
<muurkha>*you're
<muurkha>so I don't really know
<oriansj>you are not supposed to as it enables polymorphic code and a boatload of security issues
<oriansj>but if the executable format has read, write and execute bits you can always force it into bad idea mode
<muurkha>I guess I haven't actually tried this but I would sort of expect Linux to override your bits in that case
<muurkha>Linux doesn't strictly enforce W^X in general. GCC nested functions building their trampolines on an executable stack still work
<muurkha>by default, anyway
<muurkha>I spent some of last weekend disassembling the trampoline building code GCC emits for RISC-V
<muurkha>JIT compilers like V8 and HotSpot are really common these days, and they can still work with W^X, they just have to issue a system call to set a page executable after they emit code and before trying to run it
<muurkha>which they have to do anyway on RISC-V in order to ensure that all the CPU cores that can see the memory have flushed it from their instruction caches
<muurkha>(this is a change in recent versions of RISC-V; previously you just had to run a FENCE.I instruction but apparently that wasn't implemented to flush instruction caches on other cores)
<oriansj>muurkha: if you have run stage0-posix you have tried it
<oriansj>we only have a single segment where we have code and data
<oriansj>we read, write and execute in it too
<muurkha>oh, I hadn't noticed
<oriansj>it is why GRSEC kernels can't run stage0-posix binaries
<muurkha>right
<oriansj>basically instant segfault the second we write anything with a :label
<oriansj>with some cleverness we could leverage malloc and a computed label mechanism to work around that but as of right now there is no way in hex2 to break an input file into separate memory segments
<stikonas>oriansj: fixed that label position and also pushed hex0.hex2 https://git.stikonas.eu/andrius/stage0-uefi/src/branch/main/amd64/Development/hex0.hex2
<stikonas>410 bytes right now
<stikonas>but obviously this will grow once we add headers
<stikonas>quite a bit bigger than in POSIX but not too bad...
<stikonas>hopefully will be under 1 KiB in total with headers
<oriansj>well I found a mystery
<oriansj>took the working hex0.hex0
<oriansj>zero'd out the hex0 code and changed the first few bytes to setting rax to 42 and returning
<oriansj>it just hangs now
<oriansj>the .text block is still the exact same size and in the same part of the file
<oriansj>if you don't believe me: here is the hex0.hex0 that works https://paste.debian.net/1248165/ and the return 42 that doesn't https://paste.debian.net/1248166/
<oriansj>do a diff and you'll see is just a bunch of zeroing
<stikonas>yeah, I'm doing diff...
<stikonas>strange
<stikonas>oh, that's probably
<stikonas>because file size is different?
<oriansj>nope
<stikonas>hmm...
<oriansj>wc -c will show the exact same number of bytes if you build them with hex0/1/2
<stikonas>ok...
<stikonas>hmm, hanging is often code jumping to somewhere unexpected... but I don't see yet why it happens here
<oriansj>and we know mov rax, 42; ret works in assembly
<oriansj>right?
<stikonas>yes, ret should work as long as you have deallocated all previously allocated stack
<oriansj>and we just set RAX, so nothing was allocated
<stikonas>hmm, yes, it is hanging, I can reproduce
<oriansj>and I built https://paste.debian.net/1248170/ to prove that the code should work
<oriansj>although looking at that build there is a boatload of int 3 after the return
<oriansj>which makes even less sense
<stikonas>hmm, I can make it stuck by zeroing only a few bytes
<stikonas>oriansj: this already gets stuck https://paste.debian.net/1248173/
<mihi>stikonas, oriansj: I bet your problem is the blob starting at line 626 in your pastes
<stikonas>mihi: I think even zeroing one byte after C3 makes it stuck
<stikonas>this is enough to make it stuck https://paste.debian.net/1248174/
<mihi>stikonas, you did not get my comment.
<mihi>:D
<oriansj>the '3C 00 00 00' ?
<oriansj>I don't even know what that blob is yet
<mihi>oriansj, I would suggest that you find out. Then it would make a lot more sense why changing some random bytes in the .text section will magically break loading the binary.
<mihi>hint: try to find the name of the section it is lying in.
<mihi>(of course I could just point you to the relevant part of the PE file specification, but I am having the impression it to be more fun for you to find out yourself)
<stikonas[m]>Checksum?
<oriansj>or realloc
<mihi>.reloc
<mihi>(ations)
<oriansj>thank you mihi for that correction
<mihi> https://0xrick.github.io/win-internals/pe7/
<oriansj>that would do it
<oriansj>as it would be altering our instructions thinking it was data
<mihi>so I suggest you find out how an empty relocations table look like, then your eax example should work.
<oriansj>good thing my sanity.S program should have no .reloc at all
<mihi>about the R^X: EFI's AllocatePages has options to allocate code, or allocate data, but no option to have both. And reallocating the page will break when currently executing :)
<mihi>probably there are undocumented lower-level functions, but I would not expect them to be available on every implementation.
<muurkha>interesting, thanks! you mean W^X, right?
<mihi>yep, sorry :)
<muurkha>stupid, sexy UEFI
<oriansj>lol
<mihi>Also, I believe the DOS stub size may be any nonzero amount of 8 bytes (so 8 bytes minimum), but when you shrink it, you have to change many of the section offsets.