IRC channel logs
2022-07-24.log
back to list of logs
<oriansj>mihi: thank you for the tip, I'll have to revist the shrinking of the DOS stub section after we get something generally usable <oriansj>and I noticed that clang didn't make the second PE DOS stub the same size either <stikonas>I thought compile would keep it the same <oriansj>also the Characteristics part of the PE header is different too <oriansj>and it didn't bother to set the checksum either???? <stikonas>checksum probably doesn't matter for UEFI... <oriansj>but why did it previously set the checksum field? <oriansj>but in fun news, the _start is kept at 0x200 so I could sync multiple fields without issue <oriansj>I still have no idea why changing the number of sections requires one to update the Characteristics <stikonas>oriansj: have you tried looking at what changed in characteristics? <oriansj>well I found 4 different definitions for what the bits in characteristics mean and they don't seem to agree <oriansj>22 20 appears to mean Reserved & Section contains comments or some other type of information. & Reserved. & Reserved. <stikonas>maybe that could be useful for all this work <oriansj>be a good deal easier than documentation and guessing <oriansj>guix environment --ad-hoc python-pefile --fallback <stikonas>and then you can run pe = pefile.PE("hex0.efi") and analyze that pe struct <stikonas>that said we'll still have to figure out how to change things without breaking UEFI <stikonas>only one section has IMAGE_SCN_MEM_WRITE <stikonas>yes, but I guess we can steal that flag and apply to .text section <oriansj>stikonas: well we are going to figure all of that out <oriansj>thanks for the tip, certainly makes things a good bit clear <stikonas>well, I just happened to stumble upon that pefile, I wasn't aware of it before <oriansj>and hex2 -f hex0.hex0 -o hex0.efi && ./PE_INFO.py -o dump1 && meld dump dump1 lets us get a quick peek <stikonas>well, I'll go to bed soon, but hopefully this helps you <oriansj>hopefully I have something useful for you by the time you wakeup <oriansj>well I can cleanly delete the relocation section now, it however stalls on run (I'm guess because it actually needed the relocations <oriansj>I can't believe I forgot to notice line 31 <oriansj>use: pe = pefile.PE(inputfile) instead <oriansj>.reloc is going to be the big victory as it appears the memory address UEFI puts us in changes <oriansj>so we are gonna have to go PC relative addressing to make this work <fossy>stikonas[m]: oh, i forgot about that - merged <fossy>i did look at it but didn't finish reviewing before <fossy>also, nice work on hex0 optimzations <oriansj>stikonas: you'll have to replace all [label] bits with [rip+label] but it'll work around things nicely <oriansj>and we have writes to memory without hangs, no .reloc <oriansj>(doing this in x86 and RISC-V is gonna cost a call/pop combo because no PC relative addressing support) <muurkha>RISC-V has extensive PC-relative addressing support to avoid needing a call/pop combo <muurkha>the B* instructions, used for conditional branches, and the JAL instruction, used for unconditional branches and calls, are *only* PC-relative; they don't support an absolute addressing mode <muurkha>the JALR instruction, used for returning, invoking function pointers, and indexing jump tables, does interpret its register argument as an absolute address, adding its 12-bit immediate operand to that register to get the absolute address to transfer to <oriansj>muurkha: I am talking about loading/storing relative to the PC register <muurkha>but loading a PC-relative base address just requires an AUIPC, same as loading and storing relative to the PC register <oriansj>right, that stupid double instruction combo <oriansj>I knew something annoying was needed with risc-v to do this <muurkha>moreover you can AUIPC once to get a base address and then LW and SW relative to that base address multiple times <muurkha>as long as you don't clobber the register you put the PC-relative base pointer into <oriansj>muurkha: and remember to account for the dift <oriansj>as you move away from the PC you stored in a register <muurkha>LW and SW can index off it with a signed 12-bit offset, giving you 4 KiB of PC-relative space <muurkha>you don't need to account for the diff in each LW and SW instruction, just the original AUIPC instruction to load the base address of your PC-relative data page <muurkha>if you need more than 4 KiB you do need a two-instruction combo every time you want to do a PC-relative load or store. but that's also true of absolute loads and stores on RISC-V <muurkha>it's just that in that case the first instruction is an LUI instead of an AUIPC <muurkha>the standard RISC-V ABI reserves register x3 for "gp", the global pointer <muurkha>those might be especially reasonable places to store base addresses for PC-relative statically allocated data <muurkha>if you want to locate the base address of the data somewhere that isn't a multiple of 4KiB away from the instruction after the AUIPC, you can fix it up with an ADDI <muurkha>the AUIPC/ADDI combo is the LA pseudoinstruction provided by many RISC-V assemblers <oriansj>well we only need a handful of things stored, so having 32 registers means we can store 28 things in registers instead of RAM <oriansj>which might end up being the superior solution <muurkha>certainly to the extent that you can store things in registers it's desirable to do so <oriansj>well not all RISC-V have 32 registers... so yeah <muurkha>I haven't actually seen an RV32E design yet (the 16-register variant) <muurkha>but if you want more than 28 global variables, loading a global pointer gives you 1024 more 32-bit global variables or 512 64-bit ones <oriansj>and the proposed further reduced might only have r0-r4 so only 3 actual registers <muurkha>haven't heard of that; I haven't even seen an RV32E design <muurkha>hmm, maybe SeRV does have an option to implement RV32E <oriansj>mihi: we can eliminate the DOS stub entirely <oriansj>I still haven't figured out how to move _start before 0x200 bytes into the file but I completely dropped that DOS stub <oriansj>darn I couldn't null out the data dictionaries <oriansj>down to 16bytes of nulls that I haven't figured out how to remove yet <oriansj>basically everything after line 129 is your hex program <oriansj>only the size lines need to be adjusted for bigger programs <oriansj>how to do: Number of whole/partial pages in hex2 seems like a problem <stikonas[m]>yes, 454 bytes should be good enough to give us well under 1 KiB hex0.efi <stikonas[m]>but sure if we can get kaem.efi into another 1 KiB... I first need to optimize it a bit in stage0-posix <stikonas>oriansj: good progress. I'm now trying to use your header but I'm still a bit confused what needs changing. I tried adjusting VirtualSize and SizeOfRawData but binary just gets stuck... <stikonas>but I think we should be fairly close now <stikonas>perhaps something messed up in .hex0 code <stikonas>if I replace early bytes with exit, it works <stikonas>so that gives me some way of debbuging this <stikonas>hmm, yes, there are some errors that I made <stikonas>probably should later add Makefile to run production stuff (not development) in qemu... <oriansj>as the next UEFI problem is how to adjust the page count in the binaries <oriansj>we need this when we stop hand editing the PE header <stikonas>well, right now I only edited size of the binaries <stikonas>VirtualSize SizeOfRawData and size of code <oriansj>which is fine for now, I'm just thinking about what needs to be done for binaries built by M2-Planet and above <stikonas>well, it will be a while till we get there... <stikonas>I should probably optimize stage0-posix binaries a bit... <stikonas>although different callling conventions sometimes mess things up a bit <oriansj>well, stage0-posix binaries effectively are working UEFI binaries once they get the new PE header and we swap out the syscalls (and convert to RIP for our labels) <stikonas>well, calling conventions are a bit different <stikonas>so we might need to make sure everything is pushed/poped onto stack <oriansj>but that can be contained entirely in the syscall functions <stikonas>well, in hex0.efi I also used rdi instead of rbp in stage0-uefi <stikonas>as I wanted to use rbp to save stack pointer <stikonas>and posix-amd64 uses rbp because rdi is volatile <stikonas>but once we go past bootstrap binaries, binary size matters less <stikonas>in any case rbp->rdi only affected a few lines <stikonas>maybe we should also try to see if header can be compressed further... <oriansj>also we will end up having to make UEFI M2libc support, which ironically would enable UEFI binaries that are the same as POSIX binaries <oriansj>well there is only 16bytes I couldn't figure out how to eliminate and the optional header that UEFI seems to think isn't optional <oriansj>well right now 85.4626% of my example is zeros <stikonas>well, maybe we can push it to bootstrap-seeds then... <oriansj>merged, thank you stikonas for all your hard work <stikonas>the nice thing about UEFI is that it should be easy to automate... <stikonas>so one can create automatic chains, build some POSIX kernel, and automatically continue to GCC <oriansj>knowing this group, we can get there faster than expected <oriansj>plus, I'll probably learn alot along the way <oriansj>after I get enough time to setup a git repo on bootstrapping.world, I'll probably upload my notes as they seem useful for new people wanting to learn <mihi>oriansj, stikonas[m]: what problems did you run into when trying to zero CountBytesLastPage and CountPages in the MZ header? According to PE spec, it should be valid if both are zero, it will use the size of the file in this case. I also tested with some (not digitally signed) UEFI binaries with UEFI hexedit to zero it, and on my machines they still boot. <oriansj>well currently it is Debian 5.10.120-1 <mihi>(obviously if you want to append resources to the PE file, as in self-extracting archive files on Windows, the fields are needed to tell the loader where the image ends and the resources start) <oriansj>mihi: I believe stikonas figured that out but no it is unlikely for us to add resources to our PE files (unless we choose to append the sources) <mihi>oriansj, I assume that is the kernel version running on Debian 11 :) <mihi>I am not aware of any turnkey Debian packages for setting up a git server. If you want to push via ssh, you probably want to restrict the commands the people can run (via rbash or sshd_config or similar), if you want to push via git-http-backend CGI, setting up some basic webserver config is something you probably have to do yourself. <mihi>if all your users are fully trusted (not to use TCP forwarding or install a bitcoin miner), all you have to do is create a shared directory with correct permissions (sgid some group). <mihi>(after installing sshd and git, obviously) <mihi>git-shell (as login shell) or gitolite (as separate user) would be other options to restrict ssh access; note that neither of these provide ways for users to self-service (change) their SSH keys. <oriansj>mihi: well only people here would be getting access to create repos/push changes to them <oriansj>there isn't likely going to be much free compute for a bitcoin miner as we will be doing builds on it as well <oriansj>as for using the server for TCP forwarding; well if they are willing to do: ssh -L 8080:localhost:9050 username@host I can install a tor service on the server and the traffic can go on its merry way <oriansj>but if they do something other than that, there will be logs on the outbound linked to their ssh credentials... so yeah <oriansj>doing ssh -D 8080 username@host will just get all of your web traffic logged and tagged with your username <oriansj>because I understand people being in an environment where tor is forbidden and a private entry node sometimes is just what they need <oriansj>and giving someone here a 100GB of network traffic to help them out is a small price to pay <oriansj>in the case of someone putting a crypto miner on the server, I'll just set their default priority so that they get compute last and they are free to leverage any unused CPU cycles <oriansj>if the useful services and other users don't need the CPU cycles, why would anyone care if someone finds a use for them? <oriansj>now if that server was in my home and I'd have to pay for the extra electricity, then I'd care but on a rented server or a colo-rack where electricity is included in the price; I wouldn't care <oriansj>if the company renting the server or colo-rack had a problem with it, then I might have to take a different position to ensure the services remained available <oriansj>good catch on the incorrect comment by the way <stikonas[m]>I mostly just took the header that oriansj prepared and just adjusted necessary things <stikonas[m]>hmm, yes debian doesn't seam to package git hosting stuff, there are only unofficial packages <stikonas>oriansj: what if we configure rootless podman and run something in container? <stikonas>that way we don't need root to administer it <stikonas[m]>one just needs to configure subuid, subgit and install some packages as root first to get rootless podman working and after that it might just require normal unprivileged user <stikonas>yes, i think we can zero those 2 entries in MZ header that mihi mentioned <stikonas>but probably some parsing bug, nothing to do with UEFI <stikonas>quite a few more PE fields can be set to 0. all characteristics, etc... <oriansj>and guix packages are also an option, as are complex build/setups using any standard automation tool (like ansible, puppet, chef, salt, etc) <oriansj>we could do the whole Kubernetes cluster thing if we really wanted. <stikonas>setting up Kubernetes might be worth for big provider with multiple redundant hosts <stikonas>but it's probably overkill for our usecase <stikonas>hmm, even size of code from COFF header seems to be ignored <stikonas>bit it would be good to try it on baremetal before pushing to bootstrap-seeds <oriansj>nice, unfortunately I have no UEFI hardware (As I run on Librebooted machines personally) <stikonas>well, my laptop has UEFI, so I can try it, but maybe tomorrow <stikonas>though I suspect most of hardware is actually running the same tianocore stuff at higher levels (only hardware initialization stuff might differ) <oriansj>thus far I am looking at pagure, gitile, gitea, sourcehut and Kallithea as options to setup for the git repo <stikonas>I've only tried gitea and gitlab. gitlab was a huge resource hog... <stikonas>gitea is significantly smaller and similar GUI (although it does not have any integrated CI) <stikonas>it was on arm64 SoC, so RAM was fairly limited <stikonas>but I guess disk and CPU are also more used in gitalb <oriansj>I'm not a huge fan of Gitlab because it requires people to run JavaScript to see anything <stikonas>well, yes... And also resource intensive like I said... <oriansj>also we don't require an Integrated build service but it would be nice to have one available without worrying about limits <muurkha>oriansj: agreed, but it feels like an improvement over github in that the server software is free <muurkha>I haven't heard of pagure, gitile, and Kallithea <stikonas>well, in the worst case we can use a separate CI... <stikonas>yes, I haven't heard about pagure gitile or Kallithea <muurkha>people on #riscv pointed out that the current RV32E spec is still a draft <stikonas>yes, but even if we don't have integrated CI, I can set up Jenkins or something like that... <muurkha>yeah, and also CI is not rocket science <muurkha>a git post-update hook can run wget or curl or touch or whatever to kick off a regression test script <muurkha>64% of the benefit of gitlab CI for 4% of the effort <oriansj>muurkha: well the big problem is I only have so much time, of which I hope only a subset is needed to maintain/update/etc the service <oriansj>paying for the servers is easy, administration however takes time <oriansj>and unless someone wants an admin account to take up that responsibility, I am left trying to figure out the least effort option <muurkha>gitlab CI gives you containers, a status page for the CI run (linked from the commit), yaml to configure it, and a work queue/worker pool system to do the CI runs <oriansj>yeah, gitlab isn't an option I'm considering due to personal preferences <muurkha>probably some of the others have similar features <muurkha>but what I'm saying is that if you want automatic builds and test runs on commit, kicking them off from .git/hooks/post-update is not difficult <stikonas>administration shouldn't be too bad once initial setup is done <stikonas>even mail server setup that I'm running doesn't need much maintenance <oriansj>fair, also public backups folder for the data make for a cheap recovery if things go wrong. <oriansj>and I'll be doing a scripted setup that will be made available <oriansj>I could even pay someone to do the work assuming <stikonas>we should be able to set it up ourselves... <oriansj>they don't ask for some amount I can't afford <stikonas>it's probably harder to pick the hosting system <stikonas>hmm, personally I think I like gitea UI most as it's familiar to most people (very similar to github). And it's only a single binary. Downsides are slightly harder to configure CI