IRC channel logs

2022-07-24.log

back to list of logs

<oriansj>mihi: thank you for the tip, I'll have to revist the shrinking of the DOS stub section after we get something generally usable
<oriansj>and I noticed that clang didn't make the second PE DOS stub the same size either
<oriansj>56bytes instead of 64bytes; so odd
<stikonas>oh, that's strange...
<stikonas>I thought compile would keep it the same
<stikonas>oriansj: you can also try fasm2
<oriansj>also the Characteristics part of the PE header is different too
<oriansj>and it didn't bother to set the checksum either????
<stikonas>checksum probably doesn't matter for UEFI...
<oriansj>but why did it previously set the checksum field?
<oriansj>but in fun news, the _start is kept at 0x200 so I could sync multiple fields without issue
<oriansj>and now the diff is even smaller
<oriansj>and managed to turn off .reloc without deleting it: https://paste.debian.net/1248181/
<oriansj>I still have no idea why changing the number of sections requires one to update the Characteristics
<stikonas>oriansj: have you tried looking at what changed in characteristics?
<oriansj>well I found 4 different definitions for what the bits in characteristics mean and they don't seem to agree
<oriansj>22 20 appears to mean Reserved & Section contains comments or some other type of information. & Reserved. & Reserved.
<stikonas>there is python module pefile
<stikonas>maybe that could be useful for all this work
<stikonas> https://github.com/erocarrera/pefile/blob/master/pefile.py#L162
<oriansj>probably
<oriansj>be a good deal easier than documentation and guessing
<oriansj>guix environment --ad-hoc python-pefile --fallback
<stikonas>and then you can run pe = pefile.PE("hex0.efi") and analyze that pe struct
<stikonas>pe.dump_dict() or pe.dump_info()
<stikonas>that said we'll still have to figure out how to change things without breaking UEFI
<stikonas> https://paste.debian.net/1248183/
<stikonas>only one section has IMAGE_SCN_MEM_WRITE
<oriansj>that would be .data
<stikonas>yes, but I guess we can steal that flag and apply to .text section
<stikonas>and get rid of .data
<stikonas>unless it's impossible to remove .data
<oriansj>stikonas: well we are going to figure all of that out
<stikonas>fossy: by the way, is this good to merge https://github.com/fosslinux/live-bootstrap/pull/187 ?
<oriansj>thanks for the tip, certainly makes things a good bit clear
<stikonas>well, I just happened to stumble upon that pefile, I wasn't aware of it before
<oriansj>and I hacked together a little tool for us: https://paste.debian.net/1248187/
<oriansj>and hex2 -f hex0.hex0 -o hex0.efi && ./PE_INFO.py -o dump1 && meld dump dump1 lets us get a quick peek
<stikonas>well, I'll go to bed soon, but hopefully this helps you
<oriansj>hopefully I have something useful for you by the time you wakeup
<oriansj>^_^
<oriansj>well I can cleanly delete the relocation section now, it however stalls on run (I'm guess because it actually needed the relocations
<oriansj>)
<oriansj>minor tweak: https://paste.debian.net/1248189/
<oriansj>I can't believe I forgot to notice line 31
<oriansj>use: pe = pefile.PE(inputfile) instead
<oriansj>.reloc is going to be the big victory as it appears the memory address UEFI puts us in changes
<oriansj>so we are gonna have to go PC relative addressing to make this work
<fossy>stikonas[m]: oh, i forgot about that - merged
<fossy>i did look at it but didn't finish reviewing before
<fossy>also, nice work on hex0 optimzations
<oriansj>stikonas: you'll have to replace all [label] bits with [rip+label] but it'll work around things nicely
<oriansj>we are down to 1024bytes now
<oriansj>and we have writes to memory without hangs, no .reloc
<oriansj>(doing this in x86 and RISC-V is gonna cost a call/pop combo because no PC relative addressing support)
<muurkha>RISC-V has extensive PC-relative addressing support to avoid needing a call/pop combo
<oriansj> https://paste.debian.net/1248192/
<muurkha>the B* instructions, used for conditional branches, and the JAL instruction, used for unconditional branches and calls, are *only* PC-relative; they don't support an absolute addressing mode
<muurkha>the JALR instruction, used for returning, invoking function pointers, and indexing jump tables, does interpret its register argument as an absolute address, adding its 12-bit immediate operand to that register to get the absolute address to transfer to
<oriansj>muurkha: I am talking about loading/storing relative to the PC register
<muurkha>but loading a PC-relative base address just requires an AUIPC, same as loading and storing relative to the PC register
<oriansj>right, that stupid double instruction combo
<oriansj>I knew something annoying was needed with risc-v to do this
<muurkha>moreover you can AUIPC once to get a base address and then LW and SW relative to that base address multiple times
<muurkha>a PC-relative base address I mean
<muurkha>as long as you don't clobber the register you put the PC-relative base pointer into
<oriansj>muurkha: and remember to account for the dift
<oriansj>as you move away from the PC you stored in a register
<muurkha>LW and SW can index off it with a signed 12-bit offset, giving you 4 KiB of PC-relative space
<muurkha>you don't need to account for the diff in each LW and SW instruction, just the original AUIPC instruction to load the base address of your PC-relative data page
<muurkha>if you need more than 4 KiB you do need a two-instruction combo every time you want to do a PC-relative load or store. but that's also true of absolute loads and stores on RISC-V
<muurkha>it's just that in that case the first instruction is an LUI instead of an AUIPC
<muurkha>the standard RISC-V ABI reserves register x3 for "gp", the global pointer
<muurkha>and x4 for "tp", the thread pointer
<muurkha>those might be especially reasonable places to store base addresses for PC-relative statically allocated data
<oriansj>down to 598
<muurkha>if you want to locate the base address of the data somewhere that isn't a multiple of 4KiB away from the instruction after the AUIPC, you can fix it up with an ADDI
<muurkha>the AUIPC/ADDI combo is the LA pseudoinstruction provided by many RISC-V assemblers
<oriansj>well we only need a handful of things stored, so having 32 registers means we can store 28 things in registers instead of RAM
<oriansj>which might end up being the superior solution
<muurkha>certainly to the extent that you can store things in registers it's desirable to do so
<oriansj>well not all RISC-V have 32 registers... so yeah
<muurkha>all RV32I and RV64s do
<muurkha>I haven't actually seen an RV32E design yet (the 16-register variant)
<oriansj>E only support 16 registers
<muurkha>right
<muurkha>but if you want more than 28 global variables, loading a global pointer gives you 1024 more 32-bit global variables or 512 64-bit ones
<oriansj>and the proposed further reduced might only have r0-r4 so only 3 actual registers
<muurkha>haven't heard of that; I haven't even seen an RV32E design
<oriansj>but that might never be ratified
<oriansj> https://paste.debian.net/1248194/ 598bytes and looks like more room to strip things remains
<muurkha>hmm, maybe SeRV does have an option to implement RV32E
<oriansj>mihi: we can eliminate the DOS stub entirely
<oriansj>I still haven't figured out how to move _start before 0x200 bytes into the file but I completely dropped that DOS stub
<oriansj>check it out: https://paste.debian.net/1248195/
<muurkha>I did find an RV32E: https://codeberg.org/tok/attocore
<oriansj>darn I couldn't null out the data dictionaries
<oriansj>534bytes
<oriansj>finally freaking moved _start
<oriansj>454bytes
<oriansj>down to 16bytes of nulls that I haven't figured out how to remove yet
<oriansj>boom: https://paste.debian.net/1248197/
<oriansj>basically everything after line 129 is your hex program
<oriansj>only the size lines need to be adjusted for bigger programs
<oriansj>how to do: Number of whole/partial pages in hex2 seems like a problem
<oriansj>but we can figure it out
<oriansj>and I'm gonna get some sleep
<muurkha>nice, you're under half a K!
<muurkha>apparently zephyr can be built to run on RV32E: https://docs.zephyrproject.org/latest/boards/riscv/index.html
<stikonas[m]>yes, 454 bytes should be good enough to give us well under 1 KiB hex0.efi
<stikonas[m]>but sure if we can get kaem.efi into another 1 KiB... I first need to optimize it a bit in stage0-posix
<stikonas>oriansj: good progress. I'm now trying to use your header but I'm still a bit confused what needs changing. I tried adjusting VirtualSize and SizeOfRawData but binary just gets stuck...
<stikonas>but I think we should be fairly close now
<stikonas>see https://git.stikonas.eu/andrius/stage0-uefi/src/branch/main/amd64/hex0.hex0
<stikonas>perhaps something messed up in .hex0 code
<stikonas>if I replace early bytes with exit, it works
<stikonas>so that gives me some way of debbuging this
<stikonas>hmm, yes, there are some errors that I made
<stikonas>and done
<stikonas>oriansj: we now have hex0.efi
<stikonas>and pushed https://git.stikonas.eu/andrius/stage0-uefi/src/branch/main/amd64
<stikonas>probably should later add Makefile to run production stuff (not development) in qemu...
<oriansj>stikonas: very nicely done
<stikonas>I'll now be away for some hours...
<stikonas>but that's a good place to take a break
<oriansj>as the next UEFI problem is how to adjust the page count in the binaries
<stikonas>787 bytes now
<stikonas>we need this when programs get bigger?
<oriansj>we need this when we stop hand editing the PE header
<stikonas>well, right now I only edited size of the binaries
<stikonas>VirtualSize SizeOfRawData and size of code
<oriansj>which is fine for now, I'm just thinking about what needs to be done for binaries built by M2-Planet and above
<stikonas>oh ok
<stikonas>well, it will be a while till we get there...
<stikonas>I should probably optimize stage0-posix binaries a bit...
<stikonas>most of the code can be reused in uefi
<stikonas>although different callling conventions sometimes mess things up a bit
<oriansj>well, stage0-posix binaries effectively are working UEFI binaries once they get the new PE header and we swap out the syscalls (and convert to RIP for our labels)
<oriansj>so yeah, much smaller task
<stikonas>well, calling conventions are a bit different
<stikonas>different registers are volatile
<stikonas>so we might need to make sure everything is pushed/poped onto stack
<oriansj>but that can be contained entirely in the syscall functions
<stikonas>yes
<stikonas>well, in hex0.efi I also used rdi instead of rbp in stage0-uefi
<stikonas>as I wanted to use rbp to save stack pointer
<stikonas>and posix-amd64 uses rbp because rdi is volatile
<stikonas>but once we go past bootstrap binaries, binary size matters less
<oriansj>stikonas: indeed
<stikonas>in any case rbp->rdi only affected a few lines
<stikonas>maybe we should also try to see if header can be compressed further...
<oriansj>also we will end up having to make UEFI M2libc support, which ironically would enable UEFI binaries that are the same as POSIX binaries
<oriansj>well there is only 16bytes I couldn't figure out how to eliminate and the optional header that UEFI seems to think isn't optional
<stikonas>and nothing more can be zeroed?
<oriansj>well right now 85.4626% of my example is zeros
<stikonas>well, maybe we can push it to bootstrap-seeds then...
<stikonas>let me make a PR...
<stikonas> https://github.com/oriansj/bootstrap-seeds/pull/22
<oriansj>merged, thank you stikonas for all your hard work
<stikonas>you did half of the work...
<oriansj>correction, I did half the fun ^_^
<stikonas>the nice thing about UEFI is that it should be easy to automate...
<stikonas>so one can create automatic chains, build some POSIX kernel, and automatically continue to GCC
<stikonas>eventually...
<oriansj>knowing this group, we can get there faster than expected
<oriansj>plus, I'll probably learn alot along the way
<oriansj>after I get enough time to setup a git repo on bootstrapping.world, I'll probably upload my notes as they seem useful for new people wanting to learn
<stikonas[m]>What is are you running there?
<stikonas[m]>s/is/OS/ ?
<stikonas[m]>We can probably just install some distro packages
<mihi>oriansj, stikonas[m]: what problems did you run into when trying to zero CountBytesLastPage and CountPages in the MZ header? According to PE spec, it should be valid if both are zero, it will use the size of the file in this case. I also tested with some (not digitally signed) UEFI binaries with UEFI hexedit to zero it, and on my machines they still boot.
<oriansj>well currently it is Debian 5.10.120-1
<mihi>(obviously if you want to append resources to the PE file, as in self-extracting archive files on Windows, the fields are needed to tell the loader where the image ends and the resources start)
<oriansj>mihi: I believe stikonas figured that out but no it is unlikely for us to add resources to our PE files (unless we choose to append the sources)
<mihi>oriansj, I assume that is the kernel version running on Debian 11 :)
<mihi>I am not aware of any turnkey Debian packages for setting up a git server. If you want to push via ssh, you probably want to restrict the commands the people can run (via rbash or sshd_config or similar), if you want to push via git-http-backend CGI, setting up some basic webserver config is something you probably have to do yourself.
<mihi>if all your users are fully trusted (not to use TCP forwarding or install a bitcoin miner), all you have to do is create a shared directory with correct permissions (sgid some group).
<mihi>(after installing sshd and git, obviously)
<mihi>git-shell (as login shell) or gitolite (as separate user) would be other options to restrict ssh access; note that neither of these provide ways for users to self-service (change) their SSH keys.
<oriansj>mihi: well only people here would be getting access to create repos/push changes to them
<oriansj>there isn't likely going to be much free compute for a bitcoin miner as we will be doing builds on it as well
<oriansj>and I do have documented steps for that setup: https://paste.debian.net/1248236/
<oriansj>as for using the server for TCP forwarding; well if they are willing to do: ssh -L 8080:localhost:9050 username@host I can install a tor service on the server and the traffic can go on its merry way
<oriansj>but if they do something other than that, there will be logs on the outbound linked to their ssh credentials... so yeah
<oriansj>doing ssh -D 8080 username@host will just get all of your web traffic logged and tagged with your username
<oriansj>because I understand people being in an environment where tor is forbidden and a private entry node sometimes is just what they need
<oriansj>and giving someone here a 100GB of network traffic to help them out is a small price to pay
<oriansj>in the case of someone putting a crypto miner on the server, I'll just set their default priority so that they get compute last and they are free to leverage any unused CPU cycles
<oriansj>if the useful services and other users don't need the CPU cycles, why would anyone care if someone finds a use for them?
<oriansj>now if that server was in my home and I'd have to pay for the extra electricity, then I'd care but on a rented server or a colo-rack where electricity is included in the price; I wouldn't care
<oriansj>if the company renting the server or colo-rack had a problem with it, then I might have to take a different position to ensure the services remained available
<oriansj>good catch on the incorrect comment by the way
<oriansj>fixed
<stikonas[m]>mihi, I haven't tried to zero those yet, we can try
<stikonas[m]>I mostly just took the header that oriansj prepared and just adjusted necessary things
<stikonas[m]>hmm, yes debian doesn't seam to package git hosting stuff, there are only unofficial packages
<stikonas>oriansj: what if we configure rootless podman and run something in container?
<stikonas>that way we don't need root to administer it
<stikonas[m]>one just needs to configure subuid, subgit and install some packages as root first to get rootless podman working and after that it might just require normal unprivileged user
<stikonas>yes, i think we can zero those 2 entries in MZ header that mihi mentioned
<stikonas>and there is still a bug in kaem.c
<stikonas>it can only run one command
<stikonas>but probably some parsing bug, nothing to do with UEFI
<stikonas>quite a few more PE fields can be set to 0. all characteristics, etc...
<oriansj>well refinement is expected.
<oriansj>and guix packages are also an option, as are complex build/setups using any standard automation tool (like ansible, puppet, chef, salt, etc)
<oriansj>we could do the whole Kubernetes cluster thing if we really wanted.
<stikonas>not worth the hassle...
<stikonas>especially with only 1 host...
<stikonas>setting up Kubernetes might be worth for big provider with multiple redundant hosts
<stikonas>but it's probably overkill for our usecase
<stikonas>hmm, even size of code from COFF header seems to be ignored
<stikonas>oriansj: I think I've now zeroed everything possible https://git.stikonas.eu/andrius/stage0-uefi/commit/e51956a58cb0f229a82b6b76a3bc00dc648cec94
<stikonas>bit it would be good to try it on baremetal before pushing to bootstrap-seeds
<oriansj>nice, unfortunately I have no UEFI hardware (As I run on Librebooted machines personally)
<stikonas>well, my laptop has UEFI, so I can try it, but maybe tomorrow
<stikonas>though I suspect most of hardware is actually running the same tianocore stuff at higher levels (only hardware initialization stuff might differ)
<oriansj>thus far I am looking at pagure, gitile, gitea, sourcehut and Kallithea as options to setup for the git repo
<stikonas>I've only tried gitea and gitlab. gitlab was a huge resource hog...
<stikonas>gitea is significantly smaller and similar GUI (although it does not have any integrated CI)
<oriansj>in terms of disk? CPU? RAM?
<stikonas>I think mostly RAM
<stikonas>it was on arm64 SoC, so RAM was fairly limited
<stikonas>but I guess disk and CPU are also more used in gitalb
<stikonas>gitlab
<oriansj>I'm not a huge fan of Gitlab because it requires people to run JavaScript to see anything
<stikonas>well, yes... And also resource intensive like I said...
<oriansj>also we don't require an Integrated build service but it would be nice to have one available without worrying about limits
<muurkha>oriansj: agreed, but it feels like an improvement over github in that the server software is free
<muurkha>I haven't heard of pagure, gitile, and Kallithea
<stikonas>well, in the worst case we can use a separate CI...
<stikonas>yes, I haven't heard about pagure gitile or Kallithea
<muurkha>yeah, the gitlab CI stuff is nice
<muurkha>people on #riscv pointed out that the current RV32E spec is still a draft
<stikonas>yes, but even if we don't have integrated CI, I can set up Jenkins or something like that...
<muurkha>yeah, and also CI is not rocket science
<muurkha>a git post-update hook can run wget or curl or touch or whatever to kick off a regression test script
<muurkha>64% of the benefit of gitlab CI for 4% of the effort
<oriansj>muurkha: well the big problem is I only have so much time, of which I hope only a subset is needed to maintain/update/etc the service
<muurkha>of course
<oriansj>paying for the servers is easy, administration however takes time
<oriansj>and unless someone wants an admin account to take up that responsibility, I am left trying to figure out the least effort option
<muurkha>gitlab CI gives you containers, a status page for the CI run (linked from the commit), yaml to configure it, and a work queue/worker pool system to do the CI runs
<muurkha>it's definitely easier
<oriansj>yeah, gitlab isn't an option I'm considering due to personal preferences
<muurkha>probably some of the others have similar features
<oriansj>pagure-ci seems to be featureful
<oriansj>for continious integration
<muurkha>but what I'm saying is that if you want automatic builds and test runs on commit, kicking them off from .git/hooks/post-update is not difficult
<stikonas>administration shouldn't be too bad once initial setup is done
<stikonas>even mail server setup that I'm running doesn't need much maintenance
<oriansj>fair, also public backups folder for the data make for a cheap recovery if things go wrong.
<oriansj>and I'll be doing a scripted setup that will be made available
<stikonas>I did use https://github.com/geerlingguy/ansible-role-jenkins before to setup Jenkins if we have to go that route.
<oriansj>I could even pay someone to do the work assuming
<stikonas>we should be able to set it up ourselves...
<oriansj>they don't ask for some amount I can't afford
<oriansj>true
<stikonas>it's probably harder to pick the hosting system
<stikonas>than to install it
<oriansj>true
<oriansj>too many reasonable options
<stikonas>hmm, personally I think I like gitea UI most as it's familiar to most people (very similar to github). And it's only a single binary. Downsides are slightly harder to configure CI
<stikonas>we need to set up something separately