<stikonas>just deallocate required amount of stack <stikonas>will do some other optimizations tomorrow <stikonas>oriansj: I've spotted a couple of bytes where we can cut size of hex0... <stikonas>though one would have to redo hex0 conversion... <stikonas>let me see if I can spot any other optimizations <stikonas>(oh and I've just pushed that read/write_byte optimization to stage0-uefi) <stikonas>hmm, yes, I think I can reduce hex0 size a bit more... <stikonas>and same read/write_byte optimization from stage0-uefi can go to stage0-posix... <stikonas>oriansj: I now have 317 byte hex0 (built from M1 sources) <stikonas>with just these simple optimizations, nothing super fancy <stikonas>I'll do more tomorrow, hopefully will go under 300 bytes <stikonas>and combined with kaem-optional, later I think we can go under 1 KiB <Hagfish>"The longest binary number sequence memorized in one minute is 270 and was achieved by Aravind Pasupathy (India) at the Kasthuri Sreenivasan Trust in Coimbatore, India, on 3 April 2015." <stikonas>well, I'm encoding directly from M0 to hex0 right now <stikonas>did one pass, got the right file size, now need to recalculate jumps <stikonas>mostly simple tricks, for example instead of loading mov eax, 5; push 5; pop eax <oriansj>well, we have been using unoptimized to ensure ease of understanding. So keep that in mind. <stikonas>yeah I know, I always added clear comments and tried to only use simple tricks <stikonas>well, I'll do PR soon and then you can review if it's simple enough <stikonas>but it's quite significant reduction in binary size <stikonas>though a lot of it is due to being less wasteful with zeroes <oriansj>yeah, it was a very naive and simple implementation ripe for improvement. <stikonas>oriansj: how do you want PR, first stage0-posix, then bootstrap seeds then update submodule or should I start with bootstrap-seeds <stikonas>if changes above seem good, I should also port them to amd64 <oriansj>still reviewing but thus far looks good <stikonas>should I push draft PR for stage0-posix with NASM and M0 changes (but without bootstrap-seeds submodule update)? <stikonas>although hex0 files do have M0 strings in the comments anyway <oriansj>well if we want to be consistent, they would need to be updated too <stikonas>oh but I first need to make PR to stage0-posix-x86 <stikonas>oh maybe I should update README too that mentions 357 bytes <oriansj>stikonas: well yes, I granted repo commit access to everyone who put serious work into an architecture <oriansj>as I believe those who actively work to make things better should be free to do so. <oriansj>unfortunately github doesn't support the ability to turn off force pushes which can rewrite history and delete progress <stikonas>well, reviews are still good in any case, although you already reviewed interesting bits... <stikonas>oriansj: I thought you can mark some branches as protected <stikonas>oriansj: go to repo settings->branches->add branch protection rule <stikonas>I guess only master/main branch has to be protected ***jackhill is now known as KM4MBG
***KM4MBG is now known as jackhill
<oriansj>and only against force pushes and deletes of the main/master branch <stikonas>not a big deal, but one have to be careful with it... <oriansj>yeah, pretty common in all assemblers to have such bugs <oriansj>but be careful running it on some processors as it has been known to brick certain hardware <stikonas>well, this one is purely rasm2 bug, ISA docs or uses e.g. gcc work fine <oriansj>well the ISA docs are wrong in several areas, such as is nop: 2,4 or 6 bytes in size <oriansj>but assuming we stick to the GCC core x86 instruction set, then we should be fine as Intel and AMD seem to be generally very conservative on those encodings <stikonas>yeah, I don't expect much/any variation amongst different CPUs <stikonas>we are not using anything remotely uncommon <stikonas>ok, hex0 for amd64 will be 405 -> 334 bytes <stikonas>when I implement exactly the same optimizations <stikonas>and then I can go back to optimizing hex0.S for uefi... <oriansj>well decoding PE format details is less than ideal fun <stikonas>and I guess testing is slow, since each time you need to launch qemu... <oriansj>and deal with non-deterministic behavior in UEFI too <oriansj>and a sick 2 year old who refuses to rest