IRC channel logs
2021-10-14.log
back to list of logs
<stikonas>fossy: shall I just push stage0-posix update in live-bootstrap? <stikonas>nothing else changes, no checksum chage, etc... <stikonas>(since those early checksums are done as part of stage0-posix) <stikonas>that should bring kaem conditionals and match in <civodul>so if you push smoething to the repo, it should show up on-line within an hour <gbrlwck>i just tried to bootstrap stage0-posix on my HiFive Unmatched (riscv64) and failed executing kaem.riscv64 (Subprocess error; ABORTING HARD). compiled hex0 manually, but the binaries differ (except in length) <stikonas>gbrlwck: and you pulled in all submodules? <stikonas>well, riscv64 was only tested on qemu, so if something is wrong we'll have to rely on your help <gbrlwck>yes (git clone ... --recurse-submodules) <gbrlwck>stikonas: that's exactly why i'm here! so it all works fine on qemu? <gbrlwck>i can also paste you the hexdumps ;) <stikonas>Subprocess error just means that child process failed <stikonas>either hexdumps or i fyou have diffoscope <gbrlwck>oh wow, diffoscope is a *huge* package? let's see how long that takes <gbrlwck>on my HiFive-ready ubunut it asks to install 931 pacakges and up to 5GiB space ;) <stikonas>I guess it's the command after hex0 that fails (if hex0 differs from seed) <stikonas>doesn't matter, I can run diffoscope locally <gbrlwck>the error makes sense: output of the very first hex0 artifact has no valid header so it doesn't execute <gbrlwck>at first i thought it was a big-/litte-endian issue, but it is not <gbrlwck>though the second row: the quartets seem the same (it's the same hex characters) but in a scrambled order <stikonas>I think yes, the bits are the same but somehow scrambled <stikonas>gbrlwck: and they are always scrambled in the same way <stikonas>hmm, this one is not evne the same length... <gbrlwck>yeah, i need some coding tasks anyhow, so this might be the adequate point to dive in :) <stikonas>I mean I get just this from my run on teststring: "0000000 2301 6745 ab89 efcd" (this is hexdump) <stikonas>which seemed to me strange given that your miscompiled hex0 had the same length <gbrlwck>well, it was not "just" scrambled characters. the first line of both are really different <stikonas>so I'm running it directly on amd64 machine <stikonas>hmm, yeah, I think some later lines are also not simply scambled in oktets... <stikonas>and oriansj was getting the same hashes on his qemu-user instance <stikonas>well, in the worst case, we'll have to fire up gdb and see what happens <stikonas>although, debugging this early code in gdb is not super easy... <stikonas>gbrlwck: one test you can try, is to try compiling GAS prototype in GAS/hex0_riscv64.S and see if you get the same issue <stikonas>riscv64-unknown-linux-gnu-as hex0_riscv64.S -o hex0.o; riscv64-unknown-linux-gnu-ld hex0.o -o hex0 <gbrlwck>now we get a 1.3K binary file with really similar first bytes! it starts to differ at e_ident[EI_OSABI] <gbrlwck>i compiled the GAS version, this results in a 1.3K binary <stikonas>well, the built binary is different (GAS compiled version will be bigger) <stikonas>it has larger header with section tables <gbrlwck>it is still scrambled and identical to the first bootstrapped version (riscv64/artifact/hex0) <stikonas>ok, so if we need to debug it it will be easier than debugging that smaller hex0 <stikonas>althoguh for debug info you need to build it with riscv64-unknown-linux-gnu-as -g hex0_riscv64.S -o hex0.o; riscv64-unknown-linux-gnu-ld hex0.o -o hex0 <gbrlwck>i did just `as` and `ld` (because i wasn't cross compiling) <stikonas>well, full trippled would work for you too, but short version is fine <stikonas>anyway as -g can produce bigger file with some debug info <stikonas>so far I have no other ideas besides looking what happens in gdb <gbrlwck>installing gdb will take a while. i'll be back :) <stikonas>gbrlwck: if you need help using gdb also feel free to ask <stikonas>I've also never used gdb on assembly programs until 3 months ago... <xentrac>gdb sort of wants you to be using a high-level language <xentrac>yeah, it definitely copes okay with assembly <stikonas>first of all "layout asm" followed by "layout regs" can help you see assembly code and cpu registers <xentrac>although I haven't been able to figure out how to get `finish` to work <xentrac>I don't know how to use things like radare2 which are designed for debugging machine code <xentrac>I tend to do `display/i $pc` and `info registers` in GDB rather than the TUI <stikonas>there is also some trick to display memory contents <stikonas>but it's a bit hard when you work with 64bit pointers <xentrac>so if the register pointed to memory, p *(void**)$ would follow that pointer <stikonas>or there are some more semantic names like zero, a0, a1, a2,... (for function calls), t0, t1, ... (for temporaries), and similar <gbrlwck>the content at the first break is: $1 = 4395898842368 <stikonas>strange, that looks like memory address to me <gbrlwck>it might be before, since i added the breakpoint at line 130 <stikonas>actually it shouldn't matter, that line does not change a0 <gbrlwck>so, when i continue, the next value is 0x12 (18) <stikonas>it's not a readable letter or number in ascii encoding <stikonas>actually, instead of p/x it might be good to run p/c $a0 <xentrac>yeah, /x is useful for values like 4395898842368 <oriansj>my first question is what happens when given 01 23 45 67 89 AB CD EF and single step in gdb with si <oriansj>and put break points on the reading and writing of bytes <oriansj>So you should see exactly 2 reads that are correct followed by one write that is correct <stikonas>so I have file "01 23 45 67 89 AB CD EF" (with newline at the end) <stikonas>and sha256sum c45792734f2045a48f4db7f86189009be6824055b9f139f2d4d80b831303218e <stikonas>gbrlwck: on risc-v read will be stored in a0 after ecall <stikonas>value will be in the address pointed out by stack pointer, but the next line loads it into a0 <gbrlwck>so, i added a newline to my teststring file but the sums dont check out (no idea why) <oriansj>and if the gas version has the same error, we can safely assume it isn't our instruction encoding (as gas shouldn't be encoding the wrong instructions) <stikonas>yes, that's why this is strange, one would think kernel bug... but that's such a basic functionality <oriansj>using gdb does it read the correct values? <stikonas>you still have number of bytes read in a0 <stikonas>oh, maybe restart and also add breakpoint on 130 <stikonas>so you should get reads 0 then 1, and then 1 at the write part <gbrlwck>the first interation has 48 '0' on line 66 and 0 '\000' on line 130 <gbrlwck>second iteration has 49 '1' on line 66 and 32 ' ' on line 130 <stikonas>as it is binary stuff that we are writing <oriansj>it should always do atleast 2 reads before write (more if whitespace or comments) <oriansj>the contents of the register a0 at combine should be what is written out <stikonas>ok, something is messed up if this is what happens <oriansj>what happens at the: bnez s4, combine <stikonas>that is a boolean toggle to decide whether we write the combined byte or not <stikonas>is what are the values of all registers when we start <gbrlwck>should i stream this live via jitsi or something? might be easier? <stikonas>I thought kernel should initialize all registers to 0 <stikonas>(except special purpose ones like stack pointer) <stikonas>hmm, let's check initial values of registers <stikonas>I thought kernel does that when loading binaries <xentrac>yeah, the kernel needs to do that when loading binaries; the alternative is a covert information leak through execve() <stikonas>ok, so at least we understand what was going wrong <stikonas>the fewer bytes we have in seed the better <gbrlwck>do i need a github acoount for that? <stikonas>might be the easier one, but not necesserily the only way <stikonas>luckily, it would be too hard to fix up this, as this is at the beginning of the file <stikonas>so don't need to redo any calculations of where to jump <oriansj>gbrlwck: I can pull from any git repo that I can git fetch from <oriansj>So notabug, savannah, gitlab, etc are all fine <oriansj>stikonas: for relative jumps yes but fortunately hex0 doesn't need absolute addresses <stikonas>well, we just need to copy paste the line from the end of the file where we also initialize s4 to 0 <stikonas>and in hex0 file we also need to recalculate file size <stikonas>anyway, let's wait for gbrlwck to be ready <oriansj>first fixing bootstrap-seeds and then updating stage0-posix <oriansj>and there are probably similiar bugs in hex1, hex2, M0 and cc_riscv64 <stikonas>but for fixing files it might be easier to update GAS file, then M1 file then hex2 prototype and then we can update hex0 source and commit to bootstrap-seeds <stikonas>fortunately, they are not too hard to fix <oriansj>best not to assume the state of a register before we set it with RISC-V <oriansj>as there may be register side-effect differences from ecalls as well <stikonas>hmm, risc-v elf.h file in kernel source does not have ELF_PLAT_INIT <stikonas>so I think we can't assume that they are set to 0 <stikonas>but yes, at the very least we'll see this bug in hex1 and hex2 <oriansj>actually every architectures including AArch64, PowerPC64LE zero on exec <stikonas>we'll need a new release of stage0-posix at some point <oriansj>failing to zero would result in a data leak from the kernel <oriansj>so until they fix that, lets assume it will not be fixed and update our binaries accordingly <stikonas>as there are some kernels out that don't do zeroing <gbrlwck>should i add my chmod'ed kaem.riscv64 too, or is there a reason why it's not executable? <stikonas[m]>But you'll have to update prototypes in riscv64/Development <stikonas>gbrlwck: yeah, .S change looks alright. But can you do others too? <stikonas>li s4, 0 translates to RD_S4 MV in .M1 language <gbrlwck>comment it with "initialize register"? <gbrlwck>are there any more left? sorry, lost track <stikonas>yes, then there is hex0_riscv64.hex2 prototype <stikonas>and finally hex0_riscv64.hex0 file itself <stikonas>first line is comment (M1 code) and second code is actual, hex encoding <gbrlwck>p_filesz is the file-size of the compiled file in bytes? <gbrlwck>this will be 8b 01 00 00 00 00 00 00 ## p_filesz ? <stikonas>sorry I'm on and off here, so might need to wait a bit for my answers <gbrlwck>ok, so now my 1.3K assembly hex0 produces a 396 Byte hex0_bootstrapped; but this in turn produces a 392 Byte hex0_bootstrapped2 <stikonas>but that shouldn't really affect the size <stikonas>against new source it should be producing 396 byte binary <gbrlwck>i fixed the size(s) and now it seems to work <gbrlwck>so should this (hex0_b or hex0_b2) become the new seed? <stikonas>so first push this seed to bootstrap-seeds repository <stikonas>you can also update bootstrap-seeds submodule in stage0-posix <stikonas>(after bootstrap seeds is merged just go to bootstrap-seeds subdirectory, git switch master; git pull) <stikonas>after that you should be able to proceed 1 step further until things break... <stikonas>I think hex1 binary and hex2 will also need fixing <stikonas>gbrlwck: does it build (broken) hex1 now? <stikonas>oh, actually hex1 has initialization at the beginning <gbrlwck> +> ./riscv64/artifact/M0 ./riscv64/artifact/cc_riscv64.M1 ./riscv64/artifact/cc_riscv64.hex2 <gbrlwck>i'll probably continue my works tomorrow; need to eat some dinner now <stikonas>although it does completely different thing there <stikonas>fossy: oh, so actually we don't even have M2-Planet -> mes step on amd64 <stikonas>I think nobody tried building it yet, and there is no lib/linux/x86_64-mes-m2 directory <stikonas>although, of course it later builds 32-bit mes <oriansj>fossy: I still haven't had time to incorporate the meslibc functions from mes-m2 into M2libc but once I do, then all of the architectures should be able to build and run MesCC (even if MesCC doesn't support that arch yet) <oriansj>plus I still need to fix the security issue with untar with untrusted inputs. <oriansj>and get back to doing the armv7l port of stage0-posix <stikonas>oriansj: can we actually take those functions? M2libc is LGPL and meslibc is GPL <stikonas>although, M2libc does need some improvements <stikonas>it's a bit annoying to write each function in assembly <stikonas>I think usually libc's have a few syscall functions (one per syscall number of arguments) <oriansj>stikonas: true; however debugging M2libc syscalls is easier than meslibc syscalls (you can just do b FUNCTION_name and boom done) <stikonas>we had to do quite a bit of debugging for risc-v just before stage0-posix release <oriansj>So in many ways meslibc gets to benefit from what is learned by stage0-posix and M2-Planet; it is why MesCC always gets architectures AFTER M2-Planet is up and working <oriansj>hence why despite starting months earlier on RISC-V support, M2-Planet+stage0-posix finished getting it first <stikonas>well, laanwj got stuck with old hex2 syntax... <oriansj>until I had to fix it for M2-Planet+stage0-posix <stikonas>or maybe too many bitcoin PRs to merge... <oriansj>but now the work is done and it should be much easier for MesCC to gain RISC-V support