IRC channel logs

2021-09-25.log

back to list of logs

<stikonas>not sure what is wrong, but when I try to create debuggable M2, it doesn't work ( 1: 0000000000000000 0x6000e80000000f NOTYPE <unknown>: 8 DEFAULT bad section index[ 32] <corrupt>)
<oriansj>stikonas: entirely possible given, it only currently produces valid and correct ELF sections for PowerPC, PowerPC64, x86, AMD64, armv7l and AArch64. However figuring out the differences might require a bit of trial and error to figure out.
<oriansj>unless your forgot the --64 flag for your 64bit binaries
<oriansj>^your^you^ which seems an easier mistake to make
<oriansj>theruran: we always try to upstream when we improve a possible bootstrappablity of a program.
<oriansj>civodul would most likely be able to get you an answer theruran on the current upstreaming of the wonderful psyntax bootstrapping work of Michael Schierl (who sometimes shows up here)
<stikonas>no, I used --64 flag...
<oriansj>and the hex2 ELF header says 64bit as well for the EI_CLASS
<oriansj>(it is 1 for 32bit and 2 for 64bit)
<stikonas>yes... I just took non-debug header and added those 4 modifications that deal with sections
<stikonas>I've also found yet another bug in cc_riscv64. Forgot to use LUI for constants... With that fixed M2-Planet still doesn't run successfully... But it goes much further. I think the whole tokenization part might be working now but it's a bit hard to debug without blood-elf
<oriansj>e_shoff is %ELF_section_headers>ELF_base 00 00 00 00
<oriansj>e_shentsize is 40 00
<oriansj>e_shnum is 05 00
<oriansj>e_shstrndx is 02 00
<stikonas>yes, that's right...
<oriansj>stikonas: and is readelf -a isn't showing any warning?
<stikonas>it shows, that's where that warning came from
<stikonas>( 1: 0000000000000000 0x6000e80000000f NOTYPE <unknown>: 8 DEFAULT bad section index[ 32] <corrupt>) and a lot more similar lines
<stikonas>it looks like https://paste.debian.net/1213153/
<stikonas>oriansj: oh, I think I know what's the issue
<stikonas>it's due to redefined risc-v symbols ! @ etc...
<stikonas>M0/M1 are interpreting them in risc-v way...
<oriansj>yeah that would absolutely do it.
<oriansj>fortunately it is easy to fix, the problem is that it would result in us being unable to leverage M1 to sort out the endianness of the parts
<oriansj>So we would require --little-endian and --big-endian flags for blood-elf.
<stikonas[m]>Fortunately it's C code
<oriansj>so it would technically be a breaking change for blood-elf but one which would only require a handful of kaem file fixes in our bootstrap
<stikonas[m]>So adding another flag is not too bad
<oriansj>So I'd be making it a major release for mescc-tools since.
<oriansj>I never wish to do breaking user interface changes without a matching release with release notes to make it explicitly clear.
<oriansj>as small as it might be blood-elf -f foo.M1 --64 -o foo-footer.M1 to blood-elf -f foo.M1 --64 --little-endian -o foo-footer.M1
<stikonas[m]>Agreed
<oriansj>and I'll have to add a warning if --little-endian or --big-endian isn't used to help people catch the change
<oriansj>but it is unavoidable with word based architectures. So I'll get it done tonight
<oriansj>janneke: this will be a breaking change for MesCC too
<oriansj>and this will also impact M2-Planet, stikonas would you mind helping me get RISC-V support into M2-Planet?
<stikonas[m]>Well, tomorrow... Going to bed right now
<oriansj>and once RISC-V is in M2-Planet it'll be a release as well. So I guess I'm gonna have to be productive tonight
<oriansj>This is gonna be one ugly delta
<oriansj>stikonas: when you get a chance tomorrow please verify that this corrects your RISC-V blood-elf issue. And if so, I'll setup a proper release.
<oriansj>and it is finally time to generate warning messages for --LittleEndian, --BigEndian and --exec_enable
<oriansj>and changes are now up
<oriansj>and after you confirm, I'll update the release notes and do a formal release.
<oriansj>Then I'll get started on updating M2-Planet and stage0-posix to reflect the change.
***ChanServ sets mode: +o janneke
<stikonas>oriansj: blood elf works now on M2 risc-v binary that I built
<stikonas>it will let me debug it properly, because without those section names it was almost impossible to follow
<stikonas>(although, in the worse case cc_* can always be modified to spit out GAS assembly)
<oriansj>nice
<oriansj>and readelf -a now looking clean?
<stikonas>oriansj: yes
<stikonas>although, it's more important that gdb is looking alright
<stikonas>and then later we can try to port M2-Planet to risc-v
<stikonas>I guess it's mostly copying strings from cc_riscv64 although reversed order of risc-v M1 files causes some minor adjustments
<stikonas>(also loading constants and labels need AUIPC/LUI and ADDI/ADDIW)
<stikonas>and I guess for all those branching statements (if/for/while/do) we need to implement indirect jumps (branch + jump)
<oriansj>well there is a nearly 1:1 mapping between cc_* and M2-Planet
<stikonas>so far in cc_riscv64 I've only implemented indirect jumps for while statement
<stikonas>(cause M2-Planet has some long while loops that are outside the range of B-type instruction)
<stikonas>yes, I saw that it's very close
<oriansj>So I could use your cc_riscv64 work as a basis for M2-Planet.
<stikonas>yeah
<stikonas>this is what I did for while loops: https://github.com/stikonas/stage0-posix/blob/cc_riscv64/riscv64/cc_riscv64.M1#L5116
<stikonas>so I think same thing will be needed for if/for/do statements (or at least if they are outside range, but that's tricker to check)
<stikonas>the simplest thing I guess it not to do any optimizations and unconditionally do indirect branches
<stikonas>although, I'm not 100% sure yet if all the code is generated correctly
<oriansj>simplest and most inefficient will absolutely work
<stikonas>since I didn't yet get M2 binary to work
<stikonas>it does seem to work for a while, I think tokenization might be done correctly
<stikonas>so there is probably only one bug or so left...
<stikonas>oh, you would also need bootstrap.c... which I haven't pushed yet
<oriansj>well the M2-Planet tests will find most instruction bugs rather quickly; So I'll certainly have to sort it all out when doing that work.
<stikonas>although, I need to test if everything is working there correctly
<stikonas>well, let me push it to the branch anyway
<stikonas>it might be easier to test with M2-Planet tests...
<oriansj>it is easiest to figure out the M1 grammer details for cc_* in M2-Planet as it forces one to quickly deal with the common problem cases
<stikonas>well, my main issue was that debugging was not working well but that is solved now
<stikonas>earlier binaries I could debug because handwritten assembly code was easy to recognize, so I could usually tell which function I'm looking at
<stikonas>not so much with generated code... It all looks very similar
<oriansj>fair. the Big 3 (hex2, M1 and blood-elf) always really need a solid block of testing on new architectures and it is looking like RISC-V is certainly shaking things up a bit (possibly for the better)
<stikonas> https://github.com/oriansj/M2libc/pull/3
<stikonas>yeah...
<stikonas>and I'll still have to look at M0 and possibly hex2-0 bugs
<stikonas>they had some issues with cc_riscv64 generated .M1 files
<stikonas>right now I'm using C versions of those tools for testing
<oriansj>merged
<oriansj>well it is best to hunt for bugs in known good code
<stikonas>true...
<oriansj>So the optimal order would have been mescc-tools -> M2-Planet -> cc_* -> M0 -> hex2.hex1 -> hex1.hex0 -> hex0.hex0
<oriansj>but the reverse is certainly more fun
<oriansj>and a good way to learn assembly programming and low level debugging skills
<stikonas>yeah, I didn't know enough assembly programming to write cc_* initially
<stikonas>so I had to start from small stuff
<oriansj>and it looks like you are getting proficient with RISC-V assembly
<stikonas>well, in some subset. It's probably not the most optimal code
<stikonas>but we don't require that for bootstrapping
<stikonas>and especially cc_* and M2-Planet's output if less efficient on risc-v compared to other arches
<oriansj>optimization in assembly is a different skill set than just straight assembly programming which takes considerably longer to learn.
<stikonas>pushing/poping everything onto stack takes considerably more instructions
<oriansj>but much easier to reason about
<stikonas>yes, so good enough for first compilers
<stikonas>especially one written in assembly
<stikonas>word based instructions might be a mess to support here in stage0-posix but one advantage of those is that M1 code looks really similar to GAS code
<oriansj>well it looks like you are doing quite an excellent job with getting RISC-V into stage0-posix
<stikonas>well, you helped a lot with mescc-tools part
<oriansj>It is certainly the hardest instruction set we have yet to gain support for.
<stikonas>for riscv we had to work on mescc-tools/hex1 in parallel
<stikonas>cause it's hard to tell what mescc-tools might need
<stikonas>until you actually use those things
<stikonas>well, the first one x86 was much harder to get
<oriansj>We had to do major rethinking to work around its word based encoding and it will probably serve as a good roadmap for other architectures that map to words better than bytes
<stikonas>but out of those new arches yes, riscv was probably the trickiest
<oriansj>AArch64 would probably better map to words than bytes but it had the advantage of being reasonable when working with bytes
<oriansj>and we hacked around to shove it into that box
<oriansj>but with RISC-V I just couldn't possibly find a way to get it to fit in the byte box
<stikonas>yeah, it's quite messy at instruction encoding level
<oriansj>so thank you stikonas for pushing me to finally find a working solution for word based instruction sets. It will likely be quite useful as we add more architectures.
<stikonas>and hex1 is almost as large as hex2 for some other arches
<stikonas>what else might be using words?
<stikonas>well, AArch64 is already done, so maybe not worth redoing it
<oriansj>RISC-V, MIPS, SPARC, AArch64 and PowerPC are the big ones
<stikonas>well, yes, MIPS is quite similar to risc-v
<oriansj>Itanic too but someone else will have to deal with VLIW bootstrapping; because seriously fuck that noise.
<stikonas>Itanium?
<oriansj>yeah it doesn't map to bytes or words
<stikonas>well, Itanium and MIPS are both deprecated now
<stikonas>probably not worth bootstrapping on those
<stikonas>MIPS company switched to risc-v
<stikonas>and Itanium is also not making new chips anymore
<oriansj>stikonas: well any hardware one wants to support can be supported if someone is willing to do the work.
<stikonas>well, yes, if someone is willing but somebody must already have old hardware...
<oriansj>or bought it on ebay
<stikonas>or just do it for fun in qemu
<oriansj>DEC Alpha might be fun now that we have word based encoding
<stikonas>well, the easiest target is probably riscv32 now
<stikonas>that can be done quite cheaply
<stikonas>maybe somebody else here wants to do it too
<oriansj>well let us finish riscv64 first
<stikonas>well, yeah
<stikonas>there is no point of copying riscv64 files into riscv32 files and ending up with the same bugs
<oriansj>other fun new architectures that become possible with word based encoding: Lisp chips
<oriansj>small-talk cpus and FORTH cpus (if one is willings to sort out non-byte behavior; which I really do not)
<oriansj>I suspect bootstrapping has an infinite work space in all directions. (up stack, down stack and sideways [new ports])
<oriansj>but it is a lot of fun to know how everything works down to the smallest detail
<stikonas>hmm, I think something is still messed up with function calls in cc_riscv64. fputs("Test\n", stderr); before program() in M2-Planet prints it fine but first statement in the function prints gibberish
<stikonas>it's a bit surprising that other functions seem to work
<stikonas>something is peculiar about program...
<stikonas>ok, it's actually initialize_types(); function before program() where something bad happens
<stikonas>oriansj: I now think cc_riscv64 problem might actually be string loading. Somehow auipc/addi pair does not correctly load strings...
<stikonas>not sure why...
<stikonas>either I misunderstand somehow how to use it (more likely) or someting is messed up in hex2
<stikonas>ok, it's something weird going on with weird strings
<stikonas>if I manually move my string "CONSTANT" before :STRING_get_token_12 ' 27 22 0' then loading it works fine, if after it's just some corrupted data
<stikonas>oh, and the answer is a single 0
<stikonas>and M2 is starting to work with this fixed...
<stikonas>at least on simple programs...
<stikonas>probably will work on other things too...
***ChanServ sets mode: +o janneke_
***janneke_ is now known as janneke
<mihi>theruran: See https://www.freelists.org/post/bootstrappable/Can-Guile-be-bootstrapped-from-source-without-psyntaxppscm,13 for latest status of upstreaming psyntax-bootstrap. TL;DR: If you have sold you soul to RMS you may be able to help getting it upstream :)
<mihi>tbh I don't care enough about jumping though copyright assignment hoops/politics/shenanigans for getting such a niche thing upstreamed.
<oriansj>stikonas: you need to output < after strings to pad to alignment otherwise the strings that follow will not be on aligned addresses.
<stikonas>oriansj: well, in that case it was even worse
<stikonas>only one 0 meant that everything after it was shifted by 4 bits
<stikonas>so was completely messed up
<stikonas>anyway, after that fix things seem much better
<stikonas>so we can maybe merge it
<stikonas>there are some issues with hex2 and M0 though
<stikonas>M0 seem to have some issue with malloc pointer, I suspect what happens is that brk does not allocate memory\
<stikonas>and code does not have check for brk failure
<stikonas>and that causes pointer value to be wrong which crashes on dereferencing
<oriansj>brk should never fail unless you run out of RAM
<stikonas>and hex2 seems to have some issue with I instructions, which messed up a few bytes in final binary
<stikonas>hmm
<oriansj>it can allocate only page sized blocks at a time which could result in a failed allocation too but that has only been noted on BSD systems
<stikonas> https://github.com/oriansj/stage0-posix/pull/46
<stikonas>well, maybe qemu messes things up...
<stikonas>well, this PR just builds up to cc_riscv64, in order to see bug you should build M2
<stikonas>M0 hold M2.M1 step crashes fairly frequently for me
<oriansj>merged
<stikonas>if you have risc-v qemu, maybe you can try building M2 and see if you also get crashes?
<stikonas> https://paste.debian.net/1213236/
<oriansj>in gdb bt can help you find where the crash comes from
<stikonas>well, it comes from reverse list function
<stikonas>and like I said, the value in $s4 looks very strange
<stikonas>it's not memory address
<stikonas>but it should be brk pointer
<oriansj>reverse list can't crash if implemented correctly As it is iterative (no stack usage) and stops when it hits a null. unless brk is returning non-null values in memory. in which case you'll need to do zeroing on allocation
<oriansj>stikonas: looking into it now
<oriansj>sin shows your M0 is introducing forbidden chars \xA0 \xE3 \x03
<oriansj>STRING_weird_4
<stikonas>sorry, was a bit disconnected, due to router configuration problems..
<stikonas>I can see what your wrote in the logs...
<oriansj>it truncated the last 12bytes and replaced it with garbage
<oriansj>So your raw string support in M0 has a fixed size issue
<stikonas>hmm
<stikonas>ok, let's see the code
<stikonas>hmm, it's doing srli a0, a0, 2; addi a0, a0, 1; slli a0, a0, 3
<oriansj>well it isn't the long lines
<oriansj>and looking at STRING_weird_4 in M2.M1, it isn't raw string but string literal
<oriansj>and it isn't happening by a single string literal by itself
<stikonas>but it's also not happening every time...
<stikonas>so there must be some other randomness coming from somewhere
<stikonas>is it because we don't erase the allocated memory?
<oriansj>ok I have a minimal test
<stikonas>oh, that's good
<stikonas>should be easier to debug
<oriansj> https://paste.debian.net/1213242/
<stikonas>truncated after 73?
<stikonas>with two
<stikonas>``
<stikonas>AMD64/M0 crashes on this file
<stikonas>and so is x86 version
<stikonas>AArch version surprisingly works
<stikonas>they crash in In_Set function...
<stikonas>no, that's not In_Set...
<stikonas>it's File_Print
<oriansj>its a line in cc_strings.c that every version has successfully built
<oriansj>so it isn't the single line
<stikonas>yes, I understand...
<stikonas>but when you just extracted those two strings, amd64 version of M0 crashes...
<stikonas>while riscv64 prints some garbage
<oriansj>cc_riscv64 also outputs it differently than all the other architectures: https://paste.debian.net/1213247/
<oriansj>so nice bug find stikonas
<stikonas>oh, but that is weird string, isn't it
<stikonas>so it should be in single quotes '
<oriansj>technically not a weird string as it can't be mistaken by M0/M1 as something other than a raw string.
<oriansj>that being said I'm going to debug AMD64 to try to figure it out more
<stikonas>oh ok
<stikonas>so there is probably a bug in cc_riscv64 weird string detection function that exposed this M0 bug on other arches
<oriansj>so double bug find. nice
<stikonas>ok, I think I might see the bug in weird: function
<stikonas>let me test
<stikonas>actually no...\
<stikonas>I think it's fine
<stikonas>AMD64 version compares In_Set result to 1 and riscv64 to 0
<stikonas>but that should be fine (one is jump not equal and the other is jump if equal)
<stikonas>oh, but in the next piece of code riscv version is wrong
<stikonas>let me check then
<oriansj>found it