IRC channel logs

<stikonas>hmm, almost done with assembly version of kaem-minimal for riscv64 but now can't find syscall number for fork, strange...

<stikonas>oh, there is no fork on risc-v

<stikonas>hmm, there must be something else to use...

<stikonas>ok, there is clone

<oriansj>A call to fork() is equivalent to a call to clone(2) specifying flags as just

<oriansj> SIGCHLD.

<stikonas>yeah, it seems there are more syscalls that are missing

<stikonas>I guess all "obsolete" ones are not added

<stikonas>it makes sense for new software... But might be a bit painful during bootstrap for older software (depending on what libc does)

<oriansj>I guess one could say RISC-V doesn't care about bootstrapping or legacy software or assembly writers or compiler writers

<xentrac>one could say that, but one could say anything

<oriansj>xentrac: fair

<xentrac>much as one could say that Russia is in my teacup, or that molten sealing-wax is the best gift for winning the hearts of foreign potentates

<oriansj>well Russian Map Teacups are a thing and I can't make claims about the hearts of foreign potentates but RISC-V is less bootstrap friendly than AArch64 which is less bootstrap friendly than ARMv7l

<oriansj>For all of the design sins of x86, bootstrapping it is not one of them.

<xentrac>Indeed!

<xentrac>but I think writing an assembler for RISC-V is something like 2% of the effort of writing an assembler for amd64

<oriansj>xentrac: well depends entirely on what you mean by an assembler. For example M0+hex2 function like an assembler and linker and it appears the task of writing those for RISC-V is more than triple the effort needed for AMD64

<fossy>oriansj: i think that risc-v cares about being simple. which means less syscalls and not implementing legacy stuff

<fossy>which I 100% think is not a bad thing

<fossy>it does make bootstrapping harder, but almost everything easier, lol

<fossy>simple hardware ~= hard software :P

<oriansj>fossy: simple for hardware implementation, you are absolutely right. Which is probably something they cared about. The Funky immediate formats and the like do reflect those sorts of engineering choices. Bootstrapping just was not something they selected for.

<siraben>What version of GCC does live-bootstrap go up to? 4.0.4?

<stikonas[m]>siraben: 4.7.4

<stikonas[m]>There isn't much problem going further

<stikonas[m]>Prob just build latest binutils first

<siraben>huh looks like Nix already has busybox sh built in, so we won't need to bootstrap the shell

<siraben>nevertheless I found the place where the bootstrap tarball is generated

***Server sets mode: +cnt

***ChanServ sets mode: +o oriansj

<oriansj>well atleast at this point the upstream blynn-compiler is buildable by live-bootstrap so once it gets to the stage able to bootstrap GHC, we can leverage it without changes.

<siraben>oriansj: ah, did you test blynn-compiler with that GCC already?

<siraben>Ok, so we don't need to necessarily make it work from M2-Planet

***Noisytoot_ is now known as Noisytoot

<siraben>Why do things like kaem use int instead of size_t when indexing into arrays and such?

<siraben> https://github.com/oriansj/kaem/blob/master/functions/match.c

<siraben>isn't that undefined behavior?

<stikonas>leftover from before M2-Planet supported size_t?

<GeDaMo>size_t is unsigned

<stikonas>well, as long as you use just positive int's it will work with int too

<GeDaMo>I'm just pointing that out because the snippet that siraben linked sets i to -1

<oriansj>siraben: very short version: is when we first wrote kaem we were sticking to cc_x86 types https://github.com/oriansj/stage0-posix/blob/master/x86/cc_x86.M1#L4757 which as you can see don't include size_t

<siraben>I mention this because I've been attempting to formally verify some of the functions (good exercises) but the indexing is tripping me up all the time

<siraben>or at least according to the encoding of the C spec used in the proof assistant

<GeDaMo>Should that match function check if a or b are NULL?

<siraben>well assume they are not null and it's fine (stronger precondition)

<siraben>what's the spec say on overflow behavior of unsigned, it truncates, right?

<oriansj>siraben: we can always make the C pieces more formally correct without much effort.

<siraben>this is just a minor side effort because I've been trying to find some more "realistic" C programs to verify for practice

<siraben>it's definitely overkill for bootstrappable

<oriansj>GeDaMo: would you want to dereference a null pointer in a compare?

<GeDaMo>"Unsigned integer arithmetic is always performed modulo 2n where n is the number of bits in that particular integer. E.g. for unsigned int, adding one to UINT_MAX gives 0, and subtracting one from 0 gives UINT_MAX." https://en.cppreference.com/w/c/language/operator_arithmetic

<GeDaMo>oriansj: I would not :P

<siraben>GeDaMo: right, that is extremely useful

<siraben>otherwise the use of int means I have to assert that i is always less than Int.max_signed, which may not always be true

<siraben>as the website says signed overflow is undefined

<stikonas>well, I guess because for signed integers it depends on how sign is encoded

<stikonas>siraben: but here it's not as important whether spec says it's defined or not

<stikonas>cause M2-Planet can define it if it wants

<stikonas>(not sure how it deals with overflows)

<oriansj>siraben: well the behavior given an array of length less than 2^(n-1) will behave the same as an int as an unsigned (size_t)

<siraben>that's true, I can just add those preconditions to my formal spec

<oriansj>So assuming int32 and array of say 4K elements overflow should never be reached and negative numbers could be used to detect underflow. (or using unsigned underflow would be detected asn overflow)

<oriansj>stikonas: The short version of how M2-Planet deals with under and overflow is simply what the hardware behavior is by default.

<siraben>could M2-Planet be adjusted to have size_t?

<stikonas>siraben: M2-Planet has size_t

<stikonas>it didn't have before

<siraben>Oh, so kaem could be using size_t there instead?

<stikonas>yes, but nobody updated that piece of code

<oriansj>siraben: https://github.com/oriansj/M2-Planet/blob/master/cc_types.c#L111

<siraben>between do..while and while..do, which one is compiled more efficiently?

<stikonas>I guess most compilers leave signed integer overflow to hardware default, hence it is "undefined" as different hardware might have difference conventions

<siraben>compiled by M2-Planet*

<stikonas>even if it's not efficient, that wouldn't matter too much, kaem is not the bottleneck

<siraben>Also, what's the command used to automatically format C code in the bootstrap files?

<siraben>stikonas: yeah, but just out of curiosity, because I wonder why it's written a bit more awkwardly with do while

<oriansj>siraben: astyle -A1tSxejz2fpUxV file.c

<siraben>oriansj: thanks

<oriansj>there are no such things as while .. do loops in C but there are while loops and do while loops and for loops. Of which do while loops are at most 2 instructions shorter.

<siraben>sorry not while..do, but do..while vs. just while

<oriansj>as it just outputs a label and a conditional jump at the end

<oriansj>while loops have a label, conditional jump, the code block, an unconditional jump and a label

<siraben>Right.

<oriansj>So in terms of efficiency it is best to use the one that best matches the work rather than trying to shave off a single jump

<oriansj>for loops are the most inefficient and I know why C dropped the loop {} primitive despite it being trivial to implement. (a unconditional jump at the end and 2 labels)

<xentrac>indexing into arrays with ints is not undefined behavior :)

<xentrac>I mean, it depends on which int, but not in general

<siraben>xentrac: misspoke then, it turns out the obligation to assert that the int does not overflow makes verification harder

<siraben>especially if it comes from traversing an unbounded amount of bytes from an input pointer

<siraben>sample programs: https://github.com/PrincetonUniversity/VST/tree/master/progs64

<siraben>which I think are pretty realistic and comparable to files we have

<xentrac>you can compile while(a) b; as jmp end\@; start\@: K(b); end\@: K(a); jnz start\@

<xentrac>this is only one more instruction than a do-while loop rather than 2

<xentrac>usually I think you need to prove bounds on your loop counter anyway if you want to verify the loop?

<xentrac>I mean if you do things inside the loop like array indexing or arithmetic?

<siraben>yeah I have to show in particular an invariant holds on the loop

<xentrac>oriansj: M0+hex2 don't know most of the amd64 mnemonics

<siraben>and that this invariant holds on loop entry and on the ith iteration

<siraben>and all array accesses must be in bounds

<siraben>but anyway, i'm still relatively new to applying verification to C and the level of detail is a bit annoying (but necessary) compared to idealized imperative languages given for learning purposes

<xentrac>you're way ahead of me tho

<xentrac>we could, you know, use an idealized imperative language if we want to rewrite stuff in it

<siraben>xentrac: it's a lot of effort to do that, instead, the C spec has already been cleaned up and formalized and a compiler targeting x86, ARM, RISC-V has been verified to respect the semantics

<siraben>I used this book https://softwarefoundations.cis.upenn.edu/vc-current/toc.html

<xentrac>well, a lot of effort will be required in any case

<xentrac>also do you mean CompCert? CompCert may be verified but it isn't free software

<siraben>I realize :(

<siraben>Well non-commercial use is allowed, right?

<xentrac>I mean, neither was seL4 for the first several years, but we may not get so lucky this time around

<xentrac>that's my understanding, yes

<xentrac>(and CompCert is already older than seL4)

<xentrac>there were a lot of programs I used pretty often in the early days of Linux that have pretty much died because of being for non-commercial use only

<siraben>I'd wager it's possible to create a language suitable for bootstrapping while having a small and cleab semantics

<xentrac>the Seyon terminal emulator, xv, ncftp

<siraben>clean*

<siraben>I see

<xentrac>admittedly there is less need nowadays for interacting with dialup modems or ftp sites

<siraben>bbl, night all

<xentrac>night!

<oriansj>xentrac: well you are partially right but not entirely. As M0+hex2 knows 0% of AMD64's mnemonics. But that is why we use DEFINE statements and that is more than enough to bootstrap everything. RISC-V needs considerably more than that.

<xentrac>yeah, it needs the bits to be shuffled like some perverted card trick

<oriansj>we just leverage the fact that Immediates are byte aligned, contiguous and a fixed number of bytes in size. RISC-V however gets complicated because the documentation like ARMv7 is wrong AND has a messy bit ordering.

<xentrac>oh interesting, what documentation problems have you run into?

<oriansj>xentrac: little endian instructions but big endian left to right in docs

<oriansj>so docs show [ab][cd][ef][01] but reality is [01][ef]cd][ab]

<stikonas>kaem-minimal for riscv64 GAS prototype is getting closer, need to fix some minor (it segfaults when executable is not found) issues but almost works

<oriansj>stikonas: nice, keep up the good work ^_^

<stikonas>most of the problems were where I assumed things instead of reading manpages...

<xentrac>oriansj: that's explained on p. 9 of §1.5 in "RISC-V Unprivileged ISA V20191213", subsection "Expanded Instruction-Length Encoding"

<xentrac>> Instructions are stored in memory as a sequence of 16-bit little-endian parcels regardless of memory system endianness. Parcels forming one instruction are stored at increasing halfword addresses, with the lowest-addressed parcel holding the lowest-numbered bits in the instruction specification.

<xentrac>(p.27/238)

<xentrac>> We have to fix the order in which instruction parcels are stored in memory, independent of memory system endianness, to ensure that the length-encoding bits always appear first in halfword address order. This allows the length of a variable-length instruction to be quickly determined by the instruction-fetch unit by examining only the first few bits of the first 16-bit instruction parcel.

<xentrac>> We further make the parcels little-endian to decouple the instruction encoding from the memory system endianness altogether.

<xentrac>> Once we had decided to fix on a little-endian instruction encoding, this naturally led to placing the length-encoding bits in the LSB positions of the instruction format to avoid breaking up opcode fields.

<xentrac>I thought that was pretty clear!

<oriansj>xentrac: yes the words are clear with context but the pictures of the instruction formats do not match the words.

<xentrac>they seem to match to me; the length-encoding bits are in the LSB positions

<xentrac>and the bits are numbered with (for the 32-bit format) bit 31 at the left, which clearly can't be a length-encoding bit, and bit 0 at the right, which is part of the opcode field

<xentrac>I don't see any inconsistency at all

<oriansj>[func7 7][]rs2 5][rs1 5][funct3 3][rd 5][opcode 7] is what R format looks like in the diagrams. [rd 1][opcode 7], [rs1 1][funct3 3][rd 4],[rs2 4][rs1 4], [funct7 7][rs2 1] look very different to me

<xentrac>well sure, if you read it as four bytes you have to put them in the right byte order to get the right 32-bit word

<xentrac>there you've put them in big-endian order, but the right byte order is little-endian order, which gives you

<xentrac>[funct7 7][rs2 1], [rs2 4][rs1 4], [rs1 1][funct3 3][rd 4], [rd 1][opcode 7]

<xentrac>which is the same

<xentrac>I mean it's the same problem you have with a 32-bit integer in memory on i386 or amd64

<xentrac>0x804800 is represented as the sequence of bytes 0x00, 0x48, 0x80, 0x00

<oriansj>xentrac: hex2 and M? are big endian languages that optionally output pointers and immediates in little endian order.

<oriansj>they read a byte and output that byte

<oriansj>So DEFINE foo aabb will write the byte aa and then the byte bb

<oriansj>we are not writing out words but individual bytes

<oriansj>which is why for i386 and AMD64, we just did DEFINE ADDI8_to_RAX 4883C0 and ADDI8_to_RAX !-1 and it just works

<fossy>urgh the linux kernel minimum requirements docs are wrong

<oriansj>or DEFINE CALLI32 E8 and CALLI32 %foo and it worked

<fossy>our version binutils is meant to work but it most def dosent

<oriansj>fossy: nice find

<xentrac>oriansj: yeah, makes sense. not sure that necessarily means risc-v is harder to bootstrap, but it's definitely different from the byte orientation of the Datapoint 2200

<fossy>im wondering if its cause its an LTS (4.9.x series) but it changed and noone updated it

<fossy>so im testing older of that

<fossy>if that dosent work ill move to 4.4.x

<xentrac>so tools that work well for bootstrapping one may not work well for bootstrapping the other

<oriansj>xentrac: well the word orientation means we might not be able to do a reasonable hex1 and might have to skip to doing a full hex2 in hex0 (that is going to really suck)

<oriansj>xentrac: these tools worked fine for 5 other architectures

<xentrac>I wonder if there's a simpler way to do RISC-V

<oriansj>but you are right, there *MIGHT* be a more optimal bootstrapping path for RISC-V but thus far no one has figured it out yet.

<oriansj>I'm certain stikonas would love it if someone else saved them the work of having to hand convert RISC-V instructions into hex

<oriansj>hence why when there was a working solution for improving RISC-V's hex2, I incorporated it in a way that would be as easy as possible to implement in assembly.

IRC channel logs

2021-07-30.log