IRC channel logs

2021-11-12.log

back to list of logs

<oriansj>literally 50 lines if you want to get fancy: https://github.com/oriansj/Slow_Lisp/blob/master/lisp_read.c#L56
<stikonas>well, list is basically designed to have a very simple syntax
<oriansj>well s-expressions wasn't so much designed as stumbled upon while trying to figure out how to implement M-expressions
<oriansj>but yeah, if implementation simplicity and ease of porting was the goal for C, s-expressions would have been much easier to do in assembly
<oriansj> https://github.com/oriansj/stage0/blob/master/stage2/lisp.s#L124 like 60 lines of assembly total
<oriansj>oh and your M2-Planet enhancement was been merged.
<oriansj>and we found a crash
<stikonas>hmm, that's quick...
<stikonas>oriansj: could it be that head->prev->s crashes?
<stikonas>e.g. if you give input like += 1
<stikonas>oriansj: yes, that crashes
<stikonas>oriansj: just put "+=" as input file
<oriansj>yes that is exactly the case found
<oriansj>I'm thinking of putting the fix in insert_token
<stikonas>yaeh, I have a working fix
<stikonas>probably the same as yours
<oriansj>a simple require is all that it requires (unless you want a full line error message)
<stikonas> https://github.com/stikonas/M2-Planet/commit/ba51f3dcc8807a7e3d9e68686732a3c28de87d20
<oriansj>but yeah, I'm finishing up the commit locally
<stikonas>yeah, I also had require
<oriansj>I guess that works too; I'll merge yours
<stikonas> https://github.com/oriansj/M2-Planet/pull/32
<stikonas>and I'm updating stage0-posix
<stikonas>and it's here https://github.com/oriansj/stage0-posix/pull/65
<stikonas>something went wrong with github PR title...
<stikonas>but the commit inside seems to be the correct one
<oriansj>and merged
<oriansj>and looks like fuzzing on the new fix, is looking good.
<stikonas>ok, that's good
<stikonas>well, that crash was easy to spot once you said there is a crash
<oriansj>well yes, the more frequently we fuzz, the less places for possible bugs to creep in
<avalenn>Hello. Where should I start to read stage0 and stage1 source code?
<avalenn>In a pure "I am one of those 70% of developers able to understand it so I will try" way.
<riv>wb avalenn
<riv>start at stage0
<avalenn>s/able to/that should be able to/
<oriansj>avalenn: well first which architecture would you like to use for trust: x86, AMD64, RISC-V(64bit, AArch64 or knight?
<oriansj>for x86, AMD64, AArch64 or RISC-V you'll want to start in their respective folder in here: https://github.com/oriansj/stage0-posix but for the knight bare metal bootstrap: https://github.com/oriansj/stage0
<avalenn>I will focus on RISC-V for now.
<avalenn>What is knight?
<oriansj>A CPU ISA
<oriansj>the roots of trust can be found here: https://github.com/oriansj/bootstrap-seeds (or replaced by any functional equivalent)
<oriansj>general descriptions can be found here: https://github.com/oriansj/talk-notes with the actual bootstrap path being expressed here: https://github.com/oriansj/talk-notes/blob/master/live-bootstrap.dot (or if you don't wish to generate and want just the pretty picture: https://github.com/oriansj/talk-notes/blob/master/live-bootstrap.pdf )
<oriansj>You can find details about knight here: https://github.com/oriansj/stage0/tree/master/Knight%20Reference
<oriansj>and even submit suggested improvements to the ISA if you so desire to do so.
<avalenn>Is it an ISA designed for this project?
<avalenn>Thank you for the links. I will try that.
<oriansj>avalenn: nope, it existed long ago but never became popular; so its patents are expired
<oriansj>and could be implemented out of individual logic gates by a motivated individual
<avalenn>Ok. I could not find any pointer to Knight after a quick internet search. Thus my question.
<stikonas>oriansj: I've realized that there is an issue with those new assignment operators. If we want to fix it, probably the whole thing needs rewriting
<stikonas>but I think it wouldn't affect building mes
<stikonas>it doesn't support expressions on the left hand side...
<stikonas>so stuff like a->b += 1;
<stikonas>avalenn: I think the company that made Knight folded up before Internet became a thing
<stikonas>hence can't easily find any references there
<stikonas>avalenn: and if you are focussing on risc-v, it's me who wrote most of the early bootstrap code for risc-v so feel free to ask questions
<stikonas>risc-v bootstrap is fone in stage0-posix but anything after that (e.g. mes/mescc is still WIP)
<stikonas>well, mes-m2 actually builds, but mescc have not yet been ported
<stikonas>I think gbrlwck is trying to port mescc to risc-v
<avalenn>stikonas: I am not sure I really will take the time to read all, depends a lot of my other activities. But if I have questions I now know where to ask them ;-)
<gbrlwck>stikonas: yes, i am (still)..
<gbrlwck>that being said: i get mescc to emit riscv64 M1 code, but ELF header and stuff is missing. i have implemented longjmp and setjmp (not sure if i did that correctly)... https://github.com/gbrlwck/mes-m2/
<gbrlwck>any ideas?
<stikonas>gbrlwck: have you added a copy of ELF header to mes-m2?
<stikonas>e.g. x86 version is in ./lib/linux/x86-mes/
<gbrlwck>i think that's it!
<muurkha>stikonas: arguably internet existed
<stikonas[m]>Well, yes, but web didn't
<stikonas[m]>And when people say searched on the internet they mean searched on the web
<muurkha>I searched on archie
<gbrlwck>muurkha: archie?
<gbrlwck>does elf64-0exit-42.hex2 _have_ to be aligned?
<stikonas>probably not
<stikonas>other arches don't seem to do that
<stikonas>risc-v is fixed width instruction set architecture anyway
<stikonas>so everything is always alligned to 4 bytes
<stikonas>well, at least .text part is aligned. .data parts might not be aligned
<ekaitz>stikonas: is riscv port ready?
<ekaitz>i've been reading all the steps and it looks like you made a lot of work!
<stikonas>ekaitz: well, stage0-posix port is done
<ekaitz>wow!
<ekaitz>amazing
<stikonas>ekaitz: and we can even build mes-m2 but only on real HW
<stikonas>does not work that well in qemu-user emulation
<ekaitz>man this is awesome
<stikonas>(due to us using a lot of brk space that qemu does not like)
<stikonas>ekaitz: and mescc port is in progress (but not by me)
<ekaitz>oh really cool
<stikonas>but after that it's more complicated...
<ekaitz>i've been out for a couple of months and all this happened!
<stikonas>mescc->tcc only works on x86
<stikonas>ekaitz: well, not I moved onto improving M2-Planet
<stikonas>to actally make mes-m2 unnecessary
<stikonas>and to build upstream mes
<ekaitz>oh great
<ekaitz>if everything goes as expected I'll work on Mes in the near future
<ekaitz>so we'll see if we can make tcc and everything work
<stikonas>things that got added over last month are #ifdef, #ifndef, #undef, #error, #if VARIABLE, double arrays of pointers, e.g. argv[i][j] (but not int [i'
<stikonas>but not int [i][j]
<stikonas>then also variable dereferencing, i.e. *p
<stikonas>and last change is partial support for compound assignments, e.g. a += 1;
<gbrlwck>ekaitz: i have no ETA, but i'm currently working on MEScc to emit riscv64 M1
<stikonas>gbrlwck: it would be good if you eventually you submit your port to upstream mes too
<ekaitz>great!
<stikonas>(as opposed to mes-m2 fork)
<gbrlwck>stikonas: that's the plan :P but first i need to get stuff to work
<ekaitz>next is to make TCC work on riscv too, right?
<stikonas>yeah, upstream mes is mostly the same stuff, but need to adjust their build system
<ekaitz>there's a lot of work to do around there
<stikonas>ekaitz: well, TCC is trickier...
<ekaitz>ofc
<stikonas>we can only build older patched version of TCC
<stikonas>so 1) need to backport it from newer TCC
<stikonas>2) fix older TCC to work on more than just x86
<stikonas>this is for of TCC that we were building https://gitlab.com/janneke/tinycc/-/tree/mes-0.23.24
<ekaitz>in any case that is later building some gcc that we'd need to backport too, right?
<stikonas>yeah, that might also be complicated
<stikonas>also binutils
<stikonas>(in particular GAS)
<stikonas>the newest binutils that we could build was 2.14
<stikonas>(could build early in the bootstrap)
<ekaitz>well we'll see if I can help with any of this
<gbrlwck>ekaitz: of course you can
<gbrlwck>maybe s/if/how/
<ekaitz>yeah it could be
<ekaitz>i'm capable of the best and the worst so we'll see hahah
<stikonas>and there is also riscv32 that is not done
<ekaitz>it shouldn't be very difficult
<stikonas>well, getting elf header would be a start
<stikonas>and you already have experience in that. After all you wrote riscv64 header
<stikonas>and yes, a lot of stuff can be taken from riscv64
<stikonas>just need to adjust to smaller register size
<ekaitz>i basically copied the x86 one and changed two fields
<ekaitz>with the info I got from the wikipedia page
<ekaitz>:)
<ekaitz>i'll give it a shot these days and see if I can make hex0 for risc32
<stikonas>probably just need to change ld->lw after elf-header is ready
<stikonas>well, and immediates that ld loads
<ekaitz>yes and from that i can dig a little bit on the next steps of the process and check the changes they will need
<muurkha>gbrlwck: archie was an internet search engine that predated the web
<muurkha>it searched ftp sites
<stikonas>that's still from 1990...
<stikonas>I think knight is from seventies
<muurkha>yeah
<muurkha>and it's arguable whether the internet was even an internet then
<gbrlwck>hmmm... i get "mescc: file not found: "x86-mes/libmescc.a"".. how do i generate that?
<gbrlwck>and "riscv64-mes/libmescc.a"
<gbrlwck>nvm
<gbrlwck>does "Target label FUNCTION___init_io is not valid" mean it's missing some libc?
<stikonas>gbrlwck: init_io seems something from M2libc
<stikonas>gbrlwck: mes libc does not use that if I remember correctly
<stikonas>so not sure why something calls that function
<stikonas>grepping init_io does not find anything in mes
<stikonas>it is used in M2libc's libc-full.m1
<gbrlwck>it appears in my crt1.o (M1) which i've just compiled
<stikonas>compiled from which source?
<stikonas>hmm, live-bootstrap builds it from lib/linux/x86-mes-mescc/crt1.c
<stikonas>did you write an equivalent file for riscv?
<gbrlwck>i compiled mes-m2/lib/linux/riscv64-mes-m2/crt1.M1 (using stage0-posix's M1) and copied it to lib/riscv64-mes/crt1.o
<stikonas>yeah, that wouldn't work
<stikonas>that crt1.M1 file is for M2-Planet
<stikonas>don't use anything from -m2 folders
<gbrlwck>aha!
<gbrlwck>k
<stikonas>everything in -m2 is for m2-Planet
<stikonas>and in -gcc for gcc and -mescc for mescc
<gbrlwck>i see
<stikonas>calling convention, etc might be different
<stikonas>look at crt1.c file in lib/linux/$arch-mescc directories
<stikonas>it's all inline assembly though, despite .c extension
<gbrlwck>there's none yet for riscv64 (in mes-m2). but there's work from laanwj, it just needs adaption to our (new) riscv M1
<gbrlwck>would "ld_____%t0,0(%t5)" be the same as "RD_T0 RS1_T5 LD"?
<stikonas>yes
<gbrlwck>thanks!
<stikonas>in GAS syntax it is "ld t0, 0(t5)"
<stikonas>so that define kind of mimics that syntax
<gbrlwck>and the 0(t5) part means 0+ the value in t5?
<stikonas>yes, load from the addess 0 + t5
<stikonas>and you can always compare gas vs m1 syntax using Development prototypes that we have
<stikonas>most programs have .S version written in GAS and also .M1 version
<gbrlwck>unfortunately i think they're missing for laanwj's work in upstream MES
<gbrlwck>what does the last part here: "li_____%t1,$i32 &environ" mean ($i32 &environ)?
<stikonas>gbrlwck I think that's load immediate value from that label
<stikonas>in the new risc-v M1 syntax I was just issuing AUIPC and ADDI calls
<oriansj>stikonas: looking at the operators further; one can't properly implement them in the preprocessor at all.
<oriansj>now the tokenization improvement works great.
<oriansj>The problem is you'll want it in the cc_core.c chain with + and / operators
<stikonas[m]>Yeah, I'm looking at implementing them properly
<oriansj>and adding support for ++ and -- would probably fit too
<stikonas[m]>I do have prefix inc/decrements working if you want to review
<oriansj>I'll take a look
<stikonas[m]>Prefix ++ works not too bad in unary-expr
<stikonas>so stuff like *++p doesn't work but ++*p works
<oriansj>well stuff like *++P and ++*P wouldn't work anyways in M2-Planet unless p was a char
<stikonas>oriansj: https://github.com/oriansj/M2-Planet/pull/33
<stikonas>and it's not that useful generally
<stikonas>even in gcc
<stikonas>++*P is used significantly more often
<stikonas>*++p probably only makes sense in gcc if you are dealing with some double array of pointers
<stikonas>oriansj: as for assignment operators, I thought maybe expression() is the right place
<muurkha>no, *++p is "the next item in the array p points into"
<stikonas>but I haven't got it working yet
<stikonas>muurkha: in normal C
<stikonas>M2-Planet has problems with pointer arithmetic
<muurkha>doesn't gcc compile normal C though?
<stikonas>gcc yes
<stikonas>well, yes, occasionally *++p might be useful but it is very rare
<muurkha>the thing I was disagreeing with was "*++p probably only makes sense in gcc if", did you mean "M2-Planet" instead of "gcc" in that sentence?
<oriansj>and merged
<stikonas>I'm not even sure if mescc supports *++p
<muurkha>I don't think so
<oriansj>muurkha: some Proper C constructs seem like a bad idea to support because they tend to be abused in ways they shouldn't be used.
<muurkha>oriansj: sure
<stikonas>muurkha: yeah, I think you are right but still that makes code harder to read
<muurkha>people sometimes disagree on which ones
<stikonas>well, with pointer arithmetic sometimes you can write a very short code
<oriansj>also stikonas; should we look at what it would take for M2-Planet to compile the version of TCC that MesCC compiles?
<stikonas>oriansj: hmm, at the very least it will require switch/case
<stikonas>and meslibc
<muurkha>I think "the next item in the array p points into" is a useful thing to have syntax for; it can result in code being less error-prone rather than more so
<stikonas>other than that, I think not too much on top of what mes requires
<stikonas>isn't p[1] clearer?
<muurkha>no
<muurkha>there are a lot of algorithms that are natural to write in terms of iterating over sequences in that way
<stikonas>oriansj: I was trying to get some diff first to see what we need for mes...
<stikonas>but it's unfinished
<muurkha>if you want to write it as p[i] you need to separately maintain p and i, and you have to be careful not to use i to index the wrong thing, which is something C's type system can't help you with
<muurkha>you're introducing a machine-level concept of integers instead of working with the problem-domain concept of iterators
<stikonas>oriansj: https://paste.debian.net/1219344/
<muurkha>so indexing arrays explicitly instead of using pointer arithmetic requires you to program at a lower level, which is more error-prone
<stikonas>it's mostly meslibc and i haven't looked in mes/src/* that much
<muurkha>(sometimes it's unavoidable, good luck trying to write binary search in terms of pointer arithmetic, but often it's not)
<stikonas>well, true, it might be useful in some cases
<muurkha>there are people who are more comfortable with explicit numerical indexing, and if they are your intended audience, that's how you should write your code
<stikonas>I think I was mostly looking "useful" in terms of what is used in mes or tcc
<muurkha>so they can understand it
<stikonas>oriansj: oh and also at least support for . operator to get struct members
<muurkha>but being able to use pointers as iterators is a small but significant simplicity advantage
<stikonas>oriansj: so at the very least for mes we need struct members . then function-type defines in the preprocessor, postfix increments, casting
<oriansj>muurkha: disagree in terms of implementation but I might grant you in regards to use
<stikonas>oriansj: and I guess pointer arithmetic (1 vs sizeof)
<stikonas>some of them we can try to simplify in mes rather than improving m2-planet...
<oriansj>that is a good bit larger of a diff
<stikonas>larger of a diff than what?
<stikonas>this doesn't even cover actual mes.c files
<oriansj>adding support for struct members via . and function-type defines
<stikonas>janneke really did quite a bit of porting/simplification in mes-m2
<stikonas>oh those
<stikonas>yeah, these are more complicated things
<stikonas>function type-defines are at least fully in cc_macro
<stikonas>and no need to deal with cc_core
<oriansj>and pointer arithmetic is gonna be a major change
<stikonas>so fairly independent
<stikonas>well, maybe we need to keep #ifdef __M2__ for mes then
<oriansj>yep just replace them with FUNCTION statements
<muurkha>oriansj: oh, yes, pointer arithmetic is more complicated to implement than pointers without arithmetic, for sure
<muurkha>not saying it's necessarily worthwhile
<stikonas>oriansj: oh you want to automatically generate functions?
<stikonas>for function-like defines?
<muurkha>just saying I don't agree with the criticism that *using* it necessarily results in less-readable or more bug-prone code
<stikonas>hmm, that might work but I'm not yet sure if that's the easiest option
<stikonas>maybe it is...
<stikonas>we mostly need to support things like #define TYPE(x) g_cells[x].type
<stikonas>so if we create function TYPE which returns g_cells[x].type that might work but then the question is what is the type of x?
<stikonas>defines being type-independent make it hard
<stikonas>mes probably uses only ints...
<oriansj>well fully proper macro expansion would solve that
<oriansj>a bit of work but nothing that couldn't be done in a weekend with luck
<stikonas>maybe more than a weekend...
<stikonas>but it's just text replacement
<stikonas>so shouldn't be impossible to do
<stikonas>anyway, I should probably try to properly fix assignment operators
<oriansj>plus there is code you could steal from kaem to assist
<stikonas>even though what we have might already work for mes.c
<stikonas>but tcc would definitely need them fixed
<oriansj>the problem is there are too many possible paths forward, all of which will provide a benefit
<stikonas>exactly, and I'm not sure either what to do
<stikonas>well, postfix operators might be the next thing to do
<stikonas>but those might be a bit quirkier
<stikonas>we first use value there and only then increment
<stikonas>but yes, after that I'm not sure. Indeed too many paths
<oriansj>or just do expansion in cc_macro to ++ => + 1
<stikonas>but that wouldn't work, would it?
<oriansj>why not?
<stikonas>first of all preprocessor can't tell the difrerence between pre and postincrement
<oriansj>well it can look at the tokens before and after
<stikonas>and then what would happen at *nextchar++
<stikonas>this is the value at next char and only then we increase nextchar by 1
<oriansj>hmmm
<stikonas>it's probably like assignment, would work in the very simple cases in macro preprocessor
<stikonas>but those more complicated ones fail
<stikonas>except that in this case everybody uses more complicated construct
<stikonas>mes and tcc (and everywhere else) are full of things like *p++
<stikonas>hmm, although, we'll have another problem with *p++
<stikonas>pointer arithmetic is not working...
<stikonas>so perhaps pointer arithmetic needs sorting out
<stikonas>but that's a bit scary
<oriansj>very much so
<oriansj>as was even basic preprocessor support until yt added cc_macro.c
<stikonas>in any case M2-Planet is more capable now than what we had before yt added preprocessor
<oriansj>to add proper pointer arithmetic would require changing the state machine in an ugly way that would break things for a little while
<stikonas>it might be that I already did most of the easy stuff over the last few weeks
<stikonas>and rest will be harder
<oriansj>while depends upon how you look at it, in terms of mental complexity, no it isn't harder.
<stikonas>oh another thing that mes or tcc might need is logical AND and logical OR
<oriansj>But if in terms of number of lines of code needing to be changed, yeah it is gonna grow
<stikonas>right now M2-Planet ises bitwise AND/OR instead
<oriansj>as it was the fastest and simplest to implement
<stikonas>which works for 0 and 1 so gets us running in 95% of the cases
<oriansj>as we don't build a proper AST
<stikonas>mescc does build it, doesn't it?
<oriansj>it uses Nyacc to do that
<stikonas>yeah, that's what I thought too, when I briefly looked at nyacc
<stikonas>but I'm not too familiar with scheme
<oriansj>fair
<stikonas>so if we don't want to support pointer arithmetic, then we just try to implement preprocessor functions, which are independent of this
<stikonas>and patch remaining mes to avoid stuff that uses them
<stikonas>hmm...
<stikonas>it would still be a smaller patch than original mes-m2 diff
<stikonas>so might be upstreamable
<stikonas>oriansj: oh, and found another issue with prefix increments...
<stikonas>without = increment is not stored...
<stikonas>that's a bit bad...
<stikonas>maybe we shouldn't have merged it yet
<stikonas>anyway, over this weekend I'll try to fix this and assignments
<oriansj>stikonas: here is a thought.
<oriansj>what if we just stick to the minimal subset of C functionality required in M2-Planet for building mes-m2, mescc-tools and mescc-tools-extra
<stikonas>maybe.. We'll have to maintain mes-m2 then
<oriansj>for a little bit
<stikonas>although, I would still like to fix those two things (assignments and prefix)
<oriansj>stikonas: you are free to work on anything you think would be fun to do
<stikonas>well, this also needs some input from janneke
<stikonas>to know what his plans are for future mes
<oriansj>true
<stikonas>somewhat annoyingly we do support quite large subset of C
<stikonas>just a tiny bit too small
<oriansj>what will make you laugh is MesCC can't compile M2-Planet
<stikonas>oh
<stikonas>what feature it doens't like?
<oriansj>don't remember exactly but there is even a commit in mes-m2 about it
<oriansj>mes.c never was designed in terms of what M2-Planet supported, only what MesCC could compile and get good performance out of
<oriansj>as the 9 hour compile times with a segfault at the end were brutal earlier in MesCC's development
<oriansj>so considerable effort was spent on speeding that up
<oriansj>M2-Planet was entirely an after thought, as janneke was originally going to do Mes.c in hex0
<oriansj>you can see that in his fosdem 2017 talk
<oriansj>(link in my talk notes)
<stikonas>hmm, yeah, mes.c in hex0 would be brutal too
<oriansj>even the use of mescc-tools was a very slow and painful transistion
<stikonas>well, M2-Planet is in principle quite nicely done
<stikonas>and simplified version is portable to M1 assembly via (cc_*)
<oriansj>and the follow up M3 hasn't gotten much love because it is hard to focus and program with a screaming toddler demanding playtime with me
<stikonas>remind me what was M3 about?
<oriansj>think binutils compatible linker and assembler and a compiler able to directly build TCC
<stikonas>oh, so 3 programs, not just one
<oriansj>and binutils compatibility to make live-bootstrap getting muslibc much easier and sooner
<oriansj>it is still a bit from getting done: https://github.com/oriansj/M3-Meteoroid
<oriansj>but it has the major linker bits for x86 done
<stikonas>hmm, actually now I'm looking at pre increment, it's actually seems to be best done at preprocessor level
<stikonas>opposite to assignments
<stikonas>i.e. replace pre-increment with a = a + 1
<stikonas>hmm
<stikonas>anyway, I'l probably think a bit more tomorrow
<oriansj>sounds like a good plan
<stikonas>since stuff like ++(a+1) is not a legal expression
<stikonas>hmm, or maybe the whole thing is more complicated...
<stikonas>well, in the worse case I'll just revert my last two merges...
<oriansj>well the weeds of C can get really really ugly
<stikonas>yeah, it's surprising how much easier small subset of C is
<stikonas>although, maybe it wasn't that easy until it was written
<oriansj>and M2-Planet solves a hard enough problem (being buildable by a Compiler written in assembly) without having to add more on top
<oriansj>honestly I'd strip features from M2-Planet if it allowed cc_* to become simpler
<stikonas>well, that's true
<stikonas>but M2-Planet uses fairly restricted subset of C for itself
<oriansj>well it has a few extras like '\n' and the like which could be dropped from cc_* with only moderate extra complexity
<oriansj>but the pretty in_set lines were too nice to work around over a dozen lines of assembly