IRC channel logs

2023-12-29.log

back to list of logs

<oriansj>stikonas: merged
<stikonas>thanks
<stikonas>luckily fixed hex0 code had exactly the same length and positions of all jumps :)
<stikonas>oh, actually cc_amd64 also runs fine, just slowish on the VM... but good enough on baremetal
<oriansj>well, the real question is mescc able to build tcc on it
<stikonas>indeed...
<stikonas>well, I'm slowly adding more syscalls (mostly trivial addtions, but still need to test them)
<stikonas>I guess the main difficulty will be fork/waitpid/execve/wait4
<stikonas>well, M2 seems to run now too
<Googulator>speaking of M2 - what was M2-Moon going to be?
<Googulator>I've seen a few references to it, but we don't have that today (instead, we have M2-Mesoplanet, which is not mentioned in those early documents)
<oriansj>Googulator: M2-Moon was going to be a scheme compiler written in assembly
<oriansj>turns out C is much easier to write a compiler for
<muurkha>really? that surprises me. but I haven't written a C compiler, only a Scheme compiler (if you can call it Scheme without call/cc)
<muurkha>I had the impression that most of ur-scheme was actually assembly written in Scheme syntax, you know, stuff like (jnz destlabel) (comment "now, test its magic number") (cmp (const magic) (indirect tos)) (jnz destlabel)
<muurkha>but a random sampling of its code is 25% assembly, 15% comments and blank lines, and 60% other stuff
<muurkha>I was going to guess that M2-Moon was an implementation of a subset of Lua :)
<muurkha>Ur-Scheme is 1553 lines of (its own subset of) Scheme, and it took me about three weeks. I think it might take me a week today; I'd probably include structs and, to simplify closures, allocate all activation records on the heap, which would probably require me to actually implement the GC. I'm not sure how much harder it would be to write it in assembly, given that it's evidently not relying on
<muurkha>garbage collection to run in an acceptable amount of RAM
<oriansj>muurkha: yeah your Ur-Scheme is excellent work; unfortunately it wasn't powerful enough to run/compile mescc (the real goal of M2-Moon)
<oriansj>heck Edmund GRIMLEY EVANS's cc500.c is only 500 lines of C code and is self hosting (no need for separate assembler/linker) and has 70% of the language features of cc_*
<oriansj>turns out once C has functions, ints, chars, structs, arrays, goto, return and asm(); everything else is a nice extra. (one could cut down that list but it'll result in harder to follow code)
<muurkha>yeah, Ur-Scheme only implements enough of Scheme to compile itself
<muurkha>it implements functions, ints (somewhat!), chars, conses, closures, strings, conditionals, and global and local variables. I didn't think about supporting inline assembly!
<muurkha>I think C needs loops, conditionals, and variables, doesn't it?
<muurkha>Scheme with inline assembly, or inline machine code, is an interesting idea
<oriansj>nope, you can approximate loops with gotos and return
<oriansj>and asm can be used to do conditionals, variables and basic operations (like +, ->, etc)
<oriansj>but yes, variables and if statements do make things much more clean
<muurkha>oh, okay!
<muurkha>I did regret not having structs, as I said
<oriansj>well structs in scheme are not an easy thing to add; it took me a couple days to get the basic form when I was doing a scheme interpreter and janneke's mes.c doesn't even have that either.
<stikonas>rickmasters: was builder-hex0 per-process memory limit chosen due to mes usage?
<stikonas>(670,793,728 bytes)
<stikonas>I guess I should give at least 1 GiB on amd64...
<stikonas>since in order to emulate brk I need to pre-allocate sufficiently big block of memory
<stikonas>later allocations might not be continuous
<rickmasters>stikonas: yes, mes is the process that uses the most memory
<rickmasters>stikonas: I can't help but think there is some design flaw in mes / mescc that causes it to consume so much memory and time.
<stikonas>rickmasters: probably just a lot of indirections
<stikonas>I haven't looked at it that deeply, but ekaitz says it's a very simple scheme that then implements more complex one
<stikonas>but yes, it's not that good with memory, and needs high memory bandwidth too, not just amount
<rickmasters>stikonas: yeah I haven't look closely. I've debugged a couple problems in mes and it was really hard to understand.
<rickmasters>stikonas: I was hoping it would be replaced but if not maybe someday I'll have time to look into it.
<stikonas>well, on x86 there might ne an alternative path...
<stikonas>but really just 32-bit x86
<stikonas>there are just too many old tcc versions in that alternative path
<rickmasters>stikonas: re: mes, my suspicion would be a lot of copy-by-value going on combined with recursion and an increasing image size down the stack.
<stikonas>could be...
<rickmasters>stikonas: yeah, and I'm pretty sure you're with me in saying "tcc sucks" in terms of code quality.
<stikonas>yeah, tcc is really hard to read...
<stikonas>even when you know C
<muurkha>oriansj: what do you mean about getting the basic form? structs in Scheme can be pretty simple
<ekaitz>mes is pretty slow but i think that comes from different reasons at the same time
<ekaitz>scheme is by default very memory-dependent
<ekaitz>there's a lot of indirection
<ekaitz>the basic structure of scheme is a linked list...
<ekaitz>when running mescc we are building a very large tree of the C code
<ekaitz>allocating it and traversing are probably slow operations
<ekaitz>mes' code is really easy to read otoh, which may give us chance for improvements
<ekaitz>(i don't want to call them optimizations)
<stikonas>well, mescc code, or mes code?
<muurkha>I don't think "building large linked data structures" is all that different from most compilers
<stikonas>it probably depends on how much experience you have with scheme...
<muurkha>structs would help with that, a bit
<rickmasters>I wonder if there is a difficulty with hash tables that makes symbol management hard...
<muurkha>also, aside from the design of the program written in Scheme, it's possible to implement Scheme in ways that have more or less pointer chasing. car and cdr unavoidably have to chase pointers, but things like global and local variable access don't have to
<muurkha>and IIRC in MES they do?
<Mikaku>rickmasters: I'm running a complete package build with your PR64, so far so good
<rickmasters>Mikaku: thanks!
<rickmasters>As an update to the group, Mikaku is current testing a PR to support building Fiwix with tcc.
<muurkha>yay!
<stikonas>do you mean upstream Fiwix?
<stikonas>(up to know it was your fork?)
<rickmasters>Yes, this is upstreaming. After that, there is only one change left to upstream into Fiwix which is kexec of linux.
<stikonas>yeah, nice progress!
<muurkha>*up to now?
<ekaitz>stikonas: we might be able to improve mes to make mescc go faster
<stikonas>all that fiwix work will also help me with 64-bit bootstrap too
<rickmasters>stikonas: I'm hoping we can pull from Mikaku's code base directly soon, which will simplify the workflow for improvements greatly.
<stikonas>indeed
<rickmasters>Frankly, I'm hoping to extract myself from the process although I still want to help where I can.
<stikonas>though I'm still not sure how to integrate UEFI bootstrap with live-bootstrap
<stikonas>well, you did a great job already
<rickmasters>thanks. It's all fun and games until you try to upstream into someone else's project. That's a lot more work.
<rickmasters>And Mikaku's feedback led to better results. Like he rewrote the large ram drive support. My version was a hack job I was happy to get rid of.
<rickmasters>stikonas: I've been following some of your plans on UEFI and you're approach is roughly how I thought it should go...
<rickmasters>stikonas: basically, read in the files using UEFI primitives and then run a 64 bit port of builder-hex0 written in "C"-ish code
<matrix_bridge><Andrius Štikonas> Yeah, though debugging is hard...
<matrix_bridge><Andrius Štikonas> But at least code is simple...
<matrix_bridge><Andrius Štikonas> I'm not sure why, but often inserting extra fputs for debug messes things up
<rickmasters>back later
<stikonas>hmm, probably kaem-optional won't be compatible with uefi...
<stikonas>it again makes assumptions that are not true in UEFI (but unlike hex0, it would be harder to fix)
<stikonas>anyway, that shouldn't be a big deal as we can just build full kaem...
<stikonas>(could probably be fixed on the kernel side it we want by creating out own fd table)
<oriansj>muurkha: can be simple and to do in a easy to debug way are two different things.
<oriansj>also mes.c has vector support
<oriansj>which are pretty fast for what you would need; (not that you would need more than 64 items in any of the lists you would need to evaluate a function)
<oriansj>You would have one list for the globals; one list for the known types; one list for the arguments passed on the stack; one list for the local variables and then the ast you are walking to generate the assembly (M2-Planet just has it as a double linked list)
<muurkha>yeah, you can implement structs on top of vectors pretty easily