IRC channel logs

2023-10-29.log

back to list of logs

<jcowan>Is there a statement anywhere of what C dialect mescc accepts?
<stikonas>jcowan: not really...
<stikonas>probaly most C89 stuff
<stikonas>some C99 too
<stikonas>jcowan: thoguh what do you want to achieve?
<jcowan>I'm interested in small C dialects
<stikonas>well, mescc generally supports slighty more than M2-Planet
<stikonas>and M2-Planet supports if/else statements, for/do/while loops, asm blocks (with it's own M1 asm syntax), goto, unions, structs, arrays
<stikonas>I think mescc also supports switch/case on top of that
<stikonas>also increment/decrement operators
<stikonas>possibly a few other things
<stikonas>though mescc is orders of magnitute slower than M2-Planet
<jcowan>Hopefully it's faster if you run it on a more performant Scheme
<stikonas>yeah, a bit faster if you run it with guile
<stikonas>but if you are in bootstrapping environment where you have just built mes, then it is fairly slow
<jcowan>Unsurprising
<jcowan>not that Guile is a very fast Scheme either
<stikonas>yes, but still faster than mes
<jcowan>Gambit would probably be much more performant than Guile
<stikonas>and when combined with some non-performant hardware, e.g. risc-v, it can take a week to build tcc...
<jcowan>a week!
<stikonas>yeah...
<stikonas>risc-v is slow
<stikonas>we probably output not very optimized code too
<jcowan>Sure
<stikonas>(kind of following x86 ideas, so not really most optimal for riscv)
<jcowan>mmm
<stikonas>e.g. heavy stack use rather than registers
<stikonas>on x86 it's 20 minutes though
<jcowan>x86 is a horrible use of silicon
<jcowan>what cpus are supported right now?
<stikonas>well, full bootstrap works only on x86
<stikonas>on riscv we can now get from hex0 to tcc
<stikonas>so mescc works there
<stikonas>on amd64 we can start with hex0, get to mes, rebuild mes with mescc, then built very first tcc binary (but it is non-functional and crashes)
<stikonas>there might be some support for arm/aarch64, but you might have to just between 32/64 bits
<stikonas>so most likely also incomplete
<stikonas>jcowan: some people in my work do design x86 chips :), though not me...
<jcowan>You know the definition of Windows 95?
<stikonas>?
<stikonas>bloated?
<stikonas>(for those days...)
<jcowan>"A 32-bit shell on top of a 16-bit operating system designed for an 8-bit computer based on a 4-bit chip, designed by a two-bit company that doesn’t care one bit about its users."
<pabs3>btw, if anyone knows of other resources related to generated files; essays about source, or tools to detect generated files etc, let me know
<oriansj>pabs3: well generated files took a very hard turn for the worse when it comes to detection due to the wide availability of Large Language Models which produce output that looks human written.
<pabs3>yeah, that is a huge problem
<oriansj>but then again people claiming generated crap as source code isn't a new problem and even GNU programs distribute generated files in its "source" tarballs. So that half of the bootstrapping fight will be a much harder fight.
<stikonas>yeah, but LLM makes it much harder to identify those
<stikonas>and then you start getting semi-generated files
<stikonas>i.e. there was some source code that was pregenerated, possibly with some bugs and then human edits i
<stikonas>s/i/it/
<stikonas>oriansj: so I think ekaitz and I are at the point where there isn't much more to do from hex0 to bootstrappable tcc on riscv64, so we'll need to think about releasing
<stikonas>do you think you'll have a bit of time to create stage0-posix tarballs?
<stikonas>(and tags)
<oriansj>of course
<stikonas>it will probably be easier for janneke to test everything if we get stage0 out first
<stikonas>though I think new stage0-posix also needs mes 0.25 (i.e. it's not compatible with earlier ones)
<stikonas>at least this time we have a nice changelog at https://github.com/oriansj/stage0-posix/blob/master/CHANGELOG.org
<muurkha>man, it's been 28 years since Windows 95
<muurkha>RISC-V probably isn't inherently slow, but all the current implementations of it are
<muurkha>stikonas: when you say "on x86" do you mean "on i386"? sometimes people use "x86" to mean "amd64" or {amd64, i386} or {amd64, i386, 8086}
<stikonas>also our stage0 and mes compilers produce a non-optimized code
<stikonas>muurkha: no, not i386... amd64
<muurkha>hmm, but you said "full bootstrap works only on x86" and then explained how it doesn't work on amd64: "on amd64 we can ... [build the] very first tcc binary (but it is non-functional and crashes)"
<stikonas>well, yeah, I was not completely consistent with x86
<muurkha>it's probably the case that RISC-V depends more on compiler optimization than amd64; the larger register set is sort of like a level -1 data cache, explicitly managed by the compiler
<muurkha>so in "works only on x86" you meant i386?
<stikonas>sometimes people refer to both 32bit and 64-bit when they say x86
<stikonas>bootstrap is only completed on i386
<stikonas>but again, 64-bit CPUs can still run 32-bit code for now
<muurkha>sure
<muurkha>thanks for unconfusing me!
<artemist>There are some ARM cores which only support AArch64 (mostly in very new phones or Apple's ARM machines) but that's mostly unrelated, you still end up with a ton of 32 bit code on Windows and it will be supported for the forseeable future
<stikonas>on the other hand 32-bit code on Linux will be seriously broken in 15 years or so
<stikonas>I've tried running current bootstrap chain with clock moved forward and various things do break (in particular build systems)
<nektro>is the reason for why mes is slow well known/understood?
<jcowan>mescc is not optimized and does not produce optimized code
<jcowan>if you compile it with an optimizing Scheme->C compiler like Gambit and then compile the output with gcc/clang you will probably get a huge speedup
<jcowan>s/Gambit/& or Chicken
<oriansj>nektro: M2-Planet produces very naive binaries and the lack of switch support means that mes.c needs to do a bunch of branching on every single s-expresion
<oriansj>so if you wanted to speed up mes.c a good bit we would need to add switch/case support to M2-Planet
<stikonas>nektro: I tried running callgrind on mes/mescc, it was spending 25% of time in eval_apply function
<oriansj>(which is a giant if else if else block)
<stikonas>yeah...
<stikonas>still, it's unlikely that it would be massively faster
<stikonas>maybe a bit faster
<stikonas>and interpreters are slow in general
<stikonas>we can't really compare mes and gambit since gambit just compiles it to C...
<stikonas>and presumably something only gcc can deal with
<oriansj>well going from if/else to switch would reduce the number of conditional jumps from 1-20 to just 1
<oriansj>I remember when janneke went from switch to if/else even self-hosted it became a good bit slower.
<stikonas>well, even self-hosted can't help much if code is using if/else, and it has to in order to be able to be bootstrapped with M2-Planet
<stikonas>not sure how hard it would be to implement that
<stikonas>perhaps not too hard
<stikonas>but there are various corner cases
<stikonas>like break, nested switch, etc...
<oriansj>indeed, hence why it hasnt been implemented yet
<jcowan>stikonas: Gambit supports at least gcc, clang, and tcc
<jcowan>(also js and python)
<stikonas>yeah, but in terms of bootstrapping, once you have tcc, scheme is less important
<stikonas>tcc can build basically everything
<jcowan> well, anything written in c89
<mihi>just a side note: When your biggest problem is a huge if/else cascade on singleton pointers and your compiler cannot do switch/case or function pointers, but has goto (which is even used in eval_apply), the method known from last century was to give each singleton member (i.e. "symbol" in mes' case) a hand-crafted integer value (ordinal) and do range comparisons like "if (ordinal > 32) {if ordinal > 24) goto x1;
<mihi>else goto x2;} else if (ordinal > 16) goto x3; else goto x4;" That way you get from O(n) worst case comparisons to somewhere near O(log n).
<mihi>But still, it only makes sense if that is the true bottleneck. If the bottleneck is that eval_apply is called in sequential loops where some more sensible lookup would make it faster, you'd have to optimize that first.
<mihi>and in Lisp/Scheme there is lots of linked list processing :)
<mihi>(that *24* should have been *48* in the code above)
<muurkha>stikonas: not all 32-bit code on Linux, just the code that cares about time
<muurkha>like, libjpeg wont' care
<stikonas>yes, which is why I specifically mentioned build systems
<stikonas>so live-bootstrap did fail at some point when trying to run it in 2040 or so...
<stikonas>not immediately, but somewhere between tcc and gcc...
<stikonas>once more complicated build systems really kick in... Earlier steps with kaem or handcrafted make files usually work fine
<muurkha>if you want reproducible bootstrapping you should probably do it with a fake system clock
<muurkha>because otherwise the system clock is an input into the build process that's different every time you run it
<stikonas>well, other clock bugs are fixed in live-bootstrap
<stikonas>there were initially a few bugs were year was stored in documentation, etc...
<stikonas>but 2038 problem is a bit more complicated...
<muurkha>is it?
<stikonas>yeah, cause clocks wrap to 1900 or so...
<muurkha>not if you use a fake system clock
<stikonas>and e.g. make can get confused which file is newer and which is not
<stikonas>well, if you fake it completely to some fixed value
<stikonas>then yes, that's a workaround
<stikonas>and in modern linux systems you could use namespaces for that