IRC channel logs

<jcowan>Is there a statement anywhere of what C dialect mescc accepts?

<stikonas>jcowan: not really...

<stikonas>probaly most C89 stuff

<stikonas>jcowan: thoguh what do you want to achieve?

<jcowan>I'm interested in small C dialects

<stikonas>well, mescc generally supports slighty more than M2-Planet

<stikonas>and M2-Planet supports if/else statements, for/do/while loops, asm blocks (with it's own M1 asm syntax), goto, unions, structs, arrays

<stikonas>I think mescc also supports switch/case on top of that

<stikonas>also increment/decrement operators

<stikonas>possibly a few other things

<stikonas>though mescc is orders of magnitute slower than M2-Planet

<jcowan>Hopefully it's faster if you run it on a more performant Scheme

<stikonas>yeah, a bit faster if you run it with guile

<stikonas>but if you are in bootstrapping environment where you have just built mes, then it is fairly slow

<jcowan>Unsurprising

<jcowan>not that Guile is a very fast Scheme either

<stikonas>yes, but still faster than mes

<jcowan>Gambit would probably be much more performant than Guile

<stikonas>and when combined with some non-performant hardware, e.g. risc-v, it can take a week to build tcc...

<jcowan>a week!

<stikonas>yeah...

<stikonas>risc-v is slow

<stikonas>we probably output not very optimized code too

<jcowan>Sure

<stikonas>(kind of following x86 ideas, so not really most optimal for riscv)

<jcowan>mmm

<stikonas>e.g. heavy stack use rather than registers

<stikonas>on x86 it's 20 minutes though

<jcowan>x86 is a horrible use of silicon

<jcowan>what cpus are supported right now?

<stikonas>well, full bootstrap works only on x86

<stikonas>on riscv we can now get from hex0 to tcc

<stikonas>so mescc works there

<stikonas>on amd64 we can start with hex0, get to mes, rebuild mes with mescc, then built very first tcc binary (but it is non-functional and crashes)

<stikonas>there might be some support for arm/aarch64, but you might have to just between 32/64 bits

<stikonas>so most likely also incomplete

<stikonas>jcowan: some people in my work do design x86 chips :), though not me...

<jcowan>You know the definition of Windows 95?

<stikonas>?

<stikonas>bloated?

<stikonas>(for those days...)

<jcowan>"A 32-bit shell on top of a 16-bit operating system designed for an 8-bit computer based on a 4-bit chip, designed by a two-bit company that doesn’t care one bit about its users."

<pabs3>btw, if anyone knows of other resources related to generated files; essays about source, or tools to detect generated files etc, let me know

<oriansj>pabs3: well generated files took a very hard turn for the worse when it comes to detection due to the wide availability of Large Language Models which produce output that looks human written.

<pabs3>yeah, that is a huge problem

<oriansj>but then again people claiming generated crap as source code isn't a new problem and even GNU programs distribute generated files in its "source" tarballs. So that half of the bootstrapping fight will be a much harder fight.

<stikonas>yeah, but LLM makes it much harder to identify those

<stikonas>and then you start getting semi-generated files

<stikonas>i.e. there was some source code that was pregenerated, possibly with some bugs and then human edits i

<stikonas>s/i/it/

<stikonas>oriansj: so I think ekaitz and I are at the point where there isn't much more to do from hex0 to bootstrappable tcc on riscv64, so we'll need to think about releasing

<stikonas>do you think you'll have a bit of time to create stage0-posix tarballs?

<stikonas>(and tags)

<oriansj>of course

<stikonas>it will probably be easier for janneke to test everything if we get stage0 out first

<stikonas>though I think new stage0-posix also needs mes 0.25 (i.e. it's not compatible with earlier ones)

<stikonas>at least this time we have a nice changelog at https://github.com/oriansj/stage0-posix/blob/master/CHANGELOG.org

<muurkha>man, it's been 28 years since Windows 95

<muurkha>RISC-V probably isn't inherently slow, but all the current implementations of it are

<muurkha>stikonas: when you say "on x86" do you mean "on i386"? sometimes people use "x86" to mean "amd64" or {amd64, i386} or {amd64, i386, 8086}

<stikonas>also our stage0 and mes compilers produce a non-optimized code

<stikonas>muurkha: no, not i386... amd64

<muurkha>hmm, but you said "full bootstrap works only on x86" and then explained how it doesn't work on amd64: "on amd64 we can ... [build the] very first tcc binary (but it is non-functional and crashes)"

<stikonas>well, yeah, I was not completely consistent with x86

<muurkha>it's probably the case that RISC-V depends more on compiler optimization than amd64; the larger register set is sort of like a level -1 data cache, explicitly managed by the compiler

<muurkha>so in "works only on x86" you meant i386?

<stikonas>sometimes people refer to both 32bit and 64-bit when they say x86

<stikonas>bootstrap is only completed on i386

<stikonas>but again, 64-bit CPUs can still run 32-bit code for now

<muurkha>sure

<muurkha>thanks for unconfusing me!

<artemist>There are some ARM cores which only support AArch64 (mostly in very new phones or Apple's ARM machines) but that's mostly unrelated, you still end up with a ton of 32 bit code on Windows and it will be supported for the forseeable future

<stikonas>on the other hand 32-bit code on Linux will be seriously broken in 15 years or so

<stikonas>I've tried running current bootstrap chain with clock moved forward and various things do break (in particular build systems)

<nektro>is the reason for why mes is slow well known/understood?

<jcowan>mescc is not optimized and does not produce optimized code

<jcowan>if you compile it with an optimizing Scheme->C compiler like Gambit and then compile the output with gcc/clang you will probably get a huge speedup

<jcowan>s/Gambit/& or Chicken

<oriansj>nektro: M2-Planet produces very naive binaries and the lack of switch support means that mes.c needs to do a bunch of branching on every single s-expresion

<oriansj>so if you wanted to speed up mes.c a good bit we would need to add switch/case support to M2-Planet

<stikonas>nektro: I tried running callgrind on mes/mescc, it was spending 25% of time in eval_apply function

<oriansj>(which is a giant if else if else block)

<stikonas>yeah...

<stikonas>still, it's unlikely that it would be massively faster

<stikonas>maybe a bit faster

<stikonas>and interpreters are slow in general

<stikonas>we can't really compare mes and gambit since gambit just compiles it to C...

<stikonas>and presumably something only gcc can deal with

<oriansj>well going from if/else to switch would reduce the number of conditional jumps from 1-20 to just 1

<oriansj>I remember when janneke went from switch to if/else even self-hosted it became a good bit slower.

<stikonas>well, even self-hosted can't help much if code is using if/else, and it has to in order to be able to be bootstrapped with M2-Planet

<stikonas>not sure how hard it would be to implement that

<stikonas>perhaps not too hard

<stikonas>but there are various corner cases

<stikonas>like break, nested switch, etc...

<oriansj>indeed, hence why it hasnt been implemented yet

<jcowan>stikonas: Gambit supports at least gcc, clang, and tcc

<jcowan>(also js and python)

<stikonas>yeah, but in terms of bootstrapping, once you have tcc, scheme is less important

<stikonas>tcc can build basically everything

<jcowan> well, anything written in c89

<mihi>just a side note: When your biggest problem is a huge if/else cascade on singleton pointers and your compiler cannot do switch/case or function pointers, but has goto (which is even used in eval_apply), the method known from last century was to give each singleton member (i.e. "symbol" in mes' case) a hand-crafted integer value (ordinal) and do range comparisons like "if (ordinal > 32) {if ordinal > 24) goto x1;

<mihi>else goto x2;} else if (ordinal > 16) goto x3; else goto x4;" That way you get from O(n) worst case comparisons to somewhere near O(log n).

<mihi>But still, it only makes sense if that is the true bottleneck. If the bottleneck is that eval_apply is called in sequential loops where some more sensible lookup would make it faster, you'd have to optimize that first.

<mihi>and in Lisp/Scheme there is lots of linked list processing :)

<mihi>(that *24* should have been *48* in the code above)

<muurkha>stikonas: not all 32-bit code on Linux, just the code that cares about time

<muurkha>like, libjpeg wont' care

<stikonas>yes, which is why I specifically mentioned build systems

<stikonas>so live-bootstrap did fail at some point when trying to run it in 2040 or so...

<stikonas>not immediately, but somewhere between tcc and gcc...

<stikonas>once more complicated build systems really kick in... Earlier steps with kaem or handcrafted make files usually work fine

<muurkha>if you want reproducible bootstrapping you should probably do it with a fake system clock

<muurkha>because otherwise the system clock is an input into the build process that's different every time you run it

<stikonas>well, other clock bugs are fixed in live-bootstrap

<stikonas>there were initially a few bugs were year was stored in documentation, etc...

<stikonas>but 2038 problem is a bit more complicated...

<muurkha>is it?

<stikonas>yeah, cause clocks wrap to 1900 or so...

<muurkha>not if you use a fake system clock

<stikonas>and e.g. make can get confused which file is newer and which is not

<stikonas>well, if you fake it completely to some fixed value

<stikonas>then yes, that's a workaround

<stikonas>and in modern linux systems you could use namespaces for that

IRC channel logs

2023-10-29.log