IRC channel logs
2023-10-29.log
back to list of logs
<jcowan>Is there a statement anywhere of what C dialect mescc accepts? <stikonas>jcowan: thoguh what do you want to achieve? <jcowan>I'm interested in small C dialects <stikonas>well, mescc generally supports slighty more than M2-Planet <stikonas>and M2-Planet supports if/else statements, for/do/while loops, asm blocks (with it's own M1 asm syntax), goto, unions, structs, arrays <stikonas>I think mescc also supports switch/case on top of that <stikonas>though mescc is orders of magnitute slower than M2-Planet <jcowan>Hopefully it's faster if you run it on a more performant Scheme <stikonas>yeah, a bit faster if you run it with guile <stikonas>but if you are in bootstrapping environment where you have just built mes, then it is fairly slow <jcowan>not that Guile is a very fast Scheme either <jcowan>Gambit would probably be much more performant than Guile <stikonas>and when combined with some non-performant hardware, e.g. risc-v, it can take a week to build tcc... <stikonas>we probably output not very optimized code too <stikonas>(kind of following x86 ideas, so not really most optimal for riscv) <stikonas>e.g. heavy stack use rather than registers <jcowan>x86 is a horrible use of silicon <jcowan>what cpus are supported right now? <stikonas>on riscv we can now get from hex0 to tcc <stikonas>on amd64 we can start with hex0, get to mes, rebuild mes with mescc, then built very first tcc binary (but it is non-functional and crashes) <stikonas>there might be some support for arm/aarch64, but you might have to just between 32/64 bits <stikonas>jcowan: some people in my work do design x86 chips :), though not me... <jcowan>You know the definition of Windows 95? <jcowan>"A 32-bit shell on top of a 16-bit operating system designed for an 8-bit computer based on a 4-bit chip, designed by a two-bit company that doesn’t care one bit about its users." <pabs3>btw, if anyone knows of other resources related to generated files; essays about source, or tools to detect generated files etc, let me know <oriansj>pabs3: well generated files took a very hard turn for the worse when it comes to detection due to the wide availability of Large Language Models which produce output that looks human written. <pabs3>yeah, that is a huge problem <oriansj>but then again people claiming generated crap as source code isn't a new problem and even GNU programs distribute generated files in its "source" tarballs. So that half of the bootstrapping fight will be a much harder fight. <stikonas>yeah, but LLM makes it much harder to identify those <stikonas>and then you start getting semi-generated files <stikonas>i.e. there was some source code that was pregenerated, possibly with some bugs and then human edits i <stikonas>oriansj: so I think ekaitz and I are at the point where there isn't much more to do from hex0 to bootstrappable tcc on riscv64, so we'll need to think about releasing <stikonas>do you think you'll have a bit of time to create stage0-posix tarballs? <stikonas>it will probably be easier for janneke to test everything if we get stage0 out first <stikonas>though I think new stage0-posix also needs mes 0.25 (i.e. it's not compatible with earlier ones) <muurkha>man, it's been 28 years since Windows 95 <muurkha>RISC-V probably isn't inherently slow, but all the current implementations of it are <muurkha>stikonas: when you say "on x86" do you mean "on i386"? sometimes people use "x86" to mean "amd64" or {amd64, i386} or {amd64, i386, 8086} <stikonas>also our stage0 and mes compilers produce a non-optimized code <muurkha>hmm, but you said "full bootstrap works only on x86" and then explained how it doesn't work on amd64: "on amd64 we can ... [build the] very first tcc binary (but it is non-functional and crashes)" <stikonas>well, yeah, I was not completely consistent with x86 <muurkha>it's probably the case that RISC-V depends more on compiler optimization than amd64; the larger register set is sort of like a level -1 data cache, explicitly managed by the compiler <muurkha>so in "works only on x86" you meant i386? <stikonas>sometimes people refer to both 32bit and 64-bit when they say x86 <stikonas>but again, 64-bit CPUs can still run 32-bit code for now <artemist>There are some ARM cores which only support AArch64 (mostly in very new phones or Apple's ARM machines) but that's mostly unrelated, you still end up with a ton of 32 bit code on Windows and it will be supported for the forseeable future <stikonas>on the other hand 32-bit code on Linux will be seriously broken in 15 years or so <stikonas>I've tried running current bootstrap chain with clock moved forward and various things do break (in particular build systems) <nektro>is the reason for why mes is slow well known/understood? <jcowan>mescc is not optimized and does not produce optimized code <jcowan>if you compile it with an optimizing Scheme->C compiler like Gambit and then compile the output with gcc/clang you will probably get a huge speedup <oriansj>nektro: M2-Planet produces very naive binaries and the lack of switch support means that mes.c needs to do a bunch of branching on every single s-expresion <oriansj>so if you wanted to speed up mes.c a good bit we would need to add switch/case support to M2-Planet <stikonas>nektro: I tried running callgrind on mes/mescc, it was spending 25% of time in eval_apply function <oriansj>(which is a giant if else if else block) <stikonas>still, it's unlikely that it would be massively faster <stikonas>we can't really compare mes and gambit since gambit just compiles it to C... <stikonas>and presumably something only gcc can deal with <oriansj>well going from if/else to switch would reduce the number of conditional jumps from 1-20 to just 1 <oriansj>I remember when janneke went from switch to if/else even self-hosted it became a good bit slower. <stikonas>well, even self-hosted can't help much if code is using if/else, and it has to in order to be able to be bootstrapped with M2-Planet <stikonas>not sure how hard it would be to implement that <oriansj>indeed, hence why it hasnt been implemented yet <jcowan>stikonas: Gambit supports at least gcc, clang, and tcc <stikonas>yeah, but in terms of bootstrapping, once you have tcc, scheme is less important <mihi>just a side note: When your biggest problem is a huge if/else cascade on singleton pointers and your compiler cannot do switch/case or function pointers, but has goto (which is even used in eval_apply), the method known from last century was to give each singleton member (i.e. "symbol" in mes' case) a hand-crafted integer value (ordinal) and do range comparisons like "if (ordinal > 32) {if ordinal > 24) goto x1; <mihi>else goto x2;} else if (ordinal > 16) goto x3; else goto x4;" That way you get from O(n) worst case comparisons to somewhere near O(log n). <mihi>But still, it only makes sense if that is the true bottleneck. If the bottleneck is that eval_apply is called in sequential loops where some more sensible lookup would make it faster, you'd have to optimize that first. <mihi>and in Lisp/Scheme there is lots of linked list processing :) <mihi>(that *24* should have been *48* in the code above) <muurkha>stikonas: not all 32-bit code on Linux, just the code that cares about time <stikonas>yes, which is why I specifically mentioned build systems <stikonas>so live-bootstrap did fail at some point when trying to run it in 2040 or so... <stikonas>not immediately, but somewhere between tcc and gcc... <stikonas>once more complicated build systems really kick in... Earlier steps with kaem or handcrafted make files usually work fine <muurkha>if you want reproducible bootstrapping you should probably do it with a fake system clock <muurkha>because otherwise the system clock is an input into the build process that's different every time you run it <stikonas>well, other clock bugs are fixed in live-bootstrap <stikonas>there were initially a few bugs were year was stored in documentation, etc... <stikonas>but 2038 problem is a bit more complicated... <stikonas>yeah, cause clocks wrap to 1900 or so... <stikonas>and e.g. make can get confused which file is newer and which is not <stikonas>well, if you fake it completely to some fixed value <stikonas>and in modern linux systems you could use namespaces for that