IRC channel logs
2024-07-01.log
back to list of logs
<ArneBab>rlb: where can I get your branch? I can benchmark against a specific hash if needed. <rlb>It's definitely *not* going to satisfy any desires to avoid C code. I did want/try to write at least some of it in scheme, during the conversion, but couldn't because the relevant srfi is currently too deeply embedded to allow a split module (afaict -- planned to revisit that later). <rlb>But I suspect some of the core we'll still want in C until the compiler comes further along, or until we have something like prescheme for the lower-level. <ArneBab>rlb: thank you! at 20% possible improvement for some tests, missing some of the 3% of 3.0.10 might not make that much of a difference. <rlb>certainly -- of course don't have many data-points yet, so thanks for any testing you manage. <ArneBab>rlb: I expect that we’ll always need some C for the parts that are simply more efficient in C. Guile Hoot re-writes parts in wasm as alternate backend, and it does need wasm, too ⇒ similar to C in the regular backend. Having more in Scheme is great, but only where that’s actually similarly efficient. <ArneBab>keep in mind though, that these benchmarks are by ecraven, many of them ecraven imported from Larceny, so only a small part of this is actually done by me. <rlb>Well, I was wildly imagining a built-in pre-scheme that compiles to C, just so I don't have to write C directly, and/or more importantly, can use real macros instead of the preprocessor. <ArneBab>Wouldn’t pre-scheme then be written in C? <rlb>No idea, but if that's well-encapsulated, then maybe "great"? <ArneBab>it might be able to bootstrap and then only *emit* C, but I’m not sure. <rlb>And I have no idea if that's a goal, currently, but naively seems appealing. <ArneBab>Even Scheme compilers like Stalin have their slowdowns. <ArneBab>So I think it doesn’t hurt to have some performance critical parts in C — so we don’t have to call out to other C in our Scheme-programs :-) <ArneBab>I’m testing with ad48d90d12a747489d74a0d7ee7aa53d12639a64 now <rlb>ACTION is also fairly fond of ice-9, and would probably keep it (plus or minus potential further doc improvements), but will defer others. <rlb>I could just add the reference to the startup message (not opposed to additional Vonnegut publicity) :) <ArneBab>rlb: I get compile errors in ad48d90d12a747489d74a0d7ee7aa53d12639a64; don’t know why <rlb>May not be related to the i386 cps crash, but think we have at least one more bug (cf. ffb95239aacf86d8dc622a438bdaacfac4a66efc) in utf8 string hashing, i.e. as compared to the wide algorithm, the utf8 code unconditionally adds "a" to the finish (likely another overrun?), even if there's no "length", but that still doesn't get to the "right" value on i386. I'll add a patch to my proposed branch and keep looking. <rlb>Hmm, if you can paste the error somewhere, happy to take a look. <rlb>I think we should start automatically testing main against at least one 32-bit arch (and watching for failures). <rlb>(Probably want at least 4 archs to match the 4 bytecode variants.) <ArneBab>that’s strange — I see successful results in intermediary output but it’s missing at the end. <rlb>Got it (wrt utf8 hashing). I'll add the patch to the ones I've proposed later. <rlb>ArneBab: oh, I forgot to mention that you'll need to delete all your existing .go files (i.e. in your ~/.cache, etc.) because the utf8 branch (has to) change the physical string layout. Of course that'll be handled transparently in any actual release because we'll be changing the Y version too. <rlb>(Maybe that's what you were seeing.) <rlb>Oh, and of course "make clean" in the guile tree if you'd already built something else there. <ArneBab>rlb: ah, that might explain it, yes. I run make clean, but don’t delete all .go files. <haugh>I'm looking at the definition of define-values in boot-9. In the (_ (var0 ... varn) expr) clause, why is does the call-with-values consumer wrap `list` with a lambda? Why not just use `list`, like we do in the (_ var expr) clause? <haugh>Btw this is not an efficiency gripe I just want to make sure I'm not missing some functionality <haugh>Oh maybe it's a constraint, to throw an error if we get the wrong number of values <rlb>dsmith: yeah, that's why I was thinking of it :) But I didn't know exactly what was planned for guile's existing "backend" on that front. e.g. are we likely to be able to use it as an intermediary in front of the c compiler for relevant bits in libguile? <rlb>ArneBab: you can just delete the related subtree in ~/.cache for the cache, of course. I have a helper cmd for that here since I was having to do it so often (rebasing back and forth across tat border) when working on the branch :) <rlb>Hmm, as yet, I see no easy way to create a test for the scm_i_utf8_string_hash overrun bug because you can't call it directly from a test program, so you can't easily give it a buffer with a guaranteed non-null byte in the overflow. If I hack that test in to hash.c, I can see the problem, but I'm guessing that any use via scm_symbol_hash() is presenting it with a buffer that's null terminated, and the null doesn't affect the hash. <rlb>Hmm, or maybe it'll only "work" on 32-bit for reasons. <rlb>I'll double-check the "high level" test there. <ArneBab>rlb: now it works — after adding find . -iname '*.go' | xargs rm to the workflow ☺ <rlb>If "make check" passes, I suspect you're in pretty good shape. We have a good bit of testing, and the branch adds more -- though we still don't cover all the stuff we might want to (as always). <ArneBab>rlb: I see a minimal speedup over 3.0.10 (0.4%), but that’s in the margin of error ⇒ more tests are running. I had planned to run 10 runs over the night but I failed the bash variable expansion test, so it was only a single one: export ITERATIONS=10 ; for i in {1..$ITERATIONS}; do echo $i; done only does one run — with '{1..10}'. Now fixed that to use seq … <ArneBab>However there will be more noise with those added iterations, because I’ll have higher system load. <rlb>Good to know -- any idea offhand if the tests you're running are mostly ascii, and or string or not-string operation heavy? <rlb>I'd suspect the branch might be slower for mostly ascii. <rlb>well latin-1-ish -- ascii should be similar to the existing code. <ober>it's getting long in the tooth 2022 <ecraven>just updating the machine where I run them, it's been too long :D <ArneBab>ecraven: I use your benchmark for a version-comparison in the script I linked above, and it’s been pretty useful so far. Thank you! <ecraven>it's mostly not mine, but from the larceny folks <ArneBab>ecraven: and did you see that wingo also used a version of your benchmarks to check the performance of Hoot? <ecraven>I didn't ;) I haven't had the time to follow the Scheme world as closely as I'd like :-/ <ArneBab>ecraven: well: you made it easier to use, and you made it well-known (which *is* a feat in itself) <ecraven>I've been meaning to write the scaffolding to run the tests via Scheme code forever, but .. <ArneBab>ecraven: also your results are really interesting <ecraven>I'd like to capture more metrics, like CPU time, memory, ... <ecraven>I probably have better hardware now, I'll see whether I can get it running there at some point <ecraven>but as a first step, I'll try to run it as-is, just to get a current version up. <ArneBab>isn’t "user" (from time) the CPU time? <ArneBab>maybe that’s a case where incremental improvements may win over the big cleanup? <ecraven>sorry, we have cpu time, of course. but mean / max memory for example would be interesting, I think <ecraven>also, something about multiple cores, not entirely clear on that yet <ArneBab>yes: AFAIK user is the time of all CPUs added (at least that’s how it looks when real time is 10s and user time is 2min). <ecraven>right now I only measure the wall clock time, I think, because it's based on `current-jiffies', IIRC <ArneBab>For diviter (I just see that) real time on Guile is 4 seconds and user time 16 seconds. Seems like heavy GC load. <ecraven>there's so many interesting things one *could* do :D I also need to study the original Gabriel book... <ecraven>ArneBab: thanks, I'll have a look at that! <ecraven>Haha: Thirdly, GNU time has a bug. It reports 4x the actual memory usage <ecraven>thanks, this is one interesting page! <ecraven>hm.. taking memory into account will make comparison even harder than it is already :P <ArneBab>ecraven: to use the time program instead of the bash builtin you can use \time <ArneBab>\time evades all aliases and builtins <ecraven>that is one hard program name to search for :P <ArneBab>rlb: -O3 discarding everything "because you don’t need it" is fun ☺ <rlb>One interesting thing to me is if it can accurately include the entire process subtree, I've wanted something like that for basic cpu/memory/io to be able to asses the overall "cost" of some operations for a multi-process server like postgres or apache, or... <ecraven>doesn't compile on current arch, though :-/ <ArneBab>ecraven: does something prevent you from using GNU time? <ecraven>yes, I don't want to measure startup <ecraven>also, I seem to have "misplaced" the web page generating code .. <ecraven>which wouldn't be so bad, I should rewrite that anyway :P <ArneBab>that code does produce a really good website! <ecraven>it's incredibly ugly though.. the site isn't so good either <ArneBab>as a user I disagree with the website being not so good <ArneBab>also rewriting the site sounds a lot like a big project in its own right <ArneBab>it’s not like I could tell you what to spend your time on, and maybe I should just not give people advice (spending 10 years on getting Wisp into Guile does not bode of good priorization), but still I’d like to see the new results :-) <ArneBab>ecraven: to avoid setup, you’d need to first measure an empty run, right? <ecraven>no, the scheme code runs `current-jiffies' before it starts the test and after.. which is fine for wall-clock time <ecraven>this is all not as solid and robust as it should be :-/ <ArneBab>so you don’t actually get the data from `time`? <ArneBab>ah: (define current-jiffy get-internal-real-time) <ecraven>yea, depending on the amount of r7rs-support, I fake it :D <ecraven>I'm really not the best person to do this, as I don't know all the implementations that well :-/ <ecraven>there's probably too much and even incorrect or at least slow shimming there :-/ <ArneBab>To me it looks like you’re one of the few who actually knows some details on that many implementations <ArneBab>ecraven: is there a chance that you can find the website code again? <ecraven>I never wanted to commit it, because it was so ugly :) <ecraven>I can recreate it if necessary, I have the html to see *what* I generated <ecraven>it's mostly just some simple data wrangling to get the numbers, and then svg generation <ArneBab>that’s an argument for release-early-release-often-don’t-worry-what-they-may-think ☺ <ArneBab>(whoever "they" may be varies by person ☺) <ecraven>can I drop guile2.2 from the benchmarks? <ecraven>or are there still reasons to use it? <ArneBab>I personally think while it’s interesting for Guile-Devs, Guile 3 has been out for 4 years now, so it shouldn’t be a problem to drop it. <rlb>Not sure how much you should weight my opinion, but I'd say yes (you can drop it). <ecraven>ok, found some version of it, probably latest ;D <ArneBab>What may be interesting is adding Guile hoot — that’s Guile running in a webbrowser via wasm. But I can’t yet tell you how to benchmark it practically (wingo?) so maybe that’s for later :-) <ecraven>I've never done much with wasm, but there are web assembly "runners" outside of browsers, I think? <ecraven>but then, that wouldn't benchmark whatever you get in the browser <ArneBab>ecraven: yes — wasm is pretty fast and can leverage the garbage collector in the browser, so that could be pretty efficient. <rlb>wingo: (bisect) found a commit that might be the one that broke 32-bit compilation: d579848cb5d65440af5afd9c8968628665554c22 I'll look at it later and see if it means anything to me.