<apteryx>amz3: I didn't know much about Smalltalk; reading on it, it is pretty impressive! ***heroux_ is now known as heroux
<apteryx>is there no function provided by guile out-of-the-box to rename a directory? ***Server sets mode: +nt
***Server sets mode: +nt
***unCork is now known as Cork
<lloda>otoh it takes a very long time to compile an empty file <lloda>any other number is way faster ***rekado_ is now known as rekado
<wingo>doing a simple GUILE_JIT_COUNTER_THRESHOLD=1000 GUILE_JIT_LOG_LEVEL=1 meta/guile <wingo>84% of which are compiled at their entry point <wingo>and the rest tier up from loops <civodul>so it's possible that a loop within a function is compiled, but not the whole function, right? <wingo>a function has a single optimization counter <wingo>that counter is incremented on function entry and on loop iterations <wingo>so either could trigger compilation <wingo>when a function is jit-compiled, the whole thing is compiled, even if it was a loop iteration that triggered the compilation <wingo>if it was a loop iteration, the function will tier up into the middle of the compiled code, instead of the beginning <civodul>so the whole function is compiled, but the trigger might be a loop entry <wingo>indeed! the default in the lightning branch is no jit compilation <wingo>so you can just kick off a compile to have it around <wingo>make sure to remove all .go files in that dir tho <wingo>the binary format was changing without .go abi bumps <wingo>then when you want to jit-compile, you can do it via the temporarily-available %jit-compile function <wingo>or you can set GUILE_JIT_COUNTER_THRESHOLD=NNNN <wingo>where NNNN is the counter threshold above which a function gets compiled <wingo>set GUILE_JIT_LOG_LEVEL=N to see debug output <civodul>what fraction of bytecode instructions can be compiled? <wingo>even aborts, continuations, reinstating, etc <civodul>so it's where we can start playing with it and see what happens? <wingo>it's starting to get there, yes. for small examples :) <wingo>if you set GUILE_JIT_COUNTER_THRESHOLD=0 then *everything* gets compiled <wingo>as compiling a function takes longer than running it <civodul>at least you can see if it crashes ;-) <wingo>i meant to say, i haven't yet gotten the whole test suite to run through. there were some stalls, and i just fixed an important crasher yesterday <wingo>but it's definitely at the point where if you have a test case, you can see what will happen <wingo>not time to try it out on whole apps tho <wingo>also, there seem to be a couple of places where the interpreter is noticeably slower than 2.2 -- like more than 20 or 30% slower. it's related to changes in the bytecode and such, most of them are fixable i think, but if you find a case like that, please lmk so i can have a poke <wingo>it's somewhat expected to have some cases in the interpreter be slower than 2.2, as 3.0 has generally more instructions than 2.2, because the instructions do less work <wingo>but it shouldn't be significant, and the JIT should always more than make up the difference <wingo>fwiw i am seeing startup time slightly less with 3.0 than 2.2 so that's good anyway <wingo>keeping startup time fast is something i'm keeping an eye on <civodul>slower interpreter could mean longer bootstrap times tho ;-) <wingo>by interpreter i mean vm-engine.c <wingo>i haven't tested the relative speed of eval.scm <wingo>but i would imagine that JIT-compiled eval.scm should handily beat 2.2's eval.scm <wingo>right, i think we're going to have to start calling that the evaluator! <wingo>more parts, more names, more confusion :) <civodul>srfi-14.c:2061:42: warning: '%06x' directive writing between 6 and 8 bytes into a region of size 7 [-Wformat-overflow=] <civodul>wingo: messages like "jit: vcode: start=0xe30e60,+6 entry=+0" mean that we jumped to jitted code? <wingo>civodul: no that means that jit code was emitted <wingo>usually the code will be entered also right after the message <wingo>but to see all entries and exits, set log level to 3 <wingo>functions that are called a lot are called a lot :) <civodul>at the REPL we just see too many of them <wingo>yeah, better to run at log level 1 generally, if you are interested in when jit happens <wingo>otherwise pipe stderr somewhere <wingo>"entry=+0" indicates that it was a function call that triggered JIT <wingo>nonzero indicates that it was a loop <civodul>if i do: GUILE_JIT_LOG_LEVEL=1 GUILE_JIT_COUNTER_THRESHOLD=1000 ./meta/guile -c '(define (fib i)(if (<= i 1) 1 (+ i (fib (- i 1))))) (fib 42)' <civodul>presumably only the last few lines correspond to my procedure, right? <wingo>civodul: probably, check the vcode compared to the addresses from ,x fib <wingo>civodul: i think in that case actually `fib` is interpreted <wingo>civodul: for fib, on lightning, without jit: using the evaluator, (fib 35) takes 10.9s. Using the compiler, 1.05s. <wingo>with jit, using the evaluator, (fib 35) takes 6.47s, and with the compiler, 0.61s. <wingo>by way of comparison, with 2.2 and the evaluator, 9.2s, and with the compiler, 0.88s. <wingo>still some headroom in lightning i think tho <wingo>if "fib" itself is letrec-bound instead of bound at the top level, 2.2's compiler is 0.78s, lightning's compiler is 0.92s, lightning compiler+jit is 0.51s <wingo>incidentally i don't how how ecraven's benchmark got 40s for this test <wingo>he did (fib 40) but that result is 8s for me with 2.2, or 10s lightning, or 5.2s lightning+jit <ecraven>I should rerun everything sometime soon <wingo>ah sorry didn't mean to neg your machine, i was thinking you were talking about mine :) <wingo>anyway for 2.2 i would expect something a bit better, but who knows <wingo>hopefully we can get a 3.0 prerel out soon that will be on par with racket :) <wingo>ecraven: i was wrong! your tests run fib 40 5 times <ecraven>I'll do a new run over the next few days, there were some new releases <civodul>wingo: re fib, interesting! though i guess fib conses a lot (bignums) so it may not be a great benchmark <civodul>it's interesting to see to letrec binding helps this much already in 2.2 <wingo>the letrec binding means it can use call-label instead of call <wingo>and no closure needed either <wingo>(fib 40) isn't a bignum fwiw <wingo>so it's "just" call overhead <wingo>chez's results on those benchmarks are offensive <wingo>there is no way we can win those benchmarks unless we get some sort of top-level inlining story <wingo>it's clear that the mbrot results are only possible having proved that many variables are inexact reals <wingo>and you can't do that without inlining <civodul>wingo: IMO we should definitely work towards intra-module inlining <civodul>that needs a transition path: could be issuing warnings for occurrences of "@@" in user code, adding 'declare' statements to explicitly turn inlining on and off, etc. <wingo>we'd need more of an idea about what a module is of course :P <wingo>i.e. when do you know that you've seen all of a module <civodul>i've come to love R6' 'library' form <wingo>the good news from taking a look at the r7rs benchmarks is that the jit reduces run-time on those benchmarks by 40-60% generally, relative to 2.2 <wingo>the bad news is that there's still a way to go before we eat chez's lunch :) <wingo>e.g. the peval benchmark for me goes from 20s -> 11.5s <wingo>that's with GUILE_JIT_COUNTER_THRESHOLD=1000 <wingo>if i could golf that reduction down to the 60-80% range that would make me happy <wingo>GUILE_JIT_COUNTER_THRESHOLD=1000 GUILD=/opt/guile/bin/guild GUILE=/opt/guile/bin/guile ./bench guile peval <wingo>with jit we are still generally 3x slower than racket. probably a lot related to inlining, some related to code generation <wingo>civodul: intra-module contification would also be nice <civodul>wingo: 40%-60% is good news, indeed! <civodul>how does Guile 3 compare to 2.2 without JIT? <wingo>civodul: see my notes from 13:07 *wingo pastes to not have to scroll up <wingo><wingo> civodul: for fib, on lightning, without jit: using the evaluator, (fib 35) takes 10.9s. Using the compiler, 1.05s. <wingo><wingo> with jit, using the evaluator, (fib 35) takes 6.47s, and with the compiler, 0.61s. <wingo><wingo> by way of comparison, with 2.2 and the evaluator, 9.2s, and with the compiler, 0.88s. <wingo>that's fairly representative. will see if i can golf the interpreted perf a bit <civodul>i wasn't sure if this was representative <wingo>there are some bits of low-hanging fruit to investigate