IRC channel logs

2018-09-04.log

back to list of logs

<apteryx>amz3: I didn't know much about Smalltalk; reading on it, it is pretty impressive!
***heroux_ is now known as heroux
<apteryx>is there no function provided by guile out-of-the-box to rename a directory?
***Server sets mode: +nt
***Server sets mode: +nt
***unCork is now known as Cork
<lloda>wingo: morning
<lloda>the segfault is gone
<lloda>otoh it takes a very long time to compile an empty file
<lloda>it's rather strange
<lloda>with THRESHOLD=0
<lloda>any other number is way faster
***rekado_ is now known as rekado
<wingo>moin
<wingo>doing a simple GUILE_JIT_COUNTER_THRESHOLD=1000 GUILE_JIT_LOG_LEVEL=1 meta/guile
<wingo>i see 89 functions compiled
<wingo>84% of which are compiled at their entry point
<wingo>and the rest tier up from loops
<civodul>hey wingo
<wingo>greets
<civodul>so it's possible that a loop within a function is compiled, but not the whole function, right?
<wingo>no
<wingo>a function has a single optimization counter
<wingo>that counter is incremented on function entry and on loop iterations
<wingo>so either could trigger compilation
<civodul>ok
<wingo>when a function is jit-compiled, the whole thing is compiled, even if it was a loop iteration that triggered the compilation
<wingo>if it was a loop iteration, the function will tier up into the middle of the compiled code, instead of the beginning
<civodul>so the whole function is compiled, but the trigger might be a loop entry
<civodul>i should really give it a try
<wingo>indeed! the default in the lightning branch is no jit compilation
<wingo>so you can just kick off a compile to have it around
<wingo>make sure to remove all .go files in that dir tho
<wingo>the binary format was changing without .go abi bumps
<wingo>then when you want to jit-compile, you can do it via the temporarily-available %jit-compile function
<wingo>or you can set GUILE_JIT_COUNTER_THRESHOLD=NNNN
<civodul>ok
<wingo>where NNNN is the counter threshold above which a function gets compiled
<wingo>set GUILE_JIT_LOG_LEVEL=N to see debug output
<wingo>for N in {1,2,3}
<civodul>i see
<civodul>what fraction of bytecode instructions can be compiled?
<wingo>civodul: all of them!
<wingo>even aborts, continuations, reinstating, etc
<civodul>woow!
<wingo>neat, right? :)
<civodul>crazy even! :-)
<wingo>:)
<civodul>so it's where we can start playing with it and see what happens?
<wingo>it's starting to get there, yes. for small examples :)
<wingo>if you set GUILE_JIT_COUNTER_THRESHOLD=0 then *everything* gets compiled
<wingo>takes a while of course
<wingo>as compiling a function takes longer than running it
<civodul>heh
<civodul>at least you can see if it crashes ;-)
<wingo>yes :)
<wingo>i meant to say, i haven't yet gotten the whole test suite to run through. there were some stalls, and i just fixed an important crasher yesterday
<wingo>but it's definitely at the point where if you have a test case, you can see what will happen
<wingo>not time to try it out on whole apps tho
<wingo>also, there seem to be a couple of places where the interpreter is noticeably slower than 2.2 -- like more than 20 or 30% slower. it's related to changes in the bytecode and such, most of them are fixable i think, but if you find a case like that, please lmk so i can have a poke
<wingo>it's somewhat expected to have some cases in the interpreter be slower than 2.2, as 3.0 has generally more instructions than 2.2, because the instructions do less work
<wingo>but it shouldn't be significant, and the JIT should always more than make up the difference
<wingo>fwiw i am seeing startup time slightly less with 3.0 than 2.2 so that's good anyway
<wingo>keeping startup time fast is something i'm keeping an eye on
<civodul>ok
<civodul>slower interpreter could mean longer bootstrap times tho ;-)
<wingo>mmm, not sure
<wingo>by interpreter i mean vm-engine.c
<wingo>i haven't tested the relative speed of eval.scm
<wingo>but i would imagine that JIT-compiled eval.scm should handily beat 2.2's eval.scm
<wingo>haven't tested yet tho :)
<civodul>oh, i was thinking of eval.scm
<wingo>right, i think we're going to have to start calling that the evaluator!
<wingo>dunno
<wingo>more parts, more names, more confusion :)
<civodul>heheh
<civodul>gcc 8 shows interesting warnings:
<civodul>srfi-14.c:2061:42: warning: '%06x' directive writing between 6 and 8 bytes into a region of size 7 [-Wformat-overflow=]
<wingo>neat
<civodul>a bit worrying
<civodul>anyway it's building :-)
<wingo>:)
<civodul>wingo: messages like "jit: vcode: start=0xe30e60,+6 entry=+0" mean that we jumped to jitted code?
<wingo>civodul: no that means that jit code was emitted
<wingo>usually the code will be entered also right after the message
<wingo>but to see all entries and exits, set log level to 3
<civodul>ok
<civodul>that's more verbose indeed :-)
<wingo>:)
<wingo>functions that are called a lot are called a lot :)
<civodul>at the REPL we just see too many of them
<wingo>yeah, better to run at log level 1 generally, if you are interested in when jit happens
<wingo>otherwise pipe stderr somewhere
<wingo>"entry=+0" indicates that it was a function call that triggered JIT
<wingo>nonzero indicates that it was a loop
<civodul>ok, nice
<civodul>if i do: GUILE_JIT_LOG_LEVEL=1 GUILE_JIT_COUNTER_THRESHOLD=1000 ./meta/guile -c '(define (fib i)(if (<= i 1) 1 (+ i (fib (- i 1))))) (fib 42)'
<civodul>presumably only the last few lines correspond to my procedure, right?
<wingo>civodul: probably, check the vcode compared to the addresses from ,x fib
<wingo>civodul: i think in that case actually `fib` is interpreted
<wingo>right?
<wingo>also that's not fib :)
<wingo>civodul: for fib, on lightning, without jit: using the evaluator, (fib 35) takes 10.9s. Using the compiler, 1.05s.
<wingo>with jit, using the evaluator, (fib 35) takes 6.47s, and with the compiler, 0.61s.
<wingo>by way of comparison, with 2.2 and the evaluator, 9.2s, and with the compiler, 0.88s.
<wingo>still some headroom in lightning i think tho
<wingo>if "fib" itself is letrec-bound instead of bound at the top level, 2.2's compiler is 0.78s, lightning's compiler is 0.92s, lightning compiler+jit is 0.51s
<wingo>so, broadly similar results
<wingo>incidentally i don't how how ecraven's benchmark got 40s for this test
<wingo>he did (fib 40) but that result is 8s for me with 2.2, or 10s lightning, or 5.2s lightning+jit
<ecraven>I should rerun everything sometime soon
<ecraven>it's a fast machine, maybe?
<wingo>not really
<ecraven>ah, 40, not 4.. not sure
<wingo>ah sorry didn't mean to neg your machine, i was thinking you were talking about mine :)
<wingo>anyway for 2.2 i would expect something a bit better, but who knows
<wingo>hopefully we can get a 3.0 prerel out soon that will be on par with racket :)
<wingo>ecraven: i was wrong! your tests run fib 40 5 times
<wingo>so the numbers were right
<wingo>irritatingly :)
<ecraven>ok, that's good
<ecraven>I'll do a new run over the next few days, there were some new releases
<civodul>wingo: re fib, interesting! though i guess fib conses a lot (bignums) so it may not be a great benchmark
<civodul>in case someone had a doubt ;-)
<civodul>it's interesting to see to letrec binding helps this much already in 2.2
<wingo>the letrec binding means it can use call-label instead of call
<wingo>and no closure needed either
<wingo>(fib 40) isn't a bignum fwiw
<wingo>so it's "just" call overhead
<civodul>oh right
<wingo>chez's results on those benchmarks are offensive
<cmaloney>offensive in a good way?
<wingo>disgustingly good
<wingo>there is no way we can win those benchmarks unless we get some sort of top-level inlining story
<wingo>maybe civodul will let me do https://lists.gnu.org/archive/html/guile-devel/2016-03/msg00027.html one day
<wingo>it's clear that the mbrot results are only possible having proved that many variables are inexact reals
<wingo>and you can't do that without inlining
<civodul>wingo: IMO we should definitely work towards intra-module inlining
<civodul>that needs a transition path: could be issuing warnings for occurrences of "@@" in user code, adding 'declare' statements to explicitly turn inlining on and off, etc.
*wingo nod
<wingo>we'd need more of an idea about what a module is of course :P
<wingo>i.e. when do you know that you've seen all of a module
<civodul>oh yes, that's terrible
<civodul>i've come to love R6' 'library' form
<wingo>:)
<wingo>the good news from taking a look at the r7rs benchmarks is that the jit reduces run-time on those benchmarks by 40-60% generally, relative to 2.2
<wingo>the bad news is that there's still a way to go before we eat chez's lunch :)
<wingo>e.g. the peval benchmark for me goes from 20s -> 11.5s
<wingo>that's with GUILE_JIT_COUNTER_THRESHOLD=1000
<wingo>if i could golf that reduction down to the 60-80% range that would make me happy
<wingo>to run on your own, clone https://github.com/ecraven/r7rs-benchmarks
<wingo>and run via
<wingo>GUILE_JIT_COUNTER_THRESHOLD=1000 GUILD=/opt/guile/bin/guild GUILE=/opt/guile/bin/guile ./bench guile peval
<wingo>with jit we are still generally 3x slower than racket. probably a lot related to inlining, some related to code generation
<wingo>civodul: intra-module contification would also be nice
<civodul>wingo: 40%-60% is good news, indeed!
<civodul>how does Guile 3 compare to 2.2 without JIT?
<wingo>civodul: see my notes from 13:07
*wingo pastes to not have to scroll up
<wingo><wingo> civodul: for fib, on lightning, without jit: using the evaluator, (fib 35) takes 10.9s. Using the compiler, 1.05s.
<wingo><wingo> with jit, using the evaluator, (fib 35) takes 6.47s, and with the compiler, 0.61s.
<wingo><wingo> by way of comparison, with 2.2 and the evaluator, 9.2s, and with the compiler, 0.88s.
<wingo>that's fairly representative. will see if i can golf the interpreted perf a bit
<wingo>i.e. compiler without jit
<civodul>oh right
<civodul>i wasn't sure if this was representative
<wingo>there are some bits of low-hanging fruit to investigate
<wingo>will need to profile