IRC channel logs

***Server sets mode: +nt

***aminb is now known as Guest47171

***hydraz_ is now known as hydraz

<lloda>wingo: morning

<lloda>I've tried a couple of libraries of mine that are public and a repo of private code but I don't have anything very useful to report :-(

<lloda>lightning seems about 2x or 3x slower compared to stable-2.2 in some cases, without jit

<lloda>when I tried jit on large functions, the speed was about the same

<lloda>I haven't had time to try & isolate a good small benchmark

<lloda>no more crashes though

<lloda>but a ton of what I do relies on array-ref or goes directly to C, so it makes sense that the jit wouldn't be able to help

<wingo>lloda: morning :)

<wingo>that's interesting! counterintuitive as well...

<wingo>you compiled lightning with -O2 I guess?

<wingo>when you tested the jit version, did you also jit-compile all the (most-called) subrs that your functions called? maybe your functions also had inner loops that guile compiled as other functions

<wingo>regarding the slowdown vs stable-2.2, i would expect some due to master/lightning having more bytecodes in general

<wingo>but not 2x or 3x!

<wingo>for me array-sum is slower without jit (1.6s vs 1.4s) but when jitted (and importantly, when array-length and array-ref are jitted) it's a little faster (1.2s)

***rekado_ is now known as rekado

<lloda>wingo: my results on that test are stable-2.2 ~ 0.19 to 0.26, lightning no jit 0.66 to 0.73, lightning jit 0.66 to 0.72 :-/

<wingo>lloda: for your array-sum test you mean?

<lloda>yes

<wingo>how big of an f32vector are you using?

<lloda>#e1e7

<lloda>I'm using f32vector-ref in the loop instead of array-ref actually

<wingo>ah :)

<lloda>have meetings :-/ but I'll be around

<wingo>for guile 2.2.3 i still get like 1.2s here

<wingo>could be that CSE patch that mark weaver found, applied in 2.2.4 but i thought it wasn't needed on master

<wingo>certainly if 2.2 manages to unbox but master doesn't, that's a bug

<wingo>that would certainly account for the difference

<wingo>indeed, that seems to be what's happening

<wingo>yay

<wingo>hum, that's not it, i think i was testing with the wrong version

<wingo>lloda: my results on that test are stable-2.2 0.44s, lightning no jit .77-.88s, lightning jit .54s

<wingo>in stable-2.2, the body of the loop is 15 instructions

<wingo>in lightning, it's 25 instructions, and has two callouts to intrinsics (add/immediate and lsh/immediate), which in 2.2 are dedicated opcodes

<wingo>the lightning branch also has 3 side exits for error conditions that are explicit control flow, which stable-2.2 doesn't have

<wingo>inter-instruction state is kept on the stack. i guess the additional instructions cause more stack traffic. probably the icache footprint is a little higher too, though who knows; array-sum compiles to a little more than 2k of code, which probably should be slimmed a little

<wingo>still, icache misses probably aren't the thing.

<lloda>those numbers are similar to what I get

<lloda>I did compile both lightning & stable-2.2 with -O2. I used gcc 8.2 for lightning and 8.1 for stable

<lloda>in the larger test I did I only jit-compiled a big function that defined small functions inside (in a let). I'll try to jit-compile the small functions instead

<outtabwz>What do i put in my ~/.guile to make the auto-compiler silent except for errors? Alternatively, how to completely disable ac in ~/.guile?

<lloda>outtabwz: to disable ac look here https://www.gnu.org/software/guile/manual/html_node/Compilation.html

<lloda>I don't know how to disable the ac messages. One way is to make sure everything is compiled first

<lloda>,a compile or ,a load also reveal some variables that you can try

<outtabwz>lloda: Thanks. I'm reading up now.

<amz3>o/

***mood_ is now known as mood

<outtabwz>I thought maybe ,option interp #t but it seems that only applies to the REPL, and not scripts invoked from the shell :(

<outtabwz>Putting (setenv "GUILE_AUTO_COMPILE" "0") into ~/.guile doesn't prevent auto-compile of scripts invoked from the shell either :(

***geokon1 is now known as geokon

<wingo>outtabwz: (set! %load-should-auto-compile #f)

<wingo>but better if possible to simply compile the files ahead of time

<stis>wingo: procedure-properties is awfully slow in 2.2!

<stis>why's that?

<wingo>stis: probably because all the data is stored statically in the elf in a side table; it has to grovel through the DWARF for that data

<wingo>lloda: we should be able to really speed up array-length btw

<wingo>i just stepped through it, it does a bunch of useless crap

<stis>ok, if it's not improving I will move over to hash tables.

<wingo>stis: i think that's a reasonable thing to do

<stis>wingo: what's your thought on optimizing the familly of the kind (call-with-values expr (case-lambda ((x) x) (x x)))

<stis>e.g. have a speed version for 1 value return

<stis>I use this to simulate python asignment semantics regarding multiple value returns

<stis>but it really is slow 1 million operations per second which is an overhead for me because asignments is everywhere

<stis>it should be able to optimize (call-with-values (lambda () (+ number number)) ...) to (+ number number)

<wingo>how would you distinguish returning '(1 2) from returning (values 1 2) ?

<wingo>stis: do you need to support 0-valued returns?

<stis>well get better speed for 1 value return, multiple values is turned into a tuple, and I gues that this construct can be don quite effectively, especially for the common one value return. Of cause zero becomes the empty list

<stis>which I did not think of, so to avoid that perhaps use (case-lambda ((x) x) ((x . l) (cons x . l)))

<stis>in python x=... return 1,2 => x=(1,2), in guile I would expect x=1

<OrangeShark>stis: a multi value return in python is just a tuple, isn't it?

<wingo>lloda: embarrassingly, returns from interpreter -> jit weren't actually working

<wingo>it was staying in the interpreter

<stis>OrangeShark: yep but scheme uses multiple value return and I want to interoperate

<wingo>lloda: hence the 0 speedup until all leaf functions are compiled!

<stis>it's possible now, but a bit too inneficient in my taste, especially since it's an optimization away to get good bytecode

<stis>maybe i bite the bullet and just implement it as tupples

<amz3>stis: yes, that's what I was going to say.

<amz3>stis: you want to implement a fast python on top of guile vm?

<amz3>fast/faster

<stis>first of all, I want good interoperability and a nice scheme interface. Then speed.

<stis>now it's 50x slower then cython and I could perhaps improve that with a factor of 10 if I could get this device working smoothely

<amz3>device?

<amz3>what device?

<stis>(set! x (wrap expr)), wrap is the device and is (lambda (x) (call-with-values (lambda () x) (case-lambda ((x) x) (x x)))))

<stis>Currently it will make a closure and a lambda and then call a primitive with those values . Would be nice with some better rules of oprmisation for this case

<amz3>why do you compare with cython in particular? do you have python library in mind that you would like to be able to use?

<stis>I compiled the random module from cython and run one of the tests there just to see how it performs

<amz3>hmm ok

<stis>note the random.py module

<wingo>i still don't know how you plan to distinguish returning '(1 2) from (values 1 2)

<stis>you don't, it' a projection of the two concepts, either as a multiple return values you return (values 1 2) or (list 1 2) both are identical in pythonbyt not in scheme so the expectation of the pythonist would not be broken

<stis>still if you in python return values they would in scheme land be destinguished as not a tuple

<wingo>lloda: interestingly, inlining the fast path of "<?" in the VM gets a good perf boost... i wonder if we will have to go back to inlining fast paths for a number of things (both in interpreter and jit)

<wingo>stis: in that case i think your best speed bet is (call-with-values foo (lambda (x . x*) ...))

<wingo>it will be fast in the 1-value case, and for multiple values you can test if x* is null

<amz3>sneek_: botsnack

<stis>i'll try

<stis>yep 3x faster

<stis>no sorry 5x faster

<stis>and compilation is also faster

<wingo>nice

<lloda>wingo: cool stuff

<lloda>for array-length I think we only need to check the type, if it's an actual array we don't need the handle, and if it's a xxxxvector I don't think we need the handle either if all the types put the length in the same place

<lloda>but I think all of that is eventually going to Scheme, so...

<wingo>yeah

<wingo>btw i found a large part of the issue

<wingo>for that array-sum test, previous numbers were:

<wingo>stable-2.2 0.44s, lightning no jit .77-.88s, lightning jit .54s

<wingo>numbers now lightning no jit 0.60s, lightning jit 0.32s

<wingo>just pushed that fix

<wingo>was a problem in the intrinsics, they were missing a fast path

<lloda>will the jit handle a case-lambda, or should I jit-compile each of the cases?

<lloda>my timings for the array-sum tests are now stable-2.2 0.18, lightning no jit 0.39, lightning jit 0.11

<wingo>lloda: it will handle the case-lambda

<wingo>0.39 vs 0.18 is not nice, should try to improve that; it is possible though that interpreted throughput can be less than 2.2

<wingo>at least 0.11 is better :P

<wingo>yay, prompts and aborts compile well

<janneke>wow!

<stis>cool

<outtabwz>wingo: Still getting the "auto-compilation" notice even with (set! %load-should-autocompile #f) in ~/.guile

<outtabwz>wingo: https://paste.ubuntu.com/p/zDWjJQ3P4P/

<stis>my little whishlist, let/ec compiles to just gotos in case of (let/ec ret (if p? (ret 1)) (ret 2)))

<wingo>stis: need a compiler pass to do that, to contify prompts

<wingo>condition is that the prompt tag is fresh and that it doesn't escape the procedure it's in

<outtabwz>wingo: guile (GNU Guile) 2.2.4 built from source yesterday

<wingo>outtabwz: i think that's a bug then, please send a mail to bug-guile@gnu.org

<wingo>alternately of course you can rebind the current warning port, if you just want the messages to go away

<wingo>(current-warning-port (%make-void-port "r"))

<wingo>er

<wingo>(current-warning-port (%make-void-port "w"))

<outtabwz>wingo: I don't want to disable all warnings. I just don't want the verbose chatter every time I update my program.

<rekado>hmm, just found out that mailutils won’t help me process a multipart message.

<rekado>it can get me the parts and tell me that this is a multipart message, but I think I could do this without mailutils.

<civodul>it's disappointing

IRC channel logs

2018-08-29.log