IRC channel logs

<ijp>sneek: later tell wingo if by some chance you haven't already, check out "secrets of the glasgow haskell compiler inliner"

<sneek>Will do.

<ijp>morning

<mark_weaver>ijp: thanks for pushing the peval fix. feel free to push your ,br fix also.

<mark_weaver>although actually, I'm about to push some things, so maybe wait a few minutes :)

<ijp>hmm, I thought I had, but it looks like I didn't

<mark_weaver>ijp: okay, I pushed my patches. feel free to push your ,br fix.

<zRecursive>noon

<ijp>okay pushed

<mark_weaver>thanks!

<ijp>wow, someone is trying to get geiser to work with scsh

<ijp> https://github.com/rmloveland/geiser-scsh

<mark_weaver>ijp: would you like to close the bug also? http://debbugs.gnu.org/cgi/bugreport.cgi?bug=15691

<ijp>I will do

<civodul>Hello Guilers!

<ijp>morning

<nalaginrut>heya

<mark_weaver>moin

<ArneBab>moin

<mark_weaver>The 'r7rs-wip' branch now has a fairly complete implementation of R7RS for stable-2.0, which passes the R7RS test suite from chibi scheme. Apart from lack of docs and tests, the main remaining issue is that the modified 'write' which supports SRFI-38 is much slower.

<civodul>neat!

<civodul>post a summary

<civodul>and benchmarks :-)

<mark_weaver>heh, well, printing (iota 1000000) is about 10 times slower on my machine. most of that time is spent in GC.

<mark_weaver>anyway, I'd rather wait until it's really ready. also, I'm deleting and recreating this branch periodically, so beware.

<civodul>due to a hash table or something?

<civodul>ok

<mark_weaver>yeah, it's mostly the hash table.

<civodul>ok

<civodul>anyway that's great news

<mark_weaver>I think I'll end up writing a miniature hash table implementation that does no allocation in the common case.

<mark_weaver>not ideal, but I can't think of a better way.

<mark_weaver>anyway, time for me to sleep. happy hacking, all!

*mark_weaver --> zzz

<taylanub>Hrm, if I import binding `foo', then (define bar (let ((priv-foo foo)) (lambda () (use priv-foo)))), then my `priv-foo' won't see changes in the top-level `foo', and will be better optimizable, right ?

<taylanub>If so, this is essentially "static linking"! (Of a single binding.)

<taylanub>But no, a module's body surely isn't executed at compile-time .. does that `let' take effect the first time the module is loaded, perhaps ?

*taylanub experiments

<civodul>just stumbled upon this network sniffer by rixed et al.: https://github.com/rixed/junkie

<civodul>seems to use a fair amount of Guile

<jemarch>hi

<ArneBab>hi jemarch

<ArneBab>Is there a way to define a variable as uint32, so bitshift works as in C?

<ArneBab>I implemented the tiny encryption scheme in guile scheme, but my implementation will be horribly slow, because the original algorithm uses properties of bitshift on uint32.

<ArneBab>references: C-code: http://de.wikipedia.org/wiki/Tiny_Encryption_Algorithm#Referenzcode

<ArneBab>my reimplementation: https://bitbucket.org/ArneBab/wisp/src/958b9510288c50bef32efacf5d81f0bdfd8720af/examples/tinyenc.w?at=default

<ArneBab>essentially what I need to do is to run (modulo number 2**32) every time I do anything which could increase the number.

<ArneBab>(need to do→currently do)

<ArneBab>and that feels horrible…

<taylanub>The fixnum API would be the right thing here, IIRC Guile has no fixnums yet ?

<taylanub>Oh, we have it in (rnrs arithmetic fixnums (6)), but no idea if it uses optimized stuff.

<civodul>it doesn't

<civodul>but just do bitshift & mask, no?

<taylanub>And so I just realized that procedures could generally be optimized at creation-time (run-time) to specialize for the types of objects which were unknown at compile-time ..

<taylanub>Not sure if that falls under JIT already, I'm thinking of the simple case where given (define (make-proc foo) (lambda () (use foo))), (make-proc <compile-time-constant>) can be optimized to return a procedure that's specialized on the type of foo, omitting type-checks and all, whereas (make-proc variable) will generally return an unspecialized procedure.

<taylanub>A procedure is generally called more often than it's created, so I wonder if it would be worthwhile to always optimize at procedure-creation (no heuristics and stuff like it's usually in JIT engines AFAIK).

<taylanub>Well leaving aside vague terms like JIT, it certainly involves run-time code generation/modification.

<taylanub>So I think it would be best after all to have a specialized "static import" that imports an immutable copy of a binding from the current bindings of a module, instead of implementing this complex run-time optimization idea.

<ArneBab>civodul: nice - thanks!

<ArneBab>taylanub, civodul: sorry for answering late

<ArneBab>taylanub: thanks!

<ArneBab>taylanub, civodul: how do I create a fixnum?

<ArneBab>scm_t_uint32?

<ArneBab>(no, that’s for C)

*ArneBab is reading the docs right now

<ArneBab>the fixnums seem to be 64bit … (fixnum-width) → 61

<mark_weaver>62 bits when you include the sign bit.

<mark_weaver>but that's only on 64-bit platforms.

<ArneBab>ah, yes

<ArneBab>mark_weaver: what I search for is a way to tell scheme to treat a given number as uint32 (because an algorithm requires that)

<ArneBab>calling modulo all the time eats cycles :)

<ArneBab>besides: the manual page on elisp and nil¹, states that the page Compilation² tells me how to activate the warning in case of equality comparisons with #f, '() or nil, but on that page I do not find it. ¹: http://www.gnu.org/software/guile/manual/html_node/Nil.html#Nil ²: http://www.gnu.org/software/guile/manual/html_node/Compilation.html#Compilation

<mark_weaver>scheme doesn't have that, nor does guile.

<mark_weaver>(the uint32 thing)

<ArneBab>ok, so I’m currently limited to my workaround

<mark_weaver>'remainder' might be faster, and if the arguments is always non-negative then it's the same as modulo.

<mark_weaver>I think remainder might be a VM op, and modulo not.

<mark_weaver>no, nevermin

<mark_weaver>modulo has a VM op too.

<ArneBab>the cost I currently pay is that I have to call (modulo number (integer-expt 2 32)) on every math operation

<mark_weaver>if you're doing exponentation, then see 'modulo-expt', which is *much* faster than exponentiating and then doing modulo.

<mark_weaver>well, I hope you're precomputing the (integer-expt 2 32), right?

<mark_weaver>and preferably it's a lexical variable.

*ArneBab hides in very dark shadows… (re precomputing)

<mark_weaver>also, depending on the operations you're doing, you probably don't *need* to do the modulo after every operation. just often enough to make sure that the intermediate numbers don't get too big.

<mark_weaver>well, I'll tell you right now that precomputing 2^32 will be a *lot* faster.

<mark_weaver>that's a non-trivial operation.

<ArneBab>that definitely, yes…

<ArneBab>(likely the biggest performance leak right now - I did not actually profile the function: I wanted to see whether I can make the code more “natural” to the algorithm

<civodul>fixnums are a performance hack

<mark_weaver>anyway, if you need speed for something like uint32 math, then you should use C for that, as much as I try to avoid C these days.

<ArneBab>I just want to implement a C-based algorithm as natually as possible

<ArneBab>naturally

<ArneBab>and the algorithm uses the properties of uint32, so I need to emulate that: http://en.wikipedia.org/wiki/Tiny_Encryption_Algorithm#Reference_code

<mark_weaver>rather than precomputing, perhaps you should just put in the numeric constant itself, written as #x100000000

<ArneBab>that looks better, yes

<ArneBab>#b1000. righth?

<ArneBab>#b100000

<ArneBab>ah, no, sorry…

<ArneBab>(mixed it up)

<mark_weaver>You can compute the entire subexpression ((v1<<4) + k0) ^ (v1 + sum) ^ ((v1>>5) + k1) before taking the modulo.

<mark_weaver>it will always be the same, and much faster, than taking the modulo after every suboperation.

<mark_weaver>ditto for the other subexpressions that look like that.

<ArneBab>hm, yes, because the xor of the upper bits is irrelevant

<mark_weaver>(logand x #xFFFFFFFF) might be faster, and is the same as (modulo x #x100000000) if x is non-negative.

<ArneBab>that then actually operates on the bits…

<mark_weaver>also, don't use mutation. write a proper loop in scheme style. mutable variables are slower to access.

<mark_weaver>rather than taking modulo/logand after computing the subexpression ((v1<<4) + k0) ^ (v1 + sum) ^ ((v1>>5) + k1), instead, compute the sum of that plus the variable on the left, before taking modulo/logand.

<mark_weaver>v0 and v1 are the only things that really need to be kept within range before chopping off the top bits.

<mark_weaver>(the reason the top bits matter for v0 and v1 are because they are right-shifted)

<mark_weaver>s/are because/is because/

<ArneBab>mark_weaver: this is what I currently do: https://bitbucket.org/ArneBab/wisp/src/tip/examples/tinyenc.w

<ArneBab>yes

<ArneBab>uhm, reload that URL - I just pushed

<ArneBab>I think I also have to throw away the upper bits of the result of the subexpression, because that gets substracted from v0 or v1.

<ArneBab>If I don’t, I might get stray 0-values.

<mark_weaver>I don't think so.

<civodul>oh, (let ((x (values))) x) is optimized to (values), but it shouldn't

<mark_weaver>in general, in any expression involving only addition, subtraction, and multiplication (which includes left shift) you can postpone the modulo until after the expression.

<mark_weaver>right shifts are the only problem operation here.

<ArneBab>I just did a performance test of before implementing your optimizations and after it, and they improve the speed by more than factor 3

<mark_weaver>btw, that's true for any modulus, not just powers of two.

<ArneBab>I have a right-shift in both expressions… but I only substract them when decrypting, not when encrypting, so I can get rid of one more modulus for encryption.

<mark_weaver>you should probably use 'define-inlinable' for those helper routines like uint32 and v0change/v1change.

<mark_weaver>you can do the same for decryption.

<mark_weaver>(x-y) mod N is the same as (x mod N) - (y mod N)

<mark_weaver>ditto for + and *

<mark_weaver>you only need to take modulo before updating v0 and v1.

<mark_weaver>those are the only variables that need to be kept in range.

<ArneBab>I see it, yes…

<mark_weaver>anyway, I have to go afk for a while. happy hacking!

<ArneBab>thanks!

<ArneBab>mark_weaver: could I ask you something about wisp in a few hours?

<mark_weaver>you still have (integer-expt 2 32) in the loop.

<ArneBab>missed that - thanks!

<mark_weaver>also, it would be good to get rid of all top-level variable references within the loop.

<mark_weaver>anyway, ttyl!

<mark_weaver>(yes, you can ask later)

<ArneBab>thanks!

<ArneBab>getting rid of the toplevel variables in the loop did not change much.

<ArneBab>all in all, the optimizations bring a factor of 5

<ArneBab>the C-code still runs 100x faster, though.

<ArneBab>I tried ,profile, but it always replies “no samples recorded”.

<ArneBab>To reproduce: $ guile → ,pr (+ 1 2)

<davexunit>ArneBab: too quick of an operation to sample?

<ArneBab>it takes 0.0005s or so…

<ArneBab>davexunit: ah, yepp: If I just execute the code 10000 times, it works.

<ArneBab>thanks!

<davexunit>and I learned about the ,pr command!

<ArneBab>:)

<ArneBab>,help profile

<ArneBab>(for more)

<davexunit>yeah I did a busy loop and cranked up the loops until it could gather reasonable data: ,pr (let loop ((i 0)) (if (< i 10000000) (loop (1+ i)) i))

<davexunit>guile never ceases to impress me.

<ArneBab>it’s pretty impressive to have the profiler available like that

<davexunit>yeah. maybe racket or some other scheme can tout better tools, but compared to most other languages guile blows them away in terms of interaction.

<ArneBab>In python this would be python -m profile somescript.py

<davexunit>can you profile from the python repl, though?

<davexunit>of course other languages have some great profiling, too, but I just find things to be more pleasant about being able to do so many different things right from the REPL.

<ArneBab>yepp, you can: http://docs.python.org/3/library/profile.html

<davexunit>that sentence didn't come out right, but you get it.

<ArneBab>but it’s not just ,pr (something)

<davexunit>oh, cool.

<davexunit>I'm a python fan, too.

<davexunit>I just prefer the land of lisp.

<ArneBab>it’s import cProfile; cProfile.run("1 + 1")

<ArneBab>I really like both, too.

<ArneBab>1/5th of my runtime are just calls to logxor, so there’s a clear upper limit to how much faster I could get this ☺

<mark_weaver>civodul: any idea why hydra.nixos.org hasn't been doing evaluations of guile-2-0 since November?

<mark_weaver>ah, it says "last checked 2014-01-09 ..., with errors!"

<mark_weaver>civodul: could you take a look?

<civodul>mark_weaver: see http://hydra.nixos.org/jobset/gnu/guile-2-0#tabs-errors ; i'll report it

<mark_weaver>thanks!

<mark_weaver>civodul: on the guile-master job, what do you make of the "attempt to call something which is not a function but a set" errors?

<mark_weaver> http://hydra.nixos.org/jobset/gnu/guile-master

<civodul>mark_weaver: oh, lemme check

<civodul>see, error reporting is a difficult part of PL implementations...

<ArneBab>mark_weaver: on wisp: it can now parse macros and I added bootstrapping via autotools and a minimal testsuite (just checking the output against preconverted files). Do you see things it is still missing to correctly convert all scheme syntax?

<ArneBab>…damn, gone…

<mark_weaver>civodul: you said that long ago, you and wingo had a disagreement about whether #vu8(1 2 3) should be equivalent to #u8(1 2 3). Do you remember which side of that argument you took?

<civodul>my side :-)

<mark_weaver>the reason I ask is that I need to talk to whoever thought they should be distinct.

<civodul>lemme see if i can find it

<mark_weaver>so I'm wondering if I need to try to contact wingo or not.

<mark_weaver>sneek: seen wingo?

<sneek>I last saw wingo on Dec 13 at 12:09 pm UTC, saying: that is a test of the stack :) test on loops and the difference is much more.

<civodul>mark_weaver: http://lists.gnu.org/archive/html/guile-devel/2009-07/msg00076.html

<mark_weaver>civodul: thanks!

<civodul>a bit of time without meeting him, indeed

<civodul>i think there was more discussion at another point in time, but i can't find it

<mark_weaver>I think I found this thread before, but couldn't easily find the disagreement you mentioned. I'll look more closely this time.

<civodul>yeah

<mark_weaver>civodul: having looked more closely, I still can't find the disagreement you mentioned in that thread.

<civodul>maybe it was just on IRC?

<mark_weaver>Andy mentions the need to do type dispatch on SRFI-4 vectors though, e.g. distinguishing u8vector?, s8vector?, etc.

<mark_weaver>do you remember which position you took?

<civodul>"the one thing that I'm not fond of is the switch from disjoint SRFI-4 types to polymorphic types"

<civodul>i was more in favor of keeping types disjoint

<civodul>whereas Andy advocated something more flexible

<civodul>and i think we ended up with something in the middle

<mark_weaver>okay

<civodul>so you can view any SRFI-4 vector as a bytevector

<civodul>but you cannot take an u8vector as an f64vector, say

<civodul>i think

<mark_weaver>yeah, I can see the rationale for distinguishing most of those element types. I just don't see why #vu8(...) should be different than #u8(...)

<civodul>different in what sense?

<mark_weaver>well, the real problem I have is that #vu8(1 2 3) is not 'equal?' to #u8(1 2 3).

<mark_weaver>it's a problem because R6RS specifies the #vu8(...) syntax, and R7RS specifies the #u8(...) syntax.

<mark_weaver>most of the bytevector procedures we have create #vu8(...), which causes problems with R7RS.

<mark_weaver>I guess I'll have to wait until I can talk to wingo about this.

<mark_weaver>I have a patch that hacks 'equal?' to consider them equal, but I'm not sure it's the best fix. it seems better to just unify those two element types.

<civodul>ok

<civodul>yeah

<civodul>it would make sense

<civodul>perhaps you could post that and Cc Andy, and wish him a happy new year

<civodul>:-)

<mark_weaver>yes, I think that's the next step. thanks for the info :)

<ijp>sneek is probably full of messages

<dsmith-work>sneek is a good boy

IRC channel logs

2014-01-09.log