IRC channel logs

2016-06-23.log

back to list of logs

<ArneBab_>wingo: I want to use Guile to check whether a file contains null entries.
<ArneBab_>because if there are null bytes in there (\\x00 → #\\null), then the file was corrupted
<ArneBab_>checking eof-object? is pretty slow
<ArneBab_>,profile (let loop ((i 10000000)) (when (< 0 i) (eof-object? the-eof-object)(loop (1- i))))
<ArneBab_>;; ^ 6 seconds
<ArneBab_>,profile (let loop ((i 10000000)) (when (< 0 i) (equal? the-eof-object the-eof-object)(loop (1- i))))
<ArneBab_>;; ^ 1 second
<wingo>ArneBab_: then set it to ISO-8859-1 and read-delimited with a delimiter of #\\nul
<wingo>i guess you don't need the contents tho...
<ArneBab_>wingo: no, I just need to know whether there is #\\null in the file
<wingo>anyway for something like that i would use bytevectors because there's no need to decode any character
<wingo>you're literally looking for a byte that is 0
<ArneBab_>something like bytevector-index bv 0
<wingo>so do a loop of get-bytevector-n! and then looping over the bv's bytes :)
<wingo>fastest thing i can think of :)
<ArneBab_>I already used (array-for-each nullbyte?! bv)
<ArneBab_>is there a faster way to map a function onto a bytevector?
<wingo>that's going to be slow :) you want to make sure it's inlined
<ArneBab_>(map) didn’t work
<ArneBab_>how can I do that?
<wingo>(let lp ((n 0)) (and (< n (bytevector-length bv) (not (= (bytevector-u8-ref bv n) 0) (lp (1+ n))))
<wingo>anyway something like that
<ArneBab_>I’ll try that
<ArneBab_>wingo: do you know whether eof-object? has to be slow for good reasons or whether it’s just an implementation issue?
<ArneBab_>when I just use ,time, there’s a factor of 10 between (equal? the-eof-object char) and (eof-object? char)
<wingo>impl issue
<ArneBab_>ok, thanks!
<ArneBab_>that got me down to 6 seconds
<ArneBab_>I’m missing the mark by only about factor 12 now
<ArneBab_>I can read the data into a bytevector in 0.5 seconds, but I need 5 seconds to check whether there’s a 0 in the bytevector
<ArneBab_>(even using (and ...))
<wingo>what mark are you missing :)
<wingo>?
<ArneBab_>the mark of beating my Python script :)
<wingo>how does the python script work?
<ArneBab_>if "\\0" in f.read()
<wingo>haha, nice
<wingo>paste your guile script?
<ArneBab_> https://bpaste.net/show/13058c79259c
<wingo>you could of course read the whole string as latin-1 then use srfi-13 string-index
<ArneBab_>does latin-1 include null?
<wingo>yes
<wingo>in latin-1, codepoints 0-255 map to bytes 0-255
<wingo>how big is your file again?
<ArneBab_>currently each file is about 150 MiB
<ArneBab_>but that’s not guaranteed…
<wingo>well that means you are doing each byte check in 40ns, i suspect that is approaching the limit of a bytecode byte-by-byte search, dunno without looking at the assembly
<ArneBab_>I wonder how python gets it done that quickly…
<wingo>for me i would read in the bytevector in smaller chunks (2-4KB) and iterate over those chunks, but i dunno
<wingo>python uses C :)
<wingo>there you have an instant factor of 5 or so
<wingo>maybe more because you get a nice tight loop
<ArneBab_>it might just have a fastpath for `x in string`
<ArneBab_>But I’ll try latin1 and srfi-13
<wingo>that's probably the best bet in guile
<wingo>i hate coding when you pick and choose the primitives implemented in c :P
<wingo>> ,time (string-index (make-string 150000000 #\\q) #\\z)
<wingo>$5 = #f
<wingo>;; 1.301566s real time, 1.307142s run time. 0.733638s spent in GC.
<ArneBab_>so it could be able to go down to ~1s
<wingo>civodul: quick q on http://debbugs.gnu.org/19540. in relativize-path we are basically looking to see if any element of %load-path is a prefix of a file's canonicalized path, and if so, strip off the prefix. it seems that we were just missing a step to try to canonicalize the element of %load-path. any objections if i push such a fix to master? (with tests0
<wingo>i think we could probably push to 2.0 as well, dunno
<wingo>ArneBab_: i assume you're on 2.1.3?
<ArneBab_>I have 2.0 and a 2.1
<wingo>the srfi-13 solution will work equally well on both, but the bytevector-u8-ref loop will be much better on 2.1
<wingo>civodul: (the fix i propose is trying to canonicalize the %load-path element inside scm_i_relativize_path, not actually changing %load-path)
<wingo> http://debbugs.gnu.org/rrd/guile.html
<ArneBab_>(and ...) is ~factor 5 faster than read-string + string-index, already with 2.0
<wingo>civodul: i will push to master but we can revert if it's the wrong thing :)
<civodul>heh
<wingo>i am trying out "optimistic concurrency" :)
<wingo>feel very free to revert anything! no hard feelings
<civodul>it's been a year and a half since i asked for feedback, and now i cannot page it in a 5 minutes
<civodul>s/a/in/
<wingo>yeah understood, i sent a mail as well so you can process at your leisure
<civodul>cool :-)
<civodul>thanks for addressing it!
<wingo>the other option i think would be to send mails first and then like build consensus or something
<wingo>i can do that, i just expect that the result will be fewer bugs fixed, due to time constraints and such; dunno
<wingo>let me know please if you think a different strategy would be better for guile
<civodul>as you write building a community takes time and lots of communication
<civodul>i'll refrain from making concrete suggestions here, tho
<wingo>why refrain? :)
<wingo>anyway i am always interested in your thoughts. sorry for snapping at you the other day :/
<civodul>yeah
<lloda>are doc/maint/guile.texi and doc/guile-api.alist maintained by hand?
<lloda>comments say snarfed but the line numbers don't match
<wingo>i have not idea what is going on with those things
<wingo>i think they are vestigial
<lloda>so can we just remove them?
<wingo>probably yes, unless they contain valuable lore from our elders
<wingo>i think i finally fixed that smob mark/finalizer race
<wingo>gc is hard!
<lloda>having unused cruft around makes it harder to navigate the source. I would remove doc/maint at least, seems to be more ruins than lore
<wingo>lloda: sgtm
<wingo>feel free to push a removal
<lloda>directly to master?
***rubdos_ is now known as rubdos
<wingo>lloda: sure
<lloda>done
<ArneBab_>wingo: with 2.1.3 I get down to 2.8s
<wingo>ArneBab_: that's pretty good for scheme then, a factor of 6 off from C
<ArneBab_>assuming that adding guile-*/meta to PATH suffices
<ArneBab_>(to get the full performance)
<ArneBab_>wingo: here’s the comparison between guile version: http://paste.lisp.org/display/319031
<wingo>neat
<ArneBab_>with cold disk buffers (and only that matters for the usecase, though the development box has about a factor 10 faster disks) the Python version needs 11s.
<ArneBab_>with hot buffers Python is down to 0.28s
<ArneBab_>Guile is tested with hot disk buffers
<ArneBab_>(I emptied the disk buffers by simply running a script which so much memory that it is killed by the kernel…)
<ArneBab_>(this is now for exactly the same task)
<wingo>for best speed, repeatedly read into a 4 KB buffer instead of reading the whole thing.
<ArneBab_>I’ll try that
<wingo>lloda: tx. fwiw next time remember the GNU-style changelog :)
<ArneBab_>wingo: does get-bytevector-some automatically choose the ideal buffer size?
<wingo>yes but it creates a new bytevector each time
<wingo>better make a buffer and fill with get-bytevector-n!
<lloda>drat, sorry
<ArneBab_>wingo: get-bytevector-all is ~15% faster than iterative get-bytevector-n 4096
<wingo>you are using get-bytevector-n! right?
<wingo>with the bang
<ArneBab_>wingo: this is the code: http://paste.lisp.org/display/319031#1
<ArneBab_>wingo: I’m using the one with a bang, yes
<ArneBab_>the version here reads 4MiB, but the 4KiB version had roughly the same speed
<ArneBab_>each command is run twice to avoid counting the compile time
<GreySunshine>Hello all, I am Vasanth
<GreySunshine>I am finding it difficult to debug the code that I've written, please help. link: http://pastebin.com/Xx7KLHFF
<GreySunshine>I use guile
<wingo>ooooooohhhhhh my god
<wingo>holy crap
<wingo>who wants to play spot the bug? :)
<wingo>come on i need a taker
<wingo>maybe i can get civodul to play
<wingo>lloda perhaps?
<wingo> http://git.savannah.gnu.org/cgit/guile.git/tree/libguile/conv-integer.i.c#n108 <- this is the implementation of scm_from_int64. spot the memory leak :)
<wingo>no one? :)
<davexunit>I'm still waking up :)
<wingo>it took me a long time :)
<wingo> http://git.savannah.gnu.org/cgit/guile.git/commit/
<wingo>er
<wingo>well it's that commit right now
<wingo> http://git.savannah.gnu.org/cgit/guile.git/commit/?id=2c8ea5a008959ffba629694942d75887dc14a869
<efraim>"Fix a big in which"
<ArneBab_>GreySunshine: what should your code do and what does it do instead?
<ArneBab_>wingo: why did scm_double_cell leak memory?
<ArneBab_>wingo: (the original link points to the fixed file)
<GreySunshine>ArneBab_: my code should return a pair, instead it gives me error messages
<GreySunshine>* display a pair
<ArneBab_>could you include the error memssages?
<wingo>ArneBab_: in 2.0 we switched to bdw-gc
<GreySunshine>Backtrace:
<GreySunshine>In ice-9/boot-9.scm:
<GreySunshine> 157: 8 [catch #t #<catch-closure 559f24fcfa60> ...]
<GreySunshine>In unknown file:
<GreySunshine> ?: 7 [apply-smob/1 #<catch-closure 559f24fcfa60>]
<GreySunshine>In ice-9/boot-9.scm:
<GreySunshine> 63: 6 [call-with-prompt prompt0 ...]
<GreySunshine>In ice-9/eval.scm:
<GreySunshine> 432: 5 [eval # #]
<GreySunshine>In ice-9/boot-9.scm:
<GreySunshine>2401: 4 [save-module-excursion #<procedure 559f24feba00 at ice-9/boot-9.scm:4045:3 ()>]
<GreySunshine>4052: 3 [#<procedure 559f24feba00 at ice-9/boot-9.scm:4045:3 ()>]
<GreySunshine>In unknown file:
<GreySunshine> ?: 2 [load-compiled/vm "/home/nightfury/.cache/guile/ccache/2.0-LE-8-2.0/home/nightfury/Code/generic.scm.go"]
<wingo>ArneBab_: which requires attaching a finalizer to free auxiliary memory
<wingo>the double cell gets collected just fine
<wingo>the problem is the mpz memory
<ArneBab_>GreySunshine: for the next time please copy the error message into a pastebin (like your code)
<wingo>the code in numbers.c was fixed to attach a finalizer
<wingo>but we forgot to update the conv-i-integer files
<ArneBab_>that sounds like a nasty bug to find… thank you for fixing it!
<GreySunshine>yes I will I'm sorry!
<davexunit>just don't use pastebin.com
<davexunit>paste.lisp.org is preferable
<ArneBab_>davexunit: what’s the problem with pastebin?
<davexunit>ArneBab_: they block Tor users
<ArneBab_>did they say why?
<davexunit>I don't know the reason, I just know that they do.
<davexunit>and several people here browse with Tor and won't be able to view someone's pastebin.com entry
<davexunit>paste.lisp.org is better for lisp stuff, anyway
<ArneBab_>ok — I could think of many valid reasons to block Tor users from posting something (though not for blocking them from reading)
<davexunit>I've never seen a good reason to block Tor
<ArneBab_>GreySunshine: do I understand it correctly that what you create as complex number a is essentially a list '(complex rect 2 2)?
<GreySunshine>its a nested pair
<GreySunshine>looks like this: (complex rect 2 . 2)
<ArneBab_>GreySunshine: this is what gets the procedure to run: (get (cons 'add (type a))))
<ArneBab_>it returns ((add . complex) . #<procedure +complex (obj-x obj-y)>)
<GreySunshine>yes, that is accurate
***heroux_ is now known as heroux
<GreySunshine>I get it, all I need is the procedure but I've used the whole thing
<ArneBab_>yepp
<GreySunshine>It compiled successfully, but the output is wrong
<GreySunshine>I found the error it was in +c procedure, thanks so much
<ArneBab_>GreySunshine: nice!
<ArneBab_>GreySunshine: happy hacking!
***heroux_ is now known as heroux
<dsmith-work>Morning Greetings, Guilers
<wingo>dsmith-work: check that interesting bug above with scm_from_int64 :)
<dsmith-work>wingo: Ya!
<dsmith-work>wingo: So has that leak been in there since moving to libgmp or since moving to libgc ?
<wingo>since moving to libgc
<wingo>but only on 32-bit systems :/
<wingo>i think
<wingo>yeah
<wingo>or more precisely, only on systems with 32-bit longs
<dsmith-work>Nice find
<galex-713>wingo: will one day bootstraping be quicker? :/ or at least not eating 200% cpu all the time?
<wingo>nope :) use a tarball or some source for prebuilt binaries if you want better speed
<wingo>some things might be faster when we do native compilation, but the compiler will also have to do more, so i dunno
<galex-713>wingo: where do we find prebuilt binaries yet?
<linas>wingo -- I just received a thread-cancellation bug report for guile (guile crashes) and am wondering if I should bother reporting to you/guile team
<linas>I'm not sure its worth fixing; it involves SMOBs so the test cses isn't easy.
<linas>I told the reporter that thread cancellation sucks and don't do that.
<wingo>galex-713: in the tarball, prebuild/
<wingo>cancellation!
<wingo>holy jeepers
<wingo>i dunno, if you have a test case, sure, otherwise i dunno
<galex-713>wingo: ah ok
<lloda>wingo: i'll give a shot to the mac thing
<lloda>last time it happened it filled the disk and caused a panic
<lloda>I don't make check anymore on the mac
<wingo>wow
<wingo>let me know if you need some ideas
<lloda>i'll keep you posted
<lloda>and thx
<wingo>thanks to you :)
<ArneBab_>wingo: I’d like to contribute this: http://paste.lisp.org/display/319046
<ArneBab_>bbad
<davexunit>dave kastrup strikes again
<ijp>been a while
<daviid>wingo: just posted my reply to your email wrt bug#19459
<wingo>daviid: great
<wingo>would you mind mailing your NEWS as a patch?
<wingo>i want to edit but i don't want to take your credit
<daviid>wingo: no I can send patch, but maybe it needs updates? I just started to work on this now, it took me some time to answer you :), I can pull update the NewS and send patch
<daviid>wingo: oh, if it's only that, please edit it no roblem
<daviid>no problem at all
<daviid>wingo: yes, please go ahead, i don't need the credit :), it will be more efficient and I can work on something else, ike GI ...
<wingo>daviid: re: NEWS i insist, just commit it and mail plz :)
<daviid>wingo: ok, will do now, give me some min
<daviid>wingo: email sent
<wingo>tx
<wingo>daviid: i also replied to you on that bug
<daviid>ok