<ArneBab_>wingo: I want to use Guile to check whether a file contains null entries. <ArneBab_>because if there are null bytes in there (\\x00 → #\\null), then the file was corrupted <ArneBab_>,profile (let loop ((i 10000000)) (when (< 0 i) (eof-object? the-eof-object)(loop (1- i)))) <ArneBab_>,profile (let loop ((i 10000000)) (when (< 0 i) (equal? the-eof-object the-eof-object)(loop (1- i)))) <wingo>ArneBab_: then set it to ISO-8859-1 and read-delimited with a delimiter of #\\nul <wingo>i guess you don't need the contents tho... <ArneBab_>wingo: no, I just need to know whether there is #\\null in the file <wingo>anyway for something like that i would use bytevectors because there's no need to decode any character <wingo>you're literally looking for a byte that is 0 <wingo>so do a loop of get-bytevector-n! and then looping over the bv's bytes :) <wingo>fastest thing i can think of :) <ArneBab_>I already used (array-for-each nullbyte?! bv) <ArneBab_>is there a faster way to map a function onto a bytevector? <wingo>that's going to be slow :) you want to make sure it's inlined <wingo>(let lp ((n 0)) (and (< n (bytevector-length bv) (not (= (bytevector-u8-ref bv n) 0) (lp (1+ n)))) <ArneBab_>wingo: do you know whether eof-object? has to be slow for good reasons or whether it’s just an implementation issue? <ArneBab_>when I just use ,time, there’s a factor of 10 between (equal? the-eof-object char) and (eof-object? char) <ArneBab_>I’m missing the mark by only about factor 12 now <ArneBab_>I can read the data into a bytevector in 0.5 seconds, but I need 5 seconds to check whether there’s a 0 in the bytevector <wingo>what mark are you missing :) <wingo>how does the python script work? <wingo>you could of course read the whole string as latin-1 then use srfi-13 string-index <wingo>in latin-1, codepoints 0-255 map to bytes 0-255 <wingo>well that means you are doing each byte check in 40ns, i suspect that is approaching the limit of a bytecode byte-by-byte search, dunno without looking at the assembly <ArneBab_>I wonder how python gets it done that quickly… <wingo>for me i would read in the bytevector in smaller chunks (2-4KB) and iterate over those chunks, but i dunno <wingo>there you have an instant factor of 5 or so <wingo>maybe more because you get a nice tight loop <ArneBab_>it might just have a fastpath for `x in string` <wingo>that's probably the best bet in guile <wingo>i hate coding when you pick and choose the primitives implemented in c :P <wingo>> ,time (string-index (make-string 150000000 #\\q) #\\z) <wingo>;; 1.301566s real time, 1.307142s run time. 0.733638s spent in GC. <wingo>civodul: quick q on http://debbugs.gnu.org/19540. in relativize-path we are basically looking to see if any element of %load-path is a prefix of a file's canonicalized path, and if so, strip off the prefix. it seems that we were just missing a step to try to canonicalize the element of %load-path. any objections if i push such a fix to master? (with tests0 <wingo>i think we could probably push to 2.0 as well, dunno <wingo>ArneBab_: i assume you're on 2.1.3? <wingo>the srfi-13 solution will work equally well on both, but the bytevector-u8-ref loop will be much better on 2.1 <wingo>civodul: (the fix i propose is trying to canonicalize the %load-path element inside scm_i_relativize_path, not actually changing %load-path) <ArneBab_>(and ...) is ~factor 5 faster than read-string + string-index, already with 2.0 <wingo>civodul: i will push to master but we can revert if it's the wrong thing :) <wingo>i am trying out "optimistic concurrency" :) <wingo>feel very free to revert anything! no hard feelings <civodul>it's been a year and a half since i asked for feedback, and now i cannot page it in a 5 minutes <wingo>yeah understood, i sent a mail as well so you can process at your leisure <wingo>the other option i think would be to send mails first and then like build consensus or something <wingo>i can do that, i just expect that the result will be fewer bugs fixed, due to time constraints and such; dunno <wingo>let me know please if you think a different strategy would be better for guile <civodul>as you write building a community takes time and lots of communication <civodul>i'll refrain from making concrete suggestions here, tho <wingo>anyway i am always interested in your thoughts. sorry for snapping at you the other day :/ <lloda>are doc/maint/guile.texi and doc/guile-api.alist maintained by hand? <lloda>comments say snarfed but the line numbers don't match <wingo>i have not idea what is going on with those things <wingo>probably yes, unless they contain valuable lore from our elders <wingo>i think i finally fixed that smob mark/finalizer race <lloda>having unused cruft around makes it harder to navigate the source. I would remove doc/maint at least, seems to be more ruins than lore ***rubdos_ is now known as rubdos
<wingo>ArneBab_: that's pretty good for scheme then, a factor of 6 off from C <ArneBab_>assuming that adding guile-*/meta to PATH suffices <ArneBab_>with cold disk buffers (and only that matters for the usecase, though the development box has about a factor 10 faster disks) the Python version needs 11s. <ArneBab_>with hot buffers Python is down to 0.28s <ArneBab_>(I emptied the disk buffers by simply running a script which so much memory that it is killed by the kernel…) <wingo>for best speed, repeatedly read into a 4 KB buffer instead of reading the whole thing. <wingo>lloda: tx. fwiw next time remember the GNU-style changelog :) <ArneBab_>wingo: does get-bytevector-some automatically choose the ideal buffer size? <wingo>yes but it creates a new bytevector each time <wingo>better make a buffer and fill with get-bytevector-n! <ArneBab_>wingo: get-bytevector-all is ~15% faster than iterative get-bytevector-n 4096 <wingo>you are using get-bytevector-n! right? <ArneBab_>wingo: I’m using the one with a bang, yes <ArneBab_>the version here reads 4MiB, but the 4KiB version had roughly the same speed <ArneBab_>each command is run twice to avoid counting the compile time <wingo>who wants to play spot the bug? :) <wingo>maybe i can get civodul to play <wingo>well it's that commit right now <ArneBab_>GreySunshine: what should your code do and what does it do instead? <ArneBab_>wingo: why did scm_double_cell leak memory? <ArneBab_>wingo: (the original link points to the fixed file) <GreySunshine>ArneBab_: my code should return a pair, instead it gives me error messages <wingo>ArneBab_: in 2.0 we switched to bdw-gc <GreySunshine>2401: 4 [save-module-excursion #<procedure 559f24feba00 at ice-9/boot-9.scm:4045:3 ()>] <GreySunshine>4052: 3 [#<procedure 559f24feba00 at ice-9/boot-9.scm:4045:3 ()>] <GreySunshine> ?: 2 [load-compiled/vm "/home/nightfury/.cache/guile/ccache/2.0-LE-8-2.0/home/nightfury/Code/generic.scm.go"] <wingo>ArneBab_: which requires attaching a finalizer to free auxiliary memory <wingo>the double cell gets collected just fine <wingo>the problem is the mpz memory <ArneBab_>GreySunshine: for the next time please copy the error message into a pastebin (like your code) <wingo>the code in numbers.c was fixed to attach a finalizer <wingo>but we forgot to update the conv-i-integer files <ArneBab_>that sounds like a nasty bug to find… thank you for fixing it! <ArneBab_>davexunit: what’s the problem with pastebin? <davexunit>I don't know the reason, I just know that they do. <davexunit>and several people here browse with Tor and won't be able to view someone's pastebin.com entry <davexunit>paste.lisp.org is better for lisp stuff, anyway <ArneBab_>ok — I could think of many valid reasons to block Tor users from posting something (though not for blocking them from reading) <ArneBab_>GreySunshine: do I understand it correctly that what you create as complex number a is essentially a list '(complex rect 2 2)? <ArneBab_>GreySunshine: this is what gets the procedure to run: (get (cons 'add (type a)))) <ArneBab_>it returns ((add . complex) . #<procedure +complex (obj-x obj-y)>) ***heroux_ is now known as heroux
<GreySunshine>I get it, all I need is the procedure but I've used the whole thing <GreySunshine>I found the error it was in +c procedure, thanks so much ***heroux_ is now known as heroux
<wingo>dsmith-work: check that interesting bug above with scm_from_int64 :) <dsmith-work>wingo: So has that leak been in there since moving to libgmp or since moving to libgc ? <wingo>but only on 32-bit systems :/ <wingo>or more precisely, only on systems with 32-bit longs <galex-713>wingo: will one day bootstraping be quicker? :/ or at least not eating 200% cpu all the time? <wingo>nope :) use a tarball or some source for prebuilt binaries if you want better speed <wingo>some things might be faster when we do native compilation, but the compiler will also have to do more, so i dunno <galex-713>wingo: where do we find prebuilt binaries yet? <linas>wingo -- I just received a thread-cancellation bug report for guile (guile crashes) and am wondering if I should bother reporting to you/guile team <linas>I'm not sure its worth fixing; it involves SMOBs so the test cses isn't easy. <linas>I told the reporter that thread cancellation sucks and don't do that. <wingo>galex-713: in the tarball, prebuild/ <wingo>i dunno, if you have a test case, sure, otherwise i dunno <lloda>wingo: i'll give a shot to the mac thing <lloda>last time it happened it filled the disk and caused a panic <lloda>I don't make check anymore on the mac <wingo>let me know if you need some ideas <daviid>wingo: just posted my reply to your email wrt bug#19459 <wingo>would you mind mailing your NEWS as a patch? <wingo>i want to edit but i don't want to take your credit <daviid>wingo: no I can send patch, but maybe it needs updates? I just started to work on this now, it took me some time to answer you :), I can pull update the NewS and send patch <daviid>wingo: oh, if it's only that, please edit it no roblem <daviid>wingo: yes, please go ahead, i don't need the credit :), it will be more efficient and I can work on something else, ike GI ... <wingo>daviid: re: NEWS i insist, just commit it and mail plz :) <daviid>wingo: ok, will do now, give me some min <wingo>daviid: i also replied to you on that bug