IRC channel logs

2013-08-20.log

back to list of logs

***tolk` is now known as tolk
<mark_weaver>ugh, another correction. I wrote "macros are actually stored in the module as procedures", and that's not true. They are actually SMOBs.
<mark_weaver>I seem to be getting a bit sloppy. time to be more careful.
<nalaginrut>morning guilers~
<ijp>what's the best way to normalise newlines in a string?
<ijp>maybe a regexp replace is cleanest
<nalaginrut>what's normalize? '\\n' to '\\r\\n'?
<ijp>hmm, I wonder if it makes sense to export unreserved-chars from (web uri)
<ijp>nalaginrut: yes, plain '\\n' and plain '\\r' to '\\r\\n'
<nalaginrut>so you need read-string/delimiter too
<ijp>why?
<ijp>I have a string already
<mark_weaver>I'd probably use regexps.
<mark_weaver>out of curiosity, why do you need this?
<ijp>I'm fixing up a form-encoding procedure, and you're supposed to normalize newlines
<mark_weaver>where are the input strings coming from?
<nalaginrut>oh, what if users uploaded a file edited under Windows? Will it replace all "\\r" to "\\r\\n"?
<ijp>only those not followed by a \\n
<mark_weaver>the reason I ask is that I wonder if working R6RS transcoders would solve your problem more nicely, if we had them. (It's on my TODO)
<ijp>I'm not sure it would look any cleaner in code
<mark_weaver>but I wonder why you're not sure what line ending type is in the input string.
<mark_weaver>unfortunately, our regexp API is quite gross IMO. it really needs to be replaced.
<mark_weaver>(another thing on my TODO :)
<ijp>very gross
<nalaginrut>yes, here's my 2 cents
<ijp>I'd use use irregex, but it's not worth it for one procedure
<nalaginrut>I was the first victim ;-(
<mark_weaver>where are the input strings coming from?
<nalaginrut>my situation is users upload binary files, and I have to parse string-boundry from the bytevectors
<nalaginrut>for text files, it works perfectly
<ijp>not I/O but from alists of (possibly) arbitrary strings
<ijp>it's basically a wrap around http-get
<wingo> http://gallium.inria.fr/blog/we-need-a-representative-benchmark-suite/
<nalaginrut>I always want this stff
<nalaginrut>for guile
<wingo>our inliner is actually quite good, seems to me
<wingo>what we are lacking is the predictable native compiler
<nalaginrut>yup
<civodul>Hello Guilers
<wingo>moin :)
<add^_>Hm running (fib 4000000) was a bit on the heavy side. Killed the guile process.
<add^_>I *may* need a more efficient fibonacci algorithm ;-)
<taylanub>I've recently read this old paper from PLT Scheme that tells how horrible conservative GC was for them. I wonder about the reason ? They seemed to have huge memory leaks when e.g. running DrRacket inside DrRacket, and also significant leaking over time in long-running processes like a web server; do we have any similar issues in Guile ? Has anyone been running a big Guile process for months or so ?
<taylanub> http://www.cs.utah.edu/plt/publications/ismm09-rwrf.pdf source
<wingo>i run a guile web server, i have to restart it every few months
<taylanub>Oh :(
<taylanub>Well every few months isn't *that* bad.
<wingo>nope, it's fine
<wingo>i'm not sure what the plt folks' issue was with conservative gc
<wingo>it shouldn't be that bad
<wingo>and they never found out
<civodul>yeah i'm skeptical
<civodul>i think Guile-Emacs will be a good testbed ;-)
<taylanub>The paper only vaguely talks about specific issues: "[...] conservative GC can trigger unbounded memory use due to linked lists [Boehm 2002] that manage threads and continuations; this problem is usually due to liveness imprecision [Hirzel et al. 2002], rather than type imprecision." (any typos are mine)
<wingo>somewhat FUDdy, it seems to me...
<taylanub>wingo: Is that server on 32b or 64b ? (64b has significantly less issues with conservative GC, right?)
<wingo>this one is 64-bit
<wingo>anyway, i have not had any concrete report of conservative gc causing problems for guile
<wingo>only fud
<wingo>so please stop spreading it :)
<taylanub>Nah, was just wondering. :)
<civodul>yep, and there are many users of libgc beside Guile
<Chaos`Eternal>just curious, can a GC mathematically provable?
<wingo>sure, just choose an appropriate set of axioms :)
<civodul>:-)
<janneke>vague paper talks quickly loose their charm once a patch is produced that proves it all wrong
<wingo>well, racket's 3m gc works for them, no argument there
<nalaginrut>wingo: I planed to run my designed web-site with Guile, "restart it every few months" huh?!
<wingo>what's the question? :)
<wingo>if you ask almost any java shop, they restart their apps every night...
<nalaginrut>well, it's a kind of site never interrupt..
<nalaginrut>OK, anyway, I have to try a beta site first
<nalaginrut>but I've tested the anti-pressure ability of Artanis, it's nice
<nalaginrut>since it uses inner server, I think it's coo
<nalaginrut>cool
<nalaginrut>but I've no idea about the mem consumption for long time running
<Chaos`Eternal>i heard only windows machines require daily reboot...
<nalaginrut>dunno, maybe use incremental mode of bdwGC is better?
<Chaos`Eternal>high presure, and one week, record the memory footprint
<wingo>the issue is fragmentation; you can't do much about that with a conservative gc
<civodul>the problem with the web server needing restart may not be related to the fact that we use a convertive GC
<civodul>it could be an app-level leak, or it could be a libguile bug, most likely
<civodul>but such things are hard to track...
<wingo>yep
<wingo>we need better tooling there
<nalaginrut>is the GC replaceable? I mean use another GC with the original GC-interface
<nalaginrut>at least an alternative
<wingo>with enough resources, everything can be changed ;-)
<janneke>with enough time, resourses are unlimited ;-)
<wingo>after the revolution, yes ;)
<nalaginrut>I think the proper name to conservative-GC is mem-leak-in-long-time-running-GC
<wingo>i think you are misunderstanding how it works
<wingo>there is no reason to think that will always or even usually be the case
<wingo>and until someone brings an actual case in which conservative gc causes mem leaks, i will call it all FUD
<wingo>:)
<taylanub>nalaginrut: If you want a long descriptive name, then the most correct would be scan-all-locations-and-treat-everything-as-a-potential-pointer-GC, and we shortly just call that "conservative" to save breath and finger-strain. :P
<nalaginrut>;-D
<wingo>taylanub: that's not even correct
<wingo>it doesn't scan all locations
<taylanub>I should've known better. :P
<nalaginrut>I think it misses some blocks
<wingo>only the stack, statically allocated data, and thread-local data
<wingo>everything else is +/- precise
<wingo>and some regions aren't scanned at all
<nalaginrut>and in long time running, the missed blocks *maybe* increase
<wingo>!
<wingo>the thing about *conservative* is that it never misses a block
<wingo>it never misses live data
<wingo>it can treat dead data is live but that only happens occaisionally -- and if you don't understand those mechanisms you probably shouldn't worry about it at all
<wingo>*as live
<wingo>e.g. the blacklists
<nalaginrut>OK, I should get it clear
<weinholt>does the rtl compiler do any liveness analysis? i suppose it could assist the conservative GC by clearing out dead values
<wingo>weinholt: it does some when allocating variables to slots, but it doesn't explicitly clear old values
<wingo>i suppose it could do so, perhaps in a particular mode
<wingo>i think it would generally be bad for perf to do so
<weinholt>i see
<weinholt>the unix-haters handbook had some words on conservative GC :)
<wingo>hehe :)
<civodul>there's VM_ENABLE_PRECISE_STACK_GC_SCAN and VM_ENABLE_STACK_NULLING in vm.c, for the worried ;-)
<Chaos`Eternal>but i found a workaround aganist the mem-leak: let your web-server exec itself periodically...
<wingo>civodul: yes, we could add liveness maps somehow to the ELF images and have the GC use them when marking the stack
<wingo>exec is the ultimate tail call
<nalaginrut>well, seems the author claims it could leak in short term
<nalaginrut>maybe it's fine in long term, anyway, I don't want to dig into GC at this time
<Chaos`Eternal>in fact, if we can make current-continuation persistent, we can re-install that continuation after exec
<dsmith>Heya
<wingo>moo
<dsmith>sneek is now on a 4 hour reexec schedule instead of 1 hour.
<dsmith>(been that way since I moved to the rpi)
<wingo>boo
<wingo>oh
<wingo>that's better, right?
<dsmith>Yes
<wingo>he's running guile stable-2.0?
<dsmith>Yes
<wingo>and experiencing memory leaks, as i understand it
<dsmith>sneek, version?
<sneek>Sneeky bot running on Guile version 2.0.9 using bobot++ 2.3.0-darcs
<wingo>where is bobot++ source code?
<dsmith>I suspect the C++ bobot code
<dsmith>Hmm.
<dsmith>It's under darcs now..
<dsmith>unknown_lamer, Where is the bobot++ code again?
<dsmith>Here I think http://darcs.unknownlamer.org/bobot++
<Arne`>In python I can use 3*"a" to get "aaa". Is there an equivalent in guile?
<taylanub>Arne`: Did you check the manual ?
<civodul>(make-string 3 "a")
*civodul finds it amazing that there's special syntax for that in Python
<taylanub>civodul: Well, what about n*"ab" ?
<taylanub>`xsubstring' does it though.
<wingo>ffs
<civodul>ah ah!
<wingo>hehe xsubstring is a neat hack :)
<taylanub>The interface is kinda weird; I suppose it's some funny highly-efficient implementation or so ?
<wingo>fooled you :)
<taylanub>:\\
*civodul discovers xsubstring
<civodul>very weird, indeed
<taylanub>So, why does it exist ? :P
<wingo>questions, questions!
<civodul>designed for the taylanubs and Arnes of the world
<taylanub>:D
<mario-goulart>There's also make-kmp-restart-vector if you are looking for weird stuff
*taylanub can't find that in the manual
<civodul>mario-goulart: not in Guile it seems :-)
<mario-goulart>Doesn't guile provide srfi-13?
<wingo>mario-goulart: yes but i don't think that interface is part of srfi-13
<wingo>though it is part of the ref. impl.
<mario-goulart>wingo: it's in the "Low-level procedures" section. Maybe (hopefully :-)) it is not intended to be used by "end" users.
<Arne`>taylanub: I did not know how to search for that…
<Arne`>(google did not help…)
*Arne` discovers xsubstring, to. Exactly what I need! Thanks!
<taylanub>Arne`: It's highly advisable to learn Emacs's Info mode (C-h i). :)
<taylanub>C-h i d m guile RET m api ret ...
<taylanub>RET*
<Arne`>taylanub: … d m guile ← I had always used C-s… thanks!
*Arne` should really learn it…
*Arne` is no longer used to having info available, because most python modules leave it out :(
*davexunit also needs to learn to use info within emacs
<taylanub>[ and ] for prev/next (e.g. from 5.1 to 5.1.1), p and n for prev/next of the same hierarchy-level (e.g. from 5.1 to 5.2), u for hierarchy-up (e.g. 5.1 to 5), l and r for history back/forv, m to visit menu entry, t for top-level of current manual, d for the root manual-list, i for index-search; and I think that's enough 90% of the time. :P
***alexei___ is now known as amgarchIn9
<mark_weaver>sneek: later tell add^_ (fib 4000000) has 835951 decimal digits. To compute such large fibonacci numbers, you need to use a matrix exponentiation method. I have Scheme code that can efficiently compute it in less than 0.4 seconds in Guile: https://lists.gnu.org/archive/html/guile-devel/2011-01/txtGaSbapq3P2.txt
<sneek>Got it.
<mark_weaver>sneek: later tell add^_ computing it is fast, but printing out the number takes a *lot* longer.
<sneek>Got it.
<wingo>mark_weaver: just pushed a change
<wingo>compile-rtl now produces elf images directly
<wingo>one upshot of this is that you can guild compile -t rtl foo.scm
<mark_weaver>nice! :)
<wingo>and that produces a loadable valid .go file :)
<wingo>it works with your test.scm
<mark_weaver>sweet :)
<wingo>though you have to change from assemble-program to load-thunk-from-memory
<wingo>using that test harness anyway
<mark_weaver>so what does (compile x #:to 'rtl) return?
<wingo>a bytevector
<wingo>which is an elf image
<mark_weaver>ah, okay.
<wingo>in some ways it's not as nice as before
<wingo>we need to improve our tooling, to disassemble elf images
<mark_weaver>I suppose it might be nice to have another step in there, corresponding to the old 'assembly.
<wingo>but it's faster and seems to hang together re: runtime compilation and to-file compilation
<mark_weaver>but I don't care that much.
<wingo>the "assembly" step was very slow fwiw
<mark_weaver>true
<wingo>bypassing it is a feature in many ways...
<mark_weaver>and actually, it wasn't nice to read the generated assembly. I usually prefer to read the disassembled version instead.
<wingo>yep
*mark_weaver looks at the branch
<mark_weaver>I agree that this is better.
<wingo>cool, glad you like it!
<mark_weaver>being able to generate .go files is a great improvement!
<wingo>yeah, it's neat :)
<wingo>and they have the right permissions so they get shared appropriately, etc
<mark_weaver>so now, if we don't mind the lack of debugging at present, in theory we could compile many core modules using RTL.
<wingo>and you can readelf -a the .go file
<wingo>yes, except there are some things that don't work yet like prompts
<wingo>you saw the ifdefs in the vm-engine.c i think
<mark_weaver>how much needs to be done to enable prompts, do you think? is it just a matter of debugging existing code, or are there non-trivial chunks missing?
<wingo>if 0, rather
<wingo>i don't know tbh
<wingo>i think it's close
<wingo>i thought it through
<wingo>but whether my solution matches the needs of the problem or not, i don't know
<wingo>and there are some unimplemented bits
<mark_weaver>*nod* I might try poking at it a bit
<wingo>that would be super
<mark_weaver>do you remember any specifics about what is unimplemented?
<wingo>can i give you an intro?
<wingo>yes let me explain
<mark_weaver>please :)
<wingo>so the idea with prompts is that you have ($continue k ($prompt escape? tag handler))
<wingo>the body of the prompt is in k
<wingo>escape? is #t or #f
<wingo>tag is a lexical
<wingo>as in tree-il
<wingo>and handler is a continuation
<mark_weaver>a $kargs I assume?
<wingo>$prompt pushes a prompt onto the dynamic stack, associated with the label of handler
<wingo>no :)
<wingo>it continues to the body of the prompt
<wingo>for an escape-only prompt, the body will evaluate the body
<wingo>so one more thing
<wingo>in tree-il, if the prompt is escape-only, the body is an expression
<wingo>otherwise it is a thunk
<wingo>the reason being that for prompts that we might need to capture a continuation, we can only do so on a frame boundary
<wingo>so that forces the body to be in a thunk
<wingo>the same restriction is in tree-il and the stack vm fwiw
<wingo>for escape-only prompts you just need to register the handler location and proceed
<wingo>the compile-cps pass converts the two cases differently
<mark_weaver>*nod*
<wingo>for escape-only prompts, the body continuation, labelled k, will do the body, ktrunc to a rest list, kargs that rest list, primcall to pop-prompt, then apply values to the captured rest list
<wingo>that's a bit dense :)
<wingo>but it's equivalent to
<wingo>(call-with-values (lambda () BODY) (lambda vals (pop-prompt!) (apply values vals)))
<wingo>with the hope being that the optimizer can prove something about the return arity of BODY so we can elide the list construction
<wingo>but the point being is that $prompt pushes a prompt
<wingo>it doesn't pop it
<wingo>it relies on the continuation to pop it later
<wingo>this property is produced by compile-cps
<wingo>this is fine because there are only two ways the prompt can exit -- normally or abnormally
<wingo>in the normal case we explicitly pop-prompt!
<wingo>in the abnormal case the runtime machinery handles the dynamic stack explicitly
<wingo>popping it down to the level it was at when the handler was registered
<wingo>so, for non-escape-only prompts, it's all the same
<wingo>except the body continuation is expected to call the thunk
<wingo>because we know there can be no abort in the prologue before the thunk call, or in the epilogue where we pop the prompt before returning the values again
<wingo>that's a lot of words :)
<mark_weaver>I need a few minutes to take this all in :)
<wingo>ok :)
<wingo>i think the cps machinery hangs together
<mark_weaver>what does it mean to "ktrunc to a rest list"
<mark_weaver>?
<wingo>continue to a continuation that is a ($ktrunc () rest k*)
<mark_weaver>ah, okay.
<mark_weaver>so in the absence of optimization, the ktrunc will translate into code that conses up the list from the stack?
<wingo>yes
<mark_weaver>I've never looked at the dynamic runtime handling for prompts, but it's past time for me to do so :)
<wingo>it doesn't currently cons :0
<wingo>i think probably (lambda vals (pop-prompt!) (apply values vals)) could be optimized so as to never cons
<mark_weaver>you mean that part of 'ktrunc' compilation is not yet implemented?
<wingo>no, i mean that in the stack vm it doesn't cons
<wingo>that part of ktrunc compilation is implemented, though there may be bugs of course
<mark_weaver>okay, let me read what you wrote again. give me a few minutes please :)
<wingo>no problem :)
<wingo>i spent a loooooong time stuck, puzzling out how to do this
<wingo>an entire plane flight to asia
<mark_weaver>good use of a flight! :)
<mark_weaver>okay, I think I understand everything you wrote. thanks! :)
<mark_weaver>I'll poke at it soon :)
<wingo>excellent, happy hacking :)
<wingo>i think right now it is ifdeffed out because things were different in a previous version of the rtl vm, but now they are much more similar to the stack vm so it shouldn't be too bad
<mark_weaver>thanks for squeezing in a bit more hacking in your busy week. the .go files stuff is exciting!
<wingo>right, so the continuation of "i think the cps machinery hangs together" is that it's mostly vm stuff, though surely there is some cps stuff too
<wingo>np. -> z now, night :)
<mark_weaver>sleep well!