IRC channel logs

2021-07-15.log

back to list of logs

<dsmith>dsmith-work: Poke
<apteryx>is there a way to parallelize fold in Guile?
<apteryx>another question! Is there a way to get the current-output-port terminal's width?
<daviid>apteryx: not exactly parallel fold, but see if you can use par-map instead
<daviid>note that it's only 'worth it' if proc is a long/costly operation ... as par-map itself has a cost ...
<apteryx>I see, thank you!
<apteryx>the algorithm records the lengths of words in a list of list
<apteryx>it's used for presenting tabulated data neatly
<apteryx>list of list are the table rows
<apteryx>list of lists*
<apteryx>it currently looks like this: https://paste.debian.net/1204411/
<apteryx>here with profiling data (not just for that bit, but the whole application, but it's definitely on the hot path, raising total execution time from about 2 to 12 s) https://paste.debian.net/1204412/
<apteryx>hmm, perhaps I'm reading that profiling data wrong, it's not clear that it's on the hot path from it.
***sneek_ is now known as sneek
<apteryx>Yet another question; how can I create a string that repeats "~a\t" for N times?
<apteryx>I came up with this: https://paste.debian.net/1204414/, although it feels a far cry from Python's number_of_columns * "~a\t"
<lampilelo>why do you need this format string? can't you write a function that will create a string from real data?
<flatwhatson>what about: (string-join (map display ls) "\t")
<flatwhatson>otherwise: (apply string-append (make-list n "~a\t"))
***sneek_ is now known as sneek
<chrislck>apteryx: you'll probably want to upgrade from map/make-list/length eventually as far as possible. consider (map fn1 (map fn2 lst)) gets slow if lst is large -- it creates intermediate list. better: (map (compose fn1 fn2) lst)
<apteryx>sneek: later tell lampilo the format string is used to pretty print table columns so they all appear aligned
<sneek>Got it.
<apteryx>flatwhatson: (apply string-append (make-list n "~a\t")) that's much better than what I had, thank you.
<apteryx>chrislck: the list of rows needs to be scanned at least twice; once to discover the maximum column widths, and a second time to print the rows
<apteryx>I've tried moving the drop-right into the fold kons, but it slowed by 1 sec in my tests. Strange.
<apteryx>I'm starting to think that simple, \t separated yet jagged columns are not that bad. It takes less screen real estate and works reliably with 'cut'.
<chrislck>(string-concatenate (make-list n "~a\t"))
<apteryx>even better!
<apteryx>thanks :-)
<chrislck>tip: building a long string via string-append and string-concatenate isn't that clever -- it allocates numerous strings, and needs to allocate a big string at the end. check the source for string-replace-substring in guile sources to see what wingo recommends :)
<apteryx>OK!
<chrislck>hint: (with-output-to-string (lambda () (let lp ((idx 100)) (unless (zero? idx) (display "a") (lp (1- idx)))))) *may* be surprisingly faster than (string-concatenate (make-list 100 "a"))...
<apteryx>interesting
<chrislck>ok maybe not: with-output-to-string with 4x10^6 elements does 1.813090s real time, 1.811567s run time. 0.000000s spent in GC.
<dsmith-work>Thursday Greetings, Guilers
<chrislck>whereas string-contatenate scores: 0.509411s real time, 0.562900s run time. 0.349726s spent in GC.
<chrislck>see, much more gc
<dsmith-work>apteryx: Looks like you are using format? Have you looked at ~{ ~} for iteration?
<RhodiumToad>how much of that GCing is attributable to make-list?
<apteryx>I've simplified it a bit, it now looks like: https://paste.debian.net/1204496/
<apteryx>rows is a very long list (17000 entries about) of few items (4 columns)
<apteryx>the only costly operations (due to the large 'rows' list) should be: (map (cut drop-right <> 1) rows) as well as the fold, and finally the formatting of each row (the full list is iterated thrice). Interestingly moving the drop-right inside the fold doesn't improve things.
<RhodiumToad>thrice?
<RhodiumToad>oh I see
<RhodiumToad>that (map (cut drop-right ...) rows) seems like useless overhead
<RhodiumToad>why not just calculate all the column widths and leave the last one off?
<RhodiumToad>this function is doing a lot of consing
<apteryx>RhodiumToad: I tried computing column-widths for all column (including the last one), and simply using excluding the last column at the time of computing column-formats; this caused about 1.7 s extra to be used. Any optimization I try seem to make things worse, eh.
<apteryx>perhaps what I have is already close to the true cost of using format on a 17000 something list and can't be made much faster.
<RhodiumToad>bets?
<apteryx>hehe
<apteryx>I'd be happy to be proven otherwise :-)
<RhodiumToad>working on it
*apteryx is thrilled
<RhodiumToad>I assume the return value is not interesting?
<apteryx>it isn't!
<apteryx>(I'm now using for-each)
<RhodiumToad>the final (map) in your version is incorrect anyway
<RhodiumToad>interesting. for-each to repeatedly call format for each row is faster than using a ~:{...~} specifier within format itself
<apteryx>that's surprising
<apteryx>what was incorrect in the last map of my version? I fail to see it.
<RhodiumToad>should have been for-each or map-in-order, since you're relying on the side effect of (format) and not its result
<RhodiumToad>map doesn't guarantee order of evaluation
<RhodiumToad>do you care what happens if there are no columns?
<apteryx>I don't, as there wouldn't be anything to print
<daviid>if you only use "~a\t", you can use simple-format (and see if it is faster to,don'tknow)
<apteryx>it doesn't support padding I think (as in ~46a\t)
<apteryx>which in the manual is called the minwidth parameter
<RhodiumToad>meh. it looks like optimization is mostly moot since all the time is spent in the final (format)
<ss2>Hello!
*dsmith-work waves
<RhodiumToad>yeah. the actual (format #t ...) takes something like 95% of the runtime
<ss2>I just noticed, that when I copy a number from geiser to calc, Emacs will enter a ‘Text is read-only’ state.
<ss2>hangon, might try this again with emacs -q. :)
<ss2>yeah, it is the same there. Is this a bug?
<apteryx>RhodiumToad: ah! Thanks for checking :-)
<apteryx>perhaps an interesting read, related: https://bug-guile.gnu.narkive.com/Hd5JeB63/bug-12033-format-should-be-faster
<apteryx>Ludovic's findings there says (ice-9 format) is an order of magnitude slower than simple-format
<apteryx>and simple-format was fixed at that time to become 15% faster than 'display'.
<dsmith-work>I wonder if the recent optimizations for case/cond will improve (ice-9 format). Probably not.
<apteryx>not totally fair because I'm not starting the whole application and simply reading the data from a file before printing it, but with Python it takes ~3 s instead of ~6 s.
<apteryx>using the following script: https://paste.debian.net/1204513/
<apteryx>doing '(setvbuf (current-output-port) 'block)' reduces the time to under 5 s with Guile :-)
<apteryx>or close to 5 s at least
<apteryx>I think it must be close to Python, factoring in the startup time of Guix itself.
<dsmith-work>apteryx: Guix or Guile?
<rlb>apteryx: if you know exactly what you want, and it isn't hard, I suppose you could just build the strings you want and use put-string, etc. -- see if that's faster.
<rlb>...and if you know it's all ascii, could even do something more primitive and use put-bytevector, though I hope there's not too much difference there.
<rlb>Ideally, though (or also), we'll eventually improve format.
<ArneBab>wingo: did you see the GC + Java finalization question in the mailing list?
<daviid>apteryx: can you paste your guile script 'of the above python' as well