IRC channel logs

<dsmith>dsmith-work: Poke

<apteryx>is there a way to parallelize fold in Guile?

<apteryx>another question! Is there a way to get the current-output-port terminal's width?

<daviid>apteryx: not exactly parallel fold, but see if you can use par-map instead

<daviid>note that it's only 'worth it' if proc is a long/costly operation ... as par-map itself has a cost ...

<apteryx>I see, thank you!

<apteryx>the algorithm records the lengths of words in a list of list

<apteryx>it's used for presenting tabulated data neatly

<apteryx>list of list are the table rows

<apteryx>list of lists*

<apteryx>it currently looks like this: https://paste.debian.net/1204411/

<apteryx>here with profiling data (not just for that bit, but the whole application, but it's definitely on the hot path, raising total execution time from about 2 to 12 s) https://paste.debian.net/1204412/

<apteryx>hmm, perhaps I'm reading that profiling data wrong, it's not clear that it's on the hot path from it.

***sneek_ is now known as sneek

<apteryx>Yet another question; how can I create a string that repeats "~a\t" for N times?

<apteryx>I came up with this: https://paste.debian.net/1204414/, although it feels a far cry from Python's number_of_columns * "~a\t"

<lampilelo>why do you need this format string? can't you write a function that will create a string from real data?

<flatwhatson>what about: (string-join (map display ls) "\t")

<flatwhatson>otherwise: (apply string-append (make-list n "~a\t"))

***sneek_ is now known as sneek

<chrislck>apteryx: you'll probably want to upgrade from map/make-list/length eventually as far as possible. consider (map fn1 (map fn2 lst)) gets slow if lst is large -- it creates intermediate list. better: (map (compose fn1 fn2) lst)

<apteryx>sneek: later tell lampilo the format string is used to pretty print table columns so they all appear aligned

<sneek>Got it.

<apteryx>flatwhatson: (apply string-append (make-list n "~a\t")) that's much better than what I had, thank you.

<apteryx>chrislck: the list of rows needs to be scanned at least twice; once to discover the maximum column widths, and a second time to print the rows

<apteryx>I've tried moving the drop-right into the fold kons, but it slowed by 1 sec in my tests. Strange.

<apteryx>I'm starting to think that simple, \t separated yet jagged columns are not that bad. It takes less screen real estate and works reliably with 'cut'.

<chrislck>(string-concatenate (make-list n "~a\t"))

<apteryx>even better!

<apteryx>thanks :-)

<chrislck>tip: building a long string via string-append and string-concatenate isn't that clever -- it allocates numerous strings, and needs to allocate a big string at the end. check the source for string-replace-substring in guile sources to see what wingo recommends :)

<apteryx>OK!

<chrislck>hint: (with-output-to-string (lambda () (let lp ((idx 100)) (unless (zero? idx) (display "a") (lp (1- idx)))))) *may* be surprisingly faster than (string-concatenate (make-list 100 "a"))...

<apteryx>interesting

<chrislck>ok maybe not: with-output-to-string with 4x10^6 elements does 1.813090s real time, 1.811567s run time. 0.000000s spent in GC.

<dsmith-work>Thursday Greetings, Guilers

<chrislck>whereas string-contatenate scores: 0.509411s real time, 0.562900s run time. 0.349726s spent in GC.

<chrislck>see, much more gc

<dsmith-work>apteryx: Looks like you are using format? Have you looked at ~{ ~} for iteration?

<RhodiumToad>how much of that GCing is attributable to make-list?

<apteryx>I've simplified it a bit, it now looks like: https://paste.debian.net/1204496/

<apteryx>rows is a very long list (17000 entries about) of few items (4 columns)

<apteryx>the only costly operations (due to the large 'rows' list) should be: (map (cut drop-right <> 1) rows) as well as the fold, and finally the formatting of each row (the full list is iterated thrice). Interestingly moving the drop-right inside the fold doesn't improve things.

<RhodiumToad>thrice?

<RhodiumToad>oh I see

<RhodiumToad>that (map (cut drop-right ...) rows) seems like useless overhead

<RhodiumToad>why not just calculate all the column widths and leave the last one off?

<RhodiumToad>this function is doing a lot of consing

<apteryx>RhodiumToad: I tried computing column-widths for all column (including the last one), and simply using excluding the last column at the time of computing column-formats; this caused about 1.7 s extra to be used. Any optimization I try seem to make things worse, eh.

<apteryx>perhaps what I have is already close to the true cost of using format on a 17000 something list and can't be made much faster.

<RhodiumToad>bets?

<apteryx>hehe

<apteryx>I'd be happy to be proven otherwise :-)

<RhodiumToad>working on it

*apteryx is thrilled

<RhodiumToad>I assume the return value is not interesting?

<apteryx>it isn't!

<apteryx>(I'm now using for-each)

<RhodiumToad>the final (map) in your version is incorrect anyway

<RhodiumToad>interesting. for-each to repeatedly call format for each row is faster than using a ~:{...~} specifier within format itself

<apteryx>that's surprising

<apteryx>what was incorrect in the last map of my version? I fail to see it.

<RhodiumToad>should have been for-each or map-in-order, since you're relying on the side effect of (format) and not its result

<RhodiumToad>map doesn't guarantee order of evaluation

<RhodiumToad>do you care what happens if there are no columns?

<apteryx>I don't, as there wouldn't be anything to print

<daviid>if you only use "~a\t", you can use simple-format (and see if it is faster to,don'tknow)

<apteryx>it doesn't support padding I think (as in ~46a\t)

<apteryx>which in the manual is called the minwidth parameter

<RhodiumToad>meh. it looks like optimization is mostly moot since all the time is spent in the final (format)

<ss2>Hello!

*dsmith-work waves

<RhodiumToad>yeah. the actual (format #t ...) takes something like 95% of the runtime

<ss2>I just noticed, that when I copy a number from geiser to calc, Emacs will enter a ‘Text is read-only’ state.

<ss2>hangon, might try this again with emacs -q. :)

<ss2>yeah, it is the same there. Is this a bug?

<apteryx>RhodiumToad: ah! Thanks for checking :-)

<apteryx>perhaps an interesting read, related: https://bug-guile.gnu.narkive.com/Hd5JeB63/bug-12033-format-should-be-faster

<apteryx>Ludovic's findings there says (ice-9 format) is an order of magnitude slower than simple-format

<apteryx>and simple-format was fixed at that time to become 15% faster than 'display'.

<dsmith-work>I wonder if the recent optimizations for case/cond will improve (ice-9 format). Probably not.

<apteryx>not totally fair because I'm not starting the whole application and simply reading the data from a file before printing it, but with Python it takes ~3 s instead of ~6 s.

<apteryx>using the following script: https://paste.debian.net/1204513/

<apteryx>doing '(setvbuf (current-output-port) 'block)' reduces the time to under 5 s with Guile :-)

<apteryx>or close to 5 s at least

<apteryx>I think it must be close to Python, factoring in the startup time of Guix itself.

<dsmith-work>apteryx: Guix or Guile?

<rlb>apteryx: if you know exactly what you want, and it isn't hard, I suppose you could just build the strings you want and use put-string, etc. -- see if that's faster.

<rlb>...and if you know it's all ascii, could even do something more primitive and use put-bytevector, though I hope there's not too much difference there.

<rlb>Ideally, though (or also), we'll eventually improve format.

<ArneBab>wingo: did you see the GC + Java finalization question in the mailing list?

<daviid>apteryx: can you paste your guile script 'of the above python' as well

IRC channel logs

2021-07-15.log