IRC channel logs

2022-10-18.log

back to list of logs

<PurpleSym>I’m nearly done with the automated CRAN channel and performance with an additional 17000 packages is pretty brutal. `guix pull` takes five minutes to build the channel’s derivation, `guix search` for a nonexistent package goes up from .6s to 1s, `guix show` is up from .1s to .13s
<PurpleSym>Hopefully the `guix pull` performance hit can be avoided via substitutes.
<civodul>woow
<PurpleSym>Here’s the channel: https://github.com/guix-science/guix-cran
<civodul>19M under guix-cran/packages :-)
<civodul>it doesn't look bad
<civodul>you could add the machinery to generate it in that repo?
<PurpleSym>I was thinking about using a separate repo for the import script so the automated and manual histories don’t mix.
<PurpleSym>But I’m standing on giant’s shoulders here. Most of the heavy-lifting is done by the CRAN importer. I’m just adding the glue to tie things together.
<PurpleSym>And this is the script: https://github.com/guix-science/guix-cran-scripts
<civodul>neat
<civodul>did you set up machinery to update the channel automatically?
<civodul>if you want i can have it built on guix.bordeaux.inria.fr
<PurpleSym>No automated updates yet, no. That’s up next on my list.
<PurpleSym>Sure, if you have the capacity to build 17k packages, go ahead. I’m building them locally right now, because I’m curious how many succeed. I’ll also build them on substitutes.guix.psychnotebook.org, because that’s where I need them.
<civodul>ok
<civodul>i'll look into setting that up then
<civodul>i'm curious to see how it fares :-)
<PurpleSym>Thanks Ludo!
<PurpleSym>Hopefully this entire endeavor will silence those who claim Guix does not have enough packages 😉
<efraim>PurpleSym: is it better if the packages are split haphazardly across different modules? I found I had terrible performance when I tried to make a channel with ~8000 packages in one file
<PurpleSym>efraim: I tried a single file, which was slightly slower (maybe module compilation is accidentally quadradic somewhere?) and one file per package, which caused Guile to implode. So, this is kind of the middle ground.
<PurpleSym>(And single file means: 17k packages in a single file.)
<zimoun>PurpleSym: awesome!
<zimoun>I am curious about the performances.
<zimoun>Interesting, yesterday evening, I was playing with “perf timechart guix search foo”
<zimoun>thanks civodul for the advice about this tool (initially for guix-daemon)
<civodul>zimoun: yup, it can be hard to make sense of the output, but it can be super helpful
<rekado>GWL now hashes input files instead of hashing metadata
<zimoun>rekado: how are the performances on large fastq input files?
<zimoun>civodul: how can I know if a tarball is covered by Disarchive?
<zimoun>old trilinos tarball disappeared (but still in ci.guix)
<zimoun>if ci.guix removes it, then I would like to know if it is covered by SWH.
<civodul>zimoun: "guix lint -c archival PKG" should do that
<civodul>if it's silent, you're fine
<civodul>we should add other tools to fiddle with Disarchive, SWH, and all that
<rekado>zimoun: I haven’t tried it, but if it’s a problem we could use a different hash algorithm.
<rekado>the bigger problem so far is a lack of Guix caching.
<rekado>for the same version of Guix and the same workflow I don’t want to recompute all scripts again and again
<zimoun>rekado: on the other hand, “guix hash -S nar -H sha256 -f base64” takes ~27s on one folder of ~11Go containing many files and directory.
<zimoun>sha1 is obviously much faster. :-)
<zimoun>So it is probably not a practical concern, compared to time to run a process. :-)