<PurpleSym>rekado: We’re thinking about how to provide the entire CRAN repository to our Guix users. One idea was to just build an automated Guix channel, which imports the entire CRAN on a regular basis. Any thoughts? ***civodul` is now known as civodul
<civodul>PurpleSym: i have a dream (might be a nightmare actually) of importing things on the fly <civodul>though rekado always reminds me that importers are imperfect and that this would fail in many cases <civodul>i guess it would work for pure R packages, which is already something <PurpleSym>Sure, that's another option, civodul. What are the pros and cons of either way? I guess with a channel we could use the time machine to travel. <PurpleSym>(And it's okay if it works 90% of the time.) <civodul>the generated channel is probably easier to implement <civodul>like there's pretty much no additional development to be done <civodul>it's just less fancy, less "elegant" <PurpleSym>Correct, guix import would do the heavy lifting either way. <PurpleSym>I'm slightly worried about having multiple versions of the same package in different sources (proper+channel). <rekado>PurpleSym: I’m doing this kind of thing with guix.install <rekado>it uses guix import when a package isn’t available yet, adds a user module ~/.Rguix to the GUIX_PACKAGE_PATH, and then installs it from there. <zimoun>PurpleSym: many CRAN and Bioconductor are already in Guix proprer. <zimoun>A massive import could work for CRAN because the metadata is more or less clean and so the guix import cran works well, IMHO. <zimoun>rekado wrote a tiny script for listing the missing Bioconductor packages, then it is easy to feed ‘guix import cran -a bioconductor’. <rekado>yes, CRAN import works pretty well, which is why I was confident enough to publish guix.install() <rekado>I’m having a problem with the texlive-fonts-map hook — it doesn’t run even though texlive-base exists in the profile <rekado>I’d love to see all CRAN packages in Guix <rekado>it’s not a lot of work, but it’s tedious. <rekado>it also makes bulk updates a little more involved — but that’s my problem <rekado>I have the same wish for texlive — getting all those packages into Guix would be lovely <rekado>and automatically check them all for completeness <rekado>I think I found a bug relating to profile hooks when substituting derivations <rekado>still need to gather details but the effect is real: one profile builds the font maps hook, the other does not. <rekado>the manifests of these profiles are almost exactly the same (only difference is provenance info), but the output of ‘guix gc -R’ on the profile derivations is vastly different <PurpleSym>rekado: I’ve seen guix.install, but it does not fit our workflow, I believe. (We don’t install packages using R, but using Guix itself via a manifest.scm.) <PurpleSym>I’d be nice if we could just have all CRAN packages in Guix, but the descriptions are probably not good enough for an automated import, right? <PurpleSym>Thus my idea to “circumvent” this quality control via a separate opt-in channel. <rekado>or we could improve our tools to make them better automatically <rekado>a lot of CRAN descriptions use incomplete sentences; we can detect and fix that. <rekado>some packages come with bundled JS that needs fixing; we can detect that. <rekado>the CRAN importer is exceptionally good, in my opinion; it wouldn’t take much to make it good enough to reduce the necessary adjustment work to zero. <PurpleSym>Let me build a list of CRAN packages tomorrow and then we’ll see how far we are in terms of coverage. <rekado>there are thousands of CRAN packages; I think we have about 1.5k R packages in total. <rekado>when things are less busy for me I’d like to continue packaging it all. <rekado>upgrades are a lot of work, though. (See r-dt.) <PurpleSym>We have 2075 packages using r-build-system and there’s 18730 packages on CRAN. Still a long way to go. <rekado>I think last I tried I gave up because of performance problems when stuffing them all in cran.scm :) <PurpleSym>Regarding packaging every single package on CRAN: There’s also the question whether this is desireable at all. If we automate most of this there’ll be no sanity checks regarding the package’s contents at all. No-one is going to look at the diffs when updating, etc. <rekado>or rather: what kind of sanity checks that cannot be automated? <PurpleSym>The Python world had all sorts of weird credential stealing malware in PyPi. Not sure how bad CRAN is in that regard. <PurpleSym>So, simply put: Do we trust these repositories? <rekado>CRAN has reviews on every package update AFAIK <rekado>Bioconductor has pretty lax license declarations, which is reason enough to be vigilant <zimoun>I think that roughly double the total number of packages in Guix (today 20k, all CRAN 18k), the performance of “guix pull” will be drastically slower. Idem for “guix time-machine”. <zimoun>I have never timed, but at best the performance (time) is linear with the number of packages. <zimoun>And the constant (slope) is already really poor with some hardware. <zimoun>Therefore, I am all for it to try it! <zimoun>It would allow to spot out some issus about the scaling up. <rekado>hah, I did not expect this conclusion after the “therefore” :) <zimoun>I cannot offer to help this massive importer (proprer or channel), but I offer to becnhmark the result. :-) <rekado>I’ll dust off my importer script and see if we can get something that could be stuffed into a channel. <PurpleSym>For benchmarking a synthetic channel with controllable variables (like number of packages, packages per file, …) would be more meaningful. <efraim>I was thinking of adding a simple checker to the gnu-build-system on core-updates that just searches the unpacked source for instances of *.min.js and spits out a warning when it sees one <efraim>similar to the one in the python-build-system and cythonized code <efraim>^^ in relation to minimized javascript in cran packages