IRC channel logs

<rekado_>I’d like to make a new GWL release this week.

<rekado_>just a maintenance release, no big new features

<rekado_>if you have some bug that you’d like to see fixed before the release, please write to gwl-devel@gnu.org

<zimoun>rekado_: about teams, you are asking people to register but some do not have commit access to do so. Like me. ;-)

<rekado_>I prefer to have these people send patches.

<rekado_>I don’t want to add people all by myself.

<rekado_>a patch sent from their email address shows explicit consent

<rekado_>I don’t want to have to deal with people contesting their addition to that file in the future.

<zimoun>I understand, it makes sense. So I will do. :-)

<rekado_>thanks!

<rekado_>a secondary reason to ask for people to make the change by themselves (for those with commit access) is that I want to avoid the impression of “being in charge”. This is a collaborative effort — especially how to organize —, and I don’t want to turn into some sort of manager :)

<zimoun>yeah, I agree.

<rekado_>zimoun: I’m updating mumi again. Fixed a bug with the msgid search…

<zimoun>rekado_, cool! Ah I get «Resource not found: http://localhost:1234/msgid/HAv_FWZRa_dCADY» so I thought the feature was not fully deployed yet. Is it?

<rekado_>reconfiguring now

*rekado_ packages torchvision now

*zimoun says rekado_ knows how to keep busy. ;-)

<rekado_>deployed it but it doesn’t find issues with message id

<rekado_>maybe it didn’t index the data files correctly

<rekado_>gotta check this later

<rekado_>oh my… look at this: https://github.com/sdparekh/zUMIs

<rekado_>“zUMIs now comes with its own miniconda environment, so you do not need to deal with dependencies or installations”

<rekado_>but there’s no conda environment declaration… and then you see zUMIs-miniconda.partaa, zUMIs-miniconda.partab … down to zUMIs-miniconda.partah

<rekado_>each file (but the last) is about 100MB in size.

<rekado_>this *is* the conda environment

<zimoun>all from “2 years ago“; the environment is fixed, no? ;-)

<rekado_>I noped right out of there.

<rekado_>in cases like this I exercise my right to say ‘no’

<zimoun>zUMIs-miniconda.partaa: bzip2 compressed data, block size = 900k

<zimoun>well, I hope that RNAseq pipeline is not used in production

<rekado_>I’m surprised to see how very quick the mumi indexer runs on berlin.

<rekado_>suspicious

<rekado_>looks like it prints all progress updates first … and then takes its sweet time to update the xapian database

<rekado_>oh well

<kennyballou>how might I take a GWL workflow to a guix-less cluster? Let's say I've defined a workflow using GWL but want/need to run it on a set of machines which don't have GUIX? Is my only option as of now to export relocatable packs of the environment and transfer the result? Furthermore, I would also need to develop separate SBATCH scripts (assuming SLURM)? I think, I have the answer, but I'm curious if there's any other cool features I'm

<kennyballou>not aware of

<rekado_>kennyballou: yes, we currently don’t have ‘guix workflow pack’

<rekado_>we’ve got a lot of parts that are needed to implement it

<rekado_>but someone needs to tie them all together

<kennyballou>rekado_: could you point them out? I probably can't justify it yet, but I would be curious to see if I could in some down cycles, so to speak

<rekado_>haven’t looked yet

<rekado_>but much of it is in ‘guix pack’

<kennyballou>okay, thank you

<rekado_>and ‘guix workflow run --prepare’, which builds all the workflow scripts

<rekado_>some months I changed the way these scripts work, so they should not depend on Guix at submission time

<rekado_>(they take arguments that the GWL or anyone else could provide)

<kennyballou>you mean the generated scm files from `--prepare`?

<rekado_>yes

<rekado_>zimoun: figured out why the msgid stuff doesn’t work. It’s the query string parser.

<rekado_>zimoun: it chops up the message id (separating it at non-word characters) instead of taking it verbatim

*rekado_ updates python-pytorch

<rekado_>currently fails somewhere around 1031/2273 because xnn needs to be updated

<zimoun>rekado_: about GWL and pack, do you have in mind one big pack containings all the tools or multi packs (one per process say)?

<zimoun>Is --preprare output a driving script calling all the process script?

<zimoun>rekado_: thanks for working on mumi.

<rekado_>zimoun: the /msgid route and the msgid: search should now work.

<rekado_>I tried with a few msgids and it worked fine

<rekado_>could be that some msgids won’t work because of URL encoding problems

<rekado_>if you find one please let me know and I’ll try to fix them

<rekado_>re GWL pack: I don’t know yet.

<rekado_>I’m not sure what would be best.

<rekado_>it would certainly be convenient to have a big pack containing all tools and scripts.

<rekado_>like “guix pack” but with a little extra to run the workflow

<rekado_>zimoun: example for /msgid route: https://issues.guix.gnu.org/msgid/20220531160916.21508-1-ludo@gnu.org

<rekado_>leads to https://issues.guix.gnu.org/issue/55499#msgid-44a52748e5d5011b74a081cbf502f3495955b691

<rekado_>in the xapian database we store not the msgids directly but only their hashes in base16 format

<rekado_>the hashes are used as HTML anchors, because they are predictable and don’t contain invalid characters.

<zimoun>it seems to work, https://issues.guix.gnu.org/msgid/20220706180537.406140-1-zimon.toutoune@gmail.com

<zimoun>cool!

<zimoun>the redirection is really helpful!

<zimoun>re GWL pack: on some clusters, they run one pack per tool; in Snakemake words, one container per rule.

<zimoun>This way, it is possible to easily change a tool (version, compilation option, etc.) without rebuilding a big container.

<zimoun>although docker-compose could help; but I do not know about singularity-compose

<rekado_>having more packs can be useful, but one problem is that you end up with a lot of duplication

<rekado_>currently the GWL still assumes that a shared file system is used

<rekado_>that’s not necessarily the case, especially not in the use case that containers are supposed to address: execution on rented VMs (aka cloud)

<zimoun>I miss if or how the orchestration script is exported with --preserve. Because running GWL on the top of Guix, it is somehow Guix which orchesrates the processes, right?

<rekado_>don’t know.

<rekado_>it’s the GWL’s main that orchestrates things

<rekado_>the idea of GWL pack is that one doesn’t need the Guix daemon or a /gnu/store

<rekado_>but one would still need the GWL itself – or a part of it – to run the workflow

<rekado_>to submit it with DRMAA, etc

<zimoun>About the /gnu/store, I see; an archive with the closure whatever the format (tar, Docker, Singularity, etc.)

<zimoun>About GWL, I miss how it could work without Guix and the guix-daemon.

<zimoun>All the processes can be exported to a script and a pack can provide the tools.

<zimoun>But it remains the processes orchestrator which is somehow Guix.

<rekado_>the orchestator happens to use Guix as a library, but it doesn’t need a Guix installation at runtime.

<rekado_>(this would only work by bypassing re-computation of derivations with a cache)

<rekado_>I want to build a cache anyway to avoid recomputation of job scripts when nothing has changed (same version of Guix channels, same workflow files)

IRC channel logs

2022-07-06.log