IRC channel logs

2022-07-06.log

back to list of logs

<zimoun>hi!
<efraim>hi!
<rekado_>I’d like to make a new GWL release this week.
<rekado_>just a maintenance release, no big new features
<rekado_>if you have some bug that you’d like to see fixed before the release, please write to gwl-devel@gnu.org
<zimoun>rekado_: about teams, you are asking people to register but some do not have commit access to do so. Like me. ;-)
<rekado_>I prefer to have these people send patches.
<rekado_>I don’t want to add people all by myself.
<rekado_>a patch sent from their email address shows explicit consent
<rekado_>I don’t want to have to deal with people contesting their addition to that file in the future.
<zimoun>I understand, it makes sense. So I will do. :-)
<rekado_>thanks!
<rekado_>a secondary reason to ask for people to make the change by themselves (for those with commit access) is that I want to avoid the impression of “being in charge”. This is a collaborative effort — especially how to organize —, and I don’t want to turn into some sort of manager :)
<zimoun>yeah, I agree.
<rekado_>zimoun: I’m updating mumi again. Fixed a bug with the msgid search…
<zimoun>rekado_, cool! Ah I get «Resource not found: http://localhost:1234/msgid/HAv_FWZRa_dCADY» so I thought the feature was not fully deployed yet. Is it?
<rekado_>reconfiguring now
*rekado_ packages torchvision now
*zimoun says rekado_ knows how to keep busy. ;-)
<rekado_>deployed it but it doesn’t find issues with message id
<rekado_>maybe it didn’t index the data files correctly
<rekado_>gotta check this later
<rekado_>oh my… look at this: https://github.com/sdparekh/zUMIs
<rekado_>“zUMIs now comes with its own miniconda environment, so you do not need to deal with dependencies or installations”
<rekado_>but there’s no conda environment declaration… and then you see zUMIs-miniconda.partaa, zUMIs-miniconda.partab … down to zUMIs-miniconda.partah
<rekado_>each file (but the last) is about 100MB in size.
<rekado_>this *is* the conda environment
<zimoun>all from “2 years ago“; the environment is fixed, no? ;-)
<rekado_>I noped right out of there.
<rekado_>in cases like this I exercise my right to say ‘no’
<zimoun>zUMIs-miniconda.partaa: bzip2 compressed data, block size = 900k
<zimoun>well, I hope that RNAseq pipeline is not used in production
<rekado_>I’m surprised to see how very quick the mumi indexer runs on berlin.
<rekado_>suspicious
<rekado_>looks like it prints all progress updates first … and then takes its sweet time to update the xapian database
<rekado_>oh well
<kennyballou>how might I take a GWL workflow to a guix-less cluster? Let's say I've defined a workflow using GWL but want/need to run it on a set of machines which don't have GUIX? Is my only option as of now to export relocatable packs of the environment and transfer the result? Furthermore, I would also need to develop separate SBATCH scripts (assuming SLURM)? I think, I have the answer, but I'm curious if there's any other cool features I'm
<kennyballou>not aware of
<rekado_>kennyballou: yes, we currently don’t have ‘guix workflow pack’
<rekado_>we’ve got a lot of parts that are needed to implement it
<rekado_>but someone needs to tie them all together
<kennyballou>rekado_: could you point them out? I probably can't justify it yet, but I would be curious to see if I could in some down cycles, so to speak
<rekado_>haven’t looked yet
<rekado_>but much of it is in ‘guix pack’
<kennyballou>okay, thank you
<rekado_>and ‘guix workflow run --prepare’, which builds all the workflow scripts
<rekado_>some months I changed the way these scripts work, so they should not depend on Guix at submission time
<rekado_>(they take arguments that the GWL or anyone else could provide)
<kennyballou>you mean the generated scm files from `--prepare`?
<rekado_>yes
<rekado_>zimoun: figured out why the msgid stuff doesn’t work. It’s the query string parser.
<rekado_>zimoun: it chops up the message id (separating it at non-word characters) instead of taking it verbatim
*rekado_ updates python-pytorch
<rekado_>currently fails somewhere around 1031/2273 because xnn needs to be updated
<zimoun>rekado_: about GWL and pack, do you have in mind one big pack containings all the tools or multi packs (one per process say)?
<zimoun>Is --preprare output a driving script calling all the process script?
<zimoun>rekado_: thanks for working on mumi.
<rekado_>zimoun: the /msgid route and the msgid: search should now work.
<rekado_>I tried with a few msgids and it worked fine
<rekado_>could be that some msgids won’t work because of URL encoding problems
<rekado_>if you find one please let me know and I’ll try to fix them
<rekado_>re GWL pack: I don’t know yet.
<rekado_>I’m not sure what would be best.
<rekado_>it would certainly be convenient to have a big pack containing all tools and scripts.
<rekado_>like “guix pack” but with a little extra to run the workflow
<rekado_>zimoun: example for /msgid route: https://issues.guix.gnu.org/msgid/20220531160916.21508-1-ludo@gnu.org
<rekado_>leads to https://issues.guix.gnu.org/issue/55499#msgid-44a52748e5d5011b74a081cbf502f3495955b691
<rekado_>in the xapian database we store not the msgids directly but only their hashes in base16 format
<rekado_>the hashes are used as HTML anchors, because they are predictable and don’t contain invalid characters.
<zimoun>it seems to work, https://issues.guix.gnu.org/msgid/20220706180537.406140-1-zimon.toutoune@gmail.com
<zimoun>cool!
<zimoun>the redirection is really helpful!
<zimoun>re GWL pack: on some clusters, they run one pack per tool; in Snakemake words, one container per rule.
<zimoun>This way, it is possible to easily change a tool (version, compilation option, etc.) without rebuilding a big container.
<zimoun>although docker-compose could help; but I do not know about singularity-compose
<rekado_>having more packs can be useful, but one problem is that you end up with a lot of duplication
<rekado_>currently the GWL still assumes that a shared file system is used
<rekado_>that’s not necessarily the case, especially not in the use case that containers are supposed to address: execution on rented VMs (aka cloud)
<zimoun>I miss if or how the orchestration script is exported with --preserve. Because running GWL on the top of Guix, it is somehow Guix which orchesrates the processes, right?
<rekado_>don’t know.
<rekado_>it’s the GWL’s main that orchestrates things
<rekado_>the idea of GWL pack is that one doesn’t need the Guix daemon or a /gnu/store
<rekado_>but one would still need the GWL itself – or a part of it – to run the workflow
<rekado_>to submit it with DRMAA, etc
<zimoun>About the /gnu/store, I see; an archive with the closure whatever the format (tar, Docker, Singularity, etc.)
<zimoun>About GWL, I miss how it could work without Guix and the guix-daemon.
<zimoun>All the processes can be exported to a script and a pack can provide the tools.
<zimoun>But it remains the processes orchestrator which is somehow Guix.
<rekado_>the orchestator happens to use Guix as a library, but it doesn’t need a Guix installation at runtime.
<rekado_>(this would only work by bypassing re-computation of derivations with a cache)
<rekado_>I want to build a cache anyway to avoid recomputation of job scripts when nothing has changed (same version of Guix channels, same workflow files)