IRC channel logs

2023-10-26.log

back to list of logs

<zimoun_>rekado: about “guix gc”, I often do “guix gc -D $(guix gc --list-dead | grep tensorflow)”
<zimoun_>What is missing seems some --dry-run option, maybe
<rekado>zimoun_: “guix gc” will not only do what you ask it to do
<rekado>it will also *always* delete bad symlinks
<rekado>I’ll find the relevant location in the code in a few mins
<rekado>zimoun_: do you know protocols.io?
<rekado>“Although Guix has some exciting functionality, there remains a relative lack of step-by-step guides and tutorials, illustrated by the complete absence of published Guix protocols in protocols.io”
<rekado>from here: https://academic.oup.com/bib/article/24/6/bbad375/7326135
<rekado>re guix gc: actually, there is more than one place in nix/libstore/gc.cc where files are removed indiscriminately
<rekado>when guix-daemon runs on an isolated system (without read-access to users’ home directories or other cluster file systems) then “guix gc” will always remove more links than it should
<rekado>it will thus always consider more things as garbage than it should
<zimoun_>aah I do not know about “without read-access to users’ home directories”
<zimoun_>Well, as you said something is missing thus. :-)
<zimoun_>rekado: about protocols.io, no I did not know before this article.
<zimoun_>I know Pierre Poulain, SWH ambassador and Pierre works in the same univ. as mine, we meet here or there. :-)
<zimoun_>Well, if you want to know details more about the paper, give a look at the repo https://github.com/markziemann/5pillars
<rekado>for me this is the biggest open issue with a shared Guix installation on HPC
<rekado>because it requires cooperation from users to not ever create profiles in locations that are inaccessible to the daemon’s server.
<rekado>this is a deployment detail that I would rather not burder my users with.
<rekado>re protocols.io: I always find it weird when papers refer to arbitrary commercial platforms as a signifier of *anything*
<rekado>“they are not on discord, so the project is hostile to new users”
<rekado>“there is no official image on dockerhub”
<rekado>is protocols.io so popular among a wide variety of researchers that the absence of Guix on this platform can really mean anything?
<zimoun_>I do not know. I was not aware about protocols.io so it is hard for me to judge if it is popular. ;-)
<zimoun_>rekado: have you noticed that emacs-ess is broken since Emacs 29? Because of emacs-julia-mode (I am working on it :-))
<rekado>here’s an example of a protocol mentioning Conda: https://www.protocols.io/view/installation-instructions-for-rna-seq-analysis-usi-n2bvjye9nvk5/v1
<zimoun_>I mean, not much Emacs users in MDC?
<rekado>it doesn’t look like it would be worth putting a Guix-based recipe there.
<rekado>zimoun_: sadly, nobody here uses Emacs with ESS
<rekado>they all use RStudio, almost exclusively
<rekado>(many don’t seem to know that you don’t actually need RStudio to use R.)
<zimoun_>Ahah! I see.
<rekado>I think this is the only instance of unlink that really hurts me: https://git.savannah.gnu.org/cgit/guix.git/tree/nix/libstore/gc.cc#n295
<civodul>how can unlink hurt you?
<rekado>here’s an example: https://elephly.net/paste/1698323184.html
<rekado>this is a chunk of a listing of /var/guix/gcroots/auto/ on the server that hosts the guix-daemon
<rekado>on that server /home and /fast don’t exist
<civodul>warranty void :-)
<civodul>i think in https://guix.gnu.org/cookbook/en/html_node/Installing-Guix-on-a-Cluster.html we recommend mounting /home on the head node
<civodul>which is arguably not great, but that’s the only thing that can guarantee that GC roots are visible
<rekado>if they *did* exist the daemon would not be able to read them anyway because root has no power on remote file systems
<civodul>no_root_squash?
<rekado>no
<civodul>currently one has to do that for GC roots to be visible, and i agree it kinda sucks
<civodul>(something to discuss during the panel?)
<rekado>that’s on purpose to prevent an exploited system anywhere on the network from reading user data
<civodul>yeah
<rekado>if I could protect these locations and treat them as if they were alive and valid roots then I could at least still use “guix gc” and clean up garbage that we all agree to be garbage.
<civodul>one way to improve on that would be to have ‘guix shell’ and ‘time-machine’ store GC roots under /var/guix instead of $HOME/.cache
<rekado>but the current situation is that “guix gc” deletes these auto-roots and then removes whatever they may have pointed to
<civodul>yes, that’s a real problem
<civodul>i think fixing shell and time-machine would help, because that would probably cover most use cases
<rekado>it would help with the $HOME/.cache links, but can’t fix the /fast/home links I’ve got (which is a cluster file system only available on cluster nodes)
<civodul>then maybe we could think of something fancier, like a hook so that ‘guix gc --list-live’ can query other machines
<rekado>it seems wrong to me that “readTempRoots” deletes things (because I believe its name), but I must admit that I haven’t fully understood what that procedure does exactly
<efraim>Best I can think of right now is using root to list all of the guix profiles on a machine and send that back to the machine holding all the derivations
<rekado>many of our directories are only mounted on demand
<rekado>so I’d have to impersonate all users to get these locations to be mounted
<civodul>rekado: how about a hook whereby ‘guix gc --list-roots’ and/or guix-daemon would, say, concatenate a user-provided list to the roots?
<civodul>like, somehow, you’d gather root targets on compute node, write it in a file, and send that file to the head node
<rekado>not sure if I can do this in practice but perhaps it would be enough to have a way to tell the daemon to ignore links that are under a certain prefix
<civodul>yeah, let’s think of a convenient way you could tell the daemon on the head node about GC roots registered on compute nodes
<civodul>it wouldn’t be hard to accomodate that
<rekado>I think it’s time to extend Snakemake with support for Guix
<rekado>Snakemake has first-class support for Conda. We could submit a pull request to add first-class support for Guix, assuming that Guix is available wherever the workflow runs.
<rekado>there’s a feature request for this: https://github.com/snakemake/snakemake/issues/139
<alxsim>+1 for this
<rekado>ACTION just started but is python noob
<alxsim>so there's apparently pluggin features that are pending and that might make things easier to integrate maybe? https://github.com/snakemake/snakemake/pull/2453#issuecomment-1748238026
<rekado>thanks for the pointer
<rekado>I’ve found almost all places that need changing
<rekado>here’s my WIP: https://github.com/BIMSBbioinfo/snakemake/commit/dc73a936163276c2fd576e74a3dc6e21d0db99c7
<alxsim>rekado: I'd love to help but don't have a lot of time at the moment, but happy to test stuff later next month
<civodul>oh, Guix support in Snakemake, neat