***tschwing_ is now known as tschwinge
<rekado_>the guix-daemon is so slow when creating a new profile symlink tree over NFS. <rekado_>I'm considering to just move /gnu back to the local machine and export that as a read-only filesystem over NFS, so that at least the write operations are fast again. <rekado_>this is probably the last big obstacle to letting users manage their profiles regularly via the management node. <rekado_>e.g. 10+ minutes to install vim into a profile. <rekado_>the daemon is issuing a lot of lstat, chmod, link, readlink, rename calls when building a new profile version; over NFS this seems to take forever with the number of files that are involved. <civodul>rekado_: the lstat, chmod, readlink are all done when building the derivation itself <civodul>can't you have guix-daemon access the store as a local file system <civodul>and then have the other machines mount the store over NFS? <rekado_>the management interface of the NFS server says that we get at most 400 operations per second, and around 200 ops/s when building the new profile version. <civodul>where one operation = one syscall, basically, right? <rekado_>yes, moving /gnu back to the local disks is what I'm going to do next. <civodul>yeah i think the daemon definitely needs fast access <rekado_>I wanted to use our central storage for all guix stuff, but it's just too slow. <rekado_>"build processes" --- I'm only using substitutes. <rekado_>it's link traversal and renaming of links that eats up most of the time. <civodul>yes, profiles are derivations as well, there's a build process <rekado_>I don't know what I'm talking about, though. I'm just parroting whatever my sysadmin colleague says :) <mark_weaver>rekado_: profile generation used to be a lot slower, and at some point I optimized it as well as I could without radical changes. <mark_weaver>the first idea for a radical change that comes to mind is to store in the sqlite database the complete list of files and directories in each store item. <davexunit>eventually we'll have to start making some of these radical changes I suppose. <davexunit>we've already begun to diverge from nix in other ways. <mark_weaver>the deduplication machinery as currently implemented also costs a lot of lstat calls. <mark_weaver>in theory it would be possible to move that into the sqlite database as well. <mark_weaver>unfortunately, with NFS, you are paying a lot of overhead in order to handle the worst case of a mutable filesystem, when in fact /gnu/store is mostly immutable and could be implemented much more efficiently. <rekado_>I'm annoyed by rpm, yum, and Fedora. I really want to install GuixSD on my workstation in the office. <rekado_>I'll need to check what software still needs to be packaged to be able to join the domain and do LDAP authentication. <mark_weaver>I'm currently working on cogl, on the road to clutter and totem. <mark_weaver>I have a package that builds when tests are disabled, but _all_ the tests fail with an error that probably means the library is currently not functional: "Failed to create a CoglContext: The OpenGL version could not be determined" <davexunit>mark_weaver: does it need xvfb or something? <davexunit>probably needed in order to get an opengl context <mark_weaver>if xvfb were needed, I guess I'd expect it to fail at compile time instead of runtime, no? <davexunit>not needed to compile, but needed for the test suite <mark_weaver>I guess I should post the recipe I currently have in case someone else more knowledgable of this stuff wants to poke at it. <davexunit>there's other packages that have this requirement, such as guile-sdl <davexunit>for the most part. need to do something similar for guix. <mark_weaver>davexunit: alas, adding xorg-server to native-inputs didn't change anything. <davexunit>mark_weaver: and you also ran Xvfb before running the test suite? <davexunit>no promises, but that might do the trick *if* X is really the only issue here *mark_weaver copies from guile-sdl package <mark_weaver>lots of errors like this are being printed in between the failures: "_FontTransOpen: Unable to Parse address ${prefix}/share/fonts/X11/misc/" <mark_weaver>I guess I need to pass better xorg.conf or something <mark_weaver>well, I have to put this down for a while and go afk. <davexunit>mark_weaver: I'm pretty sure I got a lot of those warnings with guile-sdl's test suite but they were bening ***tschwing_ is now known as tschwinge
<davexunit>civodul: do you think that we ought to mount /sys/fs/cgroup by default? just noticing now that this isn't the case. <rekado_>so, I want to regularly rsync a local /gnu directory to /gnu_remote (the slow NFS share). As long as these two directories are the same with a couple of minutes difference that's a usuable workaround. <rekado_>I just don't know what rsync flags to pass. <rekado_>"rsync -aHr --delete --delete-before"; not sure about whether to add -k/-K for dir symlinks. <rekado_>this works but it still takes minutes. I wonder if I could combine inotify with rsync to speed things up. <civodul>rekado_: i wonder if NFS has worthy optimizations when it's mounted read-only <davexunit>civodul: cool, I'll come up with a patch when I have the chance. <mark_wea`>civodul: I've noticed that very often, hydra offloads all of its mips jobs to hydra-slave0, leaving librenote with nothing to do. very strange. <mark_wea`>load on machine 'hydra-slave0.gnu.org' is 3.86 (normalized: 1.93) <mark_wea`>load on machine 'librenote.netris.org' is 0.01 (normalized: 0.005) <mark_wea`>and then it decides to use hydra-slave0. why? <civodul>mark_wea`: could be broken logic in 'guix offload' <mark_wea`>so now we have hydra-slave0 compiling both gcc-4.9 and 5.1, and librenote is just sitting idle. <civodul>'guix offload' must be sorting in reverse order or something <civodul>mark_wea`: i'm happy to have another eyeball on the choose-build-machine procedure, if you have time ***jchmrt_ is now known as jchmrt
<efraim>rekado_: you could mount the NFS mount async <mark_wea`>civodul: 'sort' might not cope well with the fact that the 'machine-power-factor' may return different values for a given machine during the sorting. <mark_wea`>well, I guess it shouldn't matter in this case <mark_wea`>but it would be good to avoid sampling a machine's load more than once during the sorting. <mark_wea`>well, I'll have to look at this more closely. <mark_wea`>hmm, I lost my connection to freenode about an hour ago, but mark_weaver is still here. <mark_wea`>the procedure returned by 'undecorate' should return a boolean, not an element. <mark_wea`>so it should just return (pred machine1 machine2) <mark_wea`>but also, this should be improved to only sample the load of each machine just once. <civodul>mark_wea`: re undecorate, indeed, good catch! <mark_wea`>civodul: I posted both patches to guix-devel <mark_wea`>(one to fix the bug, another to memoize 'machine-load') <mark_wea`>I'm a bit concerned about the fact that freenode thinks mark_weaver is still connected, but on my end I lost that connection over 1.5 hours ago <mark_wea`>I wonder if there's someone I can talk to about this. <mark_wea`>makes me wonder if someone hijacked my connection <davexunit>I think you do something like: /msg NickServ ghost <username> <password> <civodul>mark_wea`: just replied regarding offload, thanks <mark_weaver>davexunit: hmm, it told me that mark_weaver was not online. <mark_weaver>even though we never got notification mark_weaver disconnecting, and so my client thought mark_weaver was still online. <mark_weaver>civodul: okay, I pushed the sort fix. how best to deploy it on hydra? should I "make install" from a newly built guix from git, or just copy offload.{scm,go} into place? <mark_weaver>my guess of the appropriate configure flags is: --localstatedir=/nix/var --disable-daemon <civodul>mark_weaver: right, i used --localstatedir=/nix/var --with-store-dir=/gnu/store <civodul>you could update the daemon while you're at it <mark_weaver>oh, right, so I should *not* pass --disable-daemon. and I don't need --with-store-dir either now. <mark_weaver>civodul: given that I already have a very recent guix built on hydra with --localstatedir=/nix/var --disable-daemon, do you think it's safe to just ./configure --localstatedir=/nix/var && make, or should I make clean? <mark_weaver>I made clean from a few commits ago with --disable-daemon <mark_weaver>bah: configure: error: C++ compiler 'g++' does not support the C++11 standard <mark_weaver>I guess we'll have to build this within a pure guix environment on hydra. <mark_weaver>in the meantime, is it safe for me to 'make install' with --disable-daemon? will it leave the existing installed deamon alone and everything will continue to work? <civodul>i was wondering about the consequences if we built in an environment <mark_weaver>our new binary install method may become more important now :-/ <mark_weaver>oh, if we use the existing guix package in guix, then it will have the wrong localstatedir <mark_weaver>would there be consequences to having the guile modules in a different place? <mark_weaver>maybe it would be sufficient to make symlinks for guix and guix-daemon from /usr/local/bin? <civodul>guix-register also needs to be visible <civodul>so basically, we could (1) install Guix in a profile, (2) symlink guix and guix-{daemon,register} in /usr/local, (3) symlink /var/guix <mark_weaver>civodul: if I run "./pre-inst-end guix build guix" from my git checkout on hydra, will hydra still show the build logs for it when master is later evaluated? <civodul>mark_weaver: i think it won't show it, but i can't remember why ***y is now known as init
<davexunit>civodul: it seems that on other distros it's systemd that creates the cgroup hierarchy. do you think it's still okay for us to do during boot time? <davexunit>I was thinking immediately after mounting /sys in mount-essential-file-systems <davexunit>cgroups aren't "essential", so I'll write a mount-cgroups procedure that's run after mount-essential-file-systems