IRC channel logs


back to list of logs

<rekado>we’re looking for freelancers to help out with development of our PiGx system; lots of Guile, and if you want AWS and web things.
<rekado>if you know someone who is looking for either a student job or a limited freelancer gig, please feel free to direct them to me
<kir0ul>A few comments maybe of interest: :)
<civodul>kir0ul: thanks for sharing!
<civodul>the comments are positive
<civodul>(i was going to write "surprisingly positive", but after all, why should i find it surprising?)
<rekado>surprising perhaps because I would have expected the HN-typical responses
<rekado>“Now next target for Guix: python-tensorflow.” — does tensorflow-lite count…?
<rekado>zimoun: I just tried to build r-shiny and it doesn’t fail.
<rekado>efraim pushed the patch you referenced, so I guess this is done now?
<rekado>FWIW whenever I touch an R package that uses minification I have started switching to esbuild.
<rekado>I had to use esbuild for a couple of packages that included JavaScript source code that could not be minified with the old uglify.
<rekado>maybe we should move minification to the r-build-system (wwhen JS files are present)
<rekado>civodul: have we “liked” todd’s comment on purpose…?
<civodul>rekado: i think i did and then wondered why i did
<civodul>he's making fun of us, right?
<efraim>oh, here's where the ding was from
<efraim>I got asked to look at python-kaleido, turns out it uses repo, gn, docker and the chromium sources to build, and is python only by chance
<rekado>civodul: yes, it’s rather condescending.
<rekado>his response thread also begins by accusing *someone* of “religious” fervor.
<rekado>I’m very tired of this inflationary use of “religious”.
<rekado>his initial response reads to me like “these Guix folks have something nice, sure, but they are pretty full of themselves and you shouldn’t take them serious”
<civodul>rekado: yes; initially i hadn't seen what it was replying to so i wasn't sure how to interpret it, but your interpretation sounds right
<civodul>where does it mention "religious"?
<rekado>this is the preamble to a thread that defends supporting proprietary software blobs, and positioning Spack as the “reasonable” choice between the “extremist” positions. (My words, not quotes.)
<rekado>this framing as Guix inhabiting an extreme position due to “religious” tenets bothers me a lot, and it’s a common tactical move
<rekado>it’s also the anti-GNU tactic
<civodul>now this is getting interesting
<civodul>there's a lot of crap in there, such as "Reading all the source is impractical and intractable"
<civodul>rekado: would you like to reply?
<civodul>i was going to go for a walk :-)
<rekado>me too :)
<civodul>ah ha!
<rekado>I don’t think it’s terribly urgent to respond :)
<rekado>we can go on our respective walks and work out a response later
<civodul>sounds good!
<rekado>perhaps it’s best to discuss the points raised here
<rekado>(to avoid another volley of back and forth online arguments…)
<civodul>the points he raises?
<civodul>i think that although those views are more popular now, they're losing ground
<civodul>both from a reproducible science viewpoint and from a security viewpoint
<civodul>it's becoming a harder sell to say "look, we have higher priorities"
<zimoun>civodul, rekado: They marks a point about “guix build guix --no-grafts --check”. Guix is not reproducible.
<zimoun>civodul, rekado: it is hard to rationally debate because there is a fundamental philisophical disagreement about what Scientific Knowledge means. Their tyhread is saying: let do engineering work. For sure but that’s not a scientific method. Their view is for HPC what cooking is for chemistry. That’s fine. It does not mean there is a good cooking against bad chemistry; or vice-versa. It is fundamentally
<zimoun>different. This is the question: if scientific method is not based on total transparency, how can we collectively verify that there is no mistake and thus the resulting knowledge is indeed grounded on universal facts.
<zimoun>civodul, rekado: last, I would like to know from tgamblin what is their solution about <> if results are *not* the same. How can I audit a possible cause if all is not transparent.
<zimoun>Last, as I said when reviewing the blog entry, concerns about security is another space and I do not fully buy the Guix argument because 1. the bootstrap is run on binary linux kernel, so still chicken-egg problem and 2. hardware is easier to compromise by vendor than software, IMHO.
<zimoun>rekado: perhaps, a philosophical disagreement is maybe “religious”. ;-)
<civodul>zimoun, if you ever read this: i agree, this is about verifiability, not reproducibility
<civodul>if i were not atheist, i'd be religious about verifiability rather than reproducibility :-)
<civodul>re security, your argument and that of Todd reminds me of:
<drakonis>rekado: i'm a student :V
<drakonis>i'd take a student job any time as an excuse to improve my guile chops
<civodul>it amounts to: why bother with this improvement since there's still that other problem
<drakonis>zimoun goes on and off here
<drakonis>we need sneek here
<drakonis>although i shouldnt be taking on more tasks than i can handle
<zimoun>civodul: about security, I will rehash it. Somehow, my point is against what do you protect? What is the surface? All this dance about bootstrap and reproductibility is against various attacks but this surface is not closed yet because kernel and hardware. For sure, bootstrap and repro helps. For one hand. :-)
<zimoun>For the other hand, what is the surface to introduce a backdoor to free software? CVE or mistake by packager speaks by themselves. Well, I am simply pragmatic here, I guess. ;-)
<zimoun>I agree that security and scientific stuff are the 2 faces of the same coin.
<zimoun>However, in the scientific case, I have something (e.g. a different plot) that triggers the audit.
<zimoun>In the security case, the attack hiddes as much as possible the something that could trigger the audit.
<Noisytoot>you missed: <civodul> zimoun, if you ever read this: i agree, this is about verifiability, not reproducibility
<Noisytoot><civodul> if i were not atheist, i'd be religious about verifiability rather than reproducibility :-)
<Noisytoot><civodul> re security, your argument and that of Todd reminds me of:
<zimoun>After this digression, I apply the well-known proverb: Speak does not cook the rice. :-)
<zimoun>I am late for a lot of “urgent” tasks (reply to rekado, efraim, review julia patches,…)
<civodul>zimoun: in both cases (science and security), i'd say you have to do as much as possible
<civodul>it's not black and white
<civodul>any improvement is worth taking
<rekado>zimoun: replying to me is *never* urgent :)
<rekado>(lest I be held to the same standard)
<rekado>drakonis: I’m not sure of the details of what “student job” really means at the MDC, but you’re welcome to send me a quick email ( and I’ll send you some info about the project and whatever I manage to figure out about the employment conditions.
<rekado>zimoun: “religious” is often used to mean “unreasonable” or “extreme”. Especially when the context is probably one of the most secular endeavors in human history. Nah, this is intentionally disparaging. Not cool.
<rekado>re — I think different HPC users simply have different requirements here.
<rekado>some of or bioinfo software randomizes things, so you cannot possibly get the same results on subsequent runs
<rekado>but this is *not* an argument for ignoring what software we put into the experiment.
<civodul>yes, right
<civodul>i'd have 5 points to reply:
<civodul>0. Is is fine to for tools and projects to have different goals, as long as it is clear to users. Calling the other views “religious” is uncalled for.
<civodul>1. We can’t build a secure software supply chain without reproducible builds—they give us verifiability.
<civodul>2. We can’t have reproducible builds unless we control the whole software stack.
<civodul>3. Does reproducible science really need bit-for-bit reproducibility, or is a laxer form or reproducibility sufficient? Bit-for-bit reproducibility is the only form that enables practical verification.
<civodul>4. Slightly off-topic because the post is not so much about Guix: Guix *is* used in #HPC and achieves performance portability (for MPI, for maths) through engineering means described at
<civodul>(done :-))
*rekado noticed some typos
<rekado>good responses
<zimoun>civodul: mouais :-) For sure, fixing all the holes is the thing to do. But I miss how fixing the tiny mouse hole would really help to protect my house when in the same time some windows is still open. Anyway!
<civodul>i think the strategy here was to depict Guix as idealist and impractical, and conversely to depict Spack as pragmatic, practical, and the only way to achieve performance
<civodul>this is what need debunking
<rekado>point 3 will likely be attacked
<civodul>actually we should clarify that we're just advocating that for software deployments
<civodul>numerical results, say, are a different beast
<civodul>zimoun: it ain't black and white, comrade! we're improving things where we can, but sure, there are other important problems to tackle, no argument here
<zimoun>rekado: Aside <>, provide an answer back to a reply when I asked something mean to me respect your time. :-)
<civodul>i can't parse this sentence
<rekado>I will also fetch myself a drink :)
<zimoun>civodul: drink is important :-)
<zimoun>«2. We can’t have reproducible builds unless we control the whole software stack.» To me, here “reproducible builds” is too strong. We can’t verify build unless we control the whole software stack. IMHO.
<rekado>oh, but this binary was downloaded from the nvidia website, and it has a version number. I control the whole software stack.
<zimoun>what does it version number?
*rekado drinks
<zimoun>build foo@A using bar@1 is it the same than build foo@A using bar@2?
*zimoun zimon drinks too
<zimoun>civodul: «reproducible science» is a tautology ;-)
<drakonis>its time for a drinking game
<rekado>from the perspective of someone who defends the use of proprietary software (not even “source available”) the binary itself is all you need to achieve reproducibility
<drakonis>controlling the whole stack by downloading opaque binaries is weird
<drakonis>sure it saves time and all that
<rekado>“after all, I cannot build the hardware myself, so we always have something that is not built from scratch”
<drakonis>wait until he hears about fpgas
<rekado>ugh, I just parsed this as “fp gas”…
<rekado>(functional programming … gas…?)
<rekado>it’s easy to disagree with the argument, but it does succeed at presenting spack as the more “reasoned”/“middle of the way” approach
<rekado>just by looking at the number of likes the intro tweet got I think this kind of reasoning is appealing to a lot of people
<zimoun>civodul: HPC cluster has to be treated as any other instrument to measure data. Do you ask that the microscope or genomic sequencer produce bit-to-bit data? And they are both a full stack of closed stuff.
<rekado>… oh, just 12 likes. Oh well.
<civodul>in HPC circles, Spack's more popular that our hippy project tho
<civodul>zimoun: i'm not a microscope maker, i work on software deployment :-)
<zimoun>treat cluster as any other experimental instrument helps to point the knot of the philosophical disagreement.
<civodul>yes, i see what you mean
<civodul>i think it's a fallacious argument tho
<civodul>or at least we should clarify that we're talking specifically about software deployment
<rekado>after all, software is something we actually *can* control really well.
<rekado>it’s almost negligent not to do it then.
<civodul>right, it's very different from chemistry in that regard
<civodul>(i suppose?)
<civodul>i think we/i did use the chemistry metaphor the other way around in the past
<rekado>at a fundamental level it’s not so different
<civodul>funny, you can convey anything with metaphors :-)
<rekado>todd speaks of results, i think, which is not what we talk about
<zimoun>it is not easy, nor impossible to define “science” and “knowledge“. It is easy to mix with engineering and technology.
<rekado>obviously with chemistry experiments you have to be really careful about what goes into your pipettes
<rekado>inputs must be controlled carefully
<rekado>with computer-based experiments one of the most important inputs is the bits we use.
<zimoun>Cooking is reproducible and provide a knowledge. Why is not considered “scientific”?
<dstolfa>i wonder how anyone using proprietary software to reproduce results of science can be so sure that the black box doesn't just get very hot, and then printf some numbers that it fetched from some database that it initially submitted the results to :)
<rekado>that’s what ML is
<todd_>you guys know we can read this stuff right?
<rekado>;) but also :-|
<civodul>hi todd_!
<civodul>we do :-)
<drakonis>how long were you watching
<todd_>I'm always watching guix-hpc :)
<civodul>todd_: see, you sparked a philosophical discussions late at night!
<todd_>I know! So exciting
<zimoun>civodul: yes, point #3 will be attacked. Even by me I think. :-)
<civodul>zimoun: yes, i withdraw point #3 and refine it to software deployment :-)
<civodul>todd_: perhaps you wanted to comment so we can reduce the number of round trips? ;-)
<todd_>So hey I think I agree with nearly all of your points. Not advocating for some sort of enlightened centrism here. We should do all we can to improve reproducibility. Which is why we're trying to make proprietary stuff as reproducible as possible *too*.
<rekado>todd_: we gotta comb our hair before making our case in front of the twitter crowd :)
<drakonis>the twitter crowd will judge you with the eyes of a god
<todd_>How do you propose we use GPUs, at least before there is an open runtime for them? You'll find it rather hard to do well w/o libcudart.
<rekado>ah, that’s where the “religious” comes from ;)
<civodul>ah, see!
<drakonis>cuda is a pain, eh?
<todd_>also note that of the forthcoming US exascale machines, we're using the GPUs with a more open software stack. It was a big deal for selection. Even AMD's stack comes with both completely open and proprietary pieces though.
<civodul>todd_: there's OpenCL, there's RoCM, but really, that's not what the article is about
<drakonis>there's a ongoing development regarding gpu runtimes
<drakonis>besides rocm and opencl, there's intel's thing as well
<civodul>though i'd certainly argue that building atop CUDA is a problem
<drakonis>and there's a cuda compiler being written in llvm that isn't proprietary
<rekado>drakonis: the point is though: practically speaking there is nothing but CUDA.
<todd_>If we didn't build atop CUDA, turnaround time for our physicsts would be a lot slower, and the AMD software stack is currently not pretty.
<drakonis>yeah and that sucks
<civodul>the article really is about what we put in a package and what implications it has
<drakonis>i'm not happy about it
<civodul>i was shocked that the PyPI package is just that: a bunch of opaque binaries someone uploaded
<todd_>yeah PyPI is that. CondaForge is not -- Im' actually pretty impressed w/what they've been able to do with public CI services
<civodul>automation is great like i wrote, but it only gives you so much
<civodul>i mean, we know we can do better
<civodul>all the projects at went through that process
<todd_>I would claim it gives you a lot more than verification right now. Since I don't see anyone actually scaling repro build verification.
<zimoun>civodul, rekado: no the microscope analogy is *not* fallacious. Eveything looks like a nail when you have a hammer. :-) Microscope is exactly the same as HPC cluster: you can control very *well* everything. Applying your arguments to HPC cluster, you are requiring that full transparent microscope.
<todd_>but that's not to say you couldn't use it as a verification mechanism one day.
<todd_>this is the tweet y'all should be focusing on:
<civodul>zimoun: i'm only applying my arguments to software deployment, the only thing i can really talk about :-)
<rekado>zimoun: FWIW here at the MDC microscopes are like sequencers; the real fun begins with software (which in the case of our microscopy lab is all written by the phd students).
<drakonis>todd_: you should look outside of the hpc crowd for this
<civodul>todd_: we can tell about cases where non-verifiability has proved very problematic though
<drakonis>the nixos folks are the other side of the coin here and they are likely to have some examples
<civodul>the bitcoin-mining flatpaks, the Zoom spyware, etc. etc.
<todd_>I know there are repro build efforts outside HPC. If the claim is that verification provides security, my question is how. Verifying that a binary maps to source only gives you security if the source is trustworthy.
<civodul>of course, but then there's collective and individual control: i as an indivual can't tell whether this particular source is doing what it claims to do
<civodul>but collectively, we can achieve that
<civodul>that's what free software is about
<drakonis>the nixos folks have builds that have pinned results
<todd_>collectively, you would need trusted verifiers.
<civodul>now, that doesn't mean we're all safe just because we have repro builds
<todd_>how do you trust them?
<todd_>just b/c they like free software doesn't make me trust random people on the internet.
<civodul>but at least, we now that it's a necessary step--why not do it, then?
<drakonis>so if the resulting build doesn't match what's expected, it fails
<todd_>I say do it!
<zimoun>civodul: this is the net of the issue when one thinks about producing scientific knowledge. How to deal with opaque tools in the chain? Do we consider these opaque tools as variability? As we do for almost any experiment?
<todd_>but in the meantime we do trust some vendors. It's the best we can do. We are not, as you say, "doomed". I think we do ok.
<drakonis>hyperbole is fun for articles
<civodul>todd_: we're doing it!
<todd_>yes and we love you guys for it :)
<civodul>thank you :-)
<dstolfa>we're doomed in general if physicists are to be trusted!
<civodul>i was afraid you were getting religious about *not* doing it
<todd_>I am not -- if you read the tweets
<zimoun>rekado, well there is a lot of opaque box in microscope. And I am not talking about Illumina sequencers. :-)
<todd_>tweets are not the greatest medium
<drakonis>social media is the worst medium
<civodul>also, i didn't write that the tools are "doomed", period
<civodul>i wrote: "these tools are doomed to be not only unsafe but also opaque"
<todd_>bitwise reproducible builds are great. If we can get to a point where we can build down to libc for spack, we will have them. IMO there are going to be places where you have to trust a binary for quite a while.
<civodul>perhaps "bound to" would be more appropriate
<todd_>and we should be able to repro those as best we can.
<civodul>not being a native speaker means lots of surprises :-)
<civodul>(also "these tools" was about the binary distros, pip & CONDA here)
<todd_>sure -- though we also rely on binaries in Spack and let people do "impure" builds.
<civodul>but i gather that it's the one sentence that triggered the reaction
<civodul>Spack is still primarily "source-based" though, so to me that's quite different
<todd_>My argument is that there is a lot you can do for reproducibility even if you cannot adopt a system that requires you to build down to libc.
<todd_>(and also that there are things below that -- and at some level, you trust something)
<civodul>yes, but software builds is what we work on--not CPUs, not hard disks, etc. :-)
<todd_>and yeah it sucks that CUDA is proprietary -- but we and a lot of sites use it, and at least for now we're not going to stop and sacrifice a ton of perforamance. But we would like our builds to be reproducible too.
<civodul>mind you, we have colleagues using it too, and with Guix
<civodul>and that's ok, but that doesn't mean we should not collectively question that
<civodul>from many different angles actually, not just "reproducible science"
<todd_>you could argue that though we use CUDA, we put several hundred million toward questioning that by supporting the competition
<civodul>who's "we"?
<civodul>ah, good, that's nice
<todd_>we picked an AMD machine (two of them actually)
<todd_>frontier and el capitan
<civodul>neat, i hope it can contribute to changing the status quo
<todd_>it wasn't the only reason... but openness was a positive
*civodul nods
<drakonis>oh, i remember what's intel's compute runtime, its called oneapi
<todd_>yeah it's getting more open too
<drakonis>there's also intel's arc gpus, its interesting.
<civodul>todd_: anyway, i hope you'll concede that we're not religious but just holding different views
<civodul>and that the condescending tone was uncalled for
<civodul>the upside is that i now have a better understanding of our divergence, different priorities, and perhaps things that weren't clear in the post
<civodul>i guess we can have an argument on twitter and still be friends :-)
<todd_>I think we agree we should do as much as we can. Not faulting you guys for doing bitwise, but faulting other ecosystems for not doing it and ignoring the decisions they had to make also comes off as a bit condescending.
<todd_>I could argue that many of the Guix posts come across as rather sanctimonious -- if you read them, it comes off as "if it's not bitwise it's not worth doing"
<civodul>for software builds, there's a rather large consensus that "if it's not bitwise, it's not worth it"
<civodul>again, this goes way beyond our tiny community
<civodul>but also: this is just for software builds
<civodul>and perhaps that was a source of misunderstandings
<civodul>i do take note that the posts come off as condescending and sanctimonious
<dstolfa>bitwise reproducible builds might not matter as much for reproducing results in say, biotech, but they do matter for performance evaluation of systems (quite a bit...)
<dstolfa>it ultimately depends on what you're doing, but if i pass in the same flags on the same system and same hardware, i better get the same result...
<rekado>todd_: it doesn’t appear in advocacy posts obviously, but I’m dealing with conda users on a daily basis.
<dstolfa>at the very least it should be reproducible within that one system, at least IMO
<rekado>and from my perspective anything that makes users’ lives easier is worth doing
<todd_>I mean you guys aren't the only ones trying to get around conda :)
<todd_>I do think condaforge has done a lot of good.
<todd_>but I don't like it that conda itself is basically RPM in a home dir
<todd_>and there's a separation bt/w source builds and binaries.
<rekado>(lots of bioconda users here at the institute; they also try to do better)
<todd_>conda-build is a thing, but it's a different and not very seamlessly integrated thing.
<rekado>(they = bioconda)
<todd_>it would be very awesome to see a system set up to do bitwise verification of binaries... regularly. And to do it in a distributed way. I am curious if you all have ideas for how to do anonymous verification effectively. Is there a scenario where you do not have to trust the verifier?
<zimoun>todd_, civodul: from my understanding, Guix introspects the tools themselves and try hard to apply the scientific method (i.e., transparency) to this introspection. Spack deals with cluster as any other experimental instrument, then treat the variability as any other scientific experiment deals with variability. Does it make sense?
<todd_>if you could do that, I think you could really crowdsource the verification.
<todd_>@zimoun: Spack has recipes just like guix. They just do not go all the way down to libc right now; we tag the builds with the OS used and the compilers. If you build everything with Spack, you can introspect all the builds above those things -- if you want.
<todd_>you can also say that, e.g., mvapich is "external" and you can use the system one.
<civodul>zimoun: it makes sense to me; now i understand the analogy
<todd_>in which case you're trusting that.
<todd_>and any variability due to mvapich is going to be a black box
<todd_>so yeah I think it mostly makes sense.
<todd_>The other thing that we do that guix and nix do not is that the builds are parameterized and there is actually a solver to reason about things you may not have built before.
<drakonis>hmm, parametrized builds were in the TODO list a while back
<drakonis>nix has parametrized builds in a very limited manner and only for very specific packages
<civodul>todd_: yes, and i really like Spack's support for customization
<drakonis>not even the sorts you'd want to use such a feature for
<civodul>it was an inspiration for "package transformation options"
<zimoun>todd_: thanks for explaining. And I remember discussing solver in FOSDEM corridor (I guess it was you :-))
<todd_>I've been to fosdem twice in person :)
<todd_>Is guix doing a solve to reason about whether the transformations are valid?
<todd_>or is that on the user?
<zimoun>todd_: it has recently been discussed. Nothing is done; let as an exercise to the user. I mean the user can try to apply transformation which does not make sense and Guix is silent.
<todd_>that's pretty cool, even still.
<civodul>yeah, we'd need to define what "valid" means
<todd_>yeah is there a way to express that in guix?
<todd_>or nix?
<zimoun>civodul: looking for an example for the recent discovery <> (thanks BTW :-)) I see: <>
<todd_>Oh nice. how do you get on lwn?
<todd_>apparently we need a better blog. Guix has a very nice blog.
<drakonis>todd_: for nix? i don't think so, it goes against their purely functional package management principles
<rekado>…it does?
<rekado>i wouldn’t know how to define validity
<todd_>@rekado: would you not argue that the package definitions in the guix mainline are valid?
<rekado>do you mean … like … tested and known to work?
<todd_>I'd say that's stronger. But they would also be valid.
<rekado>with Guix we have this one configuration of the graph; parameterization would give us a few more that are assumed to be working.
<rekado>but with the package transforms anything goes really
<rekado>there are too many permutations that we probably won’t ever build ourselves.
<rekado>e.g. you can tell Guix to build a package with different configure flags or different inputs, or you can tell it to replace any occurence of one package with a different one
<rekado>they are all “valid” for a meaningless definition of validity (they are all package objects), but we can’t know if they make sense.
<rekado>Guix does not know, for example, that the “gcc-toolchain” package is a compiler toolchain and cannot reasonably be replaced with, says, wget.
<todd_>Yeah I guess I should revise my statement. "valid" means it satisfies all the rules, "sound" means that the rules were actually true and indicated that the build would succeed
<todd_>and "tested and working" is something you need to try for yourself
*rekado nods
<rekado>but also
*rekado nods off
<rekado>gotta go to bed
*rekado waves
<todd_>Spack knows that certain things are compilers and that certain packages (MPIs for instance) can be replaced with others.
<todd_>it doesn't know whether they'll work but it lets the user tell it.
<todd_>I think I will get some work done as well.
<todd_>nice talking to you all
<dstolfa>todd_: thanks! hope you have a nice work day!
<zimoun>todd_: Guix does not have a list of “alternatives”
<zimoun>thanks for the discussion :-)
<zimoun>civodul: procrastinating on Bellard webpage and now PyTorch is in Guix, we could improve “guix search” by using <> … and make PyTorch a dependency of Guix, ahah! ;-)