IRC channel logs

2021-12-08.log

back to list of logs

<zimoun>hi!
<zimoun>Reproduciblity paper about Cancer: https://elifesciences.org/articles/67995
<zimoun>«the data needed to compute effect sizes and conduct power analyses was publicly accessible for just 4 of 193 experiments»
<zimoun>«Moreover, despite contacting the authors of the original papers, we were unable to obtain these data for 68% of the experiments.»
<zimoun>«none of the 193 experiments were described in sufficient detail in the original paper to enable us to design protocols to repeat the experiments»
<zimoun>«not at all helpful (or did not respond to us) for 32% of experiments»
<efraim>I understand the hedging but not responding is also not helpful
*efraim thought it was going to be about cancer being reproducable
<zimoun>Well, another paper showing Science is somehow broken.
<zimoun>Similar to this (computer science field) http://repeatability.cs.arizona.edu/
<civodul>terrible
<zimoun>yes, terrible! The good point is that people around me start to be interested by what I am doing and why it is not just an abstract issue or just waste time. :-)
<civodul>heh yes, that's very concrete
<civodul>though often we're stuck at the stage of data availability, which is even before one might start wondering about software deployment
<zimoun>yeah, from my experience with cloudy people about “computing”, Guix allows to point the dependency issues (disappeared upstream, version, etc.). Then, it easy to translate that as the exact same problem for data. Here comes in Data Management and friend. :-)
<rekado_>I don't mean to be defeatist, but my experience with cloud people has always been that they embrace denial.
<rekado_>not that they don't see a problem
<rekado_>but they refuse to acknowledge this as a systemic problem
<rekado_>"just archive your data in a Docker image!"
<rekado_>"just archive your software as a Docker image and you'll always have the same programs"
<rekado_>"of course, you should use an apt proxy!"
<rekado_>there's a surprising amount of goal post shifting
<civodul>"embrace denial" :-)
<rekado_>and I don't mean that this is malice
<rekado_>but an underestimation of the complexity of management that needs to happen for things to be reproducible
<rekado_>the very fact that protocols can be conjured up to work around certain problems does not mean that it's a workable solution
<rekado_>that's the equivalent of a Turing tarpit, I guess.
<rekado_>too rarely is the usability of reproducibility acknowledged
<civodul>well, those Docker images are easily "usable" strictly speaking
<civodul>it's just that they're opaque and unverifiable
<civodul>the software equivalent of a magic potion
<rekado_>if all you want is rerun the exact same thing (ignoring the kernel and CPU) then yes: a Docker image is usable.
<civodul>yeah, the value of being able to experiment is underestimated
<civodul>which is ironic, given that you'd think "fiddling with things" is commonplace in scientific workflows
<rekado_>I wonder why VMs aren't more popular in science.
<rekado_>I get the appeal of Docker for deployment of services because you can mix and match.
<rekado_>but if all you want is archival of your environment --- why not a VM?
<civodul>Docker and VMs are interchangeable in this context
<rekado_>is it that VMs were always just a little bit too inconvenient?
<civodul>Docker just has a more convenient interface
<civodul>yeah i think so
<civodul>there's no standard way to build a VM or to run it
<rekado_>yet Docker is hailed as the greatest invention since sliced bread and VMs are looked upon with derision.
<civodul>whereas Docker provides all that with a consistent interface
<civodul>heh, go figure :-)
<rekado_>oh, but there's a standard way to run a VM with qemu. Or a standard way to run a VM with vmware, or with virtualbox, etc.
<rekado_>it was *much* later that different container things could run foreign container images.
<rekado_>even today people run Docker images with Docker, Singularity images with Singularity, etc
<rekado_>I can't believe that existing VM solutions were *that* much poorer than Docker / Singularity / etc.
<rekado_>perhaps a big difference is how closely Docker and the Dockerfile are connected.
<rekado_>there was much more choice in automatically building VMs, which may have been a problem.
<civodul>yes i think in terms of UI and providing all the tools at once (building the image, running the image), Docker is much better than the jungle of DIY VM tools
<civodul>i mean, if it weren't for "guix system image", i wouldn't know how to build a VM image :-)
<civodul>it's just too tedious
<zimoun>I had the analogy about Docker for my applied math’s friends. We have a result (theorem) and Docker image allows to verify examples, no more. The proof is what is inside the Docker image. But how can we audit a binary?
<zimoun>Applied to Fermat’s last theorem, Wiles said: I did it, here the Docker image. First, impossible to discover the hole, so impossible to fix it. Second, impossible to adapt the proof and finally proof all the Taniyam-Shimura-Weil conjecture. (I know nothing about these maths, just read once popular science by Simon Singh :-))
<zimoun>Maths is probably not science though.
<zimoun>The key point of full transparency is audit = convince peer and reuse = build new for tackling other.
<zimoun>Cloudy people (the one around me) do not deny that. Instead, they are individualist pushed by the broken review system.
<zimoun>And lost proof also happen in math: https://www.quantamagazine.org/new-math-book-rescues-landmark-topology-proof-20210909/