Catonano
2018-04-27 07:55:18 UTC
Hello Ricardo & all!
reproducible builds is a crucial foundation for reproducible
computational experiments, and this paper does a great job at this.
Also nice that you show you can have these bit-reproducible pipelines
formalized in Guix *and* produce a ready-to-use âcontainer image.â
Hopefully we can soon address the remaining sources of non-determinism
shown in Table 3 (I think you already addressed some of them in the
meantime, didnât you?).
The bit Iâm less comfortable with is Autotools. I do understand how it
helps capture configure-time dependencies, and how it generally helps
people package and use the software; I think itâs one of the best tools
for the job. However itâs also hard to learn and, whether itâs
justified or not, itâs considered âscary.â
Given the intended audience, I wonder how we could provide a simpler
path to achieve the same goal. It could be a set of Autoconf macros
leading to high-level âconfigure.acâ files without any line of shell
code, or it could be Guix interpreting a top-level .scm or JSON file,
both of which would ideally be easier to write for bioinformaticians.
What are your thoughts on this?
I have explored the possibility to create a guile based tool for buildingIâm happy to announce that the group Iâm working with has released a
Reproducible genomics analysis pipelines with GNU Guix
https://www.biorxiv.org/content/early/2018/04/11/298653
We built a collection of bioinformatics pipelines and packaged them with
GNU Guix, and then looked at the degree to which the software achieves
bit-reproducibility (spoiler: ~98%), analysed sources of non-determinism
(e.g. time stamps), discussed experimental reproducibility at runtime
(e.g. random number generators, kernel+glibc interface, etc) and
commented on the idea of using âcontainersâ (or application bundles)
instead.
Very impressive piece of work! I think itâs important to stress thatReproducible genomics analysis pipelines with GNU Guix
https://www.biorxiv.org/content/early/2018/04/11/298653
We built a collection of bioinformatics pipelines and packaged them with
GNU Guix, and then looked at the degree to which the software achieves
bit-reproducibility (spoiler: ~98%), analysed sources of non-determinism
(e.g. time stamps), discussed experimental reproducibility at runtime
(e.g. random number generators, kernel+glibc interface, etc) and
commented on the idea of using âcontainersâ (or application bundles)
instead.
reproducible builds is a crucial foundation for reproducible
computational experiments, and this paper does a great job at this.
Also nice that you show you can have these bit-reproducible pipelines
formalized in Guix *and* produce a ready-to-use âcontainer image.â
Hopefully we can soon address the remaining sources of non-determinism
shown in Table 3 (I think you already addressed some of them in the
meantime, didnât you?).
The bit Iâm less comfortable with is Autotools. I do understand how it
helps capture configure-time dependencies, and how it generally helps
people package and use the software; I think itâs one of the best tools
for the job. However itâs also hard to learn and, whether itâs
justified or not, itâs considered âscary.â
Given the intended audience, I wonder how we could provide a simpler
path to achieve the same goal. It could be a set of Autoconf macros
leading to high-level âconfigure.acâ files without any line of shell
code, or it could be Guix interpreting a top-level .scm or JSON file,
both of which would ideally be easier to write for bioinformaticians.
What are your thoughts on this?
guile packages
That could be easily expanded to support non guile based projects
It could reproduce all thhe tests that Autoconf generated scripts perform
I thought it would be nice to reproduce the record format used in Guildhall
It seems to me that the pacages defined in pkg-list.scm files are not using
the standard srfi-9 syntax
I also noticed that the guix/records.scm file
So I embarked in the process of learning how the guile macros are used there
It was a tumble, substantially
I couldn't read the code, too much and too articulated
The guile manual was not very helpfful
So I took this article
https://www.cs.indiana.edu/~dyb/pubs/tr356.pdf
to a print shop to have it printed and be able to read it peacefully
I read it. so now I know at least what the whole thing is about
I made this little (< 7 Mb) footage of me expanding a guix defined macro
for creating a record
http://catonano.v22018025836661967.nicesrv.de/resources/videos/expanding.flv
Too bad, it expands the thing completely, so I end up with some code
creating and manipulating structures, rather than srffi-9 records
I remebered aout a macroexpand-1 macro doing only one pass of expansion but
that's in Clojure !
I remembered incorrectly !
So this is where I'm at and where I've been
A stumble in yakk shaving, substantially.
I could give up on macros and use srfi-9 records for now.
The next step would be to test if some guile deps are available (calling
the line I already have briefly discussed here)
should anyone be working on a guile based buld tool, I'd love them to make
their efforts known to the community, maybe publish a repo somewhere
Anyway, kudos on this, thank you!
Ludoâ.
Ludoâ.