Source: transdecoder Version: 2.0.1+dfsg-2 Severity: wishlist Tags: patch User: reproducible-bui...@lists.alioth.debian.org Usertags: randomness X-Debbugs-Cc: reproducible-bui...@lists.alioth.debian.org
Hi, While working on the "reproducible builds" effort [1], we have noticed that transdecoder could not be built reproducibly. When generating some sample files (sample_data/transcripts.fasta) a Perl script is used (cufflinks_gtf_genome_to_cdna_fasta.pl) that loops through a dictionary producing unreproducible output. The attached patch fixes this by setting the env var PERL_HASH_SEED to 0 before calling this script to get a reproducible output when looping through the hash keys. Once applied, transdecoder can be built reproducibly in our current experimental framework. [1]: https://wiki.debian.org/ReproducibleBuilds Regards, -- Dhole
diff -Nru transdecoder-2.0.1+dfsg/debian/changelog transdecoder-2.0.1+dfsg/debian/changelog --- transdecoder-2.0.1+dfsg/debian/changelog 2015-12-29 00:54:39.000000000 +0100 +++ transdecoder-2.0.1+dfsg/debian/changelog 2016-04-22 20:42:48.000000000 +0200 @@ -1,3 +1,11 @@ +transdecoder (2.0.1+dfsg-2.1) UNRELEASED; urgency=medium + + * Non-maintainer upload. + * Set a hash seed for Perl before generating sample_data/transcripts.fasta + to get a reproducible output. + + -- Eduard Sanou <dh...@openmailbox.org> Fri, 22 Apr 2016 20:42:02 +0200 + transdecoder (2.0.1+dfsg-2) unstable; urgency=medium [ Andreas Tille ] diff -Nru transdecoder-2.0.1+dfsg/debian/patches/reproducible-sample_data.patch transdecoder-2.0.1+dfsg/debian/patches/reproducible-sample_data.patch --- transdecoder-2.0.1+dfsg/debian/patches/reproducible-sample_data.patch 1970-01-01 01:00:00.000000000 +0100 +++ transdecoder-2.0.1+dfsg/debian/patches/reproducible-sample_data.patch 2016-04-22 20:43:56.000000000 +0200 @@ -0,0 +1,16 @@ +Description: Make the sample data reproducible + Set a fixed hash seed for Perl before generating sample_data/transcripts.fasta + to get reproducible results +Author: Eduard Sanou <dh...@openmailbox.org> + +--- transdecoder-2.0.1+dfsg.orig/sample_data/runMe.sh ++++ transdecoder-2.0.1+dfsg/sample_data/runMe.sh +@@ -26,7 +26,7 @@ fi + ../util/cufflinks_gtf_to_alignment_gff3.pl transcripts.gtf > transcripts.gff3 + + ## generate transcripts fasta file +-../util/cufflinks_gtf_genome_to_cdna_fasta.pl transcripts.gtf test.genome.fasta > transcripts.fasta ++PERL_HASH_SEED=0 ../util/cufflinks_gtf_genome_to_cdna_fasta.pl transcripts.gtf test.genome.fasta > transcripts.fasta + + ## Extract the long ORFs + ../TransDecoder.LongOrfs -t transcripts.fasta diff -Nru transdecoder-2.0.1+dfsg/debian/patches/series transdecoder-2.0.1+dfsg/debian/patches/series --- transdecoder-2.0.1+dfsg/debian/patches/series 2015-12-29 00:54:39.000000000 +0100 +++ transdecoder-2.0.1+dfsg/debian/patches/series 2016-04-22 20:43:08.000000000 +0200 @@ -1,3 +1,4 @@ cd-hit-est-rename fix-whatis reproducible-build +reproducible-sample_data.patch
signature.asc
Description: PGP signature