Source: rdfind
Version: 1.3.4-2
Severity: wishlist
Tags: patch
User: reproducible-bui...@lists.alioth.debian.org
Usertags: toolchain fileordering
X-Debbugs-Cc: reproducible-bui...@lists.alioth.debian.org

Hi!

While working on the "reproducible builds" effort [1], we have noticed
that the behaviour of rdfind is depending on the filesystem order.
When finding duplicates and replacing them with symlinks, the file
that was last created/modified(?) will be used as the destination,
which is not deterministic.

The attached patch will sort the file list by filenames before any
operation is performed on it. This will result in a deterministic
behaviour, so that packages using rdfind can be built reproducibly.

Regards,
 Reiner

[1]: https://wiki.debian.org/ReproducibleBuilds

diff --git a/debian/changelog b/debian/changelog
index 997d44b..a5365b8 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,10 @@
+rdfind (1.3.4-2.0~reproducible1) UNRELEASED; urgency=medium
+
+  * Non-maintainer upload.
+  * Added patch to sort files before doing any operation.
+
+ -- Reiner Herrmann <rei...@reiner-h.de>  Sun, 16 Aug 2015 19:00:44 +0200
+
 rdfind (1.3.4-2) unstable; urgency=low
 
   [ Helmut Grohne ]
diff --git a/debian/patches/reproducible_build.patch b/debian/patches/reproducible_build.patch
new file mode 100644
index 0000000..8871120
--- /dev/null
+++ b/debian/patches/reproducible_build.patch
@@ -0,0 +1,41 @@
+Author: Reiner Herrmann <rei...@reiner-h.de>
+Description: sort the filelist when it is complete to get reproducible behaviour
+
+Index: rdfind-1.3.4/Fileinfo.hh
+===================================================================
+--- rdfind-1.3.4.orig/Fileinfo.hh
++++ rdfind-1.3.4/Fileinfo.hh
+@@ -189,6 +189,10 @@ public:
+   static bool compareondepth(const Fileinfo &a, const Fileinfo &b)
+   {return (a.depth() < b.depth());}
+ 
++  //returns true if a has lower filename than b)
++  static bool compareonfilename(const Fileinfo &a, const Fileinfo &b)
++  {return (a.name().compare(b.name()) < 0);}
++
+   //fills with bytes from the file. if lasttype is supplied,
+   //it is used to see if the file needs to be read again - useful if
+   //file is shorter than the length of the bytes field.
+@@ -235,6 +239,10 @@ public:
+   static bool equaldepth(const Fileinfo &a, const Fileinfo &b)
+   {return (a.depth()==b.depth());}
+ 
++  //returns true if filenames are equal
++  static bool equalfilename(const Fileinfo &a, const Fileinfo &b)
++  {return (a.name()==b.name());}
++
+   //returns true if file is a regular file. call readfileinfo first!
+   bool isRegularFile() {return m_info.is_file;}
+ 
+Index: rdfind-1.3.4/rdfind.cc
+===================================================================
+--- rdfind-1.3.4.orig/rdfind.cc
++++ rdfind-1.3.4/rdfind.cc
+@@ -349,6 +349,7 @@ int main(int narg, char *argv[])
+   cout<<dryruntext<<"Now have "<<filelist1.size()<<" files in total."<<endl;
+   
+   
++  gswd.sortlist(&Fileinfo::compareonfilename,&Fileinfo::equalfilename);
+ 
+   //mark files with a unique number
+   gswd.markitems();
diff --git a/debian/patches/series b/debian/patches/series
index e69de29..b2026fe 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -0,0 +1 @@
+reproducible_build.patch

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to