On Sun, Mar 04, 2012 at 12:31:16AM +0100, Timo Weingärtner wrote: > Package: wnpp > Severity: wishlist > X-Debbugs-CC: debian-de...@lists.debian.org > > Package name: hadori > Version: 0.2 > Upstream Author: Timo Weingärtner <t...@tiwe.de> > URL: https://github.com/tiwe-de/hadori > License: GPL3+ > Description: Hardlinks identical files > This might look like yet another hardlinking tool, but it is the only one > which only memorizes one filename per inode. That results in less merory > consumption and faster execution compared to its alternatives. Therefore > (and because all the other names are already taken) it's called > "HArdlinking DOne RIght". > . > Advantages over other hardlinking tools: > * predictability: arguments are scanned in order, each first version is kept > * much lower CPU and memory consumption > * hashing option: speedup on many equal-sized, mostly identical files > > The initial comparison was with hardlink, which got OOM killed with a hundred > backups of my home directory. Last night I compared it to duff and rdfind > which would have happily linked files with different st_mtime and st_mode.
You might want to try hardlink 0.2~rc1. In any case, I don't think we need yet another such tool in the archive. If you want that algorithm, we can implement it in hardlink 0.2 using probably about 10 lines. I had that locally and it works, so if you want it, we can add it and avoid the need for one more hack in that space. hardlink 0.2 is written in C, and uses a binary tree to map (dev_t, off_t) to a struct file which contains the stat information plus name for linking. It requires two allocations per file, one for the struct file with the filename, and one for the node in the tree (well, actually we only need the node for the first file with a specific (dev_t, off_t) tuple). A node has 3 pointers. -- Julian Andres Klode - Debian Developer, Ubuntu Member See http://wiki.debian.org/JulianAndresKlode and http://jak-linux.org/.
pgpTObJIud0UX.pgp
Description: PGP signature