>>>>> Ivan Shmakov <i...@gray.siamics.net> writes: […]
> The news is that both the disassembly (e2dis) and reassembly (imrt) > tools are now working (but read below for a caution) and available > from their public Git repository [1] at Gitorious! > [1] https://gitorious.org/e2dis/e2dis-devel […] > Unfortunately, the performance of the image reassembly tool (imrt) is > extremly poor for the filesystems of more than a few MiB's size. Long story short: the changes I've made over a month made imrt significantly faster. I didn't do much testing, but it seems like an order of magnitude jump! > (And it seems that there may be subtle bugs, too.) The bug I was referring to is that it seems that the version of libgcrypt I use apparently doesn't support as many as 30 or 40 digest objects existing at the same time. With the digest removal logic re-done properly (8f56056d), it doesn't seem like a big issue anymore. > As with jigdo-file(1), imrt doesn't rely on filenames, and instead > “guesses” the output chunks the files passed to it correspond by > comparing the hashes (SHA-1 and SHA256 as of a726267a.) However, > such a comparison is currently implemented in a straightforward yet > suboptimal (as in: totally dumb) way, leading to the problem. It was improved considerably in the commits from 2acc4706 to b5009c14 (roughly.) Then, I've switched to using prepared statements extensively (a51bf977, 5d2f278c), thus reducing the time to complete a simple 64 MiB test image reassembly by roughly 20%. Finally, I've implemented the “cue sheet” support (abd326b5, 64743751.) Now, e2dis recurses over the filesystem's directories and records the filenames for all the “chunks” whose digests are recorded. Conversely, imrt uses this table to narrow the comparison of the files being processed to only such digests. If that fails, it still falls back to doing full search. For the test image, this change reduces the time by some 75% more! As for the missing parts: there's still virtually no command line interface, and I hope to fix that within a month or so, making a proper release shortly after. Neither is there documentation, nor handy tools to maintain the databases created. In particular, while the format allows a single databased to hold indices for several images (say, it may be images for different platforms), there's no tools to either “split” such a database, or “join” a few together. […] -- FSF associate member #7257 -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/86botr5raz....@gray.siamics.net