Jonathan Oxer <[EMAIL PROTECTED]> writes: > On Thu, 2003-11-06 at 10:50, Martin Pitt wrote: > > > But isn't rsync supposed to do this? I don't know exactly how > > efficiently it detects and compresses binary differences, but it > > definitely does it and not too bad. With rsync, you get both the easy > > management of complete debs and the bandwidth-saving of binary diffs. > > The only problem is that apt does not support rsync IIRC, but this > > could be solved by separately download the new debs into apt's cache > > with a script using rsync. > > Actually IIRC there was some work to make Apt support rsync. Goswin > Brederlow was talking about adding it in Jan 2001. And a lot of mirrors > actually are set up to support rsync. > > The problem is that most mirrors are loaded up pretty hard already, and > if everyone started using rsync they'd probably melt. > > So it's a tradeoff, bandwidth vs CPU. At the moment CPU seems to be the > factor for mirror admins.
rsync has two problems for this: 1. gzip streams are pretty much uniq. A one character change in the deb will create a completly different gzip stream (after that character). The --rsyncable option for gzip tries to flush the gzip dictionary at certain points so that rsync can catch on again. 2. rsync has a huge cpu and IO load on the servers. If every user would use rsync the server would break down. Several people, me including, have made rsync retrievers for apt with various features but due to the two problems above it never got picked up by the apt maintainer. In short rsync support is not wanted. > > Cheers :-) > > Jonathan Oxer Way back (somewhere in 2001 iirc) I suggested implementing cnysr (rsync backwards), which is a rsync with reversed roles. The checksum files for the server can then be precalculated and stored along the debs (2% mirror increase with 1K block size, less for bigger blocks) or calculated (and cached) on demand. Since no calculation needs to be done at the server side any http 1.1 server has all the features (Range statement) needed for cnysr. This means that any http debian mirror could directly be used without any changes apart from the client. I also did some tests on using checksums of the uncompressed data along with checksums for the compressed data and a more complex algorithm to simulate "rsyncing" the uncompressed data while only serving the compressed files on the server. That works even better than the --rsyncable patch to gzip but takes a lot of round-robins to the server and back (takes time) and an increased checksum file (2-4% mirror increase). MfG Goswin