On Thursday 21 March 2013, Niels Thykier wrote: > On 2013-03-20 18:30, Stefan Fritsch wrote: > > That would be the perfect solution. Unfortunately, it would also > > mean that apt's pdiff implementation would need to be rewritten > > because it is so inefficient. [...] > > I spoke with David Kalnischkies (DonKult) and he told me that (part > of) the reason why it is slow is that it makes no assumption about > pdiffs. It is my understanding (of the code) that apt-file just > blindly downloads all ("new") patches and applies them in one go.
I was under the impression that the Index file tells you exactly which patches are necessary. But due to the lack of any formal specification (at least at the time I wrote diffindex-* in apt-file), maybe I was wrong. > Allegedly, rerepro can merge pdiffs so not all of them needs to be > applied and (understandably) the APT maintainers do not want that > to break. This seems very broken to me. Merging the diffs on the server side has little benefit. You still need exactly the same number of diffs on the server but each diff gets larger and there is more change among the diffs so that the efficiency of caching proxies goes down. With keep- alive connections and pipelining, downloading a few dozen files is not that big a problem. And there are some implementations (at least apt-file's and the security tracker's) that depend on the pdiffs being incremental in order to be faster than apt by at least one order of magnitude. So if the archive would ever use the diff merging, those implementations would break. > The solution is probably to extend the pdiff format > (e.g. like the suggestion in [1]), so the client side can see > exactly which patches are needed (instead of having to do them one > at a time). > To this end, I have been making a bit of noise in #d-ftp; > hopefully I will have news here soon. I think apt should still be changed to assume incremental diffs unless the Index file is of a new format. That would bring the benefit even for old-style archives. Merging diffs on the server does not give comparable benefit. > David reminded me that the APT side of things already had a GSoC > last year[2]. The code has not been merged yet but at least a > proof-of-concept branch is there. Assuming that can be used, we > are probably very close to making apt-file's update/purge commands > obsolete. Nice. But the pdiff problem still needs to be solved. You don't want to slow down apt-file update by a factor of 10 or more. > As understood Nick, he was not interested in maintaining > the current Perl variant of apt-file, but he would be interested > in rewriting (and maintain said rewrite of) apt-file. He was > certain he could improve the search speed of apt-file while doing > so. Given the results of his apt-show-versions rewrite I am > looking forward to that rewrite with great anticipation. :) > > What I propose we do is that I take over the maintenance of the > current apt-file. I will focus on making apt-file update/purge > obsolete. Sure. It's in collab-maint. Just commit away. But don't remove Thijs or Enrico, they still want to stay co-maintainers. Cheers, Stefan -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org