I should probably be more specific: for a long time, I haven't wanted caching in pkg_add, because sometimes mirrors get out of synch, and I was worried a bad cache would be worse than no cache.
Also: when to generate it. Also: how to store it. At some point we had sqlite in base, but it didn't last because there was no client. AND pkg_add was an issue, because it's perl, so to use sqlite in perl you also need DBI which is *huge*. At some point during the last 10 months, I gave a script to generate a cache package to sthen and naddy, so you guys could see index-0.tgz for a while.... but I didn't really have time to hack on it, so naddy@ and sthen@ gave it up. Having it as a separate package meant a lot more glue. Then I figured: let's make it a flavor to quirks. But coping with two flavors was complicated. So it became an extra file in quirks... which means that quirks had to build LAST (because contrary to sqlports/pkglocatedb, we want quirks to reflect *built packages* not possible ports) This was surprisingly easy to do. I did a few tryouts which were hilariously bogus, but the speed-up was tremendous!* not opening a new connection for each new update-info *is* lots faster (the alternative would be to use http 1.1 or later to keep the connection open... this may happen at some point but it's way more complicated) I was expecting the call-out to locate(1) to be really slow... that's not the case. It can make the fan hot (dixit landry@) but it's not that base. Next part was ironing out some details: the cache is directly related to ONE single repository location (which is okay, because most scenarios go to one single repository), and it was possible to run locate beforehand on the "assumed" list of packages to update. So this is about the state of the current code. It still needs better checks to deal with catastrophic failures (if you get a cache that does NOT match at all the repository, some earlier bogus code resulted in pkg_add -u not updating anything, which is not okay). (note that in reality, most scenarios where the cache is wrong/incomplete will just end up in it not being used, so it is not that bad). As far as actual testing goes: unless I instrument it a lot, apart from reading the code and checking it does what it should, that it is actually faster, and that the update is still correct, there's not much one can do. (*) tremendous, as in, really WOW. I wasn't expecting such a difference and I wrote the patch. Typical update scenarios where about everything is up to date go down from 10s of minutes to 10s of seconds, it's THAT drastic. The current code isn't active yet... it's easy to find the line that prevents it from running. I'm still fussing over error-handling, but it's getting there. -- Marc