On Thu, Apr 11, 2002 at 10:40:31PM -0700, Robert Tiberius Johnson wrote:
> On Wed, 2002-04-10 at 02:28, Anthony Towns wrote: 
> > I'd suggest your formula would be better off being:
> >     bandwidthcost = sum( x = 1..30, prob(x) * cost(x) / x )
> I think it depends on what you're measuring.  I can think of two ways to
> measure the "goodness" of these schemes (there are certainly others): 
> 
> 1. What is the average bandwidth required at the server? 
> 2. What is the average bandwidth required at the client? 

I don't think the bandwidth at the server is a major issue to anyone,
although obviously improvements there are a Good Thing.

Personally, I think "amount of time spent waiting for apt-get update
to finish" is the important measure (well, "apt-get update; apt-get
dist-upgrade" is important too, but I don't thing we've seen any feasible
ideas at improving the latter).

> prob2(i)=(prob1(i)/i)*norm, 
> 
> where norm is a normalization factor so the probabilities sum to 1. 
> I've been looking at question 2, and you're suggesting that I look at
> question 1, except you forgot the normalization factor.  I think this is
> what you mean.  Please correct me if I've misunderstood. 

No, I'm not. I'm saying that "the amount of time spent waiting for
apt-get update" needs to count every apt-get update you run, not just
the first. So, if over a period of a week, I run it seven times, and you
run it once, I wait seven times as long as you do, so it's seven times
more important to speed things up for me, than for you.

> Anyway, here are the results you asked for.  I'm NOT including the
> normalization factor for easier comparison with your numbers.  My diff
> numbers are a little different from yours mainly because I charge 1K of
> overhead for each file request. 

Merging, and reordering by decreasing estimated bandwidth. The ones marked
with *'s aren't worth considering because there's a method that's both
has less bandwidth required, and takes up less diskspace. The ones without
stars are thus ordered by increasing diskspace, and decreasing bandwidth.

> days/
> bsize dspace          ebwidth
> -------------------------------

Having the "ebwidth" of the current situation (everyone downloads the
entire Packages file) for comparison would be helpful.

> 1     12.000K         342.00K [diff]
> 20    312.50K *       173.70K [cksum/rsync]
> 2     24.000K *       171.20K [diff]
> 3     36.000K *       95.900K [diff]
> 40    156.30K *       89.300K [cksum/rsync]
> 60    104.20K *       62.200K [cksum/rsync]
> 4     48.000K *       58.500K [diff]
> 80    78.100K *       49.300K [cksum/rsync]
> 100   62.500K *       42.200K [cksum/rsync]
> 5     60.000K *       38.800K [diff]
> 120   52.100K *       37.900K [cksum/rsync]
> 400   15.600K         37.700K [cksum/rsync]
> 380   16.400K         36.800K [cksum/rsync]
> 360   17.400K         35.900K [cksum/rsync]
> 140   44.600K *       35.300K [cksum/rsync]
> 340   18.400K         35.100K [cksum/rsync]
> 320   19.500K         34.300K [cksum/rsync]
> 300   20.800K *       33.600K [cksum/rsync]
> 160   39.100K *       33.600K [cksum/rsync]
> 280   22.300K         33.000K [cksum/rsync]
> 180   34.700K *       32.700K [cksum/rsync]
> 260   24.000K         32.500K [cksum/rsync]
> 240   26.000K         32.200K [cksum/rsync]
> 200   31.300K *       32.200K [cksum/rsync]
> 220   28.400K         32.100K [cksum/rsync]
> 6     72.000K         27.900K [diff]
> 7     84.000K         21.800K [diff]
> 8     96.000K         18.200K [diff]
> 9     108.00K         16.100K [diff]
> 10    120.00K         14.900K [diff]
> 11    132.00K         14.100K [diff]
> 12    144.00K         13.700K [diff]
> 13    156.00K         13.400K [diff]
> 14    168.00K         13.300K [diff]
> 15    180.00K         13.100K [diff]

180k is roughly 10% of the size of the corresponding Packages.gz, so
is relatively trivial. Since we'll probably do it at the same time as
dropping the uncompressed Packages file (sid/main/i386 alone is 6MB),
this is pretty neglible.

Cheers,
aj

-- 
Anthony Towns <[EMAIL PROTECTED]> <http://azure.humbug.org.au/~aj/>
I don't speak for anyone save myself. GPG signed mail preferred.

     ``BAM! Science triumphs again!'' 
                    -- http://www.angryflower.com/vegeta.gif

Attachment: pgp6DeEYsec6i.pgp
Description: PGP signature

Reply via email to