Hi, [Sorry for the thread broken, my POP3 provider stopped.] [Please Cc: me! <[EMAIL PROTECTED]>. Sorry! ;-)]
1. RFDiscussion on big Packages.gz 1.1. Some statistics % grep-dctrl -P -sPackage,Priority,Installed-Size,Version,Depends,Provides,Conflicts,Filename,Size,MD5sum -r '.*' ftp.jp.debian.org_debian_dists_unstable_main_binary-i386_Packages | gzip -9 > test.pkg.gz % gzip -9 ftp.jp.debian.org_debian_dists_unstable_main_binary-i386_Packages % ls -alF *.gz -rw-r--r-- 1 zw zw 1157494 Jan 7 21:20 ftp.jp.debian.org_debian_dists_unstable_main_binary-i386_Packages.gz -rw-r--r-- 1 zw zw 341407 Jan 7 21:23 test.pkg.gz % This approach is simple and straight and almost compatible. But could accpect 10K more packages come into Debian with little loss. Worth consideration. IMHO. Better, if `Description:' etc. could come into seperate gzipped file along with the Debian package. 1.2. Little math Suppose: 1) Site A get K hits of `apt-get update' per day. With everyday passed, M extra hits added, as Debian goes more popular. 2) N new packages come into Debian every day. After `gzip -9', each contribute 206 byte to old package index file, and 61 to new format index file. Current package number is P. 3) Days passed as X axis. 4) B as the byte size of the data flow for `apt-get update' for that day. On the server side. (Client side K =1, M = 0) B = (K + M*X) * (P + N*X) * 206 is for old format package index B = (K + M*X) * (P + N*X) * 61 is for new format package index [It's still X^^2 function, anyway, so it's, in theory, not a big deal. ;-)] [Only if we could eliminate the need for Package Index. That is possible. ] For K = 500, P = 6000, X = 0, Server side B is, [EMAIL PROTECTED] ~/tmp % echo $((6000*500*206)) 618000000 [EMAIL PROTECTED] ~/tmp % echo $((6000*500*61)) 183000000 [EMAIL PROTECTED] ~/tmp % [Though the caches could help a great lot for servers in such cases.] 2. Compare with DIFF and RSYNC method of APT 2.1. They need server support. (More than a directory layout and client tool changing.) 2.2. If you don't update for a long time, DIFF won't help. RSYNC help less. 3. Additional benefits Seperate changelog.Debian and `Description:' etc. out into meta-info file could help users: 1) reduce the bandwidth eaten 2) help their upgrade decisions easily. -- echo <<EOF |cpp - -|egrep -v '(^#|^$)' /* =|=X ++ * /\+_ p7 <[EMAIL PROTECTED]> */ EOF