I'm interested in package popularity. I'm aware of popcon (https://popcon.debian.org/), but I'm more interested in actual downloads.
Do the debian mirrors track unique downloads (e.g. by hashed IP address), and if no, why not? I can understand the privacy argument, but arguably package downloads aren't particularly revealing? And data could be aggregated daily, thus limiting exposure. Boyuan Yang pointed out in the debian-www list that the "repository mirrors" often use third-party CDNs. I assume it uses DNS response load-balancing. There's potential for the request log to be biased geographically, but it might add interesting data. Another bias would be people using a VPN, but they'd only be counted once per exit node (so you'd have some IPs using an extreme number of packages). Parsing the request logs could be fairly trivial: 1. reduce to unique pairs every 24h: (ip, package) 2. sum by package