> The web-scraper algorithm only finds the earliest version of each > Ubuntu package. > > I hope there's a better way to get this information, instead of > page-scraping, but don't know it to suggest it.
It's still a web-scraper, but the attached patch makes it find all Ubuntu versions of a package. Steve
commit cac56bacf9abfbaca6834c90e47591335443a581 Author: Steve Cotton <st...@s.cotton.clara.co.uk> Date: Tue Dec 30 15:41:50 2008 +0000 Show all Ubuntu versions diff --git a/whohas b/whohas index 0de8f47..7815980 100755 --- a/whohas +++ b/whohas @@ -279,19 +279,25 @@ sub ubuntu { my @groups; my $now = 0; for (my $i = 0; $i<@lines; $i++) { - if ($lines[$i] =~ /<h3>Package /) { - my $name = (split /h3>Package |<\/h3>/, $lines[$i])[1]; - push @names, $name; - my @parts = split /href\=\"|\"\>|<\/a\>/, $lines[$i+3]; - $parts[4] =~ s/ \(|\)://g; - push @groups, $parts[4]; - push @repos, ""; - push @urls, $base.$parts[2]; - push @sizes, ""; - push @dates, ""; - @parts = split />|: /, $lines[$i+6]; - push @versions, $parts[1]; - $i += 11; + (my $name) = ($lines[$i] =~ /<h3>Package (.+)<\/h3>/); + if ($name) { + $i += 3; + # There are now one or more 8-line blocks that are approximately + # $lines[$i] <li class="intrepid"><a class="resultlink" href="/intrepid/dpkg">intrepid</a> (base): + # $lines[$i+3] <br>1.14.20ubuntu6: amd64 i386 + while ($lines[$i] =~ /class="resultlink"/) { + push @names, $name; + my @parts = split /href\=\"|\"\>|<\/a\>/, $lines[$i]; + $parts[4] =~ s/ \(|\)://g; + push @groups, $parts[4]; + push @repos, ""; + push @urls, $base.$parts[2]; + push @sizes, ""; + push @dates, ""; + @parts = split />|: /, $lines[$i+3]; + push @versions, $parts[1]; + $i += 8; + } } } for (my $i = 0; $i < @repos; $i++) {