This was supposed to be a question: > And, most popular in the world, per dominant culture in > each country, per religious majority, per language culture . > . . >
Dennis Gearon Signature Warning ---------------- EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Thu, 9/16/10, Dennis Gearon <gear...@sbcglobal.net> wrote: > From: Dennis Gearon <gear...@sbcglobal.net> > Subject: Re: getting a list of top page-ranked webpages > To: solr-user@lucene.apache.org, i...@upright.net > Date: Thursday, September 16, 2010, 11:28 PM > There's a great web page somewhere > that shows the popularity as the subway map of tokyo. > > Dennis Gearon > > Signature Warning > ---------------- > EARTH has a Right To Life, > otherwise we all die. > > Read 'Hot, Flat, and Crowded' > Laugh at http://www.yert.com/film.php > > > --- On Thu, 9/16/10, Ian Upright <i...@upright.net> > wrote: > > > From: Ian Upright <i...@upright.net> > > Subject: getting a list of top page-ranked webpages > > To: solr-user@lucene.apache.org > > Date: Thursday, September 16, 2010, 2:44 PM > > Hi, this question is a little off > > topic, but I thought since so many people > > on this are probably experts in this field, someone > may > > know. > > > > I'm experimenting with my own semantic-based search > engine, > > but I want to > > test it with a large corpus of web pages. Ideally I > > would like to have a > > list of the top 10M or top 100M page-ranked URL's in > the > > world. > > > > Short of using Nutch to crawl the entire web and build > this > > page-rank, is > > there any other ways? What other ways or resources > > might be available for > > me to get this (smaller) corpus of top webpages? > > > > Thanks, Ian > > >