A slightly different route to take, but one that should help test/refine a
semantic parser is wikipedia. They make available their entire corpus, or
any subset you define. The whole thing is like 14 terabytes, but you can get
smaller sets. 
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/getting-a-list-of-top-page-ranked-webpages-tp1515311p1516649.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to