Re: wikipedia and teaching kids search engines

Andrzej Bialecki Wed, 24 Mar 2010 10:53:54 -0700

On 2010-03-24 16:15, Markus Jelsma wrote:

A bit off-topic but how about Nutch grabbing some conent and have it indexed
in Solr?

The problem is not with collecting and submitting the documents, theproblem is with parsing the Wikimedia markup embedded in XML.WikipediaTokenizer from Lucene contrib/ is a quick and perhapsacceptable solution ...


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Re: wikipedia and teaching kids search engines

Reply via email to