I note that there is a full download option available, might be easier than 
crawling.

François

On Sep 4, 2011, at 9:56 AM, Markus Jelsma wrote:

> Hi,
> 
> Solr is a search engine, not a crawler. You can use Apache Nutch to crawl 
> your 
> site and have it indexed in Solr.
> 
> Cheers,
> 
>> Hi,
>> 
>> I am new to Solr/Lucene, and have some problems trying to figure out the
>> best way to perform indexing. I think I understand the general principles,
>> but have some trouble translating this to my specific goal, which is the
>> following:
>> 
>> I want to use SolR as a search engine based on general (English) keywords,
>> that has indexed Wikipedia for Schools
>> (http://www.soschildrensvillages.org.uk/charity-news/archive/2008/10/2008-
>> wikipedia-for-schools).
>> 
>> I initially thought that it would be sufficient to add the root document
>> (index.html) to Solr, after which everything would be automagically
>> indexed, but this does not seem to work. I have also tried to use
>> urldatasource in data-config.xml, but there I get a bit confused by the
>> settings.
>> 
>> Could anyone help me understand how I can achieve my goal?
>> 
>> Thanks
>> 
>> Kees

Reply via email to