Hi list,

I am using Nutch-1.3 branch, which I checked out today to crawl a couple of 
urls in local mode. I have been using Solr Solr 1.4.1 within my web app but I 
am running into some problems during the indexing stages. I have three commands 
getting sent to Solr these are
echo "----- SolrIndex (Step 4 of $steps) -----"
$NUTCH_HOME/bin/nutch solrindex http://localhost:8080/wombra/data crawl/crawldb 
crawl/linkdb crawl/segments/*

echo "----- SolrDedup (Step 5 of $steps) -----"
$NUTCH_HOME/bin/nutch solrdedup http://localhost:8080/wombra/data

echo "----- SolrClean (Step 6 of $steps) -----"
$NUTCH_HOME/bin/nutch solrclean crawl/crawldb http://localhost:8080/wombra/data

The solrindex command is failing with SolrException: No Found
solrdedup appears to be working fine, the same could be said for solrclean

I have been monitoring threads on the Nutch list, but thought I would have a 
crack at the Solr list for any suggestions to how I can solve the errors I am 
seeing from my log output.

Thank you

Lewis

Here is my hadoop.log output

2011-04-18 11:27:05,480 INFO  solr.SolrIndexer - SolrIndexer: starting at 
2011-04-18 11:27:05
2011-04-18 11:27:05,562 INFO  indexer.IndexerMapReduce - IndexerMapReduce: 
crawldb: crawl/crawldb
2011-04-18 11:27:05,562 INFO  indexer.IndexerMapReduce - IndexerMapReduce: 
linkdb: crawl/linkdb
2011-04-18 11:27:05,562 INFO  indexer.IndexerMapReduce - IndexerMapReduces: 
adding segment: crawl/segments/20110418111549
2011-04-18 11:27:05,656 INFO  indexer.IndexerMapReduce - IndexerMapReduces: 
adding segment: crawl/segments/20110418111603
...
some more
...
2011-04-18 11:27:09,966 INFO  indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.basic.BasicIndexingFilter
2011-04-18 11:27:09,966 INFO  indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2011-04-18 11:27:10,021 INFO  solr.SolrMappingReader - source: content dest: 
content
2011-04-18 11:27:10,021 INFO  solr.SolrMappingReader - source: site dest: site
2011-04-18 11:27:10,021 INFO  solr.SolrMappingReader - source: title dest: title
2011-04-18 11:27:10,021 INFO  solr.SolrMappingReader - source: host dest: host
2011-04-18 11:27:10,021 INFO  solr.SolrMappingReader - source: segment dest: 
segment
2011-04-18 11:27:10,021 INFO  solr.SolrMappingReader - source: boost dest: boost
2011-04-18 11:27:10,021 INFO  solr.SolrMappingReader - source: digest dest: 
digest
2011-04-18 11:27:10,021 INFO  solr.SolrMappingReader - source: tstamp dest: 
tstamp
2011-04-18 11:27:10,021 INFO  solr.SolrMappingReader - source: url dest: id
2011-04-18 11:27:10,021 INFO  solr.SolrMappingReader - source: url dest: url
2011-04-18 11:27:10,394 WARN  mapred.LocalJobRunner - job_local_0001
org.apache.solr.common.SolrException: Not Found

Not Found

request: http://localhost:8080/wombra/data/update?wt=javabin&version=1
        at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:435)
        at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
        at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)
        at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:75)
        at 
org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
        at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
2011-04-18 11:27:11,033 ERROR solr.SolrIndexer - java.io.IOException: Job 
failed!
2011-04-18 11:27:11,869 INFO  solr.SolrDeleteDuplicates - SolrDeleteDuplicates: 
starting at 2011-04-18 11:27:11
2011-04-18 11:27:11,870 INFO  solr.SolrDeleteDuplicates - SolrDeleteDuplicates: 
Solr url: http://localhost:8080/wombra/data
2011-04-18 11:27:13,048 INFO  solr.SolrClean - SolrClean: starting at 
2011-04-18 11:27:13
2011-04-18 11:27:13,888 INFO  solr.SolrClean - SolrClean: deleting 5 documents
2011-04-18 11:27:13,992 WARN  mapred.LocalJobRunner - job_local_0001
org.apache.solr.common.SolrException: Not Found

Not Found

request: http://localhost:8080/wombra/data/update?wt=javabin&version=1
        at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:435)
        at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
        at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
        at 
org.apache.nutch.indexer.solr.SolrClean$SolrDeleter.close(SolrClean.java:115)
        at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:473)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html

Reply via email to