I'm running 64bit Ubuntu 11.04, nutch 1.2, solr 3.3 (downloaded, not
built) and tomcat6 following this (and some other) links
http://wiki.apache.org/nutch/RunningNutchAndSolr

I have added the nutch schema and can access/view this schema via the
admin page. nutch also works as I can perfrom successful searches.

When I execute the following:

>> ./bin/nutch solrindex http://localhost:8080/solr/core0 crawl/crawldb
crawl/linkdb crawl/segments/*

I (eventually) get an io error. 

Tha above command creates the following
files /var/lib/tomcat6/solr/core0/data/index/

-------------------------------
544 -rw-r--r-- 1 tomcat6 tomcat6 557056 2011-07-13 11:09 _1.fdt
  0 -rw-r--r-- 1 tomcat6 tomcat6      0 2011-07-13 11:00 _1.fdx
  4 -rw-r--r-- 1 tomcat6 tomcat6     32 2011-07-13 10:59 segments_2
  4 -rw-r--r-- 1 tomcat6 tomcat6     20 2011-07-13 10:59 segments.gen
  0 -rw-r--r-- 1 tomcat6 tomcat6      0 2011-07-13 11:00 write.lock
-------------------------------

but the hadoop.log reports the following error

---------------------------
2011-07-13 11:09:47,665 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.basic.BasicIndexingFilter
2011-07-13 11:09:47,666 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: content
dest: content
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: site
dest: site
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: title
dest: title
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: host
dest: host
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: segment
dest: segment
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: boost
dest: boost
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: digest
dest: digest
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: tstamp
dest: tstamp
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: url dest:
id
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: url dest:
url
2011-07-13 11:09:49,272 WARN  mapred.LocalJobRunner - job_local_0001
java.lang.RuntimeException: Invalid version or the data in not in
'javabin' format
        at
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
        at
org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:39)
        at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:466)
        at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
        at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
        at
org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)
        at
org.apache.nutch.indexer.solr.SolrWriter.write(SolrWriter.java:64)
        at org.apache.nutch.indexer.IndexerOutputFormat
$1.write(IndexerOutputFormat.java:54)
        at org.apache.nutch.indexer.IndexerOutputFormat
$1.write(IndexerOutputFormat.java:44)
        at org.apache.hadoop.mapred.ReduceTask
$3.collect(ReduceTask.java:440)
        at
org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:159)
        at
org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:50)
        at
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
        at org.apache.hadoop.mapred.LocalJobRunner
$Job.run(LocalJobRunner.java:216)
2011-07-13 11:09:49,611 ERROR solr.SolrIndexer - java.io.IOException:
Job failed!
-----------------------------------------------------------------------------------------------------------------------------------------------

I'd appreciate any help with this.

Thanks,

Leo



Reply via email to