Greetings all,
We are using Solr to index Marc records to create a better, more user
friendly library catalog here at the University of Virginia. To do
this I have written a program starting from the VuFind Java importer
written by Wayne Graham (from the College of William & Mary). After
making extensive changes to accomodate our different indexing scheme the
program was working great.
Subsequently I upgraded to lucene version 2.3.1 to realize the faster
indexing performance, and although it is faster, is also seems somewhat
unreliable. A half a dozen times I have completely wiped out the
existing index and started afresh, building an new index. Each time at
some point in the indexing run I receive the Error message:
Exception in thread "Thread-10"
org.apache.lucene.index.MergePolicy$MergeException:
org.apache.lucene.index.CorruptIndexException:
doc counts differ for segment _3gh: fieldsReader shows 15999 but
segmentInfo shows 16000
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:271)
Caused by: org.apache.lucene.index.CorruptIndexException: doc counts
differ for segment _3gh: fieldsReader shows 15999 but segmentInfo shows
16000
at
org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:313)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:221)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3099)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2834)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:240)
This occurs a dozen or so times within the set of Marc records it is
processing, and when the next segment starts processing I receive this
message:
Loading properties from import.properties
Apr 16, 2008 7:03:51 AM org.apache.solr.core.Config setInstanceDir
INFO: Solr home set to '/usr/local/projects/solr/'
Apr 16, 2008 7:03:51 AM org.apache.solr.core.SolrConfig initConfig
INFO: Loaded SolrConfig: solrconfig.xml
Apr 16, 2008 7:03:52 AM org.apache.solr.core.SolrCore <init>
INFO: Opening new SolrCore at /usr/local/projects/solr/,
dataDir=/usr/local/projects/solr/data
Apr 16, 2008 7:03:52 AM org.apache.solr.schema.IndexSchema readConfig
INFO: Reading Solr Schema
Apr 16, 2008 7:03:52 AM org.apache.solr.schema.IndexSchema readConfig
INFO: Schema name=solr_int
Apr 16, 2008 7:03:52 AM org.apache.solr.schema.IndexSchema readConfig
INFO: default search field is text
Apr 16, 2008 7:03:52 AM org.apache.solr.schema.IndexSchema readConfig
INFO: query parser default operator is AND
Apr 16, 2008 7:03:52 AM org.apache.solr.schema.IndexSchema readConfig
INFO: unique key field: id
Apr 16, 2008 7:03:52 AM org.apache.solr.core.SolrCore parseListener
INFO: Searching for listeners: //[EMAIL PROTECTED]"firstSearcher"]
Apr 16, 2008 7:03:52 AM org.apache.solr.core.SolrCore parseListener
INFO: Searching for listeners: //[EMAIL PROTECTED]"newSearcher"]
Apr 16, 2008 7:03:52 AM org.apache.solr.core.SolrCore initWriters
INFO: adding queryResponseWriter
xslt=org.apache.solr.request.XSLTResponseWriter
Apr 16, 2008 7:03:52 AM org.apache.solr.request.XSLTResponseWriter init
INFO: xsltCacheLifetimeSeconds=5
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers
initHandlersFromConfig
INFO: adding requestHandler: standard=solr.StandardRequestHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers
initHandlersFromConfig
INFO: adding requestHandler: partitioned=solr.DisMaxRequestHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers
initHandlersFromConfig
INFO: adding requestHandler: dismax=solr.DisMaxRequestHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers
initHandlersFromConfig
INFO: adding requestHandler: catalog=solr.DisMaxRequestHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers
initHandlersFromConfig
INFO: adding requestHandler: music=solr.DisMaxRequestHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers
initHandlersFromConfig
INFO: adding requestHandler: semester_at_sea=solr.DisMaxRequestHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers
initHandlersFromConfig
INFO: adding lazy requestHandler:
spellchecker=solr.SpellCheckerRequestHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers
initHandlersFromConfig
INFO: adding requestHandler: /update=solr.XmlUpdateRequestHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers
initHandlersFromConfig
INFO: adding lazy requestHandler: /update/csv=solr.CSVRequestHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers
initHandlersFromConfig
INFO: adding requestHandler:
/admin/luke=org.apache.solr.handler.admin.LukeRequestHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers
initHandlersFromConfig
INFO: adding requestHandler:
/admin/system=org.apache.solr.handler.admin.SystemInfoHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers
initHandlersFromConfig
INFO: adding requestHandler:
/admin/plugins=org.apache.solr.handler.admin.PluginInfoHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers
initHandlersFromConfig
INFO: adding requestHandler:
/admin/threads=org.apache.solr.handler.admin.ThreadDumpHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers
initHandlersFromConfig
INFO: adding requestHandler:
/admin/properties=org.apache.solr.handler.admin.PropertiesRequestHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers
initHandlersFromConfig
INFO: adding requestHandler: /debug/dump=solr.DumpRequestHandler
Exception in thread "main" java.lang.RuntimeException:
org.apache.lucene.index.CorruptIndexException: doc counts differ for
segment
_3gh: fieldsReader shows 15999 but segmentInfo shows 16000
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:433)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:216)
at MarcImporter.<init>(Unknown Source)
at MarcImporter.main(Unknown Source)
Caused by: org.apache.lucene.index.CorruptIndexException: doc counts
differ for segment _3gh: fieldsReader shows 15999 but segmentIn
fo shows 16000
at
org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:313)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:197)
at
org.apache.lucene.index.MultiSegmentReader.<init>(MultiSegmentReader.java:55)
at
org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:75)
at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:636)
at
org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:63)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:209)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:173)
at
org.apache.solr.search.SolrIndexSearcher.<init>(SolrIndexSearcher.java:87)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:424)
... 3 more
Does anybody have suggestions as to how to track this problem down?
Thanks in Advance,
Robert Haschart