Mike,

You are right it does sound exactly like that situation. The java version is:

java version "1.6.0_05"
Java(TM) SE Runtime Environment (build 1.6.0_05-b13)
Java HotSpot(TM) Server VM (build 10.0-b19, mixed mode)

Which seems to be the same as the one giving the other poster problems.

I tried running with the -Xbatch flag on for the indexing runs, but after 23 chunks of Marc records, the exact same problem occurred.

I'm going to try downgrading java versions to see whether that helps.

-Robert Haschart


Michael McCandless wrote:


Which exact version of the JRE are you using? Can you try running java with -Xbatch (forces up front compilation)?

Your situation sounds very similar to this one:

  http://lucene.markmail.org/message/awkkunr7j24nh4qj

Mike

On Apr 17, 2008, at 10:57 AM, Robert Haschart wrote:

Greetings all,

We are using Solr to index Marc records to create a better, more user friendly library catalog here at the University of Virginia. To do this I have written a program starting from the VuFind Java importer written by Wayne Graham (from the College of William & Mary). After making extensive changes to accomodate our different indexing scheme the program was working great. Subsequently I upgraded to lucene version 2.3.1 to realize the faster indexing performance, and although it is faster, is also seems somewhat unreliable. A half a dozen times I have completely wiped out the existing index and started afresh, building an new index. Each time at some point in the indexing run I receive the Error message:


Exception in thread "Thread-10" org.apache.lucene.index.MergePolicy $MergeException: org.apache.lucene.index.CorruptIndexException: doc counts differ for segment _3gh: fieldsReader shows 15999 but segmentInfo shows 16000 at org.apache.lucene.index.ConcurrentMergeScheduler $MergeThread.run(ConcurrentMergeScheduler.java:271) Caused by: org.apache.lucene.index.CorruptIndexException: doc counts differ for segment _3gh: fieldsReader shows 15999 but segmentInfo shows 16000 at org.apache.lucene.index.SegmentReader.initialize (SegmentReader.java:313) at org.apache.lucene.index.SegmentReader.get (SegmentReader.java:262) at org.apache.lucene.index.SegmentReader.get (SegmentReader.java:221) at org.apache.lucene.index.IndexWriter.mergeMiddle (IndexWriter.java:3099) at org.apache.lucene.index.IndexWriter.merge (IndexWriter.java:2834) at org.apache.lucene.index.ConcurrentMergeScheduler $MergeThread.run(ConcurrentMergeScheduler.java:240)


This occurs a dozen or so times within the set of Marc records it is processing, and when the next segment starts processing I receive this message:


Loading properties from import.properties
Apr 16, 2008 7:03:51 AM org.apache.solr.core.Config setInstanceDir
INFO: Solr home set to '/usr/local/projects/solr/'
Apr 16, 2008 7:03:51 AM org.apache.solr.core.SolrConfig initConfig
INFO: Loaded SolrConfig: solrconfig.xml
Apr 16, 2008 7:03:52 AM org.apache.solr.core.SolrCore <init>
INFO: Opening new SolrCore at /usr/local/projects/solr/, dataDir=/ usr/local/projects/solr/data
Apr 16, 2008 7:03:52 AM org.apache.solr.schema.IndexSchema readConfig
INFO: Reading Solr Schema
Apr 16, 2008 7:03:52 AM org.apache.solr.schema.IndexSchema readConfig
INFO: Schema name=solr_int
Apr 16, 2008 7:03:52 AM org.apache.solr.schema.IndexSchema readConfig
INFO: default search field is text
Apr 16, 2008 7:03:52 AM org.apache.solr.schema.IndexSchema readConfig
INFO: query parser default operator is AND
Apr 16, 2008 7:03:52 AM org.apache.solr.schema.IndexSchema readConfig
INFO: unique key field: id
Apr 16, 2008 7:03:52 AM org.apache.solr.core.SolrCore parseListener
INFO: Searching for listeners: //[EMAIL PROTECTED]"firstSearcher"]
Apr 16, 2008 7:03:52 AM org.apache.solr.core.SolrCore parseListener
INFO: Searching for listeners: //[EMAIL PROTECTED]"newSearcher"]
Apr 16, 2008 7:03:52 AM org.apache.solr.core.SolrCore initWriters
INFO: adding queryResponseWriter xslt=org.apache.solr.request.XSLTResponseWriter
Apr 16, 2008 7:03:52 AM org.apache.solr.request.XSLTResponseWriter  init
INFO: xsltCacheLifetimeSeconds=5
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers initHandlersFromConfig
INFO: adding requestHandler: standard=solr.StandardRequestHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers initHandlersFromConfig
INFO: adding requestHandler: partitioned=solr.DisMaxRequestHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers initHandlersFromConfig
INFO: adding requestHandler: dismax=solr.DisMaxRequestHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers initHandlersFromConfig
INFO: adding requestHandler: catalog=solr.DisMaxRequestHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers initHandlersFromConfig
INFO: adding requestHandler: music=solr.DisMaxRequestHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers initHandlersFromConfig
INFO: adding requestHandler: semester_at_sea=solr.DisMaxRequestHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers initHandlersFromConfig INFO: adding lazy requestHandler: spellchecker=solr.SpellCheckerRequestHandler Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers initHandlersFromConfig
INFO: adding requestHandler: /update=solr.XmlUpdateRequestHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers initHandlersFromConfig
INFO: adding lazy requestHandler: /update/csv=solr.CSVRequestHandler
Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers initHandlersFromConfig INFO: adding requestHandler: /admin/ luke=org.apache.solr.handler.admin.LukeRequestHandler Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers initHandlersFromConfig INFO: adding requestHandler: /admin/ system=org.apache.solr.handler.admin.SystemInfoHandler Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers initHandlersFromConfig INFO: adding requestHandler: /admin/ plugins=org.apache.solr.handler.admin.PluginInfoHandler Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers initHandlersFromConfig INFO: adding requestHandler: /admin/ threads=org.apache.solr.handler.admin.ThreadDumpHandler Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers initHandlersFromConfig INFO: adding requestHandler: /admin/ properties=org.apache.solr.handler.admin.PropertiesRequestHandler Apr 16, 2008 7:03:52 AM org.apache.solr.core.RequestHandlers initHandlersFromConfig
INFO: adding requestHandler: /debug/dump=solr.DumpRequestHandler
Exception in thread "main" java.lang.RuntimeException: org.apache.lucene.index.CorruptIndexException: doc counts differ for segment
_3gh: fieldsReader shows 15999 but segmentInfo shows 16000
       at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:433)
       at org.apache.solr.core.SolrCore.<init>(SolrCore.java:216)
       at MarcImporter.<init>(Unknown Source)
       at MarcImporter.main(Unknown Source)
Caused by: org.apache.lucene.index.CorruptIndexException: doc counts differ for segment _3gh: fieldsReader shows 15999 but segmentIn
fo shows 16000
at org.apache.lucene.index.SegmentReader.initialize (SegmentReader.java:313) at org.apache.lucene.index.SegmentReader.get (SegmentReader.java:262) at org.apache.lucene.index.SegmentReader.get (SegmentReader.java:197) at org.apache.lucene.index.MultiSegmentReader.<init> (MultiSegmentReader.java:55) at org.apache.lucene.index.DirectoryIndexReader$1.doBody (DirectoryIndexReader.java:75) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run (SegmentInfos.java:636) at org.apache.lucene.index.DirectoryIndexReader.open (DirectoryIndexReader.java:63) at org.apache.lucene.index.IndexReader.open(IndexReader.java: 209) at org.apache.lucene.index.IndexReader.open(IndexReader.java: 173) at org.apache.solr.search.SolrIndexSearcher.<init> (SolrIndexSearcher.java:87)
       at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:424)
       ... 3 more

Does anybody have suggestions as to how to track this problem down?

Thanks in Advance,

Robert Haschart





Reply via email to