Hi Otis, Thanks. There is no NFS anymore, and all index files are local. We migrated to new Solr 1.4 new Replication in order to avoid all the NSF Stale Exception.
Thanks, Osborn -----Original Message----- From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] Sent: Friday, January 15, 2010 12:31 PM To: solr-user@lucene.apache.org Subject: Re: Index Courruption after replication by new Solr 1.4 Replication This is not a direct answer to your question, but can you avoid NFS? My first guess would be that NFS somehow causes this problem. If you check the ML archives for: NFS lock , you will see what I mean. Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch ----- Original Message ---- > From: Osborn Chan <oc...@shutterfly.com> > To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> > Sent: Fri, January 15, 2010 3:23:21 PM > Subject: Index Courruption after replication by new Solr 1.4 Replication > > Hi all, > > I have migrated new Solr 1.4 Replication feature with multicore support from > Solr 1.2 with NFS mounting recently. The following exceptions are in > catalina.log from time to time, and there are some EOF exceptions which I > believe the slave index files are corrupted after replication from index > server. > I have following configuration with Solr 1.4, please correct me if it is > configured incorrectly. > > (The index files are not corrupted in master servers, but it is corrupted in > slave servers. Usually only one of the slave servers are corrupted with EOF > exception, but not all.) > > 1 Master Server: (Index Server) > - 8 indexes with multicore configuration. > - All indexes are configured to "replicateAfter" optimize only. > - The size of index data are vary. The smallest index only have 2.5 MB. > The > biggest index have ~ 100 MB. > - There would be infrequent optimize calls to indexes. (a optimize call > every ~30 mins to 6 hours depending on indexes). > - There are many commit calls to all indexes. (But there is no concurrent > commit and optimize for all indexes.) > - Did not configure "commitReserveDuration" in ReplicationHandler - Using > default values. > > 4 Slave Servers (Search Server) > - 8 indexes with multicore configuration. > - All indexes are configured to poll for every ~15 minutes. > - All update handler configuration are removed in solrconfig-slave.xml > (solrconfig.xml) in order to prevent add/commit/optimize calls. > - (Search Slave Servers are only responsible for search operation.) > - removed. > - > removed. > - > class="solr.BinaryUpdateRequestHandler" /> removed. > > A) FileNotFoundException > > INFO: Total time taken for download : 1 secs > Jan 15, 2010 10:34:16 AM org.apache.solr.handler.ReplicationHandler doFetch > SEVERE: SnapPull failed > org.apache.solr.common.SolrException: Index fetch failed : > at > org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:329) > at > org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264) > at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417) > at > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:280) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:135) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:65) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:142) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:166) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675) > at java.lang.Thread.run(Thread.java:595) > Caused by: java.io.FileNotFoundException: File does not exist > /slaveIndexData/publicGalleryTagDef/index.20100115103415/_al.fdx > at org.apache.solr.common.util.FileUtils.sync(FileUtils.java:55) > at > org.apache.solr.handler.SnapPuller$FileFetcher$1.run(SnapPuller.java:911) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269) > at java.util.concurrent.FutureTask.run(FutureTask.java:123) > ... 3 more > Jan 15, 2010 10:34:17 AM org.apache.solr.core.SolrCore execute > INFO: [publicGalleryPostMaster] webapp=/multicore path=/select > params={wt=javabin&rows=10&start=0&sort=createTime_dt+desc&q=%2B(profileId_s:/community/sfly/publicprofile/0AcM27Nw3aNWLi4)+%2Bstate_s:A&version=1} > > hits=1 status=0 QTime=1 > > B) LockReleaseFailedException > > SEVERE: SnapPull failed > org.apache.solr.common.SolrException: Index fetch failed : > at > org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:329) > at > org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264) > at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417) > at > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:280) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:135) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:65) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:142) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:166) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675) > at java.lang.Thread.run(Thread.java:595) > Caused by: org.apache.lucene.store.LockReleaseFailedException: failed to > delete > /slaveIndexData/publicGalleryTagDefAggregate/index/lucene-fb30bdbbdc6927666873dd616884ba29-write.lock > at > org.apache.lucene.store.NativeFSLock.release(NativeFSLockFactory.java:298) > at > org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:2225) > at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:2153) > at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:2117) > at > org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:229) > at > org.apache.solr.update.DirectUpdateHandler2.closeWriter(DirectUpdateHandler2.java:181) > at > org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:409) > at org.apache.solr.handler.SnapPuller.doCommit(SnapPuller.java:467) > at > org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:319) > ... 11 more > Jan 15, 2010 12:21:18 AM org.apache.solr.handler.SnapPuller fetchLatestIndex > INFO: Slave in sync with master. > > C) EOF Exception > INFO: [publicGalleryPostMaster] webapp=/multicore path=/select > params={wt=javabin&rows=1&start=0&sort=createTime_dt+desc&q=%2B(profileId_s:/community/sfly/publicprofile/0AbOWLNszaOWTiw)+%2B(lastBookmarked_dt:[2010-01-08T08:49:38.271Z+TO+2010-01-15T08:49:38.271Z]+lastCommented_dt:[2010-01-08T08:49:38.271Z+TO+2010-01-15T08:49:38.271Z])+%2Bstate_s:A&version=1} > > hits=0 status=0 QTime=2 > Jan 15, 2010 12:49:42 AM org.apache.solr.common.SolrException log > SEVERE: java.io.IOException: read past EOF > at > org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:151) > at > org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38) > at org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:80) > at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:64) > at > org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:129) > at > org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java:160) > at > org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:232) > at > org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:179) > at > org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java:975) > at > org.apache.lucene.index.DirectoryReader.docFreq(DirectoryReader.java:627) > at > org.apache.solr.search.SolrIndexReader.docFreq(SolrIndexReader.java:308) > at > org.apache.lucene.search.IndexSearcher.docFreq(IndexSearcher.java:147) > at org.apache.lucene.search.Similarity.idfExplain(Similarity.java:833) > at > org.apache.lucene.search.PhraseQuery$PhraseWeight.(PhraseQuery.java:122) > at > org.apache.lucene.search.PhraseQuery.createWeight(PhraseQuery.java:250) > at > org.apache.lucene.search.BooleanQuery$BooleanWeight.(BooleanQuery.java:184) > at > org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:415) > at org.apache.lucene.search.Query.weight(Query.java:99) > at org.apache.lucene.search.Searcher.createWeight(Searcher.java:230) > at org.apache.lucene.search.Searcher.search(Searcher.java:171) > at > org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988) > at > org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884) > at > org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341) > at > org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) > at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105) > at > org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:541) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148) > at > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869) > at > org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664) > at > org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527) > at > org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:80) > at > org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684) > at java.lang.Thread.run(Thread.java:595) > > Thanks a lot! > > Osborn