RE: Solr Replication: How to restore data from last snapshot
What happen if it is multiple core? Thanks -Original Message- From: noble.p...@gmail.com [mailto:noble.p...@gmail.com] On Behalf Of Noble Paul ??? ?? Sent: Friday, November 06, 2009 10:49 PM To: solr-user@lucene.apache.org Subject: Re: Solr Replication: How to restore data from last snapshot if it is a single core you will have to restart the master On Sat, Nov 7, 2009 at 1:55 AM, Osborn Chan wrote: > Thanks. But I have following use cases: > > 1) Master index is corrupted, but it didn't replicate to slave servers. > - In this case, I only need to restore to last snapshot. > 2) Master index is corrupted, and it has replicated to slave servers. > - In this case, I need to restore to last snapshot, and make sure > slave servers replicate the restored index from index server as well. > > Assuming both cases are in production environment, and I cannot shutdown the > master and slave servers. > Is there any rest API call or something else I can do without manually using > linux command and restart? > > Thanks, > > Osborn > > -Original Message- > From: Matthew Runo [mailto:matthew.r...@gmail.com] > Sent: Friday, November 06, 2009 12:20 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr Replication: How to restore data from last snapshot > > If your master index is corrupt and it hasn't been replicated out, you > should be able to shut down the server and remove the corrupted index > files. Then copy the replicated index back onto the master and start > everything back up. > > As far as I know, the indexes on the replicated slaves are exactly > what you'd have on the master, so this method should work. > > --Matthew Runo > > On Fri, Nov 6, 2009 at 11:41 AM, Osborn Chan wrote: >> Hi, >> >> I have followed Solr set up ReplicationHandler for index replication to >> slave. >> Do anyone know how to restore corrupted index from snapshot in master, and >> force replication of the restored index to slave? >> >> >> Thanks, >> >> Osborn >> > -- - Noble Paul | Principal Engineer| AOL | http://aol.com
EOF IOException Query
Hi all, I got following exception for SOLR, but the index is still searchable. (At least it is searchable for query "*:*".) I am just wondering what is the root cause. Thanks, Osborn INFO: [publicGalleryPostMaster] webapp=/multicore path=/select params={wt=javabin&rows=12&start=0&sort=/gallery/1/postlist/1Rank_i+desc&q=%2B(comm unityList_s_m:/gallery/1/postlist/1)+%2Bstate_s:A&version=1} status=500 QTime=3 Jan 11, 2010 12:23:01 PM org.apache.solr.common.SolrException log SEVERE: java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:151) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38) at org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:80) at org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:112) at org.apache.lucene.search.FieldCacheImpl$StringIndexCache.createValue(FieldCacheImpl.java:712) at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:208) at org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:676) at org.apache.lucene.search.FieldComparator$StringOrdValComparator.setNextReader(FieldComparator.java:667) at org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:94) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:245) at org.apache.lucene.search.Searcher.search(Searcher.java:171) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
Index Courruption after replication by new Solr 1.4 Replication
Hi all, I have migrated new Solr 1.4 Replication feature with multicore support from Solr 1.2 with NFS mounting recently. The following exceptions are in catalina.log from time to time, and there are some EOF exceptions which I believe the slave index files are corrupted after replication from index server. I have following configuration with Solr 1.4, please correct me if it is configured incorrectly. (The index files are not corrupted in master servers, but it is corrupted in slave servers. Usually only one of the slave servers are corrupted with EOF exception, but not all.) 1 Master Server: (Index Server) - 8 indexes with multicore configuration. - All indexes are configured to "replicateAfter" optimize only. - The size of index data are vary. The smallest index only have 2.5 MB. The biggest index have ~ 100 MB. - There would be infrequent optimize calls to indexes. (a optimize call every ~30 mins to 6 hours depending on indexes). - There are many commit calls to all indexes. (But there is no concurrent commit and optimize for all indexes.) - Did not configure "commitReserveDuration" in ReplicationHandler - Using default values. 4 Slave Servers (Search Server) - 8 indexes with multicore configuration. - All indexes are configured to poll for every ~15 minutes. - All update handler configuration are removed in solrconfig-slave.xml (solrconfig.xml) in order to prevent add/commit/optimize calls. - (Search Slave Servers are only responsible for search operation.) - removed. - removed. - removed. A) FileNotFoundException INFO: Total time taken for download : 1 secs Jan 15, 2010 10:34:16 AM org.apache.solr.handler.ReplicationHandler doFetch SEVERE: SnapPull failed org.apache.solr.common.SolrException: Index fetch failed : at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:329) at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264) at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:280) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:135) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:65) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:142) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:166) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675) at java.lang.Thread.run(Thread.java:595) Caused by: java.io.FileNotFoundException: File does not exist /slaveIndexData/publicGalleryTagDef/index.20100115103415/_al.fdx at org.apache.solr.common.util.FileUtils.sync(FileUtils.java:55) at org.apache.solr.handler.SnapPuller$FileFetcher$1.run(SnapPuller.java:911) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269) at java.util.concurrent.FutureTask.run(FutureTask.java:123) ... 3 more Jan 15, 2010 10:34:17 AM org.apache.solr.core.SolrCore execute INFO: [publicGalleryPostMaster] webapp=/multicore path=/select params={wt=javabin&rows=10&start=0&sort=createTime_dt+desc&q=%2B(profileId_s:/community/sfly/publicprofile/0AcM27Nw3aNWLi4)+%2Bstate_s:A&version=1} hits=1 status=0 QTime=1 B) LockReleaseFailedException SEVERE: SnapPull failed org.apache.solr.common.SolrException: Index fetch failed : at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:329) at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264) at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:280) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:135) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:65) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:142) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:166) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650) at java.util.concurrent.ThreadPoolExecutor$W
RE: Index Courruption after replication by new Solr 1.4 Replication
Hi Otis, Thanks. There is no NFS anymore, and all index files are local. We migrated to new Solr 1.4 new Replication in order to avoid all the NSF Stale Exception. Thanks, Osborn -Original Message- From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] Sent: Friday, January 15, 2010 12:31 PM To: solr-user@lucene.apache.org Subject: Re: Index Courruption after replication by new Solr 1.4 Replication This is not a direct answer to your question, but can you avoid NFS? My first guess would be that NFS somehow causes this problem. If you check the ML archives for: NFS lock , you will see what I mean. Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message > From: Osborn Chan > To: "solr-user@lucene.apache.org" > Sent: Fri, January 15, 2010 3:23:21 PM > Subject: Index Courruption after replication by new Solr 1.4 Replication > > Hi all, > > I have migrated new Solr 1.4 Replication feature with multicore support from > Solr 1.2 with NFS mounting recently. The following exceptions are in > catalina.log from time to time, and there are some EOF exceptions which I > believe the slave index files are corrupted after replication from index > server. > I have following configuration with Solr 1.4, please correct me if it is > configured incorrectly. > > (The index files are not corrupted in master servers, but it is corrupted in > slave servers. Usually only one of the slave servers are corrupted with EOF > exception, but not all.) > > 1 Master Server: (Index Server) > - 8 indexes with multicore configuration. > - All indexes are configured to "replicateAfter" optimize only. > - The size of index data are vary. The smallest index only have 2.5 MB. > The > biggest index have ~ 100 MB. > - There would be infrequent optimize calls to indexes. (a optimize call > every ~30 mins to 6 hours depending on indexes). > - There are many commit calls to all indexes. (But there is no concurrent > commit and optimize for all indexes.) > - Did not configure "commitReserveDuration" in ReplicationHandler - Using > default values. > > 4 Slave Servers (Search Server) > - 8 indexes with multicore configuration. > - All indexes are configured to poll for every ~15 minutes. > - All update handler configuration are removed in solrconfig-slave.xml > (solrconfig.xml) in order to prevent add/commit/optimize calls. > - (Search Slave Servers are only responsible for search operation.) > - removed. > - > removed. > - > class="solr.BinaryUpdateRequestHandler" /> removed. > > A) FileNotFoundException > > INFO: Total time taken for download : 1 secs > Jan 15, 2010 10:34:16 AM org.apache.solr.handler.ReplicationHandler doFetch > SEVERE: SnapPull failed > org.apache.solr.common.SolrException: Index fetch failed : > at > org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:329) > at > org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264) > at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417) > at > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:280) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:135) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:65) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:142) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:166) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675) > at java.lang.Thread.run(Thread.java:595) > Caused by: java.io.FileNotFoundException: File does not exist > /slaveIndexData/publicGalleryTagDef/index.20100115103415/_al.fdx > at org.apache.solr.common.util.FileUtils.sync(FileUtils.java:55) > at > org.apache.solr.handler.SnapPuller$FileFetcher$1.run(SnapPuller.java:911) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269) > at java.util.concurrent.FutureTask.run(FutureTask.java:123) > ... 3 more > Jan 15, 2010 10:34:17 AM org.apache.solr.core.SolrCore execute > INFO: [publicGa
RE: Index Courruption after replication by new Solr 1.4 Replication
Hi All, I found out there is file corruption issue by using both "EmbeddedSolrServer" & "Solr 1.4 Java based replication" together in slave server. In my slave server, I have 2 webapps in a tomcat instance. 1) "multicore" webapp with slave config 2) "my custom" webapp using EmbeddedSolrServer while queries Solr Index Data. Both webapps were set up according to the instruction from Solr wiki. However, I found out there are multi-threading issue which cause index file corruption. The following is the root case: EmbeddedSolrServer requires to have a CoreContainer object as parameter. However, during the creation of CoreContainer object, the process load the slave solr configuration which silently creates an Extra ReplcationHandler (SnapPuller) in background. However, there is a ReplcationHandler (SnapPuller) already created by multicore webapp because of the slave configuration. As a result, there are 2 threads doing file replication as same time. It causes index corruption with different IOExceptions. After I replaced the usage of EmbeddedSolrServer with CommonsHttpSolrServer (Stop creating CoreContainer object in slave server), Solr 1.4 Java based replication work perfectly without having any file corruption issue. In other to use EmbeddedSolrServer in slave server, I think we need to have a way to create CoreContainer object with slave configuration without creating extra thread to replicate files. Should I file a bug? Thanks, Osborn -Original Message- From: Osborn Chan [mailto:oc...@shutterfly.com] Sent: Friday, January 15, 2010 12:35 PM To: solr-user@lucene.apache.org Subject: RE: Index Courruption after replication by new Solr 1.4 Replication Hi Otis, Thanks. There is no NFS anymore, and all index files are local. We migrated to new Solr 1.4 new Replication in order to avoid all the NSF Stale Exception. Thanks, Osborn -Original Message- From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] Sent: Friday, January 15, 2010 12:31 PM To: solr-user@lucene.apache.org Subject: Re: Index Courruption after replication by new Solr 1.4 Replication This is not a direct answer to your question, but can you avoid NFS? My first guess would be that NFS somehow causes this problem. If you check the ML archives for: NFS lock , you will see what I mean. Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch ----- Original Message > From: Osborn Chan > To: "solr-user@lucene.apache.org" > Sent: Fri, January 15, 2010 3:23:21 PM > Subject: Index Courruption after replication by new Solr 1.4 Replication > > Hi all, > > I have migrated new Solr 1.4 Replication feature with multicore support from > Solr 1.2 with NFS mounting recently. The following exceptions are in > catalina.log from time to time, and there are some EOF exceptions which I > believe the slave index files are corrupted after replication from index > server. > I have following configuration with Solr 1.4, please correct me if it is > configured incorrectly. > > (The index files are not corrupted in master servers, but it is corrupted in > slave servers. Usually only one of the slave servers are corrupted with EOF > exception, but not all.) > > 1 Master Server: (Index Server) > - 8 indexes with multicore configuration. > - All indexes are configured to "replicateAfter" optimize only. > - The size of index data are vary. The smallest index only have 2.5 MB. > The > biggest index have ~ 100 MB. > - There would be infrequent optimize calls to indexes. (a optimize call > every ~30 mins to 6 hours depending on indexes). > - There are many commit calls to all indexes. (But there is no concurrent > commit and optimize for all indexes.) > - Did not configure "commitReserveDuration" in ReplicationHandler - Using > default values. > > 4 Slave Servers (Search Server) > - 8 indexes with multicore configuration. > - All indexes are configured to poll for every ~15 minutes. > - All update handler configuration are removed in solrconfig-slave.xml > (solrconfig.xml) in order to prevent add/commit/optimize calls. > - (Search Slave Servers are only responsible for search operation.) > - removed. > - > removed. > - > class="solr.BinaryUpdateRequestHandler" /> removed. > > A) FileNotFoundException > > INFO: Total time taken for download : 1 secs > Jan 15, 2010 10:34:16 AM org.apache.solr.handler.ReplicationHandler doFetch > SEVERE: SnapPull failed > org.apache.solr.common.SolrException: Index fetch failed : > at > org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:329) > at > org.apache.solr.handler.ReplicationHandle