Hi Erick, Thank you very much for your reply. I disabled client commit while setting commits at solconfig.xml as follows:
<autoCommit> <maxTime>${solr.autoCommit.maxTime:300000}</maxTime> <openSearcher>false</openSearcher> </autoCommit> <autoSoftCommit> <maxTime>${solr.autoSoftCommit.maxTime:60000}</maxTime> </autoSoftCommit> The picture changed for the better. No more index corruption, endless replication trials and, up till now, 16 hours since start-up and more than 142k tweet downloaded, shards and replicas are "active". One problem remains though. While auto committing Solr logs the following stack-trace 00:00:40,383 ERROR [org.apache.solr.update.CommitTracker] (commitScheduler-25-thread-1) auto commit error...:org.apache.solr.common.SolrException: *Error opening new searcher* at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1550) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1662) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:603) at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) *Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: _1.nvm* at org.apache.lucene.index.TieredMergePolicy$SegmentByteSizeDescending.compare(TieredMergePolicy.java:252) at org.apache.lucene.index.TieredMergePolicy$SegmentByteSizeDescending.compare(TieredMergePolicy.java:238) at java.util.TimSort.countRunAndMakeAscending(TimSort.java:324) at java.util.TimSort.sort(TimSort.java:203) at java.util.TimSort.sort(TimSort.java:173) at java.util.Arrays.sort(Arrays.java:659) at java.util.Collections.sort(Collections.java:217) at org.apache.lucene.index.TieredMergePolicy.findMerges(TieredMergePolicy.java:286) at org.apache.lucene.index.IndexWriter.updatePendingMerges(IndexWriter.java:2017) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1986) at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:407) at org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:287) at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:272) at org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:251) at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1461) ... 10 more *Caused by: java.io.FileNotFoundException: _1.nvm* at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:260) at org.apache.lucene.store.NRTCachingDirectory.fileLength(NRTCachingDirectory.java:177) at org.apache.lucene.index.SegmentCommitInfo.sizeInBytes(SegmentCommitInfo.java:141) at org.apache.lucene.index.MergePolicy.size(MergePolicy.java:513) at org.apache.lucene.index.TieredMergePolicy$SegmentByteSizeDescending.compare(TieredMergePolicy.java:242) ... 24 more This file "_1.nvm" once existed. Was deleted during one auto commit , but remains somewhere in a queue for deletion. I believe the consequence is that at SolrCloud Admin UI -> Core Admin -> Stats, the "Current" status is off for all shards' replica number 3. If I understand correctly this means that changes to the index are not becoming visible. Once again I tried to find possible reasons for that situation, but none of the threads found seems to reflect my case. My lock type is set to: <lockType>${solr.lock.type:single}</lockType>. This is due to lock.wait timeout error with both "native" and "simple" when trying to create collection using the commands API. There is a thread discussing this issue: http://lucene.472066.n3.nabble.com/unable-to-load-core-after-cluster-restart-td4098731.html The only thing is that "single" should only be used if "there is no possibility of another process trying to modify the index" and I cannot guarantee that. Could that be the cause of the file not found exception? Thanks once again for your help. Regards, Bruno. 2014-11-08 18:36 GMT-02:00 Erick Erickson <erickerick...@gmail.com>: > First. for tweets committing every 500 docs is much too frequent. > Especially from the client and super-especially if you have multiple > clients running. I'd recommend you just configure solrconfig this way > as a place to start and do NOT commit from any clients. > 1> a hard commit (openSearcher=false) every minute (or maybe 5 minutes) > 2> a soft commit every minute > > This latter governs how long it'll be between when a doc is indexed and > when > can be searched. > > Here's a long post about how all this works: > > https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ > > > As far as the rest, it's a puzzle definitely. If it continues, a complete > stack > trace would be a good thing to start with. > > Best, > Erick > > On Sat, Nov 8, 2014 at 9:47 AM, Bruno Osiek <baos...@gmail.com> wrote: > > Hi, > > > > I am a newbie SolrCloud enthusiast. My goal is to implement an > > infrastructure to enable text analysis (clustering, classification, > > information extraction, sentiment analysis, etc). > > > > My development environment consists of one machine, quad-core processor, > > 16GB RAM and 1TB HD. > > > > Have started implementing Apache Flume, Twitter as source and SolrCloud > > (within JBoss AS 7) as sink. Using Zookeeper (5 servers) to upload > > configuration and managing cluster. > > > > The pseudo-distributed cluster consists of one collection, three shards > > each with three replicas. > > > > Everything runs smoothly for a while. After 50.000 tweets committed > > (actually CloudSolrServer commits every batch consisting of 500 > documents) > > randomly SolrCloud starts logging exceptions: Lucene file not found, > > IndexWriter cannot be opened, replication unsuccessful and the likes. > > Recovery starts with no success until replica goes down. > > > > Have tried different Solr versions (4.10.2, 4.9.1 and lastly 4.8.1) with > > same results. > > > > I have looked everywhere for help before writing this email. My guess > right > > now is that the problem lies with SolrCloud and Zookeeper connection, > > although haven't seen any such exception. > > > > Any reference or help will be welcomed. > > > > Cheers, > > B. >