Glad to hear that! Thanks for closing this out. Best, Erick
On Sun, Nov 9, 2014 at 4:55 PM, Bruno Osiek <baos...@gmail.com> wrote: > Erick, > > Once again thank you very much for your attention. > > Now my pseudo-distributed SolrCloud is configured with no inconsistency. An > additional problem was starting Jboss with "solr.data.dir" set to a path > not expected by Solr (actually it was not even underneath solr.home > directory). > > This thread ( > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201206.mbox/%3ccao8xr5zv8o-s6zn7ypaxpzpourqjknbsm59mbe6h3dpfykg...@mail.gmail.com%3E) > explains the inconsistency. > > I found no need to change Solr data directory. After commenting this > property at Jboss' standalone.xml and setting > "<lockType>${solr.lock.type:native}</lockType>" everything started to work > properly. > > Regards, > Bruno > > > > 2014-11-09 14:35 GMT-02:00 Erick Erickson <erickerick...@gmail.com>: > >> OK, we're _definitely_ in the speculative realm here, so don't think >> I know more than I do ;)... >> >> The next thing I'd try is to go back to "native" as the lock type on the >> theory that the lock type wasn't your problem, it was the too-frequent >> commits. >> >> bq: This file "_1.nvm" once existed. Was deleted during one auto commit , >> but >> remains somewhere in a queue for deletion >> >> Assuming Unix, this is entirely expected. Searchers have all the files >> open. Commits >> do background merges, which may delete segments. So the current searcher >> may >> have the file open even though it's been "merged away". When the searcher >> closes, the file will actually truly disappear. >> >> It's more complicated on Windows but eventually that's what happens >> >> Anyway, keep us posted. If this continues to occur, please open a new >> thread, >> that might catch the eye of people who are deep into Lucene file locking... >> >> Best, >> Erick >> >> On Sun, Nov 9, 2014 at 6:45 AM, Bruno Osiek <baos...@gmail.com> wrote: >> > Hi Erick, >> > >> > Thank you very much for your reply. >> > I disabled client commit while setting commits at solconfig.xml as >> follows: >> > >> > <autoCommit> >> > <maxTime>${solr.autoCommit.maxTime:300000}</maxTime> >> > <openSearcher>false</openSearcher> >> > </autoCommit> >> > >> > <autoSoftCommit> >> > <maxTime>${solr.autoSoftCommit.maxTime:60000}</maxTime> >> > </autoSoftCommit> >> > >> > The picture changed for the better. No more index corruption, endless >> > replication trials and, up till now, 16 hours since start-up and more >> than >> > 142k tweet downloaded, shards and replicas are "active". >> > >> > One problem remains though. While auto committing Solr logs the following >> > stack-trace >> > >> > 00:00:40,383 ERROR [org.apache.solr.update.CommitTracker] >> > (commitScheduler-25-thread-1) auto commit >> > error...:org.apache.solr.common.SolrException: *Error opening new >> searcher* >> > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1550) >> > at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1662) >> > at >> > >> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:603) >> > at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216) >> > at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >> > at java.util.concurrent.FutureTask.run(FutureTask.java:262) >> > at >> > >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) >> > at >> > >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) >> > at >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > *Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: >> > _1.nvm* >> > at >> > >> org.apache.lucene.index.TieredMergePolicy$SegmentByteSizeDescending.compare(TieredMergePolicy.java:252) >> > at >> > >> org.apache.lucene.index.TieredMergePolicy$SegmentByteSizeDescending.compare(TieredMergePolicy.java:238) >> > at java.util.TimSort.countRunAndMakeAscending(TimSort.java:324) >> > at java.util.TimSort.sort(TimSort.java:203) >> > at java.util.TimSort.sort(TimSort.java:173) >> > at java.util.Arrays.sort(Arrays.java:659) >> > at java.util.Collections.sort(Collections.java:217) >> > at >> > >> org.apache.lucene.index.TieredMergePolicy.findMerges(TieredMergePolicy.java:286) >> > at >> > >> org.apache.lucene.index.IndexWriter.updatePendingMerges(IndexWriter.java:2017) >> > at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1986) >> > at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:407) >> > at >> > >> org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:287) >> > at >> > >> org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:272) >> > at >> > >> org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:251) >> > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1461) >> > ... 10 more >> > *Caused by: java.io.FileNotFoundException: _1.nvm* >> > at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:260) >> > at >> > >> org.apache.lucene.store.NRTCachingDirectory.fileLength(NRTCachingDirectory.java:177) >> > at >> > >> org.apache.lucene.index.SegmentCommitInfo.sizeInBytes(SegmentCommitInfo.java:141) >> > at org.apache.lucene.index.MergePolicy.size(MergePolicy.java:513) >> > at >> > >> org.apache.lucene.index.TieredMergePolicy$SegmentByteSizeDescending.compare(TieredMergePolicy.java:242) >> > ... 24 more >> > >> > This file "_1.nvm" once existed. Was deleted during one auto commit , but >> > remains somewhere in a queue for deletion. I believe the consequence is >> > that at SolrCloud Admin UI -> Core Admin -> Stats, the "Current" status >> is >> > off for all shards' replica number 3. If I understand correctly this >> means >> > that changes to the index are not becoming visible. >> > >> > Once again I tried to find possible reasons for that situation, but none >> of >> > the threads found seems to reflect my case. >> > >> > My lock type is set to: <lockType>${solr.lock.type:single}</lockType>. >> This >> > is due to lock.wait timeout error with both "native" and "simple" when >> > trying to create collection using the commands API. There is a thread >> > discussing this issue: >> > >> > >> http://lucene.472066.n3.nabble.com/unable-to-load-core-after-cluster-restart-td4098731.html >> > >> > The only thing is that "single" should only be used if "there is no >> > possibility of another process trying to modify the index" and I >> > cannot guarantee that. Could that be the cause of the file not found >> > exception? >> > >> > Thanks once again for your help. >> > >> > Regards, >> > Bruno. >> > >> > >> > >> > 2014-11-08 18:36 GMT-02:00 Erick Erickson <erickerick...@gmail.com>: >> > >> >> First. for tweets committing every 500 docs is much too frequent. >> >> Especially from the client and super-especially if you have multiple >> >> clients running. I'd recommend you just configure solrconfig this way >> >> as a place to start and do NOT commit from any clients. >> >> 1> a hard commit (openSearcher=false) every minute (or maybe 5 minutes) >> >> 2> a soft commit every minute >> >> >> >> This latter governs how long it'll be between when a doc is indexed and >> >> when >> >> can be searched. >> >> >> >> Here's a long post about how all this works: >> >> >> >> >> https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ >> >> >> >> >> >> As far as the rest, it's a puzzle definitely. If it continues, a >> complete >> >> stack >> >> trace would be a good thing to start with. >> >> >> >> Best, >> >> Erick >> >> >> >> On Sat, Nov 8, 2014 at 9:47 AM, Bruno Osiek <baos...@gmail.com> wrote: >> >> > Hi, >> >> > >> >> > I am a newbie SolrCloud enthusiast. My goal is to implement an >> >> > infrastructure to enable text analysis (clustering, classification, >> >> > information extraction, sentiment analysis, etc). >> >> > >> >> > My development environment consists of one machine, quad-core >> processor, >> >> > 16GB RAM and 1TB HD. >> >> > >> >> > Have started implementing Apache Flume, Twitter as source and >> SolrCloud >> >> > (within JBoss AS 7) as sink. Using Zookeeper (5 servers) to upload >> >> > configuration and managing cluster. >> >> > >> >> > The pseudo-distributed cluster consists of one collection, three >> shards >> >> > each with three replicas. >> >> > >> >> > Everything runs smoothly for a while. After 50.000 tweets committed >> >> > (actually CloudSolrServer commits every batch consisting of 500 >> >> documents) >> >> > randomly SolrCloud starts logging exceptions: Lucene file not found, >> >> > IndexWriter cannot be opened, replication unsuccessful and the likes. >> >> > Recovery starts with no success until replica goes down. >> >> > >> >> > Have tried different Solr versions (4.10.2, 4.9.1 and lastly 4.8.1) >> with >> >> > same results. >> >> > >> >> > I have looked everywhere for help before writing this email. My guess >> >> right >> >> > now is that the problem lies with SolrCloud and Zookeeper connection, >> >> > although haven't seen any such exception. >> >> > >> >> > Any reference or help will be welcomed. >> >> > >> >> > Cheers, >> >> > B. >> >> >>