OK, we're _definitely_ in the speculative realm here, so don't think
I know more than I do ;)...

The next thing I'd try is to go back to "native" as the lock type on the
theory that the lock type wasn't your problem, it was the too-frequent
commits.

bq: This file "_1.nvm" once existed. Was deleted during one auto commit , but
remains somewhere in a queue for deletion

Assuming Unix, this is entirely expected. Searchers have all the files
open. Commits
do background merges, which may delete segments. So the current searcher may
have the file open even though it's been "merged away". When the searcher
closes, the file will actually truly disappear.

It's more complicated on Windows but eventually that's what happens

Anyway, keep us posted. If this continues to occur, please open a new thread,
that might catch the eye of people who are deep into Lucene file locking...

Best,
Erick

On Sun, Nov 9, 2014 at 6:45 AM, Bruno Osiek <baos...@gmail.com> wrote:
> Hi Erick,
>
> Thank you very much for your reply.
> I disabled client commit while setting commits at solconfig.xml as follows:
>
>      <autoCommit>
>        <maxTime>${solr.autoCommit.maxTime:300000}</maxTime>
>        <openSearcher>false</openSearcher>
>      </autoCommit>
>
>      <autoSoftCommit>
>        <maxTime>${solr.autoSoftCommit.maxTime:60000}</maxTime>
>      </autoSoftCommit>
>
> The picture changed for the better. No more index corruption, endless
> replication trials and, up till now, 16 hours since start-up and more than
> 142k tweet downloaded, shards and replicas are "active".
>
> One problem remains though. While auto committing Solr logs the following
> stack-trace
>
> 00:00:40,383 ERROR [org.apache.solr.update.CommitTracker]
> (commitScheduler-25-thread-1) auto commit
> error...:org.apache.solr.common.SolrException: *Error opening new searcher*
> at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1550)
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1662)
> at
> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:603)
> at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> *Caused by: java.lang.RuntimeException: java.io.FileNotFoundException:
> _1.nvm*
> at
> org.apache.lucene.index.TieredMergePolicy$SegmentByteSizeDescending.compare(TieredMergePolicy.java:252)
> at
> org.apache.lucene.index.TieredMergePolicy$SegmentByteSizeDescending.compare(TieredMergePolicy.java:238)
> at java.util.TimSort.countRunAndMakeAscending(TimSort.java:324)
> at java.util.TimSort.sort(TimSort.java:203)
> at java.util.TimSort.sort(TimSort.java:173)
> at java.util.Arrays.sort(Arrays.java:659)
> at java.util.Collections.sort(Collections.java:217)
> at
> org.apache.lucene.index.TieredMergePolicy.findMerges(TieredMergePolicy.java:286)
> at
> org.apache.lucene.index.IndexWriter.updatePendingMerges(IndexWriter.java:2017)
> at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1986)
> at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:407)
> at
> org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:287)
> at
> org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:272)
> at
> org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:251)
> at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1461)
> ... 10 more
> *Caused by: java.io.FileNotFoundException: _1.nvm*
> at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:260)
> at
> org.apache.lucene.store.NRTCachingDirectory.fileLength(NRTCachingDirectory.java:177)
> at
> org.apache.lucene.index.SegmentCommitInfo.sizeInBytes(SegmentCommitInfo.java:141)
> at org.apache.lucene.index.MergePolicy.size(MergePolicy.java:513)
> at
> org.apache.lucene.index.TieredMergePolicy$SegmentByteSizeDescending.compare(TieredMergePolicy.java:242)
> ... 24 more
>
> This file "_1.nvm" once existed. Was deleted during one auto commit , but
> remains somewhere in a queue for deletion. I believe the consequence is
> that at SolrCloud Admin UI -> Core Admin -> Stats, the "Current" status is
> off for all shards' replica number 3. If I understand correctly this means
> that changes to the index are not becoming visible.
>
> Once again I tried to find possible reasons for that situation, but none of
> the threads found seems to reflect my case.
>
> My lock type is set to: <lockType>${solr.lock.type:single}</lockType>. This
> is due to lock.wait timeout error with both "native" and "simple" when
> trying to create collection using the commands API. There is a thread
> discussing this issue:
>
> http://lucene.472066.n3.nabble.com/unable-to-load-core-after-cluster-restart-td4098731.html
>
> The only thing is that "single" should only be used if "there is no
> possibility of another process trying to modify the index" and I
> cannot guarantee that. Could that be the cause of the file not found
> exception?
>
> Thanks once again for your help.
>
> Regards,
> Bruno.
>
>
>
> 2014-11-08 18:36 GMT-02:00 Erick Erickson <erickerick...@gmail.com>:
>
>> First. for tweets committing every 500 docs is much too frequent.
>> Especially from the client and super-especially if you have multiple
>> clients running. I'd recommend you just configure solrconfig this way
>> as a place to start and do NOT commit from any clients.
>> 1> a hard commit (openSearcher=false) every minute (or maybe 5 minutes)
>> 2> a soft commit every minute
>>
>> This latter governs how long it'll be between when a doc is indexed and
>> when
>> can be searched.
>>
>> Here's a long post about how all this works:
>>
>> https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>>
>>
>> As far as the rest, it's a puzzle definitely. If it continues, a complete
>> stack
>> trace would be a good thing to start with.
>>
>> Best,
>> Erick
>>
>> On Sat, Nov 8, 2014 at 9:47 AM, Bruno Osiek <baos...@gmail.com> wrote:
>> > Hi,
>> >
>> > I am a newbie SolrCloud enthusiast. My goal is to implement an
>> > infrastructure to enable text analysis (clustering, classification,
>> > information extraction, sentiment analysis, etc).
>> >
>> > My development environment consists of one machine, quad-core processor,
>> > 16GB RAM and 1TB HD.
>> >
>> > Have started implementing Apache Flume, Twitter as source and SolrCloud
>> > (within JBoss AS 7) as sink. Using Zookeeper (5 servers) to upload
>> > configuration and managing cluster.
>> >
>> > The pseudo-distributed cluster consists of one collection, three shards
>> > each with three replicas.
>> >
>> > Everything runs smoothly for a while. After 50.000 tweets committed
>> > (actually CloudSolrServer commits every batch consisting of 500
>> documents)
>> > randomly SolrCloud starts logging exceptions: Lucene file not found,
>> > IndexWriter cannot be opened, replication unsuccessful and the likes.
>> > Recovery starts with no success until replica goes down.
>> >
>> > Have tried different Solr versions (4.10.2, 4.9.1 and lastly 4.8.1) with
>> > same results.
>> >
>> > I have looked everywhere for help before writing this email. My guess
>> right
>> > now is that the problem lies with SolrCloud and Zookeeper connection,
>> > although haven't seen any such exception.
>> >
>> > Any reference or help will be welcomed.
>> >
>> > Cheers,
>> > B.
>>

Reply via email to