Ah, we're also seeing Solr lookup an unexisting directory:

2012-10-30 16:32:26,578 ERROR [handler.admin.CoreAdminHandler] - 
[http-8080-exec-2] - : IO error while trying to get the size of the 
Directory:org.apache.lucene.store.NoSuchDirectoryException: directory 
'/opt/solr/cores/shard_a/data/index' does not exist
        at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:220)
        at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:243)
        at 
org.apache.lucene.store.NRTCachingDirectory.listAll(NRTCachingDirectory.java:132)
        at 
org.apache.solr.core.DirectoryFactory.sizeOfDirectory(DirectoryFactory.java:146)

Instead of data/index it should be looking for data/index.20121030152324761/, 
which actually does exist.

 
 
-----Original message-----
> From:Markus Jelsma <markus.jel...@openindex.io>
> Sent: Tue 30-Oct-2012 17:30
> To: solr-user@lucene.apache.org
> Subject: trunk is unable to replicate between nodes ( Unable to download ... 
> completely)
> 
> Hi,
> 
> We're testing again with today's trunk and using the new Lucene 4.1 format by 
> default. When nodes are not restarted things are kind of stable but 
> restarting nodes leads to a lot of mayhem. It seems we can get the cluster 
> back up and running by clearing ZK and restarting everything (another issue) 
> but replication becomes impossible for some nodes leading to a continuous 
> state of failing recovery etc.
> 
> Here are some excepts from the logs:
> 
> 2012-10-30 16:12:39,674 ERROR [solr.servlet.SolrDispatchFilter] - 
> [http-8080-exe
> c-5] - : null:java.lang.IndexOutOfBoundsException
>         at java.nio.Buffer.checkBounds(Buffer.java:530)
>         at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:218)
>         at 
> org.apache.lucene.store.ByteBufferIndexInput.readBytes(ByteBufferInde
> xInput.java:91)
>         at 
> org.apache.solr.handler.ReplicationHandler$DirectoryFileStream.write(
> ReplicationHandler.java:1065)
>         at 
> org.apache.solr.handler.ReplicationHandler$3.write(ReplicationHandler.java:932)
> 
> 
> 2012-10-30 16:10:32,220 ERROR [solr.handler.ReplicationHandler] - 
> [RecoveryThrea
> d] - : SnapPull failed :org.apache.solr.common.SolrException: Unable to 
> download
>  _x.fdt completely. Downloaded 13631488!=13843504
>         at 
> org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.cleanup(SnapP
> uller.java:1237)
>         at 
> org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(Sna
> pPuller.java:1118)
>         at 
> org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java
> :716)
>         at 
> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:387)
>         at 
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:273)
>         at 
> org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:152)
>         at 
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:407)
> 
> 2012-10-30 16:12:51,061 WARN [solr.handler.ReplicationHandler] - 
> [http-8080-exec
> -3] - : Exception while writing response for params: 
> file=_p_Lucene41_0.doc&comm
> and=filecontent&checksum=true&generation=6&qt=/replication&wt=filestream
> java.io.EOFException: read past EOF: 
> MMapIndexInput(path="/opt/solr/cores/openindex_h/data/index.20121030152234973/_p_Lucene41_0.doc")
>         at 
> org.apache.lucene.store.ByteBufferIndexInput.readBytes(ByteBufferIndexInput.java:100)
>         at 
> org.apache.solr.handler.ReplicationHandler$DirectoryFileStream.write(ReplicationHandler.java:1065)
>         at 
> org.apache.solr.handler.ReplicationHandler$3.write(ReplicationHandler.java:932)
> 
> 
> Needless to say i'm puzzled so i'm wondering if anyone has seen this before 
> or have some hints that might help digg further.
> 
> Thanks,
> Markus
> 

Reply via email to