Likely some of the trunk work around allowing any Directory impl to replicate. JIRA pls :)
- Mark On Oct 30, 2012, at 12:29 PM, Markus Jelsma <markus.jel...@openindex.io> wrote: > Hi, > > We're testing again with today's trunk and using the new Lucene 4.1 format by > default. When nodes are not restarted things are kind of stable but > restarting nodes leads to a lot of mayhem. It seems we can get the cluster > back up and running by clearing ZK and restarting everything (another issue) > but replication becomes impossible for some nodes leading to a continuous > state of failing recovery etc. > > Here are some excepts from the logs: > > 2012-10-30 16:12:39,674 ERROR [solr.servlet.SolrDispatchFilter] - > [http-8080-exe > c-5] - : null:java.lang.IndexOutOfBoundsException > at java.nio.Buffer.checkBounds(Buffer.java:530) > at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:218) > at > org.apache.lucene.store.ByteBufferIndexInput.readBytes(ByteBufferInde > xInput.java:91) > at > org.apache.solr.handler.ReplicationHandler$DirectoryFileStream.write( > ReplicationHandler.java:1065) > at > org.apache.solr.handler.ReplicationHandler$3.write(ReplicationHandler.java:932) > > > 2012-10-30 16:10:32,220 ERROR [solr.handler.ReplicationHandler] - > [RecoveryThrea > d] - : SnapPull failed :org.apache.solr.common.SolrException: Unable to > download > _x.fdt completely. Downloaded 13631488!=13843504 > at > org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.cleanup(SnapP > uller.java:1237) > at > org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(Sna > pPuller.java:1118) > at > org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java > :716) > at > org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:387) > at > org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:273) > at > org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:152) > at > org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:407) > > 2012-10-30 16:12:51,061 WARN [solr.handler.ReplicationHandler] - > [http-8080-exec > -3] - : Exception while writing response for params: > file=_p_Lucene41_0.doc&comm > and=filecontent&checksum=true&generation=6&qt=/replication&wt=filestream > java.io.EOFException: read past EOF: > MMapIndexInput(path="/opt/solr/cores/openindex_h/data/index.20121030152234973/_p_Lucene41_0.doc") > at > org.apache.lucene.store.ByteBufferIndexInput.readBytes(ByteBufferIndexInput.java:100) > at > org.apache.solr.handler.ReplicationHandler$DirectoryFileStream.write(ReplicationHandler.java:1065) > at > org.apache.solr.handler.ReplicationHandler$3.write(ReplicationHandler.java:932) > > > Needless to say i'm puzzled so i'm wondering if anyone has seen this before > or have some hints that might help digg further. > > Thanks, > Markus