we are continuously getting this exception during replication from master to slave. our index size is 9.27 G and we are trying to replicate a slave from scratch. Its a different file each time , sometimes we get to 60% replication before it fails and sometimes only 10%, we never managed a successful replication.
30 Oct 2013 18:38:52,884 [explicit-fetchindex-cmd] ERROR ReplicationHandler - SnapPull failed :org.apache.solr.common.SolrException: Unable to download _aa7_Lucene41_0.tim completely. Downloaded 0!=1054090 at org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.cleanup(SnapPuller.java:1244) at org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1124) at org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:719) at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:397) at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:317) at org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:218) I read in some thread that there was a related bug in solr 4.1, but we are using solr 4.3 and tried with 4.5.1 also. It seams that DirectoryFileFetcher can not download a file sometimes , the files is downloaded to the slave in size zero. we are running in a test environment where bandwidth is high. this is the master setup: |<requestHandler name="/replication" class="solr.ReplicationHandler" > <lst name="master"> <str name="replicateAfter">commit</str> <str name="replicateAfter">startup</str> <str name="confFiles">stopwords.txt,spellings.txt,synonyms.txt,protwords.txt,elevate.xml,currency.xml</str> <str name="commitReserveDuration">00:00:50</str> </lst> </requestHandler> | and the slave setup: | <requestHandler name="/replication" class="|||solr.ReplicationHandler|"> <lst name="slave"> <str name="masterUrl">http://solr-master.saltdev.sealdoc.com:8081/solr-master</str> <str name="httpConnTimeout">150000</str> <str name="httpReadTimeout">300000</str> </lst> </requestHandler> |