Hey all,

We have a Master (1 server) and 2 Slaves (2 servers) setup and running 
replication across multiple cores.

However, the replication appears to behave sporadically and often fails when 
left to replicate automatically via poll. More often than not a replicate will 
fail after the slave has finished pulling down the segment files, because it 
cannot find a particular file, giving errors such as:

Oct 25, 2011 10:00:17 AM org.apache.solr.handler.SnapPuller copyAFile
SEVERE: Unable to move index file from: 
D:\web\solr\collection\data\index.20111025100000\_3u.tii to: 
D:\web\solr\Collection\data\index\_3u.tiiTrying to do a copy

SEVERE: Unable to copy index file from: 
D:\web\solr\collection\data\index.20111025100000\_3s.fdt to: 
D:\web\solr\Collection\data\index\_3s.fdt
java.io.FileNotFoundException: 
D:\web\solr\collection\data\index.20111025100000\_3s.fdt (The system cannot 
find the file specified)
    at java.io.FileInputStream.open(Native Method)
    at java.io.FileInputStream.<init>(Unknown Source)
    at org.apache.solr.common.util.FileUtils.copyFile(FileUtils.java:47)
    at org.apache.solr.handler.SnapPuller.copyAFile(SnapPuller.java:585)
    at org.apache.solr.handler.SnapPuller.copyIndexFiles(SnapPuller.java:621)
    at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:317)
    at 
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:267)
    at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
    at java.util.concurrent.FutureTask$Sync.innerRunAndReset(Unknown Source)
    at java.util.concurrent.FutureTask.runAndReset(Unknown Source)
    at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(Unknown
 Source)
    at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(Unknown
 Source)
    at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
 Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)

For these files, I checked the master, and they did indeed exist.

Both slave machines are configured the same, with the same replication settings 
and a 60 minutes poll interval. Using Solr 3.1

Is it perhaps because both slave machines are trying to pull down files at the 
same time? (and the other has a lock on the file, thus it gets skipped maybe?)

Note: If I manually force replication on each slave, one at a time, the 
replication always seems to work fine.




Is there any obvious explanation or oddities I should be aware of that may 
cause this?

Thanks,
Rob



                                                                                
  

Reply via email to