Hi again,
After digging through Solr and Lucene code some more and instrumenting it with 
lots of timing I can see three main culprits when it comes to replicating data 
from 3 cores around the same time:

Lucene IndexWriter's getFieldNumberMap
Have often seen times up around 30 seconds but what is even worse is that this 
hit is taken twice (once before the index download and once afterwards) adding 
a total of about 60 seconds to the replication. The first core is barely 
affected by this since the entire replication process takes about 10-14s for 
that core.

StandardDirectoryReader.open
Have seen times vary greatly here but seemingly gets worse for each core. For 
example: first core: 0.5s, second core 45s, third core 1m18s. Interestingly 
enough the first core is about 10x larger than the other two but opens 100x 
faster.

The call sequence newReaderCreator.call() + new SolrIndexSearcher + newHolder
Haven't pinpointed more exactly there. Also upwards of 30+ seconds. Maybe this 
is affected somewhat by cache warming but I didn't really see much difference 
going from like 8 cache warming queries down to 1 cache warming query in 
solrconfig.xml. And going to 0 cache warming seems sketchy, the 1 cache warming 
query I kept was to ask for the set of default facets we use for the given core.

Kind regards,

Marcus

Reply via email to