I just set up a SolrCloud instance with 2 Solr nodes & another machine running 
zookeeper.

I’ve imported 200M records from a SQL Server database, and those records are 
split nicely between the 2 nodes.  Everything seems ok.

I did the data import via the admin ui.  It took not quite 8 hours, which I 
guess is fine.  So, in the middle of the import I checked to see what was 
connected to the SQL Server machine.  It turned out that only the node that I 
had started the import on was actually connected to my database server.

Is that the expected behavior?  Is there any way to have all nodes of a 
SolrCloud index communicate with the database during the indexing?  Would that 
speed up indexing?  Maybe this isn’t a bottleneck I should be worried about.

Thanks,
-Colin

Reply via email to