> The data is pulled from the MSSQL database. > I think the bottleneck for indexing in SOLR. > Is it possible to further boost by kettle?
I don't know what kettle is or what its capabilities are. Can you run more than one instance of kettle at the same time, each one retrieving part of the database? You could divide the DB by where clause, row limit, mod value on a hash, etc. Running updates at the same time is generally the way to get good indexing performance out of solr. If I were doing this with the dataimport handler, I would define more than one handler in solrconfig.xml, each with its own config file. Thanks, Shawn