Have you read: http://wiki.apache.org/lucene-java/ImproveIndexingSpeed http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr
To be short there are only guidelines (see links) no definitive answers. If you followed the guidelines for improviing indexing speed on a single box and after having tested various settings indexing is still too slow, you may want to test the scenario: 1. indexing to several boxes/shards (using round robin or something). 2. copy all created indexes to one box. 3. use indexwriter.addIndexes to merge the indexes. 1/2/3 done on ssd's is of course going to boost performance a lot as well (on large indexes, bc small ones may fit in disk cache entirely) <http://wiki.apache.org/lucene-java/ImproveIndexingSpeed> Hope that helps a bit, Geert-Jan 2010/7/18 kenf_nc <ken.fos...@realestate.com> > > No one has done performance analysis? Or has a link to anywhere where it's > been done? > > basically fastest way to get documents into Solr. So many options > available, > what's the fastest: > 1) file import (xml, csv) vs DIH vs POSTing > 2) number of concurrent clients 1 vs 10 vs 100 ...is there a > diminishing > returns number? > > I have 16 million small (8 to 10 fields, no large text fields) docs that > get > updated monthly and 2.5 million largish (20 to 30 fields, a couple html > text > fields) that get updated monthly. It currently takes about 20 hours to do a > full import. I would like to cut that down as much as possible. > Thanks, > Ken > -- > View this message in context: > http://lucene.472066.n3.nabble.com/indexing-best-practices-tp973274p976313.html > Sent from the Solr - User mailing list archive at Nabble.com. >