Since I began using the 2010-05-18 nightly I'm experiencing indexing slow downs which I didn't with solr-1.4.
I'm seeing indexing slow down roughly every 7m records. I'm indexing about 28m in total. These records are batched into csv files of 1m rows, which are loaded with stream.file. Solr happily chugs away at the first 7m at around 50s/million. It will then consistently take around 20 minutes to index the 7m-8m batch, after which it returns to around 50s/million until reaching the 14m-15m batch and taking again around 20 minutes and so on. There are essentially no differences in configuration between my 1.4 set up and the nightly. I've played around with mergeFactor and other params to no avail. I've also hooked up yourkit to jetty, but haven't seen anything obvious in the results. That said, my java foo is not so strong so I may be missing something. Can anyone suggest where I might start looking for answers? I have a yourkit snapshot if anyone would care to see it. Thanks, Mark