On 6/19/2013 10:38 AM, Shawn Heisey wrote:
Looking at the numDocs for each segment, here's what I think is happening:

The autoCommit kicks in after the first 25000 docs (25002 to be
precise), but the ram buffer isn't emptied. The next 3339 documents get
indexed, at which point the ram buffer fills up, so it flushes another
segment.  Then it does another 21674 docs to approximately reach 25000
for autoCommit, which forces another segment flush, but without emptying
the buffer.  lather, rinse, repeat.

I seem to be wrong about it being strictly related to ramBufferSizeMB. Today I bumped the buffer up to 256MB, restarted Solr, and started another full-import.

If I were completely right about the buffer interaction, this should have resulted in a few somewhat equal sized segments being created before creating a small one. It didn't change anything - it's still two segments per autocommit, one of which is around 3000 docs and the other adds to that to make about 25000.

There's still something weird going on, but now I know that I don't completely understand it. I hope someone can shed some light.

Thanks,
Shawn

Reply via email to