Just to bring closure on this one, we were slurping data from the
wrong DB (hardly desktop class machine)...

Solr did not cough on 41Mio records @34k updates / sec.,  single threaded.
Great!



On Sat, Sep 24, 2011 at 9:18 PM, eks dev <eks...@yahoo.co.uk> wrote:
> just looking for hints where to look for...
>
> We were testing single threaded ingest rate on solr, trunk version on
> atypical collection (a lot of small documents), and we noticed
> something we are not able to explain.
>
> Setup:
> We use defaults for index settings, windows 64 bit, jdk 7 U2. on SSD,
> machine with enough memory and 8 cores.   Schema has 5 stored fields,
> 4 of them indexed no positions no norms.
> Average net document size (optimized index size / number of documents)
> is around 100 bytes.
>
> On a test with 40 Mio document:
> - we had update ingest rate  on first 4,4Mio documents @  incredible
> 34k records / second...
> - then it dropped, suddenly to 20k records per second and this rate
> remained stable (variance 1k) until...
> - we hit 13Mio, where ingest rate dropped again really hard, from one
> instant in time to another to 10k records per second.
>
> it stayed there until we reached the end @40Mio (slightly reducing, to
> ca 9k, but this is not long enough to see trend).
>
> Nothing unusual happening with jvm memory ( tooth-saw  200- 450M fully
> regular). CPU in turn was  following the ingest rate trend, inicating
> that we were waiting on something. No searches , no commits, nothing.
>
> autoCommit was turned off. Updates were streaming directly from the database.
>
> -----
> I did not expect something like this, knowing lucene merges in
> background. Also, having such sudden drops in ingest rate is
> indicative that we are not leaking something. (drop would have been
> much more gradual). It is some caches, but why two really significant
> drops? 33k/sec to 20k and than to 10k... We would love to keep it  @34
> k/second :)
>
> I am not really acquainted with the new MergePolicy and flushing
> settings, but I suspect this is something there we could tweak.
>
> Could it be windows is somehow, hmm, quirky with solr default
> directory on win64/jvm (I think it is MMAP by default)... We did not
> saturate IO with such a small documents I guess, It is a just couple
> of Gig over 1-2 hours.
>
> All in all, it works good, but is having such hard update ingest rate
> drops normal?
>
> Thanks,
> eks.
>

Reply via email to