Just to bring closure on this one, we were slurping data from the wrong DB (hardly desktop class machine)...
Solr did not cough on 41Mio records @34k updates / sec., single threaded. Great! On Sat, Sep 24, 2011 at 9:18 PM, eks dev <eks...@yahoo.co.uk> wrote: > just looking for hints where to look for... > > We were testing single threaded ingest rate on solr, trunk version on > atypical collection (a lot of small documents), and we noticed > something we are not able to explain. > > Setup: > We use defaults for index settings, windows 64 bit, jdk 7 U2. on SSD, > machine with enough memory and 8 cores. Schema has 5 stored fields, > 4 of them indexed no positions no norms. > Average net document size (optimized index size / number of documents) > is around 100 bytes. > > On a test with 40 Mio document: > - we had update ingest rate on first 4,4Mio documents @ incredible > 34k records / second... > - then it dropped, suddenly to 20k records per second and this rate > remained stable (variance 1k) until... > - we hit 13Mio, where ingest rate dropped again really hard, from one > instant in time to another to 10k records per second. > > it stayed there until we reached the end @40Mio (slightly reducing, to > ca 9k, but this is not long enough to see trend). > > Nothing unusual happening with jvm memory ( tooth-saw 200- 450M fully > regular). CPU in turn was following the ingest rate trend, inicating > that we were waiting on something. No searches , no commits, nothing. > > autoCommit was turned off. Updates were streaming directly from the database. > > ----- > I did not expect something like this, knowing lucene merges in > background. Also, having such sudden drops in ingest rate is > indicative that we are not leaking something. (drop would have been > much more gradual). It is some caches, but why two really significant > drops? 33k/sec to 20k and than to 10k... We would love to keep it @34 > k/second :) > > I am not really acquainted with the new MergePolicy and flushing > settings, but I suspect this is something there we could tweak. > > Could it be windows is somehow, hmm, quirky with solr default > directory on win64/jvm (I think it is MMAP by default)... We did not > saturate IO with such a small documents I guess, It is a just couple > of Gig over 1-2 hours. > > All in all, it works good, but is having such hard update ingest rate > drops normal? > > Thanks, > eks. >