Aha! See, it was the DB after all! ;) Thanks for following up, I was curious.
Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ >________________________________ >From: eks dev <eks...@yahoo.co.uk> >To: solr-user <solr-user@lucene.apache.org> >Sent: Monday, September 26, 2011 10:21 AM >Subject: Re: Update ingest rate drops suddenly > >Just to bring closure on this one, we were slurping data from the >wrong DB (hardly desktop class machine)... > >Solr did not cough on 41Mio records @34k updates / sec., single threaded. >Great! > > > >On Sat, Sep 24, 2011 at 9:18 PM, eks dev <eks...@yahoo.co.uk> wrote: >> just looking for hints where to look for... >> >> We were testing single threaded ingest rate on solr, trunk version on >> atypical collection (a lot of small documents), and we noticed >> something we are not able to explain. >> >> Setup: >> We use defaults for index settings, windows 64 bit, jdk 7 U2. on SSD, >> machine with enough memory and 8 cores. Schema has 5 stored fields, >> 4 of them indexed no positions no norms. >> Average net document size (optimized index size / number of documents) >> is around 100 bytes. >> >> On a test with 40 Mio document: >> - we had update ingest rate on first 4,4Mio documents @ incredible >> 34k records / second... >> - then it dropped, suddenly to 20k records per second and this rate >> remained stable (variance 1k) until... >> - we hit 13Mio, where ingest rate dropped again really hard, from one >> instant in time to another to 10k records per second. >> >> it stayed there until we reached the end @40Mio (slightly reducing, to >> ca 9k, but this is not long enough to see trend). >> >> Nothing unusual happening with jvm memory ( tooth-saw 200- 450M fully >> regular). CPU in turn was following the ingest rate trend, inicating >> that we were waiting on something. No searches , no commits, nothing. >> >> autoCommit was turned off. Updates were streaming directly from the database. >> >> ----- >> I did not expect something like this, knowing lucene merges in >> background. Also, having such sudden drops in ingest rate is >> indicative that we are not leaking something. (drop would have been >> much more gradual). It is some caches, but why two really significant >> drops? 33k/sec to 20k and than to 10k... We would love to keep it @34 >> k/second :) >> >> I am not really acquainted with the new MergePolicy and flushing >> settings, but I suspect this is something there we could tweak. >> >> Could it be windows is somehow, hmm, quirky with solr default >> directory on win64/jvm (I think it is MMAP by default)... We did not >> saturate IO with such a small documents I guess, It is a just couple >> of Gig over 1-2 hours. >> >> All in all, it works good, but is having such hard update ingest rate >> drops normal? >> >> Thanks, >> eks. >> > > >