sorry for typo in prev msg, Increase = 2,297,231 - 1,786,552 = 500,000 (average)
RATE (non-unique-id:unique-id) = 7,000,000 : 500,000 = 14:1 but 125:1 (initial 30 hours) was very strange... Funtick wrote: > > UPDATE: > > After few more minutes (after previous commit): > docsPending: about 7,000,000 > > After commit: > numDocs: 2,297,231 > > Increase = 2,297,231 - 1,281,851 = 1,000,000 (average) > > So that I have 7 docs with same ID in average. > > Having 100,000,000 and then dropping below 1,000,000 is strange; it is a > bug somewhere... need to investigate ramBufferSize and MergePolicy, > including SOLR uniqueId implementation... > > > > Funtick wrote: >> >> After running an application which heavily uses MD5 HEX-representation as >> <uniqueKey> for SOLR v.1.4-dev-trunk: >> >> 1. After 30 hours: >> 101,000,000 documents added >> >> 2. Commit: >> numDocs = 783,714 >> maxDoc = 3,975,393 >> >> 3. Upload new docs to SOLR during 1 hour(!!!!!!!), then commit, then >> optimize: >> numDocs=1,281,851 >> maxDocs=1,281,851 >> >> It looks _extremely_ strange that within an hour I have such a huge >> increase with same 'average' document set... >> >> I am suspecting something goes wrong with Lucene buffer flush / index >> merge OR SOLR - Unique ID handling... >> >> According to my own estimates, I should have about 10,000,000 new >> documents now... I had 0.5 millions within an hour, and 0.8 mlns within a >> day; same 'random' documents. >> >> This morning index size was about 4Gb, then suddenly dropped below 0.5 >> Gb. Why? I haven't issued any "commit"... >> >> I am using ramBufferMB=8192 >> >> >> >> >> >> >> > > -- View this message in context: http://www.nabble.com/SOLR-%3CuniqueKey%3E---extremely-strange-behavior%21-Documents-disappeared...-tp25017728p25018263.html Sent from the Solr - User mailing list archive at Nabble.com.