: OPtimize solved it . Thanks Jason. I am surprised on why solr does this?

this gets into some complicated discussions about the underlying Lucnee 
index format, this is discussed at a very low level in the Lucene docs...

        http://lucene.apache.org/java/2_3_2/fileformats.html

...but at a slightly higher level the issue comes from the basic nature of 
an inverted index.  even though you have a uniqueKey, and are "replacing" 
an existing document, there is no easy way to reclaim the space used by 
the previous version of the document in realtime -- instead a single bit 
records that the old version was deleted, and the new version is added to 
the end.

the space used by those deleted docs is reclaimed when "segments" get 
"merged".  All segments are merged into one compact segment when you do an 
optimize -- but an optimize isn't actaully neccessary to ensure that the 
deleted docs are *eventually* purged, as documents are added, incremental 
merges are constantly taking place.  How often they take place (as a 
function of docs added) can be controlled with various settings in 
solrconfig.xml

That is the root of why you can see an index grow even though you only 
"replace" existing docswithout adding new docs ... it will grow and then 
it will shrnk again once merging happens.

On a slightly related topic: if you really want to explicitly forge some 
segment merging, but a full optimize takes longer then you are willing to 
wait, there is a new option in Solr 1.3 to support to support partial 
optimiation...

  <optimize maxSegments="5" />


-Hoss

Reply via email to