On 8/5/2014 7:31 AM, Jako de Wet wrote: > Thanks for the insight. Why the size increase when not specifying the clean > parameter then? The PK for the documents remain the same throughout the > whole import process. > > Should a full optimize combine all the results into one and decrease the > physical size of the core?
When you delete all documents, all of the original segments have no undeleted documents in them, so Lucene knows it can completely remove those segments even when there is no merging. I don't know what situations will trigger such automatic removal, but Lucene is smart enough to know that it can do it. If you simply rely on uniqueKey replacement, the space taken up by deleted documents cannot be automatically recovered, because there are good documents in those segments. Only a merge can recover the space, and only an optimize can guarantee any specific document's segment will be merged. Thanks, Shawn