[ disclaimer: this worked for me, ymmv ... ] I just battled this. Turns out incrementally optimizing using the maxSegments attribute was the most efficient solution for me. In particular when you are actually running out of disk space.
#!/bin/bash # n-segments I started with high=400 # n-segments I want to optimize down to low=300 for i in $(seq $high -10 $low); do # your optimize call with maxSegments=$i sleep 2 done I was able to shrink my +3TB index by about 300GB optimizing from 400 segments down to 300 (10 at a time). It optimized out the .del for those segments that had one and, the best part, because you are only rewriting 10 segments per loop, disk space footprint stays tolerable ... At least compared to a commit @expungeDeletes=true or of course, an optimize without @maxSegments which basically rewrites the entire index. NOTE: it wreaks havoc on the system, so expect search slowdown and best not to index while this is going on either. David On Sun, 2015-01-11 at 06:46 -0700, ig01 wrote: > Hi, > > It's not an option for us, all the documents in our index have same deletion > probability. > Is there any other solution to perform an optimization in order to reduce > index size? > > Thanks in advance. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Frequent-deletions-tp4176689p4178720.html > Sent from the Solr - User mailing list archive at Nabble.com.