Dear All, We have a high percentage of deleted docs which do not go away because there are several huge ancient segments that do not merge with anything else naturally. Our use case in constant reindexing of same data - ~100 gb, 12 000 000 real records, 20 000 000 total records in index, which is ~80% deletes.
We plan to deal with situation by playing with mergeFactor, reclaimDeletesWeight and maxSegmentSizeMB settings to optimize for our re-indexing rate and data size. And in order to do it with eyes-opened we want to see a picture similar to http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html with columns of segment size and %of deletes. The plan is to expose SegmentInfos via /admin/luke handler and draw column bars in Solr admin. Is there an easier way to achieve that? Even in raw Luke we didn't' found these data. We'd be happy to push the changes to Solr afterwards. Thank you, Alexey Kozhemiakin