*Background:* - Our use case is to use SOLR as a massive FIFO queue.
- Document additions and updates happen continuously. - Documents are being added at sustained a rate of 50 - 100 documents per second. - About 50% of these document are updates to existing docs, indexed using atomic updates: the original doc is thus deleted and re-added. - There is a separate purge operation running every four hours that deletes the oldest docs, if required based on a number of unrelated configuration parameters. - At some time in the past, a manual force merge / optimize with maxSegments=2 was run to troubleshoot high disk i/o and remove "too many segments" as a potential variable. Currently, the largest fdts are 74G and 43G. There are 47 total segments, the largest other sizes are all around 2G. - Merge policies are all at Solr 4 defaults. Index size is currently ~50M maxDocs, ~35M numDocs, 276GB. *Issue:* The background purge operation is deleting docs on schedule, but the disk space is not being recovered. *Presumptions:* I presume, but have not confirmed (how?) the 15M deleted documents are predominately in the two large segments. Because they are largely in the two large segments, and those large segments still have (some/many) live documents, the segment backing files are not deleted. *Questions:* - When will those segments get merged and documents recovered? Does it happen when _all_ the documents in those segments are deleted? Some percentage of the segment is filled with deleted documents? - Is there a way to do it right now vs. just waiting? - In some cases, the purge delete conditional is _just_ free disk space: when index > free space, delete oldest. Those setups are now in scenarios where index >> free space, and getting worse. How does low disk space effect above two questions? - Is there a way for me to determine stats on a per-segment basis? - for example, how many deleted documents in a particular segment? - On the flip side, can I determine in what segment a particular document is located? Thank you, Scott -- Scott Lundgren Director of Engineering Carbon Black, Inc. (210) 204-0483 | scott.lundg...@carbonblack.com