On 1/7/2014 7:48 AM, Steven Bower wrote: > I was looking at the code for getIndexSize() on the ReplicationHandler to > get at the size of the index on disk. From what I can tell, because this > does directory.listAll() to get all the files in the directory, the size on > disk includes not only what is searchable at the moment but potentially > also files that are being created by background merges/etc.. I am wondering > if there is an API that would give me the size of the "currently > searchable" index files (doubt this exists, but maybe).. > > If not what is the most appropriate way to get a list of the segments/files > that are currently in use by the active searcher such that I could then ask > the directory implementation for the size of all those files? > > For a more complete picture of what I'm trying to accomplish, I am looking > at building a quota/monitoring component that will trigger when index size > on disk gets above a certain size. I don't want to trigger if index is > doing a merge and ephemerally uses disk for that process. If anyone has any > suggestions/recommendations here too I'd be interested..
Dredging up a VERY old thread here. As I was replying to your most recent query, I was looking through my email archive for your previous messages and this one caught my eye, especially because it never got a reply. It must have escaped my notice last year. This is a very good idea. I imagine that the active searcher object directly or indirectly knows exactly which files are in use for that searcher, so I think it should be relatively easy for it to retrieve a list, and the index size code should be able to return both the active index size as well as the total directory size. I've been putting a little bit of work in to get the index size code moved out of the replication handler so that it is available even if replication is completely disabled, but my free time has been limited. I don't recall the issue number(s) for that work. Thanks, Shawn