Deepika0510 opened a new issue, #12395: URL: https://github.com/apache/lucene/issues/12395
### Description There is an opportunity to improve functionality and performance of existing Disk Usage API, through a re-implementation. Currently, the best tool we have for this is based on a custom Codec that separates storage by field; to get the statistics we read an existing index and write it out using `AddIndexes` and force-merging, using the custom codec. This is time-consuming and inefficient and tends not to get done. What we could do is estimate the storage of each field by iterating its structures (i.e., inverted index, doc-values, stored fields, etc.) and tracking the number of read-bytes. Since we will enumerate the index, it wouldn't require us to force-merge all the data through `addIndexes`, and at the same time it doesn't invade the codec apis. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org