dungba88 commented on issue #12543: URL: https://github.com/apache/lucene/issues/12543#issuecomment-1782469977
I put a new revision with support for DataOutput and FileChannel. When using DataOutput, if suffix sharing is enabled one also needs to pass a RandomAccessInput for reading. Otherwise it can be left null. So one can pass a IndexOutput, and RandomAccessInput can be created from IndexInput. When using FileChannel, one only needs to pass the FileChannel as that already allows both read & write at the same time. This FileChannel implementation is just for demonstration of feasibility. Some stuffs I'd like to discuss: - Should we write the rootNode + numBytes to the end of the FST instead of the front? We only have them after constructing the FST and we can't prepend a DataOutput (that's costly). Otherwise we would need to save the metadata separately from the main body. That's why I added a new method `saveMetadata()` - Should we move to value-based LRU cache? It has pros and cons: - Pros: We make NodeHash independent of FST completely. It would allow the suffix sharing without the need of RandomAccessInput, and thus without the need for IndexOutput & IndexInput to be open at the same time. Also accessing from RAM is much faster than accessing from disk. - Cons: More RAM required than the address-based cache. For truly minimal FST it would require the same (or more) RAM needed for the entire FST. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org