dungba88 commented on issue #12543:
URL: https://github.com/apache/lucene/issues/12543#issuecomment-1782469977

   I put a new revision with support for DataOutput and FileChannel.
   
   When using DataOutput, if suffix sharing is enabled one also needs to pass a 
RandomAccessInput for reading. Otherwise it can be left null. So one can pass a 
IndexOutput, and RandomAccessInput can be created from IndexInput.
   
   When using FileChannel, one only needs to pass the FileChannel as that 
already allows both read & write at the same time. This FileChannel 
implementation is just for demonstration of feasibility.
   
   Some stuffs I'd like to discuss:
   - Should we write the rootNode + numBytes to the end of the FST instead of 
the front? We only have them after constructing the FST and we can't prepend a 
DataOutput (that's costly). Otherwise we would need to save the metadata 
separately from the main body. That's why I added a new method `saveMetadata()`
   - Should we move to value-based LRU cache? It has pros and cons:
     - Pros: We make NodeHash independent of FST completely. It would allow the 
suffix sharing without the need of RandomAccessInput, and thus without the need 
for IndexOutput & IndexInput to be open at the same time. Also accessing from 
RAM is much faster than accessing from disk.
     - Cons: More RAM required than the address-based cache. For truly minimal 
FST it would require the same (or more) RAM needed for the entire FST.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to