dungba88 commented on issue #12714:
URL: https://github.com/apache/lucene/issues/12714#issuecomment-1786608097

   I think that makes sense.
   
   I attempted to implement the copy bytes (not optimizing though, and there 
are lots of non-optimal bytes read/write).
   
   With the same FST as above, it uses 513KB cache size, while with the 
address-based it's 150KB, so it's aligned with the 3x number reported by Mike.
   
   There are some quirks I found while implementing:
   - As the ByteBlockPool seem to merely a very long byte array (which was 
divided into multiple byte array), we still need to record and map the FST real 
address to the offset of the copied bytes (unless there's already a tracking 
mechanism that I'm unaware of). Maybe we can use an additional PagedBytesWriter?
   - As FST operations acts on the real, absolute address, I created a new 
layer of `ReverseBytesReader` which does the mapping automatically.
   - The implementation first copy the node bytes from BytesStore into a new 
temporary byte[], and then write this byte[] into the primary table 
ByteBlockPool. We could directly from BytesStore into ByteBlockPool.
   
   I could put a draft and gradually improve it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to