iverase commented on PR #12460:
URL: https://github.com/apache/lucene/pull/12460#issuecomment-1715224914

   > I'm contemplating not introducing a new DataInputDocValues class, and 
instead have a dataInput() method on BinaryDocValues
   
   I think this approach defeats on of the main purposes for this change, that 
is to avoid allocating a byte array when reading doc values. I don't think we 
want BinaryDocValues to do that lazily:
   
   ```
   when one of the doc values is big, in the order of few megabytes, it can 
cause issues with small heaps (or even big heaps if 
   big enough). This is due to the allocation of a big byte array upfront, that 
can be consider humongous allocations by the G1 
   garbage collector and it can cause heap issues under high load.
   ```
   
   On my own use case, getting a DataInput is not enough as I need random 
access via get/set position, in a similar fashion to what I am doing now via 
ByteArrayDataInput. 
   
   
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to