iverase commented on PR #12460: URL: https://github.com/apache/lucene/pull/12460#issuecomment-1715224914
> I'm contemplating not introducing a new DataInputDocValues class, and instead have a dataInput() method on BinaryDocValues I think this approach defeats on of the main purposes for this change, that is to avoid allocating a byte array when reading doc values. I don't think we want BinaryDocValues to do that lazily: ``` when one of the doc values is big, in the order of few megabytes, it can cause issues with small heaps (or even big heaps if big enough). This is due to the allocation of a big byte array upfront, that can be consider humongous allocations by the G1 garbage collector and it can cause heap issues under high load. ``` On my own use case, getting a DataInput is not enough as I need random access via get/set position, in a similar fashion to what I am doing now via ByteArrayDataInput. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org