asfimport opened a new issue, #389: URL: https://github.com/apache/arrow-java/issues/389
Following the discussion on https://github.com/apache/arrow/pull/9187. Proposed API in BaseVariableWidthVector.java: ```java /** * Get the potential buffer size for a particular number of records and density. * @param valueCount desired number of elements in the vector * @param density average number of bytes per variable width element * @return estimated size of underlying buffers if the vector holds * a given number of elements */ public int getBufferSizeFor(final int valueCount, double density) ``` The current `getBufferSizeFor(int valueCount)` for BaseVariableWidthVector requires that validity and offset vectors have already been allocated for at least the given `valueCount`. If the aim of this method is to estimate memory usage for a value count, it's not very useful because it can only give sizes for less than or equal value counts in the currently allocated vector. A better approach for approximating memory usage is to include a density argument, along with value count. Then the buffer estimate does not require the validity and offset vector to have any allocation. This also is inline with `setInitialCapacity(int valueCount, double density)` NOTE: this API should also be added to BaseLargeVariableWidthVector and possibly BaseRepeatedValueVector(Large) as well. **Reporter**: [Bryan Cutler](https://issues.apache.org/jira/browse/ARROW-11739) / @BryanCutler <sub>**Note**: *This issue was originally created as [ARROW-11739](https://issues.apache.org/jira/browse/ARROW-11739). Please see the [migration documentation](https://github.com/apache/arrow/issues/14542) for further details.*</sub> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org