Re: [PR] Arrow: Close child allocators [iceberg]

via GitHub Wed, 17 Sep 2025 19:00:48 -0700


RussellSpitzer commented on PR #13976:
URL: https://github.com/apache/iceberg/pull/13976#issuecomment-3271447046


   > @kevinjqliu @RussellSpitzer while working on this I discovered, that we 
still might have a memory leak in the Arrow reader. The leak is with last 
update sequence vectorized reader, and rowId reader. The problem is that these 
don't implement the reuse logic which other readers do. When `ArrowBatchReader` 
populates the `vectorsHolders`
   > 
   > ```
   >     for (int i = 0; i < readers.length; i += 1) {
   >       vectorHolders[i] = readers[i].read(vectorHolders[i], numRowsToRead);
   >       int numRowsInVector = vectorHolders[i].numValues();
   > ```
   > 
   > these two readers always allocate a new vector, and we lose reference to 
the old value vector (hence nobody closes them later on). It is not yet clear 
for me why tests are not failing, if I recall @RussellSpitzer you recently 
added tests to avoid similar memory leaks.
   > 
   > If you think that this is indeed a potential memory leak, should we 
address it in a separate item, or in this one?
   
   It probably isn't hitting because our tests don't populate these non-table 
columns (I think?) . These are populated based on id not on table schema


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Arrow: Close child allocators [iceberg]

Reply via email to