nandorKollar commented on PR #13976:
URL: https://github.com/apache/iceberg/pull/13976#issuecomment-3270175702

   @kevinjqliu @RussellSpitzer while working on this I discovered, that we 
still might have a memory leak in the Arrow reader. The leak is with last 
update sequence vectorized reader, and rowId reader. The problem is that these 
don't implement the reuse logic which other readers do. When `ArrowBatchReader` 
populates the `vectorsHolders`
   ```
       for (int i = 0; i < readers.length; i += 1) {
         vectorHolders[i] = readers[i].read(vectorHolders[i], numRowsToRead);
         int numRowsInVector = vectorHolders[i].numValues();
   ```
   these two readers always allocate a new vector, and we lose reference to the 
old value vector (hence nobody closes them later on). It is not yet clear for 
me why tests are not failing, if I recall @RussellSpitzer you recently added 
tests to avoid similar memory leaks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to