nandorKollar commented on PR #13976:
URL: https://github.com/apache/iceberg/pull/13976#issuecomment-3270175702
@kevinjqliu @RussellSpitzer while working on this I discovered, that we
still might have a memory leak in the Arrow reader. The leak is with last
update sequence vectorized reader, and rowId reader. The problem is that these
don't implement the reuse logic which other readers do. When `ArrowBatchReader`
populates the `vectorsHolders`
```
for (int i = 0; i < readers.length; i += 1) {
vectorHolders[i] = readers[i].read(vectorHolders[i], numRowsToRead);
int numRowsInVector = vectorHolders[i].numValues();
```
these two readers always allocate a new vector, and we lose reference to the
old value vector (hence nobody closes them later on). It is not yet clear for
me why tests are not failing, if I recall @RussellSpitzer you recently added
tests to avoid similar memory leaks.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]