boaz-gold commented on issue #15898:
URL: https://github.com/apache/iceberg/issues/15898#issuecomment-4321648741
Hi @anoopj, thanks for engaging on this — here's the data you asked for.
JVM / GC config (from our live STG cluster, j-1TCOQHIFPXGDA, EMR 7.12):
- JVM: Corretto 17
- GC: G1GC, -Xmx32G, -XX:InitiatingHeapOccupancyPercent=35,
-XX:+ExplicitGCInvokesConcurrent
- GC log enabled: /var/log/spark/driver-gc.log
Observed after ~24h uptime:
- Total JVM threads: 31,041
- sdk-ScheduledExecutor threads (via jstack): 30,830
- Young (Normal) GCs: frequent
- Mixed GCs completed: 1,202
- Full GCs: 0
The old gen is being scanned regularly via mixed GCs, but the
sdk-ScheduledExecutor threads persist through all of them. My understanding is
that these are live Java threads — they appear as running/waiting in jstack —
and live threads are GC roots, so weak-reference cleanup wouldn't reclaim them
regardless of GC frequency. But I might be missing something about how
FileIOTracker interacts with GlueCatalog + CachingCatalog specifically — if you
see a gap in that reasoning I'd genuinely appreciate the correction.
Regarding your point about long-running queries: that's a valid concern and
I don't have a clean answer. In my case, with a 30-second TTL, a table can be
evicted while a query that loaded it is still reading — calling fileIO.close()
at that point could break in-flight reads. also I noticed that the test in PR
#15910 still asserts verify(mockIO).close(), which seemed to conflict with the
removal of the close() call — not sure if that's intentional or an oversight.
Of the two directions you suggested, the catalog-level cleanup hook feels
more tractable to us since each catalog implementation knows its own FileIO
ownership semantics. The reference counting approach seems cleaner in principle
but I am not sure how the decrement side would work without changing the
loadTable contract. That said, I am very much in learning mode here and would
follow whatever direction the maintainers think is correct.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]