boaz-gold commented on issue #15898: URL: https://github.com/apache/iceberg/issues/15898#issuecomment-4325582294
Thanks @anoopj
Here are the results from both diagnostics.
Thrift server uptime at capture: ~3.5 hours — 5,546 leaked
sdk-ScheduledExecutor threads already accumulated (~1,580/hour).
---
Forced full GC (jcmd <pid> GC.run)
- sdk-ScheduledExecutor threads before: 5,546
- sdk-ScheduledExecutor threads after (~10s): 5,744
- Total threads before/after: 6,102 → 6,055
Thread count did not decrease — the leaked executor threads are completely
unaffected by a full GC.
---
Heap histogram (jmap -histo:live)
Key counts after forced GC:
- S3FileIO — 11,073 instances
- GlueTableOperations — 2,874 instances
- PrefixedS3Client / DefaultS3Client — ~4,439 instances
- Caffeine SoftValueReference — 4,439
The 2,874 GlueTableOperations are the currently cached tables, held by
soft references as expected. But there are 11,073 S3FileIO instances alive —
~4x more than active cache
entries. The extra ~8,800 have no surviving parent TableOperations: those
were evicted and soft refs cleared. Yet the S3FileIO objects remain strongly
reachable via Thread →
PeriodicScheduledTask → S3AsyncClient → S3FileIO. Since live threads are
GC roots, full GC cannot collect them regardless of generation.
This rules out the old-gen/weak-reference theory — these objects are not
unreachable and awaiting collection, they are anchored by their own threads.
Side observation: 7.1M BaseSnapshot instances in the heap (~1.2GB, ~2,490
per table). This appears to be pushing the cache against its
max-total-bytes=2GB limit and increasing
eviction frequency, which worsens the leak rate.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
