boaz-gold commented on issue #15898:
URL: https://github.com/apache/iceberg/issues/15898#issuecomment-4325582294

   Thanks @anoopj 
   
   Here are the results from both diagnostics.                                  
                                                                                
                    
                                                                                
                                                                                
                      
     Thrift server uptime at capture: ~3.5 hours — 5,546 leaked 
sdk-ScheduledExecutor threads already accumulated (~1,580/hour).
                                                                                
                                                                                
                      
     ---             
     Forced full GC (jcmd <pid> GC.run)                                         
                                                                                
                      
                                       
     - sdk-ScheduledExecutor threads before: 5,546
     - sdk-ScheduledExecutor threads after (~10s): 5,744                        
                                                                                
                      
     - Total threads before/after: 6,102 → 6,055                                
                                                                                
                      
                                                                                
                                                                                
                      
     Thread count did not decrease — the leaked executor threads are completely 
unaffected by a full GC.                                                        
                      
                     
     ---                                                                        
                                                                                
                      
     Heap histogram (jmap -histo:live)
                                                                                
                                                                                
                      
     Key counts after forced GC:
   
     - S3FileIO — 11,073 instances
     - GlueTableOperations — 2,874 instances
     - PrefixedS3Client / DefaultS3Client — ~4,439 instances                    
                                                                                
                      
     - Caffeine SoftValueReference — 4,439
                                                                                
                                                                                
                      
     The 2,874 GlueTableOperations are the currently cached tables, held by 
soft references as expected. But there are 11,073 S3FileIO instances alive — 
~4x more than active cache   
     entries. The extra ~8,800 have no surviving parent TableOperations: those 
were evicted and soft refs cleared. Yet the S3FileIO objects remain strongly 
reachable via Thread → 
     PeriodicScheduledTask → S3AsyncClient → S3FileIO. Since live threads are 
GC roots, full GC cannot collect them regardless of generation.                 
                        
                     
     This rules out the old-gen/weak-reference theory — these objects are not 
unreachable and awaiting collection, they are anchored by their own threads.
   
     Side observation: 7.1M BaseSnapshot instances in the heap (~1.2GB, ~2,490 
per table). This appears to be pushing the cache against its 
max-total-bytes=2GB limit and increasing  
     eviction frequency, which worsens the leak rate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to