ForestYang20 opened a new issue, #11017:
URL: https://github.com/apache/iceberg/issues/11017

   ### Query engine
   
   Iceberg v1.0.0 on Spark v3.3.0 (Glue v4.0)
   
   ### Question
   
   On an existing iceberg table, we ran the following procedure for the first 
time. 
   ```py
   sql = f"""
       CALL spark_catalog.system.expire_snapshots(
           table => '{database_name}.{table}',
           older_than => TIMESTAMP '<one week ago>',
           max_concurrent_deletes => 20,
           stream_results => true
       )
       """
   ```
   This results in some tasks most likely related to [this 
union](https://github.com/apache/iceberg/blob/5e7cfffdf57c40398d0e52bd7610271faa42125f/spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/ExpireSnapshotsSparkAction.java#L225-L230)
 being executed on a single 16gb executor.
   <img width="1556" alt="Screenshot 2024-08-27 at 2 24 31 PM" 
src="https://github.com/user-attachments/assets/c2f0bdd3-aebb-4a4e-8b12-f6d061502edf";>
   <img width="2106" alt="Screenshot 2024-08-27 at 2 20 18 PM" 
src="https://github.com/user-attachments/assets/cbc30ea8-3169-4321-ae73-20acd90aceda";>
   After rewriting manifest and data files and removing orphan files, from our 
second maintenance workflow and thereafter, we saw a dramatic increase in the 
compute required to execute the same `expire_snapshots` task, often being 
automatically retried or OOM. Now, a large number of executors (i.e. 50+) are 
being utilised to perform this task. Below are the summary statistics for these 
executors: 
   <img width="2457" alt="Screenshot 2024-08-27 at 3 05 57 PM" 
src="https://github.com/user-attachments/assets/7020dec8-3769-420d-912f-94d530f6efff";>
   Is there some form of configuration that could have caused this to occur, or 
some form of config that we can change to avoid the sudden increase in executor 
usage? Any pointers would be appreciated.
   
   It also appears that the executions where we did not encounter OOM have 
skipped stage 513 in the following image, which is part of the job circled in 
red above. 
   <img width="928" alt="Screenshot 2024-08-27 at 2 54 53 PM" 
src="https://github.com/user-attachments/assets/2ac2d705-50d9-4b48-a7ac-c1def61cd63b";>
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to