koderka2020 opened a new issue, #11426:
URL: https://github.com/apache/iceberg/issues/11426

   Hi Iceberg team, 
   I've been searching for some time information on what is the max insert rate 
per sec or per min on iceberg table. We've been ingesting some large amounts of 
data (in tandem with trino and nessie) by concurrently running aws glue jobs. 
These jobs are failing at pretty high rate ("SystemExit: ERROR: An error 
occurred while calling o213.append.") even with the increased "retry" table 
property settings (25 retry, min 1000 ms wait, max 1500ms wait). 
   If the parallelism is too high (1000-2500 concurrently running jobs trying 
to write to iceberg total of about 100k rows /500MB within 30mins) would you 
recommend some way around it? I was thinking creating staging table in 
postgress or creating multiple staging tables in iceberg to distribute the load 
and later after migrating the data to the main iceberg table at the end just 
dropping the staging tables. What are your thoughts on that?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to