koderka2020 opened a new issue, #11426: URL: https://github.com/apache/iceberg/issues/11426
Hi Iceberg team, I've been searching for some time information on what is the max insert rate per sec or per min on iceberg table. We've been ingesting some large amounts of data (in tandem with trino and nessie) by concurrently running aws glue jobs. These jobs are failing at pretty high rate ("SystemExit: ERROR: An error occurred while calling o213.append.") even with the increased "retry" table property settings (25 retry, min 1000 ms wait, max 1500ms wait). If the parallelism is too high (1000-2500 concurrently running jobs trying to write to iceberg total of about 100k rows /500MB within 30mins) would you recommend some way around it? I was thinking creating staging table in postgress or creating multiple staging tables in iceberg to distribute the load and later after migrating the data to the main iceberg table at the end just dropping the staging tables. What are your thoughts on that? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org