pvary commented on issue #12761: URL: https://github.com/apache/iceberg/issues/12761#issuecomment-2796396531
> By the way, https://iceberg.apache.org/docs/latest/spark-ddl/#alter-table-write-ordered-by Is this effective? > > 1. After adding this config, will all write operations to this table become ordered? This depends on the writer. AFAIK Spark uses this config, but Flink just ignores it (because streaming writes has different constraints) > 3. How much does this write-time sorting action increase the cost of the write operation? I can't help you here 😢 as I did not run tests with Spark. That said, sorting is never a trivial task, and definitely involves at least one shuffle stage, which can cause performance degradation. You can check your Spark job's plan for this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org