fbertsch commented on issue #8179: URL: https://github.com/apache/iceberg/issues/8179#issuecomment-2840609092
Raising this again, to discuss why it's useful. We’ve started using DBT w/ Spark + Iceberg. When using DBT, you write a `SQL SELECT` statement that DBT uses to either create the table or insert data into it. - If the table you’re writing to doesn’t exist, it creates it by building + executing a `CREATE TABLE AS SELECT` - If the table does exist, it uses the select to build + execute an `INSERT OVERWRITE` (or `MERGE INTO`, depending on your setup) Now the problem I’m running into is: I’d like to set sort ordering on my tables, but I don’t have a chance to before the `CREATE TABLE AS SELECT` runs. DBT provides a pre-hook and post-hook which run before/after your SQL statement executes, but when the table doesn’t exist: - The pre-hook runs before the table even exists, so a ALTER TABLE SET WRITE ORDER won’t work - The post-hook runs after the data has been loaded, so that first run (the CREATE TABLE AS SELECT) won’t adhere to the write order I’d like to avoid putting an `ORDER BY` clause in every query, and instead would like to rely on write ordering as set by the table. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org