fbertsch commented on issue #8179:
URL: https://github.com/apache/iceberg/issues/8179#issuecomment-2840609092

   Raising this again, to discuss why it's useful.
   
   We’ve started using DBT w/ Spark + Iceberg. When using DBT, you write a `SQL 
SELECT` statement that DBT uses to either create the table or insert data into 
it.
   
   - If the table you’re writing to doesn’t exist, it creates it by building + 
executing a `CREATE TABLE AS SELECT`
   - If the table does exist, it uses the select to build + execute an `INSERT 
OVERWRITE` (or `MERGE INTO`, depending on your setup)
   
   Now the problem I’m running into is: I’d like to set sort ordering on my 
tables, but I don’t have a chance to before the `CREATE TABLE AS SELECT` runs. 
DBT provides a pre-hook and post-hook which run before/after your SQL statement 
executes, but when the table doesn’t exist:
   
   - The pre-hook runs before the table even exists, so a ALTER TABLE SET WRITE 
ORDER won’t work
   - The post-hook runs after the data has been loaded, so that first run (the 
CREATE TABLE AS SELECT) won’t adhere to the write order
   
   I’d like to avoid putting an `ORDER BY` clause in every query, and instead 
would like to rely on write ordering as set by the table.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to