snfddl opened a new issue, #10181: URL: https://github.com/apache/iceberg/issues/10181
### Query engine spark 3.2 ### Question 1. create partitioned table ``` create table temp.partition_table ( dt string ,contents string ) partitioned by spec (dt) stored as iceberg; ``` 2. Insert data with one partition key value into a partitioned table ``` insert into temp.partition_table select dt ,text as contents from temp.dataset where dt = '20240418' ``` 3. physical plan ``` AppendData org.apache.spark.sql.execution.datasources.v2.DataSourceV2Strategy$$Lambda$3485/49797107@3d2cac62, IcebergWrite(table=spark_catalog.temp.partition_table, format=PARQUET) +- AdaptiveSparkPlan isFinalPlan=false +- Sort [dt#107 ASC NULLS FIRST], false, 0 <<<<<<<<< +- Exchange hashpartitioning(dt#107, 200), REPARTITION_BY_NUM, [plan_id=175] +- Project [dt#107, ansi_cast(contents#106 as string) AS ctnt#110] +- FileScan parquet temp.dataset ``` In this case, since dt, the partition key column, has only one value, I don't think there is a need to perform sorting using the partition key right before writing. However, it appears that sorting is always performed using the partition key when inserting into a partitioned iceberg table. Is there any way to avoid this? In impala, unnecessary sorting could be avoided by using the /*+noclustered*/ hint when performing the same type of insert into a general parquet-based table, so I thought the same function existed, but I couldn't find it. I also tried static partition insert, but the plan was the same. (insert into temp.partition_table partition(dt='20240406') select contents from temp.dataset where dt='20240406') -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org