baotran306 opened a new issue, #12738: URL: https://github.com/apache/iceberg/issues/12738
### Apache Iceberg version 1.7.2 ### Query engine Spark ### Please describe the bug 🐞 ### Describe bug When I try this command`CREATE OR REPLACE TABLE AS SELECT`, it works well for several cases including: - Replace table with structure only (I try to limit 0 in my select statement) - Replace table from non-partition table to partition table - Replace table from identity partition column to another identity partition column However, there's one case I've tested that was unsuccessful: attempting to replace a table with a transform partitiontion to non-partitioned. ### Step reproduce Firstly, I created an empty non-partitioned table using the CREATE TABLE command, followed by inserting test data. Next, I created another partitioned table with a different structure using the same command as in the first step.  Then, I tried to replace the existing partitioned table using the CREATE OR REPLACE TABLE AS SELECT command along with my non-partitioned table. My expectation was that the table would be completely replaced, including both the schema and data, with the non-partitioned table successfully. However, it didn't work and raised `org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 24.0 failed 1 times, most recent failure: Lost task 0.0 in stage 24.0 (TID 50) (10.16.14.4 executor driver): org.apache.iceberg.exceptions.ValidationException: Cannot find source column for partition field: 1000: id_bucket: bucket[16](1)`  Okay, I tried one more time. Before replacing the table, I dropped all partition columns using ALTER TABLE DROP PARTITION FIELD. However, it still resulted in the same errors. I don't understand why it is still checking for partition fields when I had already dropped them before replacing the table.(I've already checked metadata file, and there's no partition spec now as I dropped before)  ### My code: Spark + Rest Catalog(catalog_name = 'lakekeeper') ```python # Create non-partition table spark.sql(""" CREATE TABLE IF NOT EXISTS lakekeeper.risk_mon_out.my_data_non_partitioned( vendor_id int, product string, price double, event_time timestamp ) """) spark.sql(""" INSERT INTO lakekeeper.risk_mon_out.my_data_non_partitioned VALUES (1, "Car", 2.34, TIMESTAMP '2025-03-01 11:03:01'), (2, "Bike", 5.43, TIMESTAMP '2025-03-01 11:19:21'), (3, "House", 100.4, TIMESTAMP '2025-03-01 12:43:17'), (4, "Plane", 124.4, TIMESTAMP '2025-03-01 13:05:29') """) spark.sql("SELECT * FROM lakekeeper.risk_mon_out.my_data_non_partitioned").show() # Create partition table with transformation spark.sql(""" CREATE TABLE lakekeeper.risk_mon_out.data_sample_test( id int, category string, event_time timestamp ) PARTITIONED BY ( bucket(16, id), category ) """) spark.sql(""" INSERT INTO lakekeeper.risk_mon_out.data_sample_test VALUES (5, "HOVal", TIMESTAMP '2025-03-01 11:03:09'), (6, "NextMi", TIMESTAMP '2025-03-01 12:19:29'), (7, "OGel", TIMESTAMP '2025-01-09 06:43:17'), (8, "AAAA", TIMESTAMP '2025-01-09 06:05:29') """) spark.sql("SELECT * FROM lakekeeper.risk_mon_out.data_sample_test").show() # Replace partition table to non-partition table try: spark.sql(""" CREATE OR REPLACE TABLE lakekeeper.risk_mon_out.data_sample_test USING iceberg AS SELECT * FROM lakekeeper.risk_mon_out.my_data_non_partitioned """) except Exception as e: print(f"Error 111: {e}") try: spark.sql("ALTER TABLE lakekeeper.risk_mon_out.data_sample_test DROP PARTITION FIELD id_bucket") spark.sql("ALTER TABLE lakekeeper.risk_mon_out.data_sample_test DROP PARTITION FIELD category") print("Drop done") spark.sql(""" CREATE OR REPLACE TABLE lakekeeper.risk_mon_out.data_sample_test USING iceberg AS SELECT * FROM lakekeeper.risk_mon_out.my_data_non_partitioned """) except Exception as e: print("=====") print(f"Error 222: {e}") ``` ### Expected Behavior Table will be replaced entirely with both schema and data. ### Environment Apache Iceberg Version: 1.7.2 Execution Engines Tested: Spark 3.5.2 Catalog Type: REST catalog Storage: S3-compatible ### Willingness to contribute - [ ] I can contribute a fix for this bug independently - [ ] I would be willing to contribute a fix for this bug with guidance from the Iceberg community - [x] I cannot contribute a fix for this bug at this time -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org