lime-squeeze opened a new issue, #8881: URL: https://github.com/apache/iceberg/issues/8881
### Query engine Spark 3.3 within AWS Glue 4.0 and using iceberg-spark-runtime-3.3_2.12-1.4.0.jar ### Question I am attempting to test branching via SparkSQL within AWS Glue. I am able to successfully create a branch and modify it, but when I attempt to merge the branch into main via the fast_forward command nothing happens. The query completes successfully without error, but when I query the data on the main branch the data is unchanged. Below is what I am executing. ``` # DROP spark.sql(f""" DROP TABLE IF EXISTS iceberg_catalog.db.employee """) spark.sql(f""" DROP TABLE IF EXISTS iceberg_catalog.db.employee_stg """) # CREATE spark.sql(""" CREATE TABLE iceberg_catalog.db.employee ( employee_id int, name string, salary double, eff_start_dt date, eff_end_dt date, etl_state string ) USING iceberg LOCATION 's3://<bucket/db/tbl_nm>' TBLPROPERTIES ( 'format-version'='2', 'write.format.default'='parquet', 'write.target-file-size-bytes'='536870912', 'write.parquet.compression-codec'='snappy', 'history.expire.max-snapshot-age-ms'='86400000', 'write.wap.enabled'='true', 'write.object-storage.enabled'=true, 'external.table.purge'='true' ) """) spark.sql(""" CREATE TABLE iceberg_catalog.db.employee_stg ( employee_id int, name string, salary double, load_date date ) USING iceberg LOCATION 's3://<bucket/db/stg_tbl_nm>' TBLPROPERTIES ( 'format-version'='2', 'write.format.default'='parquet', 'write.target-file-size-bytes'='536870912', 'write.parquet.compression-codec'='snappy', 'history.expire.max-snapshot-age-ms'='86400000', 'write.wap.enabled'='true', 'write.object-storage.enabled'=true, 'external.table.purge'='true' ) """) # INSERT INTITIAL VALUES spark.sql(""" insert into iceberg_catalog.db.employee values (101, 'tom', 9000, date('2022-01-01'), date('9999-12-31'), 'current'), (102, 'sara', 5000, date('2022-02-01'), date('9999-12-31'),'current'), (103, 'bob', 9000, date('2022-01-01'), date('9999-12-31'),'current'), (104, 'mike', 5000, date('2022-11-01'), date('9999-12-31'),'current') """) spark.sql(""" insert into iceberg_catalog.db.employee_stg values (102, 'sara', 7000, date('2023-09-27')), (104, 'mike', 7000, date('2023-09-27')) """) # APPLY UPDATES ON BRANCH spark.sql(""" ALTER TABLE iceberg_catalog.db.employee DROP BRANCH IF EXISTS `validation-branch` """) spark.sql(""" ALTER TABLE iceberg_catalog.db.employee CREATE BRANCH `validation-branch` RETAIN 7 DAYS WITH SNAPSHOT RETENTION 2 SNAPSHOTS """) spark.sql(""" SET spark.wap.branch = 'validation-branch'; """) spark.sql(f""" MERGE INTO iceberg_catalog.db.employee as T USING iceberg_catalog.db.employee_stg as S ON T.employee_id = S.employee_id WHEN MATCHED and etl_state = 'current' THEN UPDATE SET etl_state='history', eff_end_dt=S.load_date """) spark.sql(f""" INSERT INTO iceberg_catalog.db.employee select employee_id, name, salary, load_date, date('9999-12-31'), 'current' from iceberg_catalog.db.employee_stg """) # MERGE BRANCHES spark.sql("""CALL iceberg_catalog.system.fast_forward('iceberg_catalog.datransforms.employee_iceberg_demo', 'main', 'validation-branch')""") ``` While still on the validation-branch, querying the data yields the following result as expected ``` +-----------+-------+------+------------+----------+---------+ |employee_id|name |salary|eff_start_dt|eff_end_dt|etl_state| +-----------+-------+------+------------+----------+---------+ |102 |sara|5000.0|2022-02-01 |2023-09-27|history | |104 |mike|5000.0|2022-11-01 |2023-09-27|history | |101 |tom|9000.0|2022-01-01 |9999-12-31|current | |103 |bob|9000.0|2022-01-01 |9999-12-31|current | |102 |sara|7000.0|2023-09-27 |9999-12-31|current | |104 |mike|7000.0|2023-09-27 |9999-12-31|current | +-----------+-------+------+------------+----------+---------+ ``` If I switch over to the main branch, what I see is ONLY the initial inserts ``` +-----------+-------+------+------------+----------+---------+ |employee_id|name |salary|eff_start_dt|eff_end_dt|etl_state| +-----------+-------+------+------------+----------+---------+ |101 |tom|9000.0|2022-01-01 |9999-12-31|current | |102 |sara|5000.0|2022-02-01 |9999-12-31|current | |103 |bob|9000.0|2022-01-01 |9999-12-31|current | |104 |mike|5000.0|2022-11-01 |9999-12-31|current | +-----------+-------+------+------------+----------+---------+ ``` Is there anything here that I am missing or doing wrong? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org