oneonestar opened a new issue, #9723: URL: https://github.com/apache/iceberg/issues/9723
### Query engine Spark ### Question In `metadata_log_entries` table, `latest_snapshot_id, latest_schema_id, latest_sequence_number` return null in some cases. Also, those values became null after `CREATE OR REPLACE`. The current implementation rely on `snapshot-log` field in metadata file, and `snapshot-log` got reset after after the CREATE OR REPLACE statement. Is this an intended behavior? Could someone provide a precise definition for `latest_snapshot_id, latest_schema_id, latest_sequence_number`? ``` spark-sql> create table test.t1 (c1 integer); spark-sql> alter table test.t1 add columns (c2 string); spark-sql> SELECT * FROM test.t1.metadata_log_entries; 2024-01-22 18:51:41.331 hdfs://hadoop/metadata/00000-d1ad9769-3bf3-497d-a86e-20f71d2700f7.metadata.json NULL NULL NULL 2024-01-22 18:51:41.593 hdfs://hadoop/metadata/00001-9eb3275c-f53d-4e24-bc23-fdcadae28d92.metadata.json NULL NULL NULL spark-sql> insert into test.t1 values (1, 'a'); spark-sql> SELECT * FROM test.t1.metadata_log_entries; 2024-01-22 18:51:41.331 hdfs://hadoop/metadata/00000-d1ad9769-3bf3-497d-a86e-20f71d2700f7.metadata.json NULL NULL NULL 2024-01-22 18:51:41.593 hdfs://hadoop/metadata/00001-9eb3275c-f53d-4e24-bc23-fdcadae28d92.metadata.json NULL NULL NULL 2024-01-22 18:51:47.538 hdfs://hadoop/metadata/00002-ec3f0aa6-7dae-4c45-90ad-13839eefbd54.metadata.json 1836642618692808023 1 0 spark-sql> delete from test.t1 where c1 = 1; spark-sql> SELECT * FROM test.t1.metadata_log_entries; 2024-01-22 18:51:41.331 hdfs://hadoop/metadata/00000-d1ad9769-3bf3-497d-a86e-20f71d2700f7.metadata.json NULL NULL NULL 2024-01-22 18:51:41.593 hdfs://hadoop/metadata/00001-9eb3275c-f53d-4e24-bc23-fdcadae28d92.metadata.json NULL NULL NULL 2024-01-22 18:51:47.538 hdfs://hadoop/metadata/00002-ec3f0aa6-7dae-4c45-90ad-13839eefbd54.metadata.json 1836642618692808023 1 0 2024-01-22 18:51:52.6 hdfs://hadoop/metadata/00003-806771f6-e7f1-44f5-ac80-350f2c084505.metadata.json 8876099574020403871 1 0 spark-sql> CALL local.system.rewrite_data_files('test.t1'); spark-sql> SELECT * FROM test.t1.metadata_log_entries; 2024-01-22 18:51:41.331 hdfs://hadoop/metadata/00000-d1ad9769-3bf3-497d-a86e-20f71d2700f7.metadata.json NULL NULL NULL 2024-01-22 18:51:41.593 hdfs://hadoop/metadata/00001-9eb3275c-f53d-4e24-bc23-fdcadae28d92.metadata.json NULL NULL NULL 2024-01-22 18:51:47.538 hdfs://hadoop/metadata/00002-ec3f0aa6-7dae-4c45-90ad-13839eefbd54.metadata.json 1836642618692808023 1 0 2024-01-22 18:51:52.6 hdfs://hadoop/metadata/00003-806771f6-e7f1-44f5-ac80-350f2c084505.metadata.json 8876099574020403871 1 0 spark-sql> create or replace table test.t1 (c3 integer); spark-sql> SELECT * FROM test.t1.metadata_log_entries; 2024-01-22 18:51:41.331 hdfs://hadoop/metadata/00000-d1ad9769-3bf3-497d-a86e-20f71d2700f7.metadata.json NULL NULL NULL 2024-01-22 18:51:41.593 hdfs://hadoop/metadata/00001-9eb3275c-f53d-4e24-bc23-fdcadae28d92.metadata.json NULL NULL NULL 2024-01-22 18:51:47.538 hdfs://hadoop/metadata/00002-ec3f0aa6-7dae-4c45-90ad-13839eefbd54.metadata.json NULL NULL NULL 2024-01-22 18:51:52.6 hdfs://hadoop/metadata/00003-806771f6-e7f1-44f5-ac80-350f2c084505.metadata.json NULL NULL NULL 2024-01-22 18:52:01.749 hdfs://hadoop/metadata/00004-d9c74f49-37c8-4463-8565-7b31660723e2.metadata.json NULL NULL NULL ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org