leedarHawk opened a new issue, #6235: URL: https://github.com/apache/iceberg/issues/6235
### Apache Iceberg version 0.14.0 ### Query engine Hive ### Please describe the bug 🐞 When I insert data using Hive with TEZ, the data is stored on HDFS successfully, but the data is not shown when select. If I change the engine to MR, it works, the data is stored/loaded successfully. hive version: 3.1.3 tez: 0.10.2 hadoop: Hadoop 3.1.1.3.1.5.0-152 set hive.execution.engine=tez; add jar /tmp/iceberg-hive-runtime-0.14.1.jar; describe foramtted test.x6; +-------------------------------+----------------------------------------------------+----------------------------------------------------+ | col_name | data_type | comment | +-------------------------------+----------------------------------------------------+----------------------------------------------------+ | # col_name | data_type | comment | | user_name | string | from deserializer | | area | string | from deserializer | | | NULL | NULL | | # Detailed Table Information | NULL | NULL | | Database: | test | NULL | | OwnerType: | USER | NULL | | Owner: | hive | NULL | | CreateTime: | Mon Nov 21 13:31:53 CST 2022 | NULL | | LastAccessTime: | UNKNOWN | NULL | | Retention: | 0 | NULL | | Location: | hdfs://hdp46:8020/user/hive/warehouse/test.db/x6 | NULL | | Table Type: | MANAGED_TABLE | NULL | | Table Parameters: | NULL | NULL | | | bucketing_version | 2 | | | current-schema | {\"type\":\"struct\",\"schema-id\":0,\"fields\":[{\"id\":1,\"name\":\"user_name\",\"required\":false,\"type\":\"string\"},{\"id\":2,\"name\":\"area\",\"required\":false,\"type\":\"string\"}]} | | | engine.hive.enabled | true | | | external.table.purge | TRUE | | | metadata_location | hdfs://hdp46:8020/user/hive/warehouse/test.db/x6/metadata/00000-362d4f5e-6575-4be7-aad1-a9c027d1ff43.metadata.json | | | numFiles | 0 | | | numRows | 0 | | | rawDataSize | 0 | | | snapshot-count | 0 | | | storage_handler | org.apache.iceberg.mr.hive.HiveIcebergStorageHandler | | | table_type | ICEBERG | | | totalSize | 0 | | | transient_lastDdlTime | 1669008713 | | | uuid | 5fe91beb-0cfc-4903-96e6-268b933eca73 | | | NULL | NULL | | # Storage Information | NULL | NULL | | SerDe Library: | org.apache.iceberg.mr.hive.HiveIcebergSerDe | NULL | | InputFormat: | org.apache.iceberg.mr.hive.HiveIcebergInputFormat | NULL | | OutputFormat: | org.apache.iceberg.mr.hive.HiveIcebergOutputFormat | NULL | | Compressed: | No | NULL | | Num Buckets: | 0 | NULL | | Bucket Columns: | [] | NULL | | Sort Columns: | [] | NULL | +-------------------------------+----------------------------------------------------+----------------------------------------------------+ =================================insert data=================== insert into test.x6 values ('a1', 'b1'); =================================insert data log start================ INFO : Compiling command(queryId=root_20221121133308_796f75d3-e891-4f39-8b48-ac2dcbb8d433): insert into test.x6 values ('a1', 'b1') INFO : Concurrency mode is disabled, not creating a lock manager INFO : Semantic Analysis Completed (retrial = false) INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:col1, type:string, comment:null), FieldSchema(name:col2, type:string, comment:null)], properties:null) INFO : Completed compiling command(queryId=root_20221121133308_796f75d3-e891-4f39-8b48-ac2dcbb8d433); Time taken: 0.782 seconds INFO : Concurrency mode is disabled, not creating a lock manager INFO : Executing command(queryId=root_20221121133308_796f75d3-e891-4f39-8b48-ac2dcbb8d433): insert into test.x6 values ('a1', 'b1') INFO : Query ID = root_20221121133308_796f75d3-e891-4f39-8b48-ac2dcbb8d433 INFO : Total jobs = 1 INFO : Starting task [Stage-0:DDL] in serial mode INFO : Starting task [Stage-1:DDL] in serial mode INFO : Launching Job 1 out of 1 INFO : Starting task [Stage-2:MAPRED] in serial mode INFO : Subscribed to counters: [] for queryId: root_20221121133308_796f75d3-e891-4f39-8b48-ac2dcbb8d433 INFO : Session is already open INFO : Dag name: insert into test.x6 values ('a1', 'b1') (Stage-2) INFO : Tez session was closed. Reopening... INFO : Session re-established. INFO : Session re-established. INFO : Status: Running (Executing on YARN cluster with App id application_1668952240853_0007) INFO : Starting task [Stage-4:DDL] in serial mode INFO : Completed executing command(queryId=root_20221121133308_796f75d3-e891-4f39-8b48-ac2dcbb8d433); Time taken: 17.825 seconds INFO : OK INFO : Concurrency mode is disabled, not creating a lock manager INFO : Compiling command(queryId=root_20221121133504_3b4e5da8-db8c-4494-b94d-98aa991a33f3): select * from test.x6 INFO : Concurrency mode is disabled, not creating a lock manager =================================insert data log end================ =================================select data====================== select * from test.x6 =================================select data log start=================== INFO : Compiling command(queryId=root_20221121133504_3b4e5da8-db8c-4494-b94d-98aa991a33f3): select * from test.x6 INFO : Concurrency mode is disabled, not creating a lock manager INFO : Semantic Analysis Completed (retrial = false) INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:x6.user_name, type:string, comment:null), FieldSchema(name:x6.area, type:string, comment:null)], properties:null) INFO : Completed compiling command(queryId=root_20221121133504_3b4e5da8-db8c-4494-b94d-98aa991a33f3); Time taken: 0.441 seconds INFO : Concurrency mode is disabled, not creating a lock manager INFO : Executing command(queryId=root_20221121133504_3b4e5da8-db8c-4494-b94d-98aa991a33f3): select * from test.x6 INFO : Completed executing command(queryId=root_20221121133504_3b4e5da8-db8c-4494-b94d-98aa991a33f3); Time taken: 0.001 seconds INFO : OK INFO : Concurrency mode is disabled, not creating a lock manager +---------------+----------+ | x6.user_name | x6.area | +---------------+----------+ +---------------+----------+ =================================select data log start=================== -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org