myfjdthink commented on issue #10452: URL: https://github.com/apache/doris/issues/10452#issuecomment-1168365071
In addition to authorization issues, doirs are also unable to read data stored on gcs. Take a look at this example I wrote iceberg table on spark and the data is stored on gcs and then read it in the hive environment, it works fine `hive> select * from gs_table2;` ``` Query ID = nick_20220628041137_6fe75ec9-bae7-40b2-8d96-f6653fcdfb49 Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1656302766976_0013) ---------------------------------------------------------------------------------------------- VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED ---------------------------------------------------------------------------------------------- Map 1 .......... container SUCCEEDED 1 1 0 0 0 0 ---------------------------------------------------------------------------------------------- VERTICES: 01/01 [==========================>>] 100% ELAPSED TIME: 8.34 s ---------------------------------------------------------------------------------------------- OK 1 a 2 b 3 c 4 a 1 a 2 b 3 c 4 a Time taken: 9.569 seconds, Fetched: 8 row(s) ``` `hive> describe formatted gs_table2;` ``` OK # col_name data_type comment id int from deserializer data string from deserializer # Detailed Table Information Database: gsdb OwnerType: USER Owner: nick CreateTime: Tue Jun 28 02:34:36 UTC 2022 LastAccessTime: Sun Dec 14 22:38:21 UTC 1969 Retention: 2147483647 Location: gs://iceberg-spark/datasets/gsdb.db/gs_table2 Table Type: EXTERNAL_TABLE Table Parameters: EXTERNAL TRUE metadata_location gs://iceberg-spark/datasets/gsdb.db/gs_table2/metadata/00002-833dafb8-cda5-4c7c-a2ea-04e2c84fa372.metadata.json numFiles 8 numRows 8 owner nick previous_metadata_location gs://iceberg-spark/datasets/gsdb.db/gs_table2/metadata/00001-0c51975f-0d0a-4c2e-a5d1-aad3251c0393.metadata.json storage_handler org.apache.iceberg.mr.hive.HiveIcebergStorageHandler table_type ICEBERG totalSize 4952 transient_lastDdlTime 1656383676 uuid 86645efc-8a03-47f0-a397-1ec30877b1bb # Storage Information SerDe Library: org.apache.iceberg.mr.hive.HiveIcebergSerDe InputFormat: org.apache.iceberg.mr.hive.HiveIcebergInputFormat OutputFormat: org.apache.iceberg.mr.hive.HiveIcebergOutputFormat Compressed: No Num Buckets: 0 Bucket Columns: [] Sort Columns: [] Time taken: 0.206 seconds, Fetched: 34 row(s) ``` Then I try to create the iceberg table in doris ```sql CREATE TABLE `gs_table2` ENGINE = ICEBERG PROPERTIES ( "iceberg.database" = "gsdb", "iceberg.table" = "gs_table2", "iceberg.hive.metastore.uris" = "thrift://10.201.0.104:9083", "iceberg.catalog.type" = "HIVE_CATALOG" ); ``` sql execution timeout, table creation failed Let's look at another example I wrote the iceberg table on spark, the data is stored in hdfs `hive> select * from test_table;` ``` Query ID = nick_20220628042917_7209ca76-1704-45e1-8678-88af4297b64a Total jobs = 1 Launching Job 1 out of 1 Tez session was closed. Reopening... Session re-established. Session re-established. Status: Running (Executing on YARN cluster with App id application_1656302766976_0014) ---------------------------------------------------------------------------------------------- VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED ---------------------------------------------------------------------------------------------- Map 1 .......... container SUCCEEDED 1 1 0 0 0 0 ---------------------------------------------------------------------------------------------- VERTICES: 01/01 [==========================>>] 100% ELAPSED TIME: 7.40 s ---------------------------------------------------------------------------------------------- OK 1 a 2 b 3 c Time taken: 14.78 seconds, Fetched: 3 row(s) ``` `hive> describe formatted test_table;` ``` OK # col_name data_type comment id bigint from deserializer data string from deserializer # Detailed Table Information Database: testdb OwnerType: USER Owner: nick CreateTime: Thu Jun 23 11:11:56 UTC 2022 LastAccessTime: Wed Dec 10 07:15:41 UTC 1969 Retention: 2147483647 Location: hdfs://10.201.0.104:8020/user/hive/warehouse/testdb.db/test_table Table Type: EXTERNAL_TABLE Table Parameters: EXTERNAL TRUE metadata_location hdfs://10.201.0.104:8020/user/hive/warehouse/testdb.db/test_table/metadata/00001-ceb6024b-3d7d-4304-8fff-f2aad293d2cf.metadata.json numFiles 3 numRows 3 owner nick previous_metadata_location hdfs://10.201.0.104:8020/user/hive/warehouse/testdb.db/test_table/metadata/00000-f95f2c72-1e64-421b-a4b9-bccb77984d32.metadata.json storage_handler org.apache.iceberg.mr.hive.HiveIcebergStorageHandler table_type ICEBERG totalSize 1929 transient_lastDdlTime 1655982716 uuid b3a2408e-bc56-4486-bd81-65b2be22b2f3 # Storage Information SerDe Library: org.apache.iceberg.mr.hive.HiveIcebergSerDe InputFormat: org.apache.iceberg.mr.hive.HiveIcebergInputFormat OutputFormat: org.apache.iceberg.mr.hive.HiveIcebergOutputFormat Compressed: No Num Buckets: 0 Bucket Columns: [] Sort Columns: [] Time taken: 0.144 seconds, Fetched: 34 row(s) ``` and then try to create iceberg table in doris, created successfully and successfully read the data ```sql CREATE TABLE `test_table` ENGINE = ICEBERG PROPERTIES ( "iceberg.database" = "testdb", "iceberg.table" = "test_table", "iceberg.hive.metastore.uris" = "thrift://10.201.0.104:9083", "iceberg.catalog.type" = "HIVE_CATALOG" ); select * from iceberg_db.test_table; ``` query result data|id| ----+--+ a | 1| b | 2| c | 3| -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org