myfjdthink commented on issue #10452:
URL: https://github.com/apache/doris/issues/10452#issuecomment-1168365071

   In addition to authorization issues, doirs are also unable to read data 
stored on gcs.
   Take a look at this example
   I wrote iceberg table on spark and the data is stored on gcs
   and then read it in the hive environment, it works fine
   
   `hive> select * from gs_table2;`
   ```
   Query ID = nick_20220628041137_6fe75ec9-bae7-40b2-8d96-f6653fcdfb49
   Total jobs = 1
   Launching Job 1 out of 1
   Status: Running (Executing on YARN cluster with App id 
application_1656302766976_0013)
   
   
----------------------------------------------------------------------------------------------
           VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING 
 FAILED  KILLED
   
----------------------------------------------------------------------------------------------
   Map 1 .......... container     SUCCEEDED      1          1        0        0 
      0       0
   
----------------------------------------------------------------------------------------------
   VERTICES: 01/01  [==========================>>] 100%  ELAPSED TIME: 8.34 s
   
----------------------------------------------------------------------------------------------
   OK
   1    a
   2    b
   3    c
   4    a
   1    a
   2    b
   3    c
   4    a
   Time taken: 9.569 seconds, Fetched: 8 row(s)
   ```
   
   `hive> describe formatted gs_table2;`
   ```
   OK
   # col_name                   data_type               comment
   id                   int                     from deserializer
   data                 string                  from deserializer
   
   # Detailed Table Information
   Database:            gsdb
   OwnerType:           USER
   Owner:               nick
   CreateTime:          Tue Jun 28 02:34:36 UTC 2022
   LastAccessTime:      Sun Dec 14 22:38:21 UTC 1969
   Retention:           2147483647
   Location:            gs://iceberg-spark/datasets/gsdb.db/gs_table2
   Table Type:          EXTERNAL_TABLE
   Table Parameters:
        EXTERNAL                TRUE
        metadata_location       
gs://iceberg-spark/datasets/gsdb.db/gs_table2/metadata/00002-833dafb8-cda5-4c7c-a2ea-04e2c84fa372.metadata.json
        numFiles                8
        numRows                 8
        owner                   nick
        previous_metadata_location      
gs://iceberg-spark/datasets/gsdb.db/gs_table2/metadata/00001-0c51975f-0d0a-4c2e-a5d1-aad3251c0393.metadata.json
        storage_handler         
org.apache.iceberg.mr.hive.HiveIcebergStorageHandler
        table_type              ICEBERG
        totalSize               4952
        transient_lastDdlTime   1656383676
        uuid                    86645efc-8a03-47f0-a397-1ec30877b1bb
   
   # Storage Information
   SerDe Library:       org.apache.iceberg.mr.hive.HiveIcebergSerDe
   InputFormat:         org.apache.iceberg.mr.hive.HiveIcebergInputFormat
   OutputFormat:        org.apache.iceberg.mr.hive.HiveIcebergOutputFormat
   Compressed:          No
   Num Buckets:         0
   Bucket Columns:      []
   Sort Columns:        []
   Time taken: 0.206 seconds, Fetched: 34 row(s)
   ```
   
   Then I try to create the iceberg table in doris
   ```sql
   CREATE TABLE `gs_table2` 
   ENGINE = ICEBERG
   PROPERTIES (
   "iceberg.database" = "gsdb",
   "iceberg.table" = "gs_table2",
   "iceberg.hive.metastore.uris" = "thrift://10.201.0.104:9083",
   "iceberg.catalog.type"  =  "HIVE_CATALOG"
   );
   ```
   sql execution timeout, table creation failed
   
   
   
   Let's look at another example
   I wrote the iceberg table on spark, the data is stored in hdfs
   `hive> select * from test_table;`
   ```
   Query ID = nick_20220628042917_7209ca76-1704-45e1-8678-88af4297b64a
   Total jobs = 1
   Launching Job 1 out of 1
   Tez session was closed. Reopening...
   Session re-established.
   Session re-established.
   Status: Running (Executing on YARN cluster with App id 
application_1656302766976_0014)
   
   
----------------------------------------------------------------------------------------------
           VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING 
 FAILED  KILLED
   
----------------------------------------------------------------------------------------------
   Map 1 .......... container     SUCCEEDED      1          1        0        0 
      0       0
   
----------------------------------------------------------------------------------------------
   VERTICES: 01/01  [==========================>>] 100%  ELAPSED TIME: 7.40 s
   
----------------------------------------------------------------------------------------------
   OK
   1    a
   2    b
   3    c
   Time taken: 14.78 seconds, Fetched: 3 row(s)
   ```
   
   `hive> describe formatted test_table;`
   
   ```
   OK
   # col_name                   data_type               comment
   id                   bigint                  from deserializer
   data                 string                  from deserializer
   
   # Detailed Table Information
   Database:            testdb
   OwnerType:           USER
   Owner:               nick
   CreateTime:          Thu Jun 23 11:11:56 UTC 2022
   LastAccessTime:      Wed Dec 10 07:15:41 UTC 1969
   Retention:           2147483647
   Location:            
hdfs://10.201.0.104:8020/user/hive/warehouse/testdb.db/test_table
   Table Type:          EXTERNAL_TABLE
   Table Parameters:
        EXTERNAL                TRUE
        metadata_location       
hdfs://10.201.0.104:8020/user/hive/warehouse/testdb.db/test_table/metadata/00001-ceb6024b-3d7d-4304-8fff-f2aad293d2cf.metadata.json
        numFiles                3
        numRows                 3
        owner                   nick
        previous_metadata_location      
hdfs://10.201.0.104:8020/user/hive/warehouse/testdb.db/test_table/metadata/00000-f95f2c72-1e64-421b-a4b9-bccb77984d32.metadata.json
        storage_handler         
org.apache.iceberg.mr.hive.HiveIcebergStorageHandler
        table_type              ICEBERG
        totalSize               1929
        transient_lastDdlTime   1655982716
        uuid                    b3a2408e-bc56-4486-bd81-65b2be22b2f3
   
   # Storage Information
   SerDe Library:       org.apache.iceberg.mr.hive.HiveIcebergSerDe
   InputFormat:         org.apache.iceberg.mr.hive.HiveIcebergInputFormat
   OutputFormat:        org.apache.iceberg.mr.hive.HiveIcebergOutputFormat
   Compressed:          No
   Num Buckets:         0
   Bucket Columns:      []
   Sort Columns:        []
   Time taken: 0.144 seconds, Fetched: 34 row(s)
   ```
   
   and then try to create iceberg table in doris, created successfully and 
successfully read the data
   ```sql
   CREATE TABLE `test_table` 
   ENGINE = ICEBERG
   PROPERTIES (
   "iceberg.database" = "testdb",
   "iceberg.table" = "test_table",
   "iceberg.hive.metastore.uris" = "thrift://10.201.0.104:9083",
   "iceberg.catalog.type"  =  "HIVE_CATALOG"
   );
   
   
   select * from iceberg_db.test_table;
   ```
   query  result
   
   data|id|
   ----+--+
   a   | 1|
   b   | 2|
   c   | 3|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to