This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
     new 500c7fb702 [improvement](multi-catalog) support unsupported column 
type (#15660)
500c7fb702 is described below

commit 500c7fb702cd9a13e4079d8c8c59e78294c4006d
Author: Mingyu Chen <morning...@163.com>
AuthorDate: Sun Jan 8 10:07:10 2023 +0800

    [improvement](multi-catalog) support unsupported column type (#15660)
    
    When creating an external catalog, Doris will automatically sync the schema 
of table from external catalog.
    But some of column type are not supported by Doris now, such as struct, 
map, etc.
    
    In previous, when meeting these unsupported column, Doris will throw an 
exception, and the corresponding
    table can not be synced. But user may just want to query other supported 
columns.
    
    In this PR, I add a new column type: UNSUPPORTED. And now it is just used 
for external table schema sync.
    When meeting unsupported column, it will be synced as column with 
UNSUPPORTED type.
    
    When query this table, there are serval situation:
    
    select * from table: throw error Unsupported type 'UNSUPPORTED_TYPE' xxx
    select k1 from table: k1 is with supported type. query OK.
    select * except(k2): k2 is with unsupported type. query OK
---
 .../hive/scripts/create_preinstalled_table.hql     | 12 ++++++++++
 docs/en/docs/advanced/variables.md                 |  1 +
 .../docs/ecosystem/external-table/multi-catalog.md | 28 ++++++++++++++++++++--
 .../docs/ecosystem/external-table/multi-catalog.md | 28 ++++++++++++++++++++--
 .../doris/catalog/HiveMetaStoreClientHelper.java   |  2 +-
 .../org/apache/doris/catalog/PrimitiveType.java    |  1 +
 .../java/org/apache/doris/catalog/ScalarType.java  |  1 +
 .../main/java/org/apache/doris/catalog/Type.java   |  6 ++---
 .../doris/external/elasticsearch/EsUtil.java       |  3 ++-
 .../org/apache/doris/external/jdbc/JdbcClient.java |  6 ++---
 gensrc/thrift/Types.thrift                         |  1 +
 .../external_catalog_p0/hive/test_hive_other.out   | 12 ++++++++++
 .../hive/test_hive_other.groovy                    | 20 +++++++++++++++-
 13 files changed, 107 insertions(+), 14 deletions(-)

diff --git 
a/docker/thirdparties/docker-compose/hive/scripts/create_preinstalled_table.hql 
b/docker/thirdparties/docker-compose/hive/scripts/create_preinstalled_table.hql
index dfab9b0d1f..8bb9f456b8 100644
--- 
a/docker/thirdparties/docker-compose/hive/scripts/create_preinstalled_table.hql
+++ 
b/docker/thirdparties/docker-compose/hive/scripts/create_preinstalled_table.hql
@@ -274,4 +274,16 @@ TBLPROPERTIES (
 
 msck repair table table_with_x01;
 
+CREATE TABLE `unsupported_type_table`(
+  k1 int,
+  k2 string,
+  k3 double,
+  k4 map<string,int>,
+  k5 STRUCT<
+            houseno:    STRING
+           ,streetname: STRING
+           >,
+  k6 int
+);
+
 show tables;
diff --git a/docs/en/docs/advanced/variables.md 
b/docs/en/docs/advanced/variables.md
index 881b5835a6..d23825f722 100644
--- a/docs/en/docs/advanced/variables.md
+++ b/docs/en/docs/advanced/variables.md
@@ -563,3 +563,4 @@ Translated with www.DeepL.com/Translator (free version)
 * `rewrite_or_to_in_predicate_threshold`
 
     The default threshold of rewriting OR to IN. The default value is 2, which 
means that when there are 2 ORs, if they can be compact, they will be rewritten 
as IN predicate.
+
diff --git a/docs/en/docs/ecosystem/external-table/multi-catalog.md 
b/docs/en/docs/ecosystem/external-table/multi-catalog.md
index 3d5c5bd200..d11370e32a 100644
--- a/docs/en/docs/ecosystem/external-table/multi-catalog.md
+++ b/docs/en/docs/ecosystem/external-table/multi-catalog.md
@@ -24,10 +24,11 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-<version since="1.2.0">
 
 # Multi-Catalog
 
+<version since="1.2.0">
+
 Multi-Catalog is a feature introduced in Doris 1.2.0, which aims to make it 
easier to interface with external data sources to enhance Doris' data lake 
analysis and federated data query capabilities.
 
 In previous versions of Doris, there were only two levels of user data: 
Database and Table. When we need to connect to an external data source, we can 
only connect at the Database or Table level. For example, create a mapping of a 
table in an external data source through `create external table`, or map a 
Database in an external data source through `create external database`. If 
there are too many Databases or Tables in the external data source, users need 
to manually map them one by one, [...]
@@ -40,6 +41,8 @@ The new Multi-Catalog function adds a new layer of Catalog to 
the original metad
 
 This function will be used as a supplement and enhancement to the previous 
external table connection method (External Table) to help users perform fast 
multi-catalog federated queries.
 
+</version>
+
 ## Basic Concepts
 
 1. Internal Catalog
@@ -595,6 +598,28 @@ MySQL [db1]> select * from tbl1;
 
 After the user creates the catalog, Doris will automatically synchronize the 
database and tables of the data catalog. For different data catalog and data 
table formats, Doris will perform the following mapping relationships.
 
+<version since="dev">
+
+For types that cannot currently be mapped to Doris column types, such as map, 
struct, etc. Doris will map the column type to UNSUPPORTED type. For queries of 
type UNSUPPORTED, an example is as follows:
+
+Suppose the table schema after synchronization is:
+
+```
+k1 INT,
+k2 INT,
+k3 UNSUPPORTED,
+k4 INT
+```
+
+```
+select * from table;                // Error: Unsupported type 
'UNSUPPORTED_TYPE' in '`k3`
+select * except(k3) from table;     // Query OK.
+select k1, k3 from table;           // Error: Unsupported type 
'UNSUPPORTED_TYPE' in '`k3`
+select k1, k4 from table;           // Query OK.
+```
+
+</version>
+
 ### Hive MetaStore
 
 For Hive/Iceberge/Hudi
@@ -715,4 +740,3 @@ The following configurations solves the problem `failed to 
get schema for table
 
 Restart `Hive Metastore` after the configuration is complete. 
 
-</version>
diff --git a/docs/zh-CN/docs/ecosystem/external-table/multi-catalog.md 
b/docs/zh-CN/docs/ecosystem/external-table/multi-catalog.md
index 8519a75866..016c9921ff 100644
--- a/docs/zh-CN/docs/ecosystem/external-table/multi-catalog.md
+++ b/docs/zh-CN/docs/ecosystem/external-table/multi-catalog.md
@@ -24,10 +24,11 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-<version since="1.2.0">
 
 # 多源数据目录
 
+<version since="1.2.0">
+
 多源数据目录(Multi-Catalog)是 Doris 1.2.0 
版本中推出的功能,旨在能够更方便对接外部数据目录,以增强Doris的数据湖分析和联邦数据查询能力。
 
 在之前的 Doris 版本中,用户数据只有两个层级:Database 和 Table。当我们需要连接一个外部数据目录时,我们只能在Database 或 
Table 层级进行对接。比如通过 `create external table` 的方式创建一个外部数据目录中的表的映射,或通过 `create 
external database` 的方式映射一个外部数据目录中的 Database。 如果外部数据目录中的 Database 或 Table 
非常多,则需要用户手动进行一一映射,使用体验不佳。
@@ -40,6 +41,8 @@ under the License.
 
 该功能将作为之前外表连接方式(External Table)的补充和增强,帮助用户进行快速的多数据目录联邦查询。
 
+</version>
+
 ## 基础概念
 
 1. Internal Catalog
@@ -594,6 +597,28 @@ MySQL [db1]> select * from tbl1;
 
 用户创建 Catalog 后,Doris 会自动同步数据目录的数据库和表,针对不同的数据目录和数据表格式,Doris 会进行以下列映射关系。
 
+<version since="dev">
+
+对于当前无法映射到 Doris 列类型的外表类型,如 map,struct 等。Doris 会将列类型映射为 UNSUPPORTED 类型。对于 
UNSUPPORTED 类型的查询,示例如下:
+
+假设同步后的表 schema 为:
+
+```
+k1 INT,
+k2 INT,
+k3 UNSUPPORTED,
+k4 INT
+```
+
+```
+select * from table;                // Error: Unsupported type 
'UNSUPPORTED_TYPE' in '`k3`
+select * except(k3) from table;     // Query OK.
+select k1, k3 from table;           // Error: Unsupported type 
'UNSUPPORTED_TYPE' in '`k3`
+select k1, k4 from table;           // Query OK.
+```
+
+</version>
+
 ### Hive MetaStore
 
 适用于 Hive/Iceberge/Hudi
@@ -714,4 +739,3 @@ Doris 的权限管理功能提供了对 Cataloig 层级的扩展,具体可参
   
 配置完成后需要重启Hive Metastore。
 
-</version>
diff --git 
a/fe/fe-core/src/main/java/org/apache/doris/catalog/HiveMetaStoreClientHelper.java
 
b/fe/fe-core/src/main/java/org/apache/doris/catalog/HiveMetaStoreClientHelper.java
index e43cecb0c7..0e88ec9e0d 100644
--- 
a/fe/fe-core/src/main/java/org/apache/doris/catalog/HiveMetaStoreClientHelper.java
+++ 
b/fe/fe-core/src/main/java/org/apache/doris/catalog/HiveMetaStoreClientHelper.java
@@ -727,7 +727,7 @@ public class HiveMetaStoreClientHelper {
             }
             return ScalarType.createDecimalType(precision, scale);
         }
-        return Type.INVALID;
+        return Type.UNSUPPORTED;
     }
 
     public static String 
showCreateTable(org.apache.hadoop.hive.metastore.api.Table remoteTable) {
diff --git 
a/fe/fe-core/src/main/java/org/apache/doris/catalog/PrimitiveType.java 
b/fe/fe-core/src/main/java/org/apache/doris/catalog/PrimitiveType.java
index 8ae8c76ed5..e91face521 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/catalog/PrimitiveType.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/catalog/PrimitiveType.java
@@ -31,6 +31,7 @@ import java.util.List;
 
 public enum PrimitiveType {
     INVALID_TYPE("INVALID_TYPE", -1, TPrimitiveType.INVALID_TYPE),
+    UNSUPPORTED("UNSUPPORTED_TYPE", -1, TPrimitiveType.UNSUPPORTED),
     // NULL_TYPE - used only in LiteralPredicate and NullLiteral to make NULLs 
compatible
     // with all other types.
     NULL_TYPE("NULL_TYPE", 1, TPrimitiveType.NULL_TYPE),
diff --git a/fe/fe-core/src/main/java/org/apache/doris/catalog/ScalarType.java 
b/fe/fe-core/src/main/java/org/apache/doris/catalog/ScalarType.java
index 6510ecdff3..ab1be2c4c9 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/catalog/ScalarType.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/catalog/ScalarType.java
@@ -764,6 +764,7 @@ public class ScalarType extends Type {
     public boolean isSupported() {
         switch (type) {
             case BINARY:
+            case UNSUPPORTED:
                 return false;
             default:
                 return true;
diff --git a/fe/fe-core/src/main/java/org/apache/doris/catalog/Type.java 
b/fe/fe-core/src/main/java/org/apache/doris/catalog/Type.java
index cffcc3a597..ee762f634c 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/catalog/Type.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/catalog/Type.java
@@ -51,6 +51,7 @@ public abstract class Type {
 
     // Static constant types for scalar types that don't require additional 
information.
     public static final ScalarType INVALID = new 
ScalarType(PrimitiveType.INVALID_TYPE);
+    public static final ScalarType UNSUPPORTED = new 
ScalarType(PrimitiveType.UNSUPPORTED);
     public static final ScalarType NULL = new 
ScalarType(PrimitiveType.NULL_TYPE);
     public static final ScalarType BOOLEAN = new 
ScalarType(PrimitiveType.BOOLEAN);
     public static final ScalarType TINYINT = new 
ScalarType(PrimitiveType.TINYINT);
@@ -1419,7 +1420,6 @@ public abstract class Type {
         compatibilityMatrix[DECIMAL128.ordinal()][DECIMAL32.ordinal()] = 
PrimitiveType.DECIMAL128;
         compatibilityMatrix[DECIMAL128.ordinal()][DECIMAL64.ordinal()] = 
PrimitiveType.DECIMAL128;
 
-
         // HLL
         compatibilityMatrix[HLL.ordinal()][TIME.ordinal()] = 
PrimitiveType.INVALID_TYPE;
         compatibilityMatrix[HLL.ordinal()][TIMEV2.ordinal()] = 
PrimitiveType.INVALID_TYPE;
@@ -1455,7 +1455,6 @@ public abstract class Type {
         compatibilityMatrix[QUANTILE_STATE.ordinal()][DECIMAL64.ordinal()] = 
PrimitiveType.INVALID_TYPE;
         compatibilityMatrix[QUANTILE_STATE.ordinal()][DECIMAL128.ordinal()] = 
PrimitiveType.INVALID_TYPE;
 
-
         // TIME why here not???
         compatibilityMatrix[TIME.ordinal()][TIME.ordinal()] = 
PrimitiveType.INVALID_TYPE;
         compatibilityMatrix[TIME.ordinal()][TIMEV2.ordinal()] = 
PrimitiveType.INVALID_TYPE;
@@ -1485,7 +1484,8 @@ public abstract class Type {
                         || t1 == PrimitiveType.TIME || t2 == PrimitiveType.TIME
                         || t1 == PrimitiveType.TIMEV2 || t2 == 
PrimitiveType.TIMEV2
                         || t1 == PrimitiveType.MAP || t2 == PrimitiveType.MAP
-                        || t1 == PrimitiveType.STRUCT || t2 == 
PrimitiveType.STRUCT) {
+                        || t1 == PrimitiveType.STRUCT || t2 == 
PrimitiveType.STRUCT
+                        || t1 == PrimitiveType.UNSUPPORTED || t2 == 
PrimitiveType.UNSUPPORTED) {
                     continue;
                 }
                 Preconditions.checkNotNull(compatibilityMatrix[i][j]);
diff --git 
a/fe/fe-core/src/main/java/org/apache/doris/external/elasticsearch/EsUtil.java 
b/fe/fe-core/src/main/java/org/apache/doris/external/elasticsearch/EsUtil.java
index 7cf22cfda3..0cde72067d 100644
--- 
a/fe/fe-core/src/main/java/org/apache/doris/external/elasticsearch/EsUtil.java
+++ 
b/fe/fe-core/src/main/java/org/apache/doris/external/elasticsearch/EsUtil.java
@@ -440,8 +440,9 @@ public class EsUtil {
             case "ip":
             case "nested":
             case "object":
-            default:
                 return ScalarType.createStringType();
+            default:
+                return Type.UNSUPPORTED;
         }
     }
 
diff --git 
a/fe/fe-core/src/main/java/org/apache/doris/external/jdbc/JdbcClient.java 
b/fe/fe-core/src/main/java/org/apache/doris/external/jdbc/JdbcClient.java
index 9b1bd31975..2da7cd38ff 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/external/jdbc/JdbcClient.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/external/jdbc/JdbcClient.java
@@ -440,8 +440,7 @@ public class JdbcClient {
             case "ENUM":
                 return ScalarType.createStringType();
             default:
-                throw new JdbcClientException("Can not convert mysql data type 
to doris data type for type ["
-                        + mysqlType + "]");
+                return Type.UNSUPPORTED;
         }
     }
 
@@ -505,8 +504,7 @@ public class JdbcClient {
             case "uuid":
                 return ScalarType.createStringType();
             default:
-                throw new JdbcClientException("Can not convert postgresql data 
type to doris data type for type ["
-                        + pgType + "]");
+                return Type.UNSUPPORTED;
         }
     }
 
diff --git a/gensrc/thrift/Types.thrift b/gensrc/thrift/Types.thrift
index 08e8e69bf1..589b95f77c 100644
--- a/gensrc/thrift/Types.thrift
+++ b/gensrc/thrift/Types.thrift
@@ -91,6 +91,7 @@ enum TPrimitiveType {
   DECIMAL64,
   DECIMAL128I,
   JSONB,
+  UNSUPPORTED
 }
 
 enum TTypeNodeType {
diff --git a/regression-test/data/external_catalog_p0/hive/test_hive_other.out 
b/regression-test/data/external_catalog_p0/hive/test_hive_other.out
index 05c25b0e19..0ce6861e34 100644
--- a/regression-test/data/external_catalog_p0/hive/test_hive_other.out
+++ b/regression-test/data/external_catalog_p0/hive/test_hive_other.out
@@ -224,6 +224,10 @@ zyLjAtVdXV GrJRf8WvRR
 2022-11-10     2022-11-10      zvbIuF1cg8      33ZkOc8Fl7      2022-11-10
 2022-11-10     2022-11-10      zyLjAtVdXV      GrJRf8WvRR      2022-11-10
 
+-- !q51 --
+
+-- !q52 --
+
 -- !q24 --
 zhangsan       1
 lisi   1
@@ -444,6 +448,10 @@ zyLjAtVdXV GrJRf8WvRR
 2022-11-10     2022-11-10      zvbIuF1cg8      33ZkOc8Fl7      2022-11-10
 2022-11-10     2022-11-10      zyLjAtVdXV      GrJRf8WvRR      2022-11-10
 
+-- !q51 --
+
+-- !q52 --
+
 -- !q24 --
 zhangsan       1
 lisi   1
@@ -664,6 +672,10 @@ zyLjAtVdXV GrJRf8WvRR
 2022-11-10     2022-11-10      zvbIuF1cg8      33ZkOc8Fl7      2022-11-10
 2022-11-10     2022-11-10      zyLjAtVdXV      GrJRf8WvRR      2022-11-10
 
+-- !q51 --
+
+-- !q52 --
+
 -- !after_refresh --
 2022-11-24     2022-11-24      zVhHON3m5L      ecmOZhvo5C      2022-11-24
 2022-11-24     2022-11-24      zceLX7IpL5      JZtsmIjKGQ      2022-11-24
diff --git 
a/regression-test/suites/external_catalog_p0/hive/test_hive_other.groovy 
b/regression-test/suites/external_catalog_p0/hive/test_hive_other.groovy
index 2d8e85e461..df2d489e0e 100644
--- a/regression-test/suites/external_catalog_p0/hive/test_hive_other.groovy
+++ b/regression-test/suites/external_catalog_p0/hive/test_hive_other.groovy
@@ -47,8 +47,26 @@ suite("test_hive_other", "p0") {
         order_qt_q48 """ select k2, k5 from table_with_x01 where dt in 
('2022-11-25') order by k2 desc limit 10;"""
         order_qt_q49 """ select k2, k5 from table_with_x01 where dt in 
('2022-11-10', '2022-11-10') order by k2 desc limit 10;"""
         order_qt_q50 """ select dt, dt, k2, k5, dt from table_with_x01 where 
dt in ('2022-11-10') or dt in ('2022-11-10') order by k2 desc limit 10;"""
-    }
 
+        test {
+            sql """select * from unsupported_type_table"""
+            exception """Unsupported type 'UNSUPPORTED_TYPE'"""
+        }
+
+        qt_q51 """select * except(k4,k5) from unsupported_type_table;"""
+
+        test {
+            sql """select k1,k4 from unsupported_type_table"""
+            exception """Unsupported type 'UNSUPPORTED_TYPE'"""
+        }
+
+        test {
+            sql """select k1 from unsupported_type_table where k4 is null;"""
+            exception """Unsupported type 'UNSUPPORTED_TYPE'"""
+        }
+
+        qt_q52 """select k1,k3,k6 from unsupported_type_table"""
+    }
 
     String enabled = context.config.otherConfigs.get("enableHiveTest")
     if (enabled != null && enabled.equalsIgnoreCase("true")) {


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to