morningman commented on a change in pull request #3819: URL: https://github.com/apache/incubator-doris/pull/3819#discussion_r438202036
########## File path: docs/zh-CN/sql-reference/sql-statements/Data Definition/CREATE TABLE.md ########## @@ -152,6 +152,18 @@ under the License. "path" 中如果有多个文件,用逗号[,]分割。如果文件名中包含逗号,那么使用 %2c 来替代。如果文件名中包含 %,使用 %25 代替 现在文件内容格式支持CSV,支持GZ,BZ2,LZ4,LZO(LZOP) 压缩格式。 + 3) 如果是hive,则需要在properties提供以下信息: + ``` + PROPERTIES ( + "database" = "hive_db_name", + "table" = "hive_table_name", + "hive.metastore.uris" = "thrift://127.0.0.1:9083" + ) + + ``` + 其中database是hive表对应的库名字,table是hive表的名字,hive.metastore.uris是hive metastore服务地址。 + 注意:目前hive外部表仅用于Spark Load使用。 Review comment: ```suggestion 注意:目前 hive 外部表仅用于 Spark Load ``` ########## File path: fe/src/main/java/org/apache/doris/load/BrokerFileGroup.java ########## @@ -165,6 +170,33 @@ public void parse(Database db, DataDescription dataDescription) throws DdlExcept // FilePath filePaths = dataDescription.getFilePaths(); + + if (dataDescription.isLoadFromTable()) { + String srcTableName = dataDescription.getSrcTableName(); + // src table should be hive table + Table srcTable = db.getTable(srcTableName); + if (srcTable == null) { + throw new DdlException("Unknown table " + srcTableName + " in database " + db.getFullName()); + } + if (!(srcTable instanceof HiveTable)) { + throw new DdlException("Source table " + srcTableName + " is not HiveTable"); + } + // src table columns should include all columns of loaded table Review comment: Is this necessary? I think we can support that some of olap table's column does not exist in HIVE table, and can be filled by default value or null? ########## File path: fe/src/main/cup/sql_parser.cup ########## @@ -1244,6 +1244,15 @@ data_desc ::= RESULT = new DataDescription(tableName, partitionNames, files, colList, colSep, fileFormat, columnsFromPath, isNeg, colMappingList, whereExpr); :} + | KW_DATA KW_FROM KW_TABLE ident:srcTableName + opt_negative:isNeg + KW_INTO KW_TABLE ident:tableName + opt_partition_names:partitionNames + opt_col_mapping_list:colMappingList Review comment: How to map the hive table's columns to olap table's columns? What it column's name is same in two tables? How about reference to the [DeltaLake](https://docs.microsoft.com/en-us/azure/databricks/spark/latest/spark-sql/language-manual/copy-into) `COPY INTO` stmt by using a `SELECT` statement? ``` DATA AS (SELECT xxx FROM hive_table WHERE xxx) INTO TABLE olap_table PARTITION(p1, p2, ...) (k1, k2, k3, v1, v2) /* indicate the columns of olap table which will be loaded */ ``` First, SQL is more flexible, and can be easily used by Spark to read from a hive table. ########## File path: fe/src/main/java/org/apache/doris/catalog/Catalog.java ########## @@ -3908,6 +3911,18 @@ private void createBrokerTable(Database db, CreateTableStmt stmt) throws DdlExce return; } + private void createHiveTable(Database db, CreateTableStmt stmt) throws DdlException { + String tableName = stmt.getTableName(); + List<Column> columns = stmt.getColumns(); + long tableId = Catalog.getCurrentCatalog().getNextId(); Review comment: ```suggestion long tableId = getNextId(); ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org