morningman commented on code in PR #16055:
URL: https://github.com/apache/doris/pull/16055#discussion_r1090766395


##########
docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD.md:
##########
@@ -180,6 +180,8 @@ ERRORS:
 
 25. trim_double_quotes: 布尔类型,默认值为 false,为 true 时表示裁剪掉 csv 文件每个字段最外层的双引号。
 
+26. skip_lines: <version since="1.2" type="inline"> 整数类型, 默认值为0, 
含义为跳过csv文件的前几行. 当设置format设置为csv_with_names或、csv_with_names_and_types时, 该参数会失效. 
</version>

Review Comment:
   ```suggestion
   26. skip_lines: <version since="dev" type="inline"> 整数类型, 默认值为0, 
含义为跳过csv文件的前几行. 当设置format设置为 `csv_with_names` 或、`csv_with_names_and_types` 时, 
该参数会失效. </version>
   ```



##########
be/src/vec/exec/format/csv/csv_reader.cpp:
##########
@@ -88,14 +88,18 @@ CsvReader::~CsvReader() = default;
 Status CsvReader::init_reader(bool is_load) {
     // set the skip lines and start offset
     int64_t start_offset = _range.start_offset;
-    if (start_offset == 0 && _params.__isset.file_attributes &&
-        _params.file_attributes.__isset.header_type &&
-        _params.file_attributes.header_type.size() > 0) {
-        std::string header_type = 
to_lower(_params.file_attributes.header_type);
-        if (header_type == BeConsts::CSV_WITH_NAMES) {
-            _skip_lines = 1;
-        } else if (header_type == BeConsts::CSV_WITH_NAMES_AND_TYPES) {
-            _skip_lines = 2;
+    if (start_offset == 0) {
+        // check header typer first
+        if (_params.__isset.file_attributes && 
_params.file_attributes.__isset.header_type &&
+            _params.file_attributes.header_type.size() > 0) {
+            std::string header_type = 
to_lower(_params.file_attributes.header_type);
+            if (header_type == BeConsts::CSV_WITH_NAMES) {
+                _skip_lines = 1;
+            } else if (header_type == BeConsts::CSV_WITH_NAMES_AND_TYPES) {
+                _skip_lines = 2;
+            }
+        } else if (_params.file_attributes.__isset.skip_lines) {

Review Comment:
   Need to check `_params.__isset.file_attributes`?



##########
docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD.md:
##########
@@ -183,6 +183,8 @@ ERRORS:
 
 25. trim_double_quotes: Boolean type, The default value is false. True means 
that the outermost double quotes of each field in the csv file are trimmed.
 
+26. skip_lines: <version since="1.2" type="inline"> Integer type, the default 
value is 0. It will skip some lines in the head of csv file. It will be disable 
when format is csv_with_names or csv_with_names_and_types. </version>

Review Comment:
   ```suggestion
   26. skip_lines: <version since="dev" type="inline"> Integer type, the 
default value is 0. It will skip some lines in the head of csv file. It will be 
disabled when format is `csv_with_names` or `csv_with_names_and_types`. 
</version>
   ```



##########
fe/fe-core/src/main/cup/sql_parser.cup:
##########
@@ -621,7 +621,8 @@ terminal String
     KW_AUTO,
     KW_PREPARE,
     KW_EXECUTE,
-    KW_LINES;
+    KW_LINES,
+    KW_IGNORE;

Review Comment:
   Need to add `KW_IGNORE` to the `keywords ::=` entry



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to