[GitHub] [doris] dataroaring closed pull request #21249: [enhancement](log) pretty log in vtablet_sink

2023-09-03 Thread via GitHub


dataroaring closed pull request #21249: [enhancement](log) pretty log in 
vtablet_sink
URL: https://github.com/apache/doris/pull/21249


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] dataroaring commented on pull request #21090: [Improvement](fe)Simplified type system initialization Array List

2023-09-03 Thread via GitHub


dataroaring commented on PR #21090:
URL: https://github.com/apache/doris/pull/21090#issuecomment-1704030725

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] dataroaring commented on pull request #21090: [Improvement](fe)Simplified type system initialization Array List

2023-09-03 Thread via GitHub


dataroaring commented on PR #21090:
URL: https://github.com/apache/doris/pull/21090#issuecomment-1704030802

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #22155: [improvement](case function) add check to avoid stack overflow

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #22155:
URL: https://github.com/apache/doris/pull/22155#issuecomment-1704030811

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] dataroaring merged pull request #23680: [enhancement](load) support dry_run_query for load

2023-09-03 Thread via GitHub


dataroaring merged PR #23680:
URL: https://github.com/apache/doris/pull/23680


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated: [enhancement](load) support dry_run_query for load (#23680)

2023-09-03 Thread dataroaring
This is an automated email from the ASF dual-hosted git repository.

dataroaring pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
 new 89eacd4751 [enhancement](load) support dry_run_query for load (#23680)
89eacd4751 is described below

commit 89eacd47515b256e95a371ab6950efb3c24b2e87
Author: Yongqiang YANG <98214048+dataroar...@users.noreply.github.com>
AuthorDate: Sun Sep 3 15:08:10 2023 +0800

[enhancement](load) support dry_run_query for load (#23680)

If dry_run_query is set, a sink just discards blocks and do
not send them to destination.
---
 be/src/vec/sink/vtablet_sink.cpp  | 4 
 be/src/vec/sink/vtablet_sink_v2.cpp   | 4 
 docs/en/docs/advanced/variables.md| 2 +-
 docs/zh-CN/docs/advanced/variables.md | 2 +-
 4 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/be/src/vec/sink/vtablet_sink.cpp b/be/src/vec/sink/vtablet_sink.cpp
index 02782a8220..09bcdc6e25 100644
--- a/be/src/vec/sink/vtablet_sink.cpp
+++ b/be/src/vec/sink/vtablet_sink.cpp
@@ -1297,6 +1297,10 @@ Status VOlapTableSink::send(RuntimeState* state, 
vectorized::Block* input_block,
 SCOPED_CONSUME_MEM_TRACKER(_mem_tracker.get());
 Status status = Status::OK();
 
+if (state->query_options().dry_run_query) {
+return status;
+}
+
 auto rows = input_block->rows();
 auto bytes = input_block->bytes();
 if (UNLIKELY(rows == 0)) {
diff --git a/be/src/vec/sink/vtablet_sink_v2.cpp 
b/be/src/vec/sink/vtablet_sink_v2.cpp
index eb66eac664..a88f05b88d 100644
--- a/be/src/vec/sink/vtablet_sink_v2.cpp
+++ b/be/src/vec/sink/vtablet_sink_v2.cpp
@@ -248,6 +248,10 @@ Status VOlapTableSinkV2::send(RuntimeState* state, 
vectorized::Block* input_bloc
 SCOPED_CONSUME_MEM_TRACKER(_mem_tracker.get());
 Status status = Status::OK();
 
+if (state->query_options().dry_run_query) {
+return status;
+}
+
 auto input_rows = input_block->rows();
 auto input_bytes = input_block->bytes();
 if (UNLIKELY(input_rows == 0)) {
diff --git a/docs/en/docs/advanced/variables.md 
b/docs/en/docs/advanced/variables.md
index f2838dde23..d77c1f06ed 100644
--- a/docs/en/docs/advanced/variables.md
+++ b/docs/en/docs/advanced/variables.md
@@ -646,7 +646,7 @@ Translated with www.DeepL.com/Translator (free version)
 
 
 
-If set to true, for query requests, the actual result set will no longer 
be returned, but only the number of rows. The default is false.
+If set to true, for query requests, the actual result set will no longer 
be returned, but only the number of rows, while for load and insert, the data 
is discarded by sink node, no writing happens. The default is false.
 
 This parameter can be used to avoid the time-consuming result set 
transmission when testing a large number of data sets, and focus on the 
time-consuming underlying query execution.
 
diff --git a/docs/zh-CN/docs/advanced/variables.md 
b/docs/zh-CN/docs/advanced/variables.md
index 4587f86e81..6302ea604e 100644
--- a/docs/zh-CN/docs/advanced/variables.md
+++ b/docs/zh-CN/docs/advanced/variables.md
@@ -633,7 +633,7 @@ try (Connection conn = 
DriverManager.getConnection("jdbc:mysql://127.0.0.1:9030/
 
 
 
-如果设置为true,对于查询请求,将不再返回实际结果集,而仅返回行数。默认为 false。
+如果设置为true,对于查询请求,将不再返回实际结果集,而仅返回行数。对于导入和insert,Sink 丢掉了数据,不会有实际的写发生。额默认为 
false。
 
 该参数可以用于测试返回大量数据集时,规避结果集传输的耗时,重点关注底层查询执行的耗时。
 


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] xiaokang commented on a diff in pull request #23498: [Feature-Variant](Variant Type) support variant type

2023-09-03 Thread via GitHub


xiaokang commented on code in PR #23498:
URL: https://github.com/apache/doris/pull/23498#discussion_r1314143669


##
gensrc/proto/segment_v2.proto:
##
@@ -137,6 +137,20 @@ message ZoneMapPB {
 optional bool pass_all = 5 [default = false];
 }
 
+// For semi-structure column info, perisist info for PathInData
+message ColumnPathPartInfo {
+optional string key = 1;

Review Comment:
   what's the meaning of key? Add comment for each field.



##
fe/fe-core/src/main/java/org/apache/doris/analysis/SlotRef.java:
##
@@ -72,6 +73,14 @@ public SlotRef(TableName tblName, String col) {
 this.label = "`" + col + "`";
 }
 
+public SlotRef(TableName tblName, String col, List subColLables) {
+super();
+this.tblName = tblName;
+this.col = col;
+this.label = "`" + col + "`";
+this.subColLables = subColLables;

Review Comment:
   It seems that label is `` escaped column name. What's the meaing of 
subColLables?



##
fe/fe-core/src/main/java/org/apache/doris/analysis/SlotDescriptor.java:
##
@@ -304,6 +304,9 @@ public TSlotDescriptor toThrift() {
 tSlotDescriptor.setColUniqueId(column.getUniqueId());
 tSlotDescriptor.setIsKey(column.isKey());
 }
+if (subColLabels != null) {
+tSlotDescriptor.setColumnPaths(subColLabels);

Review Comment:
   The name is not consistent between thrift and java object.



##
gensrc/proto/segment_v2.proto:
##
@@ -161,7 +175,12 @@ message ColumnMetaPB {
 // required by array/struct/map reader to create child reader.
 optional uint64 num_rows = 11;
 repeated string children_column_names = 12;
+// persist info for PathInData that represents path in document, e.g. JSON.
+optional ColumnPathInfo column_path_info = 13;

Review Comment:
   Does column_path_info mean a single subcolumn of variant or all sub columns?



##
fe/fe-core/src/main/java/org/apache/doris/planner/FileLoadScanNode.java:
##
@@ -313,10 +313,6 @@ protected void finalizeParamsForLoad(ParamCreateContext 
context,
 String name = "jsonb_parse_" + nullable + "_error_to_null";
 expr = new FunctionCallExpr(name, args);
 expr.analyze(analyzer);
-} else if (dstType == PrimitiveType.VARIANT) {
-// Generate SchemaChange expr for dynamicly generating columns
-TableIf targetTbl = desc.getTable();
-expr = new SchemaChangeExpr((SlotRef) expr, (int) 
targetTbl.getId());

Review Comment:
   old dynamic table code?



##
fe/fe-core/src/main/java/org/apache/doris/analysis/SlotDescriptor.java:
##
@@ -64,7 +66,6 @@ public class SlotDescriptor {
 
 private ColumnStats stats;  // only set if 'column' isn't set
 private boolean isAgg;
-private boolean isMultiRef;

Review Comment:
   Why delete it?



##
gensrc/thrift/Descriptors.thrift:
##
@@ -59,6 +59,8 @@ struct TSlotDescriptor {
   // materialize them.Used to optmize to read less data and less memory usage
   13: optional bool need_materialize = true
   14: optional bool is_auto_increment = false;
+  // `$.a.b.c` => ['$', 'a', 'b', 'c']
+  15: optional list column_paths

Review Comment:
   submit a small seperate pr to change the thrift/pb for variant to be merged 
quickly



##
fe/fe-common/src/main/java/org/apache/doris/catalog/Type.java:
##
@@ -725,6 +726,8 @@ public static boolean canCastTo(Type sourceType, Type 
targetType) {
 return true;
 } else if (sourceType.isStructType() && targetType.isStructType()) {
 return StructType.canCastTo((StructType) sourceType, (StructType) 
targetType);
+} else if (sourceType.isVariantType() && targetType.isArrayType()) {

Review Comment:
   Can variant be cast to struct or map?



##
be/src/common/config.h:
##
@@ -1109,6 +1107,12 @@ DECLARE_mInt64(lookup_connection_cache_bytes_limit);
 
 // level of compression when using LZ4_HC, whose defalut value is 
LZ4HC_CLEVEL_DEFAULT
 DECLARE_mInt64(LZ4_HC_compression_level);
+// Whether flatten nested arrays in variant column
+// Notice: TEST ONLY
+DECLARE_mBool(enable_flatten_nested_for_variant);

Review Comment:
   use variant_ prefix for variant related configurations



##
gensrc/proto/segment_v2.proto:
##
@@ -137,6 +137,20 @@ message ZoneMapPB {
 optional bool pass_all = 5 [default = false];
 }
 
+// For semi-structure column info, perisist info for PathInData
+message ColumnPathPartInfo {
+optional string key = 1;
+optional bool is_nested = 2;
+optional uint32 anonymous_array_level = 3;
+}
+
+message ColumnPathInfo {
+optional string path = 1;
+repeated ColumnPathPartInfo path_part_infos = 2;

Review Comment:
   why path part info is a list?



##
gensrc/proto/olap_file.proto:
##
@@ -206,6 +206,8 @@ message ColumnPB {
 repeate

[GitHub] [doris] dataroaring merged pull request #22491: [typo][doc]modify some error descriptions.

2023-09-03 Thread via GitHub


dataroaring merged PR #22491:
URL: https://github.com/apache/doris/pull/22491


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated (89eacd4751 -> 8c213f8498)

2023-09-03 Thread dataroaring
This is an automated email from the ASF dual-hosted git repository.

dataroaring pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


from 89eacd4751 [enhancement](load) support dry_run_query for load (#23680)
 add 8c213f8498 [typo][doc]modify some error decriptions. (#22491)

No new revisions were added by this update.

Summary of changes:
 docs/en/docs/admin-manual/maint-monitor/metadata-operation.md| 2 +-
 docs/zh-CN/docs/admin-manual/maint-monitor/metadata-operation.md | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] jackwener commented on pull request #23563: [fix](Nereids): resolve to parse unnormalized date literal

2023-09-03 Thread via GitHub


jackwener commented on PR #23563:
URL: https://github.com/apache/doris/pull/23563#issuecomment-1704035290

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #22869: [Enhancement](multi-catalog) merge hms partition events.

2023-09-03 Thread via GitHub


hello-stephen commented on PR #22869:
URL: https://github.com/apache/doris/pull/22869#issuecomment-1704035389

   (From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 49.42 seconds
stream load tsv:  529 seconds loaded 74807831229 Bytes, about 134 
MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc:  64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet:  30 seconds loaded 861443392 Bytes, about 27 
MB/s
insert into select:  29.2 seconds inserted 1000 Rows, about 
342K ops/s
storage size: 17162097605 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #22155: [improvement](case function) add check to avoid stack overflow

2023-09-03 Thread via GitHub


hello-stephen commented on PR #22155:
URL: https://github.com/apache/doris/pull/22155#issuecomment-1704036076

   (From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 48.97 seconds
stream load tsv:  530 seconds loaded 74807831229 Bytes, about 134 
MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc:  64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet:  31 seconds loaded 861443392 Bytes, about 26 
MB/s
insert into select:  28.9 seconds inserted 1000 Rows, about 
346K ops/s
storage size: 17162052993 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yujun777 commented on pull request #23543: [improvement](tablet schedule) colocate balance between all groups

2023-09-03 Thread via GitHub


yujun777 commented on PR #23543:
URL: https://github.com/apache/doris/pull/23543#issuecomment-1704036579

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yujun777 commented on pull request #23543: [improvement](tablet schedule) colocate balance between all groups

2023-09-03 Thread via GitHub


yujun777 commented on PR #23543:
URL: https://github.com/apache/doris/pull/23543#issuecomment-1704037367

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #21090: [Improvement](fe)Simplified type system initialization Array List

2023-09-03 Thread via GitHub


hello-stephen commented on PR #21090:
URL: https://github.com/apache/doris/pull/21090#issuecomment-1704038002

   (From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 48.54 seconds
stream load tsv:  531 seconds loaded 74807831229 Bytes, about 134 
MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc:  64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet:  31 seconds loaded 861443392 Bytes, about 26 
MB/s
insert into select:  29.1 seconds inserted 1000 Rows, about 
343K ops/s
storage size: 17162153587 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zzzzzzzs opened a new pull request, #23797: [Enhancement](Load) stream tvf support csv header

2023-09-03 Thread via GitHub


zzzs opened a new pull request, #23797:
URL: https://github.com/apache/doris/pull/23797

   ## Proposed changes
   
   Issue Number: close https://github.com/apache/doris/issues/23678
   
   
   
   ## Further comments
   
   stream tvf support csv header
   csv_with_names
   ```
   id,name,age
   1,alice,18
   2,bob,20
   3,jack,24
   4,jackson,19
   5,liming,18
   6,luffy,20
   7,zoro,22
   8,sanzi,26
   9,wusuopu,21
   10,nami,18
   ```
   csv_with_names_and_types
   ```
   id,name,age
   INT,STRING,INT
   1,alice,18
   2,bob,20
   3,jack,24
   4,jackson,19
   5,liming,18
   6,luffy,20
   7,zoro,22
   8,sanzi,26
   9,wusuopu,21
   10,nami,18
   ```
   example:
   ```
   curl -v --location-trusted -u root: -H "sql: insert into test.t1(id, name, 
age) select cast(id as INT) as id, name, age from http_stream(\"format\" = 
\"csv_with_names\", \"column_separator\" = \",\")" -T csv_with_names.txt 
http://127.0.0.1:8030/api/_http_stream
   ```
   
   ```
   curl -v --location-trusted -u root: -H "sql: insert into test.t1(id, name, 
age) select cast(id as INT) as id, name, age from http_stream(\"format\" = 
\"csv_with_names_and_types\", \"column_separator\" = \",\")" -T 
csv_with_names_and_types.txt http://127.0.0.1:8030/api/_http_stream
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zzzzzzzs commented on pull request #23797: [Enhancement](Load) stream tvf support csv header

2023-09-03 Thread via GitHub


zzzs commented on PR #23797:
URL: https://github.com/apache/doris/pull/23797#issuecomment-1704040743

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #23563: [fix](Nereids): resolve to parse unnormalized date literal

2023-09-03 Thread via GitHub


hello-stephen commented on PR #23563:
URL: https://github.com/apache/doris/pull/23563#issuecomment-1704040972

   (From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.4 seconds
stream load tsv:  530 seconds loaded 74807831229 Bytes, about 134 
MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc:  64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet:  31 seconds loaded 861443392 Bytes, about 26 
MB/s
insert into select:  29.0 seconds inserted 1000 Rows, about 
344K ops/s
storage size: 17162135738 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #23792: [improvement](show backends) show backends print trash used

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #23792:
URL: https://github.com/apache/doris/pull/23792#issuecomment-1704041576

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #23792: [improvement](show backends) show backends print trash used

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #23792:
URL: https://github.com/apache/doris/pull/23792#issuecomment-1704041587

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #23543: [improvement](tablet schedule) colocate balance between all groups

2023-09-03 Thread via GitHub


hello-stephen commented on PR #23543:
URL: https://github.com/apache/doris/pull/23543#issuecomment-1704043993

   (From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.37 seconds
stream load tsv:  527 seconds loaded 74807831229 Bytes, about 135 
MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc:  64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet:  31 seconds loaded 861443392 Bytes, about 26 
MB/s
insert into select:  28.9 seconds inserted 1000 Rows, about 
346K ops/s
storage size: 17162244454 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morningman closed pull request #16661: [Enhencement](CsvReader) Optimize CsvReader

2023-09-03 Thread via GitHub


morningman closed pull request #16661: [Enhencement](CsvReader) Optimize 
CsvReader
URL: https://github.com/apache/doris/pull/16661


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #23797: [Enhancement](Load) stream tvf support csv header

2023-09-03 Thread via GitHub


hello-stephen commented on PR #23797:
URL: https://github.com/apache/doris/pull/23797#issuecomment-1704047410

   (From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.2 seconds
stream load tsv:  530 seconds loaded 74807831229 Bytes, about 134 
MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc:  64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet:  30 seconds loaded 861443392 Bytes, about 27 
MB/s
insert into select:  29.1 seconds inserted 1000 Rows, about 
343K ops/s
storage size: 17162152881 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #23751: [Improvement](test)Add property to support manually use auto analyzer to analyze db.

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #23751:
URL: https://github.com/apache/doris/pull/23751#issuecomment-1704048800

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #23751: [Improvement](test)Add property to support manually use auto analyzer to analyze db.

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #23751:
URL: https://github.com/apache/doris/pull/23751#issuecomment-1704048808

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morningman commented on a diff in pull request #23635: [fix](Export) Concatenation the outfile sql for Export

2023-09-03 Thread via GitHub


morningman commented on code in PR #23635:
URL: https://github.com/apache/doris/pull/23635#discussion_r1314184096


##
fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/ExportCommand.java:
##
@@ -49,35 +100,248 @@ public class ExportCommand extends Command implements 
ForwardWithSync {
 public ExportCommand(List nameParts, List partitions, 
String whereSql, String path,
 Map fileProperties, BrokerDesc brokerDesc) {
 super(PlanType.EXPORT_COMMAND);
-this.nameParts = nameParts;
+this.nameParts = 
ImmutableList.copyOf(Objects.requireNonNull(nameParts, "nameParts should not be 
null"));
+this.path = Objects.requireNonNull(path.trim(), "export path should 
not be null");
 this.partitionsNameList = partitions;
 this.whereSql = whereSql;
-this.path = path.trim();
 this.fileProperties = fileProperties;
-this.brokerDesc = brokerDesc;
+if (brokerDesc == null) {
+this.brokerDesc = new BrokerDesc("local", 
StorageBackend.StorageType.LOCAL, null);
+} else {
+this.brokerDesc = brokerDesc;
+}
 }
 
 @Override
 public void run(ConnectContext ctx, StmtExecutor executor) throws 
Exception {
-ExportStmt exportStmt = generateExportStmt();
-Analyzer analyzer = new Analyzer(ctx.getEnv(), ctx);
-exportStmt.analyze(analyzer);
-ctx.getEnv().getExportMgr().addExportJobAndRegisterTask(exportStmt);
+// get tblName
+TableName tblName = getTableName(ctx);
+
+// check auth
+if (!Env.getCurrentEnv().getAccessManager().checkTblPriv(ctx, 
tblName.getDb(), tblName.getTbl(),
+PrivPredicate.SELECT)) {
+
ErrorReport.reportAnalysisException(ErrorCode.ERR_TABLEACCESS_DENIED_ERROR, 
"EXPORT",
+ctx.getQualifiedUser(),
+ctx.getRemoteIP(),
+tblName.getDb() + ": " + tblName.getTbl());
+}
+
+// convert key to lowercase
+Map lowercaseProperties = 
convertPropertyKeyToLowercase(fileProperties);
+
+// check phases
+checkAllParameter(ctx, tblName, lowercaseProperties);
+
+ExportJob exportJob = generateExportJob(ctx, lowercaseProperties, 
tblName);
+// register job
+ctx.getEnv().getExportMgr().addExportJobAndRegisterTask(exportJob);
+}
+
+private void checkAllParameter(ConnectContext ctx, TableName tblName, 
Map lowercaseProperties)

Review Comment:
   ```suggestion
   private void checkAllParameters(ConnectContext ctx, TableName tblName, 
Map lowercaseProperties)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #23635: [fix](Export) Concatenation the outfile sql for Export

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #23635:
URL: https://github.com/apache/doris/pull/23635#issuecomment-1704056439

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #23635: [fix](Export) Concatenation the outfile sql for Export

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #23635:
URL: https://github.com/apache/doris/pull/23635#issuecomment-1704056450

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] xiaokang commented on a diff in pull request #23498: [Feature-Variant](Variant Type) support variant type

2023-09-03 Thread via GitHub


xiaokang commented on code in PR #23498:
URL: https://github.com/apache/doris/pull/23498#discussion_r1314162358


##
be/src/olap/delta_writer.cpp:
##
@@ -51,6 +51,7 @@
 #include "util/mem_info.h"
 #include "util/ref_count_closure.h"
 #include "util/stopwatch.hpp"
+#include "util/time.h"

Review Comment:
   useless include?



##
be/src/olap/rowset/segment_v2/segment_writer.cpp:
##
@@ -598,7 +575,8 @@ Status 
SegmentWriter::fill_missing_columns(vectorized::MutableColumns& mutable_f
 if (tablet_column.has_default_value()) {
 mutable_full_columns[cids_missing[i]]->insert_from(
 *mutable_default_value_columns[i].get(), 0);
-} else if (tablet_column.is_nullable()) {
+} else if (tablet_column.is_nullable() &&
+   
mutable_full_columns[cids_missing[i]]->can_be_inside_nullable()) {

Review Comment:
   unrelated to variant?



##
be/src/vec/columns/column.h:
##
@@ -145,15 +145,19 @@ class IColumn : public COW {
 return nullptr;
 }
 
+/// Some columns may require finalization before using of other operations.
+virtual void finalize() {}
+
+MutablePtr clone_finalized() const {
+auto finalized = IColumn::mutate(get_ptr());

Review Comment:
   no clone



##
be/src/olap/rowset/beta_rowset_writer_v2.h:
##
@@ -102,6 +98,8 @@ class BetaRowsetWriterV2 : public RowsetWriter {
 return nullptr;
 }
 
+RowsetWriterContext& mutable_context() override { LOG(FATAL) << "not 
implemented"; }

Review Comment:
   Is v2 not used?



##
be/src/olap/rowset/rowset_writer_context.h:
##
@@ -40,24 +40,25 @@ struct RowsetWriterContext {
 RowsetWriterContext()
 : tablet_id(0),
   tablet_schema_hash(0),
-  index_id(0),
   partition_id(0),
+  index_id(0),
   rowset_type(BETA_ROWSET),
   rowset_state(PREPARED),
   version(Version(0, 0)),
   sender_id(0),
   txn_id(0),
   tablet_uid(0, 0),
-  segments_overlap(OVERLAP_UNKNOWN) {
+  segments_overlap(OVERLAP_UNKNOWN),
+  schema_lock(new std::mutex) {
 load_id.set_hi(0);
 load_id.set_lo(0);
 }
 
 RowsetId rowset_id;
 int64_t tablet_id;
 int64_t tablet_schema_hash;
-int64_t index_id;
 int64_t partition_id;
+int64_t index_id;

Review Comment:
   why change the order of index_id?



##
be/src/olap/rowset/rowset_writer.h:
##
@@ -151,6 +151,8 @@ class RowsetWriter {
 
 virtual int64_t segment_writer_ns() { return 0; }
 
+virtual RowsetWriterContext& mutable_context() = 0;

Review Comment:
   why need a mutable context?



##
be/src/vec/columns/column.h:
##
@@ -145,15 +145,19 @@ class IColumn : public COW {
 return nullptr;
 }
 
+/// Some columns may require finalization before using of other operations.
+virtual void finalize() {}
+
+MutablePtr clone_finalized() const {
+auto finalized = IColumn::mutate(get_ptr());
+finalized->finalize();
+return finalized;
+}
+
 // Only used on ColumnDictionary
 virtual void set_rowset_segment_id(std::pair 
rowset_segment_id) {}
 
 virtual std::pair get_rowset_segment_id() const { 
return {}; }
-// todo(Amory) from column to get data type is not correct ,column is 
memory data,can not to assume memory data belong to which data type
-virtual TypeIndex get_data_type() const {

Review Comment:
   why delete it



##
be/src/olap/rowset/beta_rowset_writer.cpp:
##
@@ -750,76 +757,130 @@ Status 
BetaRowsetWriter::flush_segment_writer_for_segcompaction(
 return Status::OK();
 }
 
-Status BetaRowsetWriter::_unfold_variant_column(vectorized::Block& block,
-TabletSchemaSPtr& 
flush_schema) {
-if (block.rows() == 0) {
+Status BetaRowsetWriter::expand_variant_to_subcolumns(vectorized::Block& block,
+  TabletSchemaSPtr& 
flush_schema) {
+size_t num_rows = block.rows();
+if (num_rows == 0) {
 return Status::OK();
 }
 
-// Sanitize block to match exactly from the same type of frontend meta
-vectorized::schema_util::FullBaseSchemaView schema_view;
-schema_view.table_id = _context.tablet_schema->table_id();
-vectorized::ColumnWithTypeAndName* variant_column =
-block.try_get_by_name(BeConsts::DYNAMIC_COLUMN_NAME);
-if (!variant_column) {
-return Status::OK();
+std::vector variant_column_pos;
+if (_context.tablet_schema->is_partial_update()) {
+// check columns that used to do partial updates should not include 
variant
+for (int i : _context.tablet_schema->get_update_cids()) {
+if (_context.tablet_schema->co

[GitHub] [doris] zzzzzzzs opened a new pull request, #23798: [Enhancement](Load) stream tvf support two phase commit

2023-09-03 Thread via GitHub


zzzs opened a new pull request, #23798:
URL: https://github.com/apache/doris/pull/23798

   ## Proposed changes
   
   Issue Number: close https://github.com/apache/doris/issues/23678
   
   
   
   ## Further comments
   
   ``` shell
   curl v --location-trusted -u root: -H "two_phase_commit:true" -H "sql: 
insert into test.t1(c1, c2) select c1, c2 from http_stream(\"format\" = 
\"CSV\", \"column_separator\" = \",\")" -T example.csv 
http://127.0.0.1:8030/api/_http_stream
   ```
   result:
   ```
   {
   "TxnId": 3008,
   "Label": "c38b6252-69a4-4aa6-8d82-7645a85f50d0",
   "Comment": "",
   "TwoPhaseCommit": "true",
   "Status": "Success",
   "Message": "OK",
   "NumberTotalRows": 2,
   "NumberLoadedRows": 2,
   "NumberFilteredRows": 0,
   "NumberUnselectedRows": 0,
   "LoadBytes": 26,
   "LoadTimeMs": 132,
   "BeginTxnTimeMs": 0,
   "StreamLoadPutTimeMs": 35,
   "ReadDataTimeMs": 35,
   "WriteDataTimeMs": 88,
   "CommitAndPublishTimeMs": 0
   }
   ```
   two phase commit
   ```shell
   curl -X PUT --location-trusted -u root:  -H "txn_id:3008" -H 
"txn_operation:commit"  http://127.0.0.1:8030/api/test/t1/_stream_load_2pc
   ```
   ```
   {
   "status": "Success",
   "msg": "transaction [3008] commit successfully."
   }
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zzzzzzzs closed pull request #23798: [Enhancement](Load) stream tvf support two phase commit

2023-09-03 Thread via GitHub


zzzs closed pull request #23798: [Enhancement](Load) stream tvf support two 
phase commit
URL: https://github.com/apache/doris/pull/23798


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] dutyu opened a new pull request, #23799: [Fix](multi-catalog) Do not throw exceptions when hdfs file not exists.

2023-09-03 Thread via GitHub


dutyu opened a new pull request, #23799:
URL: https://github.com/apache/doris/pull/23799

   
   ## Proposed changes
   
   ```
   errCode = 2, detailMessage = 
(xxx.xxx.xxx.xxx)[CANCELLED][INTERNAL_ERROR]failed to init reader for file 
hdfs://xxx/dwd_tmp.db/check_dam_table_relation_record_day_data/part-0-c4ee3118-ae94-4bf7-8c40-1f12da07a292-c000.snappy.orc,
 err: [INTERNAL_ERROR]Init OrcReader failed. reason = Failed to read 
hdfs://xxx/dwd_tmp.db/check_dam_table_relation_record_day_data/part-0-c4ee3118-ae94-4bf7-8c40-1f12da07a292-c000.snappy.orc:
 [INTERNAL_ERROR]Read hdfs file failed. (BE: xxx.xxx.xxx.xxx) 
namenode:hdfs://xxx/dwd_tmp.db/check_dam_table_relation_record_day_data/part-0-c4ee3118-ae94-4bf7-8c40-1f12da07a292-c000.snappy.orc,
 err: (2), No such file or directory), reason: RemoteException: File does not 
exist: 
/xxx/dwd_tmp.db/check_dam_table_relation_record_day_data/part-0-c4ee3118-ae94-4bf7-8c40-1f12da07a292-c000.snappy.orc
 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86) 
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76) 
at org
 
.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:158)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1927)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:738)
 at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:426)
 at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subj
 ect.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
   ```
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #23798: [Enhancement](Load) stream tvf support two phase commit

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #23798:
URL: https://github.com/apache/doris/pull/23798#issuecomment-1704068845

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zzzzzzzs opened a new pull request, #23800: [Enhancement](Load) stream tvf support two phase commit

2023-09-03 Thread via GitHub


zzzs opened a new pull request, #23800:
URL: https://github.com/apache/doris/pull/23800

   ## Proposed changes
   
   Issue Number: close https://github.com/apache/doris/issues/23678
   
   
   
   ## Further comments
   ```
   curl -v --location-trusted -u root: -H "two_phase_commit:true" -H "sql: 
insert into test.t1(c1, c2) select c1, c2 from http_stream(\"format\" = 
\"CSV\", \"column_separator\" = \",\")" -T example.csv 
http://127.0.0.1:8030/api/_http_stream
   ```
   result:
   ```
   {
   "TxnId": 3008,
   "Label": "c38b6252-69a4-4aa6-8d82-7645a85f50d0",
   "Comment": "",
   "TwoPhaseCommit": "true",
   "Status": "Success",
   "Message": "OK",
   "NumberTotalRows": 2,
   "NumberLoadedRows": 2,
   "NumberFilteredRows": 0,
   "NumberUnselectedRows": 0,
   "LoadBytes": 26,
   "LoadTimeMs": 132,
   "BeginTxnTimeMs": 0,
   "StreamLoadPutTimeMs": 35,
   "ReadDataTimeMs": 35,
   "WriteDataTimeMs": 88,
   "CommitAndPublishTimeMs": 0
   }
   ```
   
   two phase commit
   ```
   curl -X PUT --location-trusted -u root:  -H "txn_id:3008" -H 
"txn_operation:commit"  http://127.0.0.1:8030/api/test/t1/_stream_load_2pc
   ```
   ```
   {
   "status": "Success",
   "msg": "transaction [3008] commit successfully."
   }
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #23799: [Fix](multi-catalog) Do not throw exceptions when hdfs file not exists.

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #23799:
URL: https://github.com/apache/doris/pull/23799#issuecomment-1704070207

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zzzzzzzs commented on pull request #23800: [Enhancement](Load) stream tvf support two phase commit

2023-09-03 Thread via GitHub


zzzs commented on PR #23800:
URL: https://github.com/apache/doris/pull/23800#issuecomment-1704070275

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] dutyu commented on pull request #23799: [Fix](multi-catalog) Do not throw exceptions when hdfs file not exists.

2023-09-03 Thread via GitHub


dutyu commented on PR #23799:
URL: https://github.com/apache/doris/pull/23799#issuecomment-1704070316

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #23800: [Enhancement](Load) stream tvf support two phase commit

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #23800:
URL: https://github.com/apache/doris/pull/23800#issuecomment-1704071909

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] dutyu commented on pull request #23391: [Feature](multi-catalog) Support sql cache for hms catalog

2023-09-03 Thread via GitHub


dutyu commented on PR #23391:
URL: https://github.com/apache/doris/pull/23391#issuecomment-1704075467

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #23800: [Enhancement](Load) stream tvf support two phase commit

2023-09-03 Thread via GitHub


hello-stephen commented on PR #23800:
URL: https://github.com/apache/doris/pull/23800#issuecomment-1704101796

   (From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 50.38 seconds
stream load tsv:  535 seconds loaded 74807831229 Bytes, about 133 
MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc:  64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet:  31 seconds loaded 861443392 Bytes, about 26 
MB/s
insert into select:  29.1 seconds inserted 1000 Rows, about 
343K ops/s
storage size: 17161865472 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #23799: [Fix](multi-catalog) Do not throw exceptions when hdfs file not exists.

2023-09-03 Thread via GitHub


hello-stephen commented on PR #23799:
URL: https://github.com/apache/doris/pull/23799#issuecomment-1704121879

   (From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.78 seconds
stream load tsv:  531 seconds loaded 74807831229 Bytes, about 134 
MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc:  64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet:  31 seconds loaded 861443392 Bytes, about 26 
MB/s
insert into select:  28.9 seconds inserted 1000 Rows, about 
346K ops/s
storage size: 17162150760 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #23391: [Feature](multi-catalog) Support sql cache for hms catalog

2023-09-03 Thread via GitHub


hello-stephen commented on PR #23391:
URL: https://github.com/apache/doris/pull/23391#issuecomment-1704139487

   (From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 48.42 seconds
stream load tsv:  530 seconds loaded 74807831229 Bytes, about 134 
MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc:  64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet:  31 seconds loaded 861443392 Bytes, about 26 
MB/s
insert into select:  29.2 seconds inserted 1000 Rows, about 
342K ops/s
storage size: 17161880421 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morningman commented on pull request #23801: [fix](row-policy) fix creating row policy with forward issue

2023-09-03 Thread via GitHub


morningman commented on PR #23801:
URL: https://github.com/apache/doris/pull/23801#issuecomment-1704152632

   rum buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morningman opened a new pull request, #23801: [fix](row-policy) fix creating row policy with forward issue

2023-09-03 Thread via GitHub


morningman opened a new pull request, #23801:
URL: https://github.com/apache/doris/pull/23801

   ## Proposed changes
   
   The `CreateRowPolicyCommand` is implemented with overriding `run()` method.
   
   So when executing `create row policy` in non-master FE, and forward it to 
Master FE,
   it will call `execute(TUniqueId queryId)` method and go through 
`executeByNereids()`.
   And because without `run()` method, it will do nothing and return OK.
   So after `show row policy`, user will get empty result.
   
   This PR fix it by implmenting the `run()` method but throw an Exception, so 
that it will
   fallback to old planner, to do the creating row policy command normally.
   
   The full implement of `run()` method should be implemented later.
   This is just a tmp fix.
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] dataroaring merged pull request #23792: [improvement](show backends) show backends print trash used

2023-09-03 Thread via GitHub


dataroaring merged PR #23792:
URL: https://github.com/apache/doris/pull/23792


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated: [improvement](show backends) show backends print trash used (#23792)

2023-09-03 Thread dataroaring
This is an automated email from the ASF dual-hosted git repository.

dataroaring pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
 new acbd8ca185 [improvement](show backends) show backends print trash used 
(#23792)
acbd8ca185 is described below

commit acbd8ca1851debec99cbd653911ee94be268c6f3
Author: yujun 
AuthorDate: Sun Sep 3 20:30:58 2023 +0800

[improvement](show backends) show backends print trash used (#23792)
---
 be/src/agent/task_worker_pool.cpp |  1 +
 be/src/agent/task_worker_pool.h   |  2 ++
 be/src/olap/data_dir.cpp  | 11 +++
 be/src/olap/data_dir.h|  5 +
 be/src/olap/olap_common.h |  1 +
 be/src/olap/storage_engine.cpp| 11 ++-
 be/src/olap/storage_engine.h  |  3 ++-
 be/src/service/backend_service.cpp| 19 ++-
 .../main/java/org/apache/doris/catalog/DiskInfo.java  | 16 ++--
 .../org/apache/doris/common/proc/BackendsProcDir.java | 10 --
 .../main/java/org/apache/doris/system/Backend.java| 13 +
 .../tablefunction/BackendsTableValuedFunction.java|  1 +
 .../apache/doris/tablefunction/MetadataGenerator.java |  3 +++
 .../apache/doris/utframe/DemoMultiBackendsTest.java   |  2 +-
 gensrc/thrift/MasterService.thrift|  1 +
 .../external_table_p0/tvf/test_backends_tvf.groovy|  2 +-
 .../nereids_syntax_p0/information_schema.groovy   |  2 +-
 17 files changed, 85 insertions(+), 18 deletions(-)

diff --git a/be/src/agent/task_worker_pool.cpp 
b/be/src/agent/task_worker_pool.cpp
index 71fe09b9d9..625fa7abe7 100644
--- a/be/src/agent/task_worker_pool.cpp
+++ b/be/src/agent/task_worker_pool.cpp
@@ -678,6 +678,7 @@ void 
TaskWorkerPool::_report_disk_state_worker_thread_callback() {
 disk.__set_data_used_capacity(root_path_info.local_used_capacity);
 
disk.__set_remote_used_capacity(root_path_info.remote_used_capacity);
 disk.__set_disk_available_capacity(root_path_info.available);
+disk.__set_trash_used_capacity(root_path_info.trash_used_capacity);
 disk.__set_used(root_path_info.is_used);
 request.disks[root_path_info.path] = disk;
 }
diff --git a/be/src/agent/task_worker_pool.h b/be/src/agent/task_worker_pool.h
index 598c77d3ef..83fcc1ff1d 100644
--- a/be/src/agent/task_worker_pool.h
+++ b/be/src/agent/task_worker_pool.h
@@ -181,6 +181,8 @@ public:
 // notify the worker. currently for task/disk/tablet report thread
 void notify_thread();
 
+TaskWorkerType task_worker_type() const { return _task_worker_type; }
+
 protected:
 bool _register_task_info(const TTaskType::type task_type, int64_t 
signature);
 void _remove_task_info(const TTaskType::type task_type, int64_t signature);
diff --git a/be/src/olap/data_dir.cpp b/be/src/olap/data_dir.cpp
index bc488d06a4..7b4b88d9f9 100644
--- a/be/src/olap/data_dir.cpp
+++ b/be/src/olap/data_dir.cpp
@@ -75,6 +75,7 @@ DEFINE_GAUGE_METRIC_PROTOTYPE_2ARG(disks_total_capacity, 
MetricUnit::BYTES);
 DEFINE_GAUGE_METRIC_PROTOTYPE_2ARG(disks_avail_capacity, MetricUnit::BYTES);
 DEFINE_GAUGE_METRIC_PROTOTYPE_2ARG(disks_local_used_capacity, 
MetricUnit::BYTES);
 DEFINE_GAUGE_METRIC_PROTOTYPE_2ARG(disks_remote_used_capacity, 
MetricUnit::BYTES);
+DEFINE_GAUGE_METRIC_PROTOTYPE_2ARG(disks_trash_used_capacity, 
MetricUnit::BYTES);
 DEFINE_GAUGE_METRIC_PROTOTYPE_2ARG(disks_state, MetricUnit::BYTES);
 DEFINE_GAUGE_METRIC_PROTOTYPE_2ARG(disks_compaction_score, MetricUnit::NOUNIT);
 DEFINE_GAUGE_METRIC_PROTOTYPE_2ARG(disks_compaction_num, MetricUnit::NOUNIT);
@@ -88,6 +89,7 @@ DataDir::DataDir(const std::string& path, int64_t 
capacity_bytes,
   _fs(io::LocalFileSystem::create(path)),
   _available_bytes(0),
   _disk_capacity_bytes(0),
+  _trash_used_bytes(0),
   _storage_medium(storage_medium),
   _is_used(false),
   _tablet_manager(tablet_manager),
@@ -103,6 +105,7 @@ DataDir::DataDir(const std::string& path, int64_t 
capacity_bytes,
 INT_GAUGE_METRIC_REGISTER(_data_dir_metric_entity, disks_avail_capacity);
 INT_GAUGE_METRIC_REGISTER(_data_dir_metric_entity, 
disks_local_used_capacity);
 INT_GAUGE_METRIC_REGISTER(_data_dir_metric_entity, 
disks_remote_used_capacity);
+INT_GAUGE_METRIC_REGISTER(_data_dir_metric_entity, 
disks_trash_used_capacity);
 INT_GAUGE_METRIC_REGISTER(_data_dir_metric_entity, disks_state);
 INT_GAUGE_METRIC_REGISTER(_data_dir_metric_entity, disks_compaction_score);
 INT_GAUGE_METRIC_REGISTER(_data_dir_metric_entity, disks_compaction_num);
@@ -122,6 +125,7 @@ Status DataDir::init() {
"check file exist failed");
 }
 
+u

[GitHub] [doris] morningman commented on pull request #23801: [fix](row-policy) fix creating row policy with forward issue

2023-09-03 Thread via GitHub


morningman commented on PR #23801:
URL: https://github.com/apache/doris/pull/23801#issuecomment-1704295881

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yujun777 commented on pull request #23543: [improvement](tablet schedule) colocate balance between all groups

2023-09-03 Thread via GitHub


yujun777 commented on PR #23543:
URL: https://github.com/apache/doris/pull/23543#issuecomment-1704296837

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yujun777 commented on pull request #23543: [improvement](tablet schedule) colocate balance between all groups

2023-09-03 Thread via GitHub


yujun777 commented on PR #23543:
URL: https://github.com/apache/doris/pull/23543#issuecomment-1704297983

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] 01/03: [function](bitmap) support bitmap_to_base64 and bitmap_from_base64 (#23759)

2023-09-03 Thread kxiao
This is an automated email from the ASF dual-hosted git repository.

kxiao pushed a commit to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/doris.git

commit 9bf7a0e9559dab7e55e07721b8d548bd1810f987
Author: TengJianPing <18241664+jackte...@users.noreply.github.com>
AuthorDate: Sat Sep 2 00:58:48 2023 +0800

[function](bitmap) support bitmap_to_base64 and bitmap_from_base64 (#23759)
---
 be/src/util/bitmap_value.h |  23 ++-
 be/src/vec/functions/function_bitmap.cpp   | 106 +
 be/test/vec/function/function_bitmap_test.cpp  | 171 +
 be/test/vec/function/function_test_util.h  |  15 ++
 .../doris/catalog/BuiltinScalarFunctions.java  |   4 +
 .../functions/scalar/BitmapFromBase64.java |  71 +
 .../functions/scalar/BitmapToBase64.java   |  69 +
 .../expressions/visitor/ScalarFunctionVisitor.java |  10 ++
 gensrc/script/doris_builtins_functions.py  |   3 +
 .../bitmap_functions/test_bitmap_function.out  | 117 +-
 .../bitmap_functions/test_bitmap_function.groovy   |  78 ++
 11 files changed, 627 insertions(+), 40 deletions(-)

diff --git a/be/src/util/bitmap_value.h b/be/src/util/bitmap_value.h
index 32486008c7..02a5595440 100644
--- a/be/src/util/bitmap_value.h
+++ b/be/src/util/bitmap_value.h
@@ -1229,7 +1229,7 @@ public:
 return;
 }
 
-if (bits.size() == 1 && !config::enable_set_in_bitmap_value) {
+if (bits.size() == 1) {
 _type = SINGLE;
 _sv = bits[0];
 return;
@@ -1247,6 +1247,27 @@ public:
 }
 }
 
+BitmapTypeCode::type get_type_code() const {
+switch (_type) {
+case EMPTY:
+return BitmapTypeCode::EMPTY;
+case SINGLE:
+if (_sv <= std::numeric_limits::max()) {
+return BitmapTypeCode::SINGLE32;
+} else {
+return BitmapTypeCode::SINGLE64;
+}
+case SET:
+return BitmapTypeCode::SET;
+case BITMAP:
+if (_bitmap->is32BitsEnough()) {
+return BitmapTypeCode::BITMAP32;
+} else {
+return BitmapTypeCode::BITMAP64;
+}
+}
+}
+
 template 
 void add_many(const T* values, const size_t count) {
 switch (_type) {
diff --git a/be/src/vec/functions/function_bitmap.cpp 
b/be/src/vec/functions/function_bitmap.cpp
index 578ce0e34d..8c7d81cd9b 100644
--- a/be/src/vec/functions/function_bitmap.cpp
+++ b/be/src/vec/functions/function_bitmap.cpp
@@ -40,6 +40,7 @@
 #include "util/hash_util.hpp"
 #include "util/murmur_hash3.h"
 #include "util/string_parser.hpp"
+#include "util/url_coding.h"
 #include "vec/aggregate_functions/aggregate_function.h"
 #include "vec/columns/column.h"
 #include "vec/columns/column_array.h"
@@ -250,6 +251,58 @@ struct BitmapFromString {
 }
 };
 
+struct NameBitmapFromBase64 {
+static constexpr auto name = "bitmap_from_base64";
+};
+struct BitmapFromBase64 {
+using ArgumentType = DataTypeString;
+
+static constexpr auto name = "bitmap_from_base64";
+
+static Status vector(const ColumnString::Chars& data, const 
ColumnString::Offsets& offsets,
+ std::vector& res, NullMap& null_map,
+ size_t input_rows_count) {
+res.reserve(input_rows_count);
+if (offsets.size() == 0 && input_rows_count == 1) {
+// For NULL constant
+res.emplace_back();
+null_map[0] = 1;
+return Status::OK();
+}
+std::string decode_buff;
+int last_decode_buff_len = 0;
+int curr_decode_buff_len = 0;
+for (size_t i = 0; i < input_rows_count; ++i) {
+const char* src_str = reinterpret_cast(&data[offsets[i - 1]]);
+int64_t src_size = offsets[i] - offsets[i - 1];
+if (0 != src_size % 4) {
+// return Status::InvalidArgument(
+// fmt::format("invalid base64: {}", 
std::string(src_str, src_size)));
+res.emplace_back();
+null_map[i] = 1;
+continue;
+}
+curr_decode_buff_len = src_size + 3;
+if (curr_decode_buff_len > last_decode_buff_len) {
+decode_buff.resize(curr_decode_buff_len);
+last_decode_buff_len = curr_decode_buff_len;
+}
+int outlen = base64_decode(src_str, src_size, decode_buff.data());
+if (outlen < 0) {
+res.emplace_back();
+null_map[i] = 1;
+} else {
+BitmapValue bitmap_val;
+if (!bitmap_val.deserialize(decode_buff.data())) {
+return Status::RuntimeError(
+fmt::format("bitmap_from_base64 decode failed: 
base64: {}", src_str));
+}
+ 

[doris] branch branch-2.0 updated (c1b8c23ee0 -> 9ec056675c)

2023-09-03 Thread kxiao
This is an automated email from the ASF dual-hosted git repository.

kxiao pushed a change to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/doris.git


from c1b8c23ee0 [Fix](planner) fix to_date failed in create table as select 
(#23613)
 new 9bf7a0e955 [function](bitmap) support bitmap_to_base64 and 
bitmap_from_base64 (#23759)
 new 444836f1ae [improvement](config) add a specific be config for 
segment_cache_capacity (#23701)
 new 9ec056675c [Fix](vscanner) remove TEMP column in block after filter 
(#23778)

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 be/src/common/config.cpp   |   3 +
 be/src/common/config.h |   3 +
 be/src/runtime/exec_env_init.cpp   |   7 +-
 be/src/util/bitmap_value.h |  23 ++-
 be/src/vec/exec/scan/vscanner.cpp  |   6 +
 be/src/vec/functions/function_bitmap.cpp   | 106 +
 be/test/vec/function/function_bitmap_test.cpp  | 171 +
 be/test/vec/function/function_test_util.h  |  15 ++
 docs/en/docs/admin-manual/config/be-config.md  |  12 +-
 docs/zh-CN/docs/admin-manual/config/be-config.md   |  12 +-
 .../doris/catalog/BuiltinScalarFunctions.java  |   4 +
 ...BitmapFromString.java => BitmapFromBase64.java} |  12 +-
 .../{BitmapToString.java => BitmapToBase64.java}   |  14 +-
 .../expressions/visitor/ScalarFunctionVisitor.java |  10 ++
 gensrc/script/doris_builtins_functions.py  |   3 +
 .../bitmap_functions/test_bitmap_function.out  | 117 +-
 .../bitmap_functions/test_bitmap_function.groovy   |  78 ++
 17 files changed, 527 insertions(+), 69 deletions(-)
 copy 
fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/functions/scalar/{BitmapFromString.java
 => BitmapFromBase64.java} (88%)
 copy 
fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/functions/scalar/{BitmapToString.java
 => BitmapToBase64.java} (85%)


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] 03/03: [Fix](vscanner) remove TEMP column in block after filter (#23778)

2023-09-03 Thread kxiao
This is an automated email from the ASF dual-hosted git repository.

kxiao pushed a commit to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/doris.git

commit 9ec056675c497e34cdccd85be523cbcccddcda7c
Author: airborne12 
AuthorDate: Sat Sep 2 21:54:27 2023 +0800

[Fix](vscanner) remove TEMP column in block after filter (#23778)
---
 be/src/vec/exec/scan/vscanner.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/be/src/vec/exec/scan/vscanner.cpp 
b/be/src/vec/exec/scan/vscanner.cpp
index 9140e3487f..19d5302286 100644
--- a/be/src/vec/exec/scan/vscanner.cpp
+++ b/be/src/vec/exec/scan/vscanner.cpp
@@ -113,6 +113,12 @@ Status VScanner::_filter_output_block(Block* block) {
 auto old_rows = block->rows();
 Status st = VExprContext::filter_block(_conjuncts, block, 
block->columns());
 _counter.num_rows_unselected += old_rows - block->rows();
+auto all_column_names = block->get_names();
+for (auto& name : all_column_names) {
+if (name.rfind(BeConsts::BLOCK_TEMP_COLUMN_PREFIX, 0) == 0) {
+block->erase(name);
+}
+}
 return st;
 }
 


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] 02/03: [improvement](config) add a specific be config for segment_cache_capacity (#23701)

2023-09-03 Thread kxiao
This is an automated email from the ASF dual-hosted git repository.

kxiao pushed a commit to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/doris.git

commit 444836f1ae8fd236ea9c2c630ba28aadf8d5ed13
Author: Kang 
AuthorDate: Sat Sep 2 01:14:14 2023 +0800

[improvement](config) add a specific be config for segment_cache_capacity 
(#23701)

* add segment_cache_capacity config istead of fd limit * 2/5
* default -1 for backward compatibility
---
 be/src/common/config.cpp |  3 +++
 be/src/common/config.h   |  3 +++
 be/src/runtime/exec_env_init.cpp |  7 +--
 docs/en/docs/admin-manual/config/be-config.md| 12 +---
 docs/zh-CN/docs/admin-manual/config/be-config.md | 12 +---
 5 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/be/src/common/config.cpp b/be/src/common/config.cpp
index 9bc477c27a..3ac4535966 100644
--- a/be/src/common/config.cpp
+++ b/be/src/common/config.cpp
@@ -1021,6 +1021,9 @@ DEFINE_Bool(enable_shrink_memory, "false");
 DEFINE_mInt32(schema_cache_capacity, "1024");
 DEFINE_mInt32(schema_cache_sweep_time_sec, "100");
 
+// max number of segment cache, default -1 for backward compatibility 
fd_number*2/5
+DEFINE_mInt32(segment_cache_capacity, "-1");
+
 // enable feature binlog, default false
 DEFINE_Bool(enable_feature_binlog, "false");
 
diff --git a/be/src/common/config.h b/be/src/common/config.h
index 4a1a4ca18d..5525745fe4 100644
--- a/be/src/common/config.h
+++ b/be/src/common/config.h
@@ -1059,6 +1059,9 @@ DECLARE_Bool(enable_shrink_memory);
 DECLARE_mInt32(schema_cache_capacity);
 DECLARE_mInt32(schema_cache_sweep_time_sec);
 
+// max number of segment cache
+DECLARE_mInt32(segment_cache_capacity);
+
 // enable binlog
 DECLARE_Bool(enable_feature_binlog);
 
diff --git a/be/src/runtime/exec_env_init.cpp b/be/src/runtime/exec_env_init.cpp
index 5fca5e325e..7f88efe689 100644
--- a/be/src/runtime/exec_env_init.cpp
+++ b/be/src/runtime/exec_env_init.cpp
@@ -281,8 +281,11 @@ Status ExecEnv::_init_mem_env() {
 }
 // SegmentLoader caches segments in rowset granularity. So the size of
 // opened files will greater than segment_cache_capacity.
-uint64_t segment_cache_capacity = fd_number * 2 / 5;
-LOG(INFO) << "segment_cache_capacity = fd_number * 2 / 5, fd_number: " << 
fd_number
+int64_t segment_cache_capacity = config::segment_cache_capacity;
+if (segment_cache_capacity < 0 || segment_cache_capacity > fd_number * 2 / 
5) {
+segment_cache_capacity = fd_number * 2 / 5;
+}
+LOG(INFO) << "segment_cache_capacity <= fd_number * 2 / 5, fd_number: " << 
fd_number
   << " segment_cache_capacity: " << segment_cache_capacity;
 SegmentLoader::create_global_instance(segment_cache_capacity);
 
diff --git a/docs/en/docs/admin-manual/config/be-config.md 
b/docs/en/docs/admin-manual/config/be-config.md
index 88a044fdfd..ec85dc41f1 100644
--- a/docs/en/docs/admin-manual/config/be-config.md
+++ b/docs/en/docs/admin-manual/config/be-config.md
@@ -1012,13 +1012,6 @@ BaseCompaction:546859:
   - Increasing this value can reduce the number of calls to read remote data, 
but it will increase memory overhead.
 * Default value: 16MB
 
- `segment_cache_capacity`
-
-* Type: int32
-* Description: The maximum number of Segments cached by Segment Cache.
-  - The default value is currently only an empirical value, and may need to be 
modified according to actual scenarios. Increasing this value can cache more 
segments and avoid some IO. Decreasing this value will reduce memory usage.
-* Default value: 100
-
  `file_cache_type`
 
 * Type: string
@@ -1181,6 +1174,11 @@ BaseCompaction:546859:
 * Description: Index page cache as a percentage of total storage page cache, 
value range is [0, 100]
 * Default value: 10
 
+ `segment_cache_capacity`
+* Type: int32
+* Description: Max number of segment cache (the key is rowset id) entries. -1 
is for backward compatibility as fd_number * 2/5.
+* Default value: -1
+
  `storage_strict_check_incompatible_old_format`
 
 * Type: bool
diff --git a/docs/zh-CN/docs/admin-manual/config/be-config.md 
b/docs/zh-CN/docs/admin-manual/config/be-config.md
index 5e48d603f6..109bdd5462 100644
--- a/docs/zh-CN/docs/admin-manual/config/be-config.md
+++ b/docs/zh-CN/docs/admin-manual/config/be-config.md
@@ -1037,13 +1037,6 @@ BaseCompaction:546859:
   - 增大这个值,可以减少远端数据读取的调用次数,但会增加内存开销。
 * 默认值: 16MB
 
- `segment_cache_capacity`
-
-* 类型: int32
-* 描述: Segment Cache 缓存的 Segment 最大数量
-  - 默认值目前只是一个经验值,可能需要根据实际场景修改。增大该值可以缓存更多的segment从而避免一些IO。减少该值则会降低内存使用。
-* 默认值: 100
-
  `file_cache_type`
 
 * 类型:string
@@ -1206,6 +1199,11 @@ BaseCompaction:546859:
 * 描述:索引页缓存占总页面缓存的百分比,取值为[0, 100]。
 * 默认值:10
 
+ `segment_cache_capacity`
+* Type: int32
+* Description: segment元数据缓存(以rowset id为key)的最大rowset个数. -1代表向后兼容取值为fd_number * 
2/5
+* Default value: -1
+
  `storage_strict_check_incompatible_old

[GitHub] [doris] hello-stephen commented on pull request #23801: [fix](row-policy) fix creating row policy with forward issue

2023-09-03 Thread via GitHub


hello-stephen commented on PR #23801:
URL: https://github.com/apache/doris/pull/23801#issuecomment-1704305426

   (From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.73 seconds
stream load tsv:  531 seconds loaded 74807831229 Bytes, about 134 
MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc:  64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet:  31 seconds loaded 861443392 Bytes, about 26 
MB/s
insert into select:  29.2 seconds inserted 1000 Rows, about 
342K ops/s
storage size: 17161958715 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #23543: [improvement](tablet schedule) colocate balance between all groups

2023-09-03 Thread via GitHub


hello-stephen commented on PR #23543:
URL: https://github.com/apache/doris/pull/23543#issuecomment-1704307681

   (From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 48.14 seconds
stream load tsv:  527 seconds loaded 74807831229 Bytes, about 135 
MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc:  64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet:  30 seconds loaded 861443392 Bytes, about 27 
MB/s
insert into select:  28.9 seconds inserted 1000 Rows, about 
346K ops/s
storage size: 17162392832 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Lchangliang commented on a diff in pull request #23561: [improve](segment-cache) Change the segment cache granularity from rowset_id to rowset_id+segment_id

2023-09-03 Thread via GitHub


Lchangliang commented on code in PR #23561:
URL: https://github.com/apache/doris/pull/23561#discussion_r1314260194


##
be/src/olap/segment_loader.cpp:
##
@@ -68,28 +64,34 @@ Status SegmentLoader::load_segments(const 
BetaRowsetSharedPtr& rowset,
 if (cache_handle->is_inited()) {
 return Status::OK();
 }
-
-SegmentCache::CacheKey cache_key(rowset->rowset_id());
-if (_segment_cache->lookup(cache_key, cache_handle)) {
-return Status::OK();
-}
-
-std::vector segments;
-RETURN_IF_ERROR(rowset->load_segments(&segments));
-
-if (use_cache) {
-// memory of SegmentCache::CacheValue will be handled by SegmentCache
-SegmentCache::CacheValue* cache_value = new SegmentCache::CacheValue();
-cache_value->segments = std::move(segments);
-_segment_cache->insert(cache_key, *cache_value, cache_handle);
-} else {
-cache_handle->init(std::move(segments));
+for (int64_t i = 0; i < rowset->num_segments(); i++) {
+SegmentCache::CacheKey cache_key(rowset->rowset_id(), i);
+if (_segment_cache->lookup(cache_key, cache_handle)) {
+continue;
+}
+segment_v2::SegmentSharedPtr segment;
+RETURN_IF_ERROR(rowset->load_segment(i, &segment));
+if (use_cache) {
+// memory of SegmentCache::CacheValue will be handled by 
SegmentCache
+SegmentCache::CacheValue* cache_value = new 
SegmentCache::CacheValue();
+cache_value->segment = std::move(segment);
+_segment_cache->insert(cache_key, *cache_value, cache_handle);

Review Comment:
   The segment is pushed in the method SegmentCache::insert.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Lchangliang commented on a diff in pull request #23561: [improve](segment-cache) Change the segment cache granularity from rowset_id to rowset_id+segment_id

2023-09-03 Thread via GitHub


Lchangliang commented on code in PR #23561:
URL: https://github.com/apache/doris/pull/23561#discussion_r1314260710


##
be/src/olap/segment_loader.cpp:
##
@@ -37,26 +37,22 @@ bool SegmentCache::lookup(const SegmentCache::CacheKey& 
key, SegmentCacheHandle*
 if (lru_handle == nullptr) {
 return false;
 }
-handle->init(_cache.get(), lru_handle);
+handle->push_segment(_cache.get(), lru_handle);

Review Comment:
   Not need. The concurrency occurs only at the methods of LRUCache. The 
variable `handle` is only used by one thread.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Lchangliang commented on pull request #23561: [improve](segment-cache) Change the segment cache granularity from rowset_id to rowset_id+segment_id

2023-09-03 Thread via GitHub


Lchangliang commented on PR #23561:
URL: https://github.com/apache/doris/pull/23561#issuecomment-1704309535

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #23543: [improvement](tablet schedule) colocate balance between all groups

2023-09-03 Thread via GitHub


hello-stephen commented on PR #23543:
URL: https://github.com/apache/doris/pull/23543#issuecomment-1704310330

   (From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.23 seconds
stream load tsv:  535 seconds loaded 74807831229 Bytes, about 133 
MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc:  64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet:  31 seconds loaded 861443392 Bytes, about 26 
MB/s
insert into select:  29.0 seconds inserted 1000 Rows, about 
344K ops/s
storage size: 17161870331 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #23561: [improve](segment-cache) Change the segment cache granularity from rowset_id to rowset_id+segment_id

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #23561:
URL: https://github.com/apache/doris/pull/23561#issuecomment-1704311100

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Kikyou1997 commented on a diff in pull request #23717: [feat](optimizer) Support tablesample in new optimizer

2023-09-03 Thread via GitHub


Kikyou1997 commented on code in PR #23717:
URL: https://github.com/apache/doris/pull/23717#discussion_r1314261988


##
fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalOlapScan.java:
##
@@ -45,6 +46,7 @@
  */
 public class PhysicalOlapScan extends PhysicalCatalogRelation implements 
OlapScan {
 
+public final TableSample tableSample;

Review Comment:
   It's not allowed by our checkstyle rules



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Kikyou1997 commented on a diff in pull request #23717: [feat](optimizer) Support tablesample in new optimizer

2023-09-03 Thread via GitHub


Kikyou1997 commented on code in PR #23717:
URL: https://github.com/apache/doris/pull/23717#discussion_r1314262220


##
fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/logical/LogicalOlapScan.java:
##
@@ -51,6 +52,8 @@
  */
 public class LogicalOlapScan extends LogicalCatalogRelation implements 
OlapScan {
 
+public final TableSample tableSample;

Review Comment:
   Declare public fields after privete is not allowed by our checkstyle rules



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #23793: [cherry-pick](branch-2.0) support delete pred v2

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #23793:
URL: https://github.com/apache/doris/pull/23793#issuecomment-1704311537

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morningman commented on a diff in pull request #23760: [improvement](index) support CANCEL BUILD INDEX

2023-09-03 Thread via GitHub


morningman commented on code in PR #23760:
URL: https://github.com/apache/doris/pull/23760#discussion_r1314262634


##
fe/fe-core/src/main/java/org/apache/doris/alter/IndexChangeJob.java:
##
@@ -321,10 +322,31 @@ protected void runRunningJob() throws 
AlterCancelException {
 LOG.info("inverted index job finished: {}", jobId);
 }
 
+/**
+ * cancelImpl() can be called any time any place.
+ * We need to clean any possible residual of this job.
+ */
 protected boolean cancelImpl(String errMsg) {
+if (jobState.isFinalState()) {
+return false;
+}
+
+cancelInternal();
+
+jobState = JobState.CANCELLED;
+this.errMsg = errMsg;
+this.finishedTimeMs = System.currentTimeMillis();
+LOG.info("cancel index job {}, err: {}", jobId, errMsg);

Review Comment:
   print log after `Env.getCurrentEnv().getEditLog().logIndexChangeJob(this);`



##
fe/fe-core/src/main/java/org/apache/doris/alter/SchemaChangeHandler.java:
##
@@ -2370,6 +2378,69 @@ public void cancel(CancelStmt stmt) throws DdlException {
 }
 }
 
+private void cancelIndexJob(CancelAlterTableStmt cancelAlterTableStmt) 
throws DdlException {
+String dbName = cancelAlterTableStmt.getDbName();
+String tableName = cancelAlterTableStmt.getTableName();
+Preconditions.checkState(!Strings.isNullOrEmpty(dbName));
+Preconditions.checkState(!Strings.isNullOrEmpty(tableName));
+
+Database db = 
Env.getCurrentInternalCatalog().getDbOrDdlException(dbName);
+
+List jobList = new ArrayList<>();
+
+OlapTable olapTable;
+try {
+olapTable = (OlapTable) db.getTableOrMetaException(tableName, 
Table.TableType.OLAP);
+} catch (MetaNotFoundException e) {
+throw new DdlException(e.getMessage());
+}
+olapTable.writeLock();
+try {
+// if (olapTable.getState() != OlapTableState.SCHEMA_CHANGE
+// && olapTable.getState() != 
OlapTableState.WAITING_STABLE) {
+// throw new DdlException("Table[" + tableName + "] is not 
under SCHEMA_CHANGE.");
+// }
+
+// find from index change jobs first
+if (cancelAlterTableStmt.getAlterJobIdList() != null
+&& cancelAlterTableStmt.getAlterJobIdList().size() > 0) {
+for (Long jobId : cancelAlterTableStmt.getAlterJobIdList()) {
+IndexChangeJob job = indexChangeJobs.get(jobId);
+if (job == null) {
+continue;
+}
+jobList.add(job);
+LOG.debug("add build index job {} on table {} for specific 
id", jobId, tableName);
+}
+} else {
+for (IndexChangeJob job : indexChangeJobs.values()) {
+if (!job.isDone() && job.getTableId() == 
olapTable.getId()) {
+jobList.add(job);
+LOG.debug("add build index job {} on table {} for 
all", job.getJobId(), tableName);
+}
+}
+}
+} finally {
+olapTable.writeUnlock();
+}
+
+// alter job v2's cancel must be called outside the table lock
+if (jobList.size() > 0) {
+for (IndexChangeJob job : jobList) {
+long jobId = job.getJobId();
+LOG.debug("cancel build index job {} on table {}", jobId, 
tableName);
+if (!job.cancel("user cancelled")) {
+LOG.info("cancel build index job {} on table {} failed", 
jobId, tableName);

Review Comment:
   ```suggestion
   LOG.warn("cancel build index job {} on table {} failed", 
jobId, tableName);
   ```



##
fe/fe-core/src/main/java/org/apache/doris/alter/IndexChangeJob.java:
##
@@ -321,10 +322,31 @@ protected void runRunningJob() throws 
AlterCancelException {
 LOG.info("inverted index job finished: {}", jobId);
 }
 
+/**
+ * cancelImpl() can be called any time any place.
+ * We need to clean any possible residual of this job.
+ */
 protected boolean cancelImpl(String errMsg) {
+if (jobState.isFinalState()) {
+return false;
+}
+
+cancelInternal();
+
+jobState = JobState.CANCELLED;
+this.errMsg = errMsg;
+this.finishedTimeMs = System.currentTimeMillis();
+LOG.info("cancel index job {}, err: {}", jobId, errMsg);

Review Comment:
   Missing `replayCancelled()`



##
fe/fe-core/src/main/java/org/apache/doris/alter/SchemaChangeHandler.java:
##
@@ -2370,6 +2378,69 @@ public void cancel(CancelStmt stmt) throws DdlException {
 }
 }
 
+private void cancelIndexJob(CancelAlterTableStmt cancelAlterTableStmt) 
throws DdlException {
+String dbName = cancelAlterTableStmt.

[GitHub] [doris] zclllyybb commented on pull request #23236: [Feature-WIP](partitions) Support auto partition

2023-09-03 Thread via GitHub


zclllyybb commented on PR #23236:
URL: https://github.com/apache/doris/pull/23236#issuecomment-1704313879

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] 01/02: [improvement](index) support CANCEL BUILD INDEX (#23760)

2023-09-03 Thread kxiao
This is an automated email from the ASF dual-hosted git repository.

kxiao pushed a commit to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/doris.git

commit 35d8c9450d815c75502760373831ea157f11ca50
Author: Kang 
AuthorDate: Sun Sep 3 21:25:42 2023 +0800

[improvement](index) support CANCEL BUILD INDEX (#23760)
---
 docs/en/docs/data-table/index/inverted-index.md|  40 +++--
 docs/zh-CN/docs/data-table/index/inverted-index.md |  34 ++--
 fe/fe-core/src/main/cup/sql_parser.cup |  10 +-
 .../org/apache/doris/alter/IndexChangeJob.java |  22 +++
 .../apache/doris/alter/SchemaChangeHandler.java|  71 +
 .../org/apache/doris/analysis/ShowAlterStmt.java   |   2 +-
 .../main/java/org/apache/doris/catalog/Env.java|   3 +-
 .../inverted_index_p0/test_build_index.groovy  | 172 +
 8 files changed, 318 insertions(+), 36 deletions(-)

diff --git a/docs/en/docs/data-table/index/inverted-index.md 
b/docs/en/docs/data-table/index/inverted-index.md
index 06102632f8..fbcb56253f 100644
--- a/docs/en/docs/data-table/index/inverted-index.md
+++ b/docs/en/docs/data-table/index/inverted-index.md
@@ -110,22 +110,28 @@ ALTER TABLE table_name ADD INDEX idx_name(column_name) 
USING INVERTED [PROPERTIE
 
 **After version 2.0-beta (including 2.0-beta):**
 
-The above 'create/add index' operation only generates inverted index for 
incremental data. The syntax of build index is added to add inverted index to 
stock data:
+The above 'create/add index' operation only generates inverted index for 
incremental data. The syntax of BUILD INDEX is added to add inverted index to 
stock data:
 ```sql
 -- syntax 1, add inverted index to the stock data of the whole table by default
 BUILD INDEX index_name ON table_name;
 -- syntax 2, partition can be specified, and one or more can be specified
 BUILD INDEX index_name ON table_name PARTITIONS(partition_name1, 
partition_name2);
 ```
-(**The above 'create/add index' operation needs to be executed before 
executing the build index**)
+(**The above 'create/add index' operation needs to be executed before 
executing the BUILD INDEX**)
 
-To view the progress of the `build index`, you can use the following statement
+To view the progress of the `BUILD INDEX`, you can run the following statement
 ```sql
-show build index [FROM db_name];
--- Example 1: Viewing the progress of all build index tasks
-show build index;
--- Example 2: Viewing the progress of the build index task for a specified 
table
-show build index where TableName = "table1";
+SHOW BUILD INDEX [FROM db_name];
+-- Example 1: Viewing the progress of all BUILD INDEX tasks
+SHOW BUILD INDEX;
+-- Example 2: Viewing the progress of the BUILD INDEX task for a specified 
table
+SHOW BUILD INDEX where TableName = "table1";
+```
+
+To cancel `BUILD INDEX`, you can run the following statement
+```sql
+CANCEL BUILD INDEX ON table_name;
+CANCEL BUILD INDEX ON table_name (job_id1,jobid_2,...);
 ```
 
 - drop an inverted index
@@ -349,13 +355,13 @@ mysql> SELECT count() FROM hackernews_1m WHERE timestamp 
> '2007-08-23 04:17:00'
 mysql> CREATE INDEX idx_timestamp ON hackernews_1m(timestamp) USING INVERTED;
 Query OK, 0 rows affected (0.03 sec)
 ```
-**After 2.0-beta (including 2.0-beta), you need to execute `build index` to 
add inverted index to the stock data:**
+**After 2.0-beta (including 2.0-beta), you need to execute `BUILD INDEX` to 
add inverted index to the stock data:**
 ```sql
 mysql> BUILD INDEX idx_timestamp ON hackernews_1m;
 Query OK, 0 rows affected (0.01 sec)
 ```
 
-- progress of building index can be view by SQL. It just costs 1s (compare 
FinishTime and CreateTime) to build index for timestamp column with 1 million 
rows.
+- progress of building index can be view by SQL. It just costs 1s (compare 
FinishTime and CreateTime) to BUILD INDEX for timestamp column with 1 million 
rows.
 ```sql
 mysql> SHOW ALTER TABLE COLUMN;
 
+---+---+-+-+---+-+---+---+---+--+--+--+-+
@@ -366,10 +372,10 @@ mysql> SHOW ALTER TABLE COLUMN;
 1 row in set (0.00 sec)
 ```
 
-**After 2.0-beta (including 2.0-beta), you can view the progress of stock data 
creating index by `show build index`:**
+**After 2.0-beta (including 2.0-beta), you can view the progress of stock data 
creating index by `SHOW BUILD INDEX`:**
 ```sql
 -- If the table has no partitions, the PartitionName defaults to TableName
-mysql> show build index;
+mysql> SHOW BUILD INDEX;
 
+---+---+---+--+-+-+---+--+--+--+
 | JobId | TableName | PartitionName | AlterInvertedIndexes 
| CreateTime  | FinishTime  | 
TransactionId | State| Msg  | Prog

[GitHub] [doris] github-actions[bot] commented on pull request #23236: [Feature-WIP](partitions) Support auto partition

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #23236:
URL: https://github.com/apache/doris/pull/23236#issuecomment-1704315451

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch branch-2.0 updated (9ec056675c -> c22a08a5e4)

2023-09-03 Thread kxiao
This is an automated email from the ASF dual-hosted git repository.

kxiao pushed a change to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/doris.git


from 9ec056675c [Fix](vscanner) remove TEMP column in block after filter 
(#23778)
 new 35d8c9450d [improvement](index) support CANCEL BUILD INDEX (#23760)
 new c22a08a5e4 change version to 2.0.2-rc01

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 docs/en/docs/data-table/index/inverted-index.md|  40 +++--
 docs/zh-CN/docs/data-table/index/inverted-index.md |  34 ++--
 fe/fe-core/src/main/cup/sql_parser.cup |  10 +-
 .../org/apache/doris/alter/IndexChangeJob.java |  22 +++
 .../apache/doris/alter/SchemaChangeHandler.java|  71 +
 .../org/apache/doris/analysis/ShowAlterStmt.java   |   2 +-
 .../main/java/org/apache/doris/catalog/Env.java|   3 +-
 gensrc/script/gen_build_version.sh |   4 +-
 .../inverted_index_p0/test_build_index.groovy  | 172 +
 9 files changed, 320 insertions(+), 38 deletions(-)
 create mode 100644 
regression-test/suites/inverted_index_p0/test_build_index.groovy


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] 02/02: change version to 2.0.2-rc01

2023-09-03 Thread kxiao
This is an automated email from the ASF dual-hosted git repository.

kxiao pushed a commit to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/doris.git

commit c22a08a5e412a5019cb948b69ce4d9705010a600
Author: Kang 
AuthorDate: Sun Sep 3 22:04:48 2023 +0800

change version to 2.0.2-rc01
---
 gensrc/script/gen_build_version.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gensrc/script/gen_build_version.sh 
b/gensrc/script/gen_build_version.sh
index 32172374cb..3d522a508c 100755
--- a/gensrc/script/gen_build_version.sh
+++ b/gensrc/script/gen_build_version.sh
@@ -30,8 +30,8 @@ set -eo pipefail
 build_version_prefix="doris"
 build_version_major=2
 build_version_minor=0
-build_version_patch=1
-build_version_rc_version="rc04"
+build_version_patch=2
+build_version_rc_version="rc01"
 
 
build_version="${build_version_prefix}-${build_version_major}.${build_version_minor}.${build_version_patch}-${build_version_rc_version}"
 


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Kikyou1997 commented on a diff in pull request #23717: [feat](optimizer) Support tablesample in new optimizer

2023-09-03 Thread via GitHub


Kikyou1997 commented on code in PR #23717:
URL: https://github.com/apache/doris/pull/23717#discussion_r1314266071


##
fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundRelation.java:
##
@@ -44,6 +45,7 @@
  */
 public class UnboundRelation extends LogicalRelation implements Unbound {
 
+public final TableSample tableSample;

Review Comment:
   According to the one who proposed Optional api to Java8, seems we should 
almost never wrap a member with Optional, the only scenario use this API is the 
return type of a function 
https://stackoverflow.com/questions/26327957/should-java-8-getters-return-optional-type/26328555#26328555



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Kikyou1997 commented on pull request #23717: [feat](optimizer) Support tablesample in new optimizer

2023-09-03 Thread via GitHub


Kikyou1997 commented on PR #23717:
URL: https://github.com/apache/doris/pull/23717#issuecomment-1704316995

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch branch-2.0 updated: add analyze table in tpcds_sf1_index/load.groovy

2023-09-03 Thread kxiao
This is an automated email from the ASF dual-hosted git repository.

kxiao pushed a commit to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/branch-2.0 by this push:
 new 880a7b3425 add analyze table in tpcds_sf1_index/load.groovy
880a7b3425 is described below

commit 880a7b3425b520c67fe358fe522e7d133c5e0367
Author: Kang 
AuthorDate: Sun Sep 3 22:20:04 2023 +0800

add analyze table in tpcds_sf1_index/load.groovy
---
 regression-test/suites/inverted_index_p1/tpcds_sf1_index/load.groovy | 2 ++
 1 file changed, 2 insertions(+)

diff --git 
a/regression-test/suites/inverted_index_p1/tpcds_sf1_index/load.groovy 
b/regression-test/suites/inverted_index_p1/tpcds_sf1_index/load.groovy
index 1b527dcc81..5e1422b58d 100644
--- a/regression-test/suites/inverted_index_p1/tpcds_sf1_index/load.groovy
+++ b/regression-test/suites/inverted_index_p1/tpcds_sf1_index/load.groovy
@@ -98,5 +98,7 @@ suite("load") {
 assertTrue(json.NumberLoadedRows > 0 && json.LoadBytes > 0)
 }
 }
+sql """ ANALYZE TABLE $tableName WITH SYNC """
 }
+sql """ sync """
 }


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zclllyybb commented on pull request #23236: [Feature-WIP](partitions) Support auto partition

2023-09-03 Thread via GitHub


zclllyybb commented on PR #23236:
URL: https://github.com/apache/doris/pull/23236#issuecomment-1704319580

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] liugddx opened a new pull request, #23802: [feature] (multi-catalog) support cassandra catalog

2023-09-03 Thread via GitHub


liugddx opened a new pull request, #23802:
URL: https://github.com/apache/doris/pull/23802

   ## Proposed changes
   
   Issue Number: close #xxx
   
   
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #23561: [improve](segment-cache) Change the segment cache granularity from rowset_id to rowset_id+segment_id

2023-09-03 Thread via GitHub


hello-stephen commented on PR #23561:
URL: https://github.com/apache/doris/pull/23561#issuecomment-1704320725

   (From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.57 seconds
stream load tsv:  549 seconds loaded 74807831229 Bytes, about 129 
MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc:  64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet:  31 seconds loaded 861443392 Bytes, about 26 
MB/s
insert into select:  28.7 seconds inserted 1000 Rows, about 
348K ops/s
storage size: 17162193407 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Kikyou1997 opened a new pull request, #23803: [enhancement](optimizer) Recycle expired table stats

2023-09-03 Thread via GitHub


Kikyou1997 opened a new pull request, #23803:
URL: https://github.com/apache/doris/pull/23803

   ## Proposed changes
   
   Issue Number: close #xxx
   
   
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #23177: [fix](broadcast join) fix bug of probe side early eos if enable shared hash table

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #23177:
URL: https://github.com/apache/doris/pull/23177#issuecomment-1704322258

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #23802: [feature] (multi-catalog) support cassandra catalog

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #23802:
URL: https://github.com/apache/doris/pull/23802#issuecomment-1704322379

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #23802: [feature] (multi-catalog) support cassandra catalog

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #23802:
URL: https://github.com/apache/doris/pull/23802#issuecomment-1704322525

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Kikyou1997 commented on pull request #23803: [enhancement](optimizer) Recycle expired table stats

2023-09-03 Thread via GitHub


Kikyou1997 commented on PR #23803:
URL: https://github.com/apache/doris/pull/23803#issuecomment-1704324916

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yujun777 commented on pull request #23218: [improvement](create tablet) backend create tablet round robin among disks

2023-09-03 Thread via GitHub


yujun777 commented on PR #23218:
URL: https://github.com/apache/doris/pull/23218#issuecomment-1704327028

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] eldenmoon closed pull request #20219: [Improve](BloomFilter) Suport caching bloom filters and filter segmen…

2023-09-03 Thread via GitHub


eldenmoon closed pull request #20219: [Improve](BloomFilter) Suport caching 
bloom filters and filter segmen…
URL: https://github.com/apache/doris/pull/20219


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #23218: [improvement](create tablet) backend create tablet round robin among disks

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #23218:
URL: https://github.com/apache/doris/pull/23218#issuecomment-1704328427

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] xiaokang commented on pull request #23760: [improvement](index) support CANCEL BUILD INDEX

2023-09-03 Thread via GitHub


xiaokang commented on PR #23760:
URL: https://github.com/apache/doris/pull/23760#issuecomment-1704329006

   run buildall 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Kikyou1997 opened a new pull request, #23804: [fix](optimizer) Fix sql block when new optimizer is enabled

2023-09-03 Thread via GitHub


Kikyou1997 opened a new pull request, #23804:
URL: https://github.com/apache/doris/pull/23804

   ## Proposed changes
   
   The check would skipped since when checkBlockPolicy get invoked, new 
optimizer doesn't do plan yet
   
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Kikyou1997 commented on pull request #23804: [fix](optimizer) Fix sql block when new optimizer is enabled

2023-09-03 Thread via GitHub


Kikyou1997 commented on PR #23804:
URL: https://github.com/apache/doris/pull/23804#issuecomment-1704331775

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Kikyou1997 commented on pull request #23803: [enhancement](optimizer) Recycle expired table stats

2023-09-03 Thread via GitHub


Kikyou1997 commented on PR #23803:
URL: https://github.com/apache/doris/pull/23803#issuecomment-1704332112

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Kikyou1997 commented on pull request #23717: [feat](optimizer) Support tablesample in new optimizer

2023-09-03 Thread via GitHub


Kikyou1997 commented on PR #23717:
URL: https://github.com/apache/doris/pull/23717#issuecomment-1704334023

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #23803: [enhancement](optimizer) Recycle expired table stats

2023-09-03 Thread via GitHub


hello-stephen commented on PR #23803:
URL: https://github.com/apache/doris/pull/23803#issuecomment-1704334107

   (From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.36 seconds
stream load tsv:  529 seconds loaded 74807831229 Bytes, about 134 
MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc:  64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet:  31 seconds loaded 861443392 Bytes, about 26 
MB/s
insert into select:  29.1 seconds inserted 1000 Rows, about 
343K ops/s
storage size: 17161979969 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Kikyou1997 commented on pull request #23717: [feat](optimizer) Support tablesample in new optimizer

2023-09-03 Thread via GitHub


Kikyou1997 commented on PR #23717:
URL: https://github.com/apache/doris/pull/23717#issuecomment-1704335682

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #23802: [feature] (multi-catalog) support cassandra catalog

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #23802:
URL: https://github.com/apache/doris/pull/23802#issuecomment-1704337837

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zclllyybb commented on pull request #23236: [Feature-WIP](partitions) Support auto partition

2023-09-03 Thread via GitHub


zclllyybb commented on PR #23236:
URL: https://github.com/apache/doris/pull/23236#issuecomment-1704339080

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] jacktengg opened a new pull request, #23805: [doc](bitmap) add docs from bitmap_to_base64 and bitmap_from_base64

2023-09-03 Thread via GitHub


jacktengg opened a new pull request, #23805:
URL: https://github.com/apache/doris/pull/23805

   ## Proposed changes
   
   Issue Number: close #xxx
   
   
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #23236: [Feature-WIP](partitions) Support auto partition

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #23236:
URL: https://github.com/apache/doris/pull/23236#issuecomment-1704340266

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #23218: [improvement](create tablet) backend create tablet round robin among disks

2023-09-03 Thread via GitHub


hello-stephen commented on PR #23218:
URL: https://github.com/apache/doris/pull/23218#issuecomment-1704340769

   (From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.51 seconds
stream load tsv:  547 seconds loaded 74807831229 Bytes, about 130 
MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc:  64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet:  29 seconds loaded 861443392 Bytes, about 28 
MB/s
insert into select:  29.0 seconds inserted 1000 Rows, about 
344K ops/s
storage size: 17162062376 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #23804: [fix](optimizer) Fix sql block when new optimizer is enabled

2023-09-03 Thread via GitHub


hello-stephen commented on PR #23804:
URL: https://github.com/apache/doris/pull/23804#issuecomment-1704340612

   (From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.01 seconds
stream load tsv:  531 seconds loaded 74807831229 Bytes, about 134 
MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc:  64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet:  31 seconds loaded 861443392 Bytes, about 26 
MB/s
insert into select:  28.8 seconds inserted 1000 Rows, about 
347K ops/s
storage size: 17162213470 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #23802: [feature](multi-catalog) support cassandra catalog

2023-09-03 Thread via GitHub


github-actions[bot] commented on PR #23802:
URL: https://github.com/apache/doris/pull/23802#issuecomment-1704341277

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #23717: [feat](optimizer) Support tablesample in new optimizer

2023-09-03 Thread via GitHub


hello-stephen commented on PR #23717:
URL: https://github.com/apache/doris/pull/23717#issuecomment-1704342545

   (From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.12 seconds
stream load tsv:  528 seconds loaded 74807831229 Bytes, about 135 
MB/s
stream load json: 21 seconds loaded 2358488459 Bytes, about 107 MB/s
stream load orc:  65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet:  32 seconds loaded 861443392 Bytes, about 25 
MB/s
insert into select:  29.1 seconds inserted 1000 Rows, about 
343K ops/s
storage size: 17162102113 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yujun777 opened a new pull request, #23806: [fix](publish) publish go ahead even if quorum is not met

2023-09-03 Thread via GitHub


yujun777 opened a new pull request, #23806:
URL: https://github.com/apache/doris/pull/23806

   ## Proposed changes
   
   pick: #18696
   
   If publish txn is not quorum met, writes will be stucked. Let publish go 
ahead if each tablet contains at least one succ replica and time exceeds 10 min.
   
This PR also change some hints.
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yujun777 commented on pull request #23806: [fix](publish) publish go ahead even if quorum is not met

2023-09-03 Thread via GitHub


yujun777 commented on PR #23806:
URL: https://github.com/apache/doris/pull/23806#issuecomment-1704344328

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



  1   2   3   4   >