hubgeter opened a new pull request, #37530: URL: https://github.com/apache/doris/pull/37530
bp #37377 ## Proposed changes Since the value of the partition column is fixed when querying the partition table, we can deserialize the value only once and then repeatedly insert the value into the block. ```sql in Hive: CREATE TABLE parquet_partition_tb ( col1 STRING, col2 INT, col3 DOUBLE ) PARTITIONED BY ( partition_col1 STRING, partition_col2 INT ) STORED AS PARQUET; insert into parquet_partition_tb partition (partition_col1="hello",partition_col2=1) values("word",2,2.3); insert into parquet_partition_tb partition(partition_col1="hello",partition_col2=1 ) select col1,col2,col3 from parquet_partition_tb where partition_col1="hello" and partition_col2=1; Repeat the `insert into xxx select xxx`operation several times. Doris : before: mysql> select count(partition_col1) from parquet_partition_tb; +-----------------------+ | count(partition_col1) | +-----------------------+ | 33554432 | +-----------------------+ 1 row in set (3.24 sec) mysql> select count(partition_col2) from parquet_partition_tb; +-----------------------+ | count(partition_col2) | +-----------------------+ | 33554432 | +-----------------------+ 1 row in set (3.34 sec) after: mysql> select count(partition_col1) from parquet_partition_tb ; +-----------------------+ | count(partition_col1) | +-----------------------+ | 33554432 | +-----------------------+ 1 row in set (0.79 sec) mysql> select count(partition_col2) from parquet_partition_tb; +-----------------------+ | count(partition_col2) | +-----------------------+ | 33554432 | +-----------------------+ 1 row in set (0.51 sec) ``` ## Summary: test sql `select count(partition_col) from tbl;` Number of lines : 33554432 | |before | after| |---|---|--| |boolean | 3.96|0.47 | |tinyint | 3.39|0.47 | |smallint | 3.14|0.50 | |int |3.34|0.51 | |bigint | 3.61|0.51 | |float | 4.59 |0.51 | |double |4.60| 0.55 | |decimal(5,2)| 3.96 |0.61 | |date | 5.80|0.52 | |timestamp | 7.68 | 0.52 | |string | 3.24 |0.79 | Issue Number: close #xxx <!--Describe your changes.--> ## Proposed changes Issue Number: close #xxx <!--Describe your changes.--> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org