yixiutt opened a new issue, #12966:
URL: https://github.com/apache/doris/issues/12966

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Description
   
   Compaction in some case such as wide table will cost a lot memory, every row 
will be loaded in compaction, so memory cost will be rowset_num * segment_num * 
column_num * page_size when every column load first page, default page size is 
64K, so if a table has 1000 columns it will cost  rowset_num * segment_num * 
64M。
   
   Vertical compaction can handle compaction in column groups so not cost to 
much memory.
   
   I'll accomplish this work in few weeks.
   
   ### Solution
   
   Vertical Compaction, steps to do:
   1. framework, support duplicate key vertical compaction, including all basic 
functions and can run correctly.
   2. unique key 
   3. agg key
   4. some optimize such as RowSourceBuffer write to tmp file to support large 
compaction.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to