tibrewalpratik17 opened a new issue, #14305:
URL: https://github.com/apache/pinot/issues/14305

   The concept of compaction traditionally refers to the process of making 
something denser or more tightly packed. In its current implementation, the 
Upsert-Compaction task in Apache Pinot operates at the segment level, where it 
rebuilds individual segments by removing unused or invalid rows. This approach 
has proven highly effective in controlling the disk usage of upsert tables.
   
   However this task focuses on addressing the issue of the continuously 
growing number of segments in upsert tables. To mitigate this challenge, we 
propose a multi-segment compaction model for upsert tables. In this model, 
multiple segments will be combined and re-uploaded as a single, consolidated 
segment, with invalid or unused rows removed. This approach aims to reduce the 
overall segment count while maintaining the storage efficiency benefits of the 
current upsert-compaction mechanism.
   
   Sharing the [design 
doc](https://docs.google.com/document/d/1uzFJggSAxxVpnro5yr-HnWQ-8j5G3EggdSh5bjG78kI/edit?pli=1&tab=t.0)
 here for review and feedback from the community.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to