vvivekiyer opened a new pull request, #14300:
URL: https://github.com/apache/pinot/pull/14300

   Currently, when new columns are added or indexes are added/removed, the 
segment reloads happen on the server.  There are a number of issues with this 
approach:
   
   1. Increased startup times for Pinot Server hosts. Servers have to reload 
segments (generating indexes, columns) everytime at server startup. This is 
particularly exacerbated for Upsert tables. cc: @tibrewalpratik17 @ankitsultana 
   2. The server reload compute cost is paid on each server when indexes/colums 
are added. This leads to over-provisioning of servers to account for this 
compute cost.
   3. Reload on servers when queries are being processed affects latencies. 
   4. Takes a long time to reload all segments (default value of 1 segment at a 
time). Increasing the concurrency affects query latencies. 
   5. The segment on the deepstore never contains the new indexes/columns. So 
the segment in deepstore is at divergence from the server (making it not ideal 
for disaster recovery).
   
   
   
   This PR creates a minion task to automatically refresh segments when there 
are index/column updates to table config/schema. It can support automatic 
refresh for the following operations:
   1. Adding/Removing indexes
   2. Adding columns
   3. Changing compatible datatypes. 
   4. Converting segment versions
   
   
   Followup Work:
   1. When there are table config/schema updates, we can validate if the 
datatype changes for columns are compatible. We can allow compatible updates. 
   2. Schedule the SegmentRefresh tasks when there are tableconfig/schema 
updates rather than waiting for the next iteration of periodic job. 
   
   
   Tested using integration tests.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to