[GitHub] [pinot] snleee commented on issue #5089: Enhance Data Ingestion Engine

GitBox Tue, 26 Jul 2022 09:53:15 -0700


snleee commented on issue #5089:
URL: https://github.com/apache/pinot/issues/5089#issuecomment-1195734895


   To add here, I think that we should introduce the column-based interface 
(maybe it's the same idea as `Design an interface (close to the idea of the 
stats collector) to store all the column data`) for data indexing.
   
   If the input data is based on the columnar format, we will be able to 
generate dictionary/indices column by column. This will probably consume much 
less heap because we don't need to store all column data at the same time. 
Also, we can add the parallelization config to make the engine process multiple 
columns concurrently to speed up. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [pinot] snleee commented on issue #5089: Enhance Data Ingestion Engine

Reply via email to