marregui commented on PR #12668:
URL: https://github.com/apache/pinot/pull/12668#issuecomment-2036617662

   @siddharthteotia 
   
   afaik the api here is to receive two buffers for the decompress/compress 
operations, one containing the bytes to be processed, the other to hold the 
results. I wonder if there would be a way to associate these Gzipped segment 
data to metadata indicating their original size (somewhere in zookeeper). Or 
botch it altogether and append the compressed size directly to the compressed 
buffer before storing it to disk, knowing that when decompressing, these bytes 
have to be removed first. 
   
   There are options for the codec yes, but choosing one or another depends on 
the shape of the data. At most I suppose we could offer the user to configure 
it via configuration files, perhaps per table. Then a table of data 
shapes/codecs/compression ratios/compression speed would make sense indeed.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to