marregui commented on PR #12668: URL: https://github.com/apache/pinot/pull/12668#issuecomment-2036617662
@siddharthteotia afaik the api here is to receive two buffers for the decompress/compress operations, one containing the bytes to be processed, the other to hold the results. I wonder if there would be a way to associate these Gzipped segment data to metadata indicating their original size (somewhere in zookeeper). Or botch it altogether and append the compressed size directly to the compressed buffer before storing it to disk, knowing that when decompressing, these bytes have to be removed first. There are options for the codec yes, but choosing one or another depends on the shape of the data. At most I suppose we could offer the user to configure it via configuration files, perhaps per table. Then a table of data shapes/codecs/compression ratios/compression speed would make sense indeed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org