cbalci opened a new pull request, #10643:
URL: https://github.com/apache/pinot/pull/10643

   Introducing a new approximate percentile calculation function 
`PercentileKLL` and its variations (MV & Raw), using Apache Datasketches 
libraries 'KLL'.
   
   This is part of a proposal to improve Apache Datasketches support in Pinot: 
   [(Google Docs Link) [Proposal] Improved Apache DataSketches Support in 
Pinot](https://docs.google.com/document/d/1ctmKVRi67lpO6x1RYKDvDYf05EZx2Vbs2OnUudYP-bU/edit
 )
   
   Some advantages listed and discussed in the linked document:
   - [Well 
defined](https://datasketches.apache.org/docs/KLL/KLLAccuracyAndSize.html) 
error bound 
([comparison](https://datasketches.apache.org/docs/QuantilesStudies/KllSketchVsTDigest.html)
 to t-Digest)
   - Faster updates, serialization/deserialization
   - Binary compatibility with external systems, hence the ability to use Pinot 
as sketch store
   - Ability to compute ‘Rank’ and ‘Histogram’ besides ‘Percentile’
   - Feature parity with Druid
   
   Please leave design related comments on the linked document and code related 
comments in this PR.
   
   **Testing**
   - Added unit tests to cover basic use cases that call `PercentileKLL`, 
`PercentileKLLMV`,  `PercentileRawKLL`, `PercentileRawKLLMV`
   - Added tests to cover group by scenarios
   - Manually tested ingesting raw (externally generated) data sketches
   
   
   `feature` `performance`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to