[GitHub] [pinot] richardstartin commented on pull request #8408: satisfy queries using column metadata when convenient to

GitBox Fri, 25 Mar 2022 03:41:12 -0700


richardstartin commented on pull request #8408:
URL: https://github.com/apache/pinot/pull/8408#issuecomment-1078891090



   this obviously has a huge impact on performance for raw columns:
   
   
   ```
   Benchmark               (_numRows)                                           
               (_query)  (_scenario)  Mode  Cnt      Score      Error  Units
   BenchmarkQueries.query     1500000  SELECT MIN(RAW_INT_COL), 
MAX(RAW_INT_COL), COUNT(*) FROM MyTable   EXP(0.001)  avgt    5  12050.914 ± 
1603.683  us/op
   BenchmarkQueries.query     1500000  SELECT MIN(RAW_INT_COL), 
MAX(RAW_INT_COL), COUNT(*) FROM MyTable     EXP(0.5)  avgt    5  16311.600 ± 
2757.909  us/op
   BenchmarkQueries.query     1500000  SELECT MIN(RAW_INT_COL), 
MAX(RAW_INT_COL), COUNT(*) FROM MyTable   EXP(0.999)  avgt    5  15463.667 ± 
3098.050  us/op
   ```
   
   ```
   Benchmark               (_numRows)                                           
               (_query)  (_scenario)  Mode  Cnt    Score     Error  Units
   BenchmarkQueries.query     1500000  SELECT MIN(RAW_INT_COL), 
MAX(RAW_INT_COL), COUNT(*) FROM MyTable   EXP(0.001)  avgt    5  513.716 ±  
51.830  us/op
   BenchmarkQueries.query     1500000  SELECT MIN(RAW_INT_COL), 
MAX(RAW_INT_COL), COUNT(*) FROM MyTable     EXP(0.5)  avgt    5  484.079 ±  
88.200  us/op
   BenchmarkQueries.query     1500000  SELECT MIN(RAW_INT_COL), 
MAX(RAW_INT_COL), COUNT(*) FROM MyTable   EXP(0.999)  avgt    5  506.688 ± 
119.060  us/op
   ```
   
   However, this optimisation is so obvious it raises questions that it wasn't 
done before - is realtime metadata less trustworthy than the equivalent info 
from a dictionary?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [pinot] richardstartin commented on pull request #8408: satisfy queries using column metadata when convenient to

Reply via email to