richardstartin commented on pull request #8408: URL: https://github.com/apache/pinot/pull/8408#issuecomment-1078891090
this obviously has a huge impact on performance for raw columns: ``` Benchmark (_numRows) (_query) (_scenario) Mode Cnt Score Error Units BenchmarkQueries.query 1500000 SELECT MIN(RAW_INT_COL), MAX(RAW_INT_COL), COUNT(*) FROM MyTable EXP(0.001) avgt 5 12050.914 ± 1603.683 us/op BenchmarkQueries.query 1500000 SELECT MIN(RAW_INT_COL), MAX(RAW_INT_COL), COUNT(*) FROM MyTable EXP(0.5) avgt 5 16311.600 ± 2757.909 us/op BenchmarkQueries.query 1500000 SELECT MIN(RAW_INT_COL), MAX(RAW_INT_COL), COUNT(*) FROM MyTable EXP(0.999) avgt 5 15463.667 ± 3098.050 us/op ``` ``` Benchmark (_numRows) (_query) (_scenario) Mode Cnt Score Error Units BenchmarkQueries.query 1500000 SELECT MIN(RAW_INT_COL), MAX(RAW_INT_COL), COUNT(*) FROM MyTable EXP(0.001) avgt 5 513.716 ± 51.830 us/op BenchmarkQueries.query 1500000 SELECT MIN(RAW_INT_COL), MAX(RAW_INT_COL), COUNT(*) FROM MyTable EXP(0.5) avgt 5 484.079 ± 88.200 us/op BenchmarkQueries.query 1500000 SELECT MIN(RAW_INT_COL), MAX(RAW_INT_COL), COUNT(*) FROM MyTable EXP(0.999) avgt 5 506.688 ± 119.060 us/op ``` However, this optimisation is so obvious it raises questions that it wasn't done before - is realtime metadata less trustworthy than the equivalent info from a dictionary? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org