yashmayya commented on PR #14843: URL: https://github.com/apache/pinot/pull/14843#issuecomment-2605174430
> Can we verify that empirically with a test? I updated the existing `BenchmarkQueries` to also support a larger number of segments and here's the results for the group by queries in that benchmark. Old (without changes from this PR): ``` Benchmark (_numRows) (_numSegments) (_query) (_scenario) Mode Cnt Score Error Units BenchmarkQueries.query 1500000 1 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 36.764 ± 1.189 ms/op BenchmarkQueries.query 1500000 1 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 134.669 ± 2.014 ms/op BenchmarkQueries.query 1500000 1 SELECT NO_INDEX_STRING_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY NO_INDEX_STRING_COL,INT_COL ORDER BY NO_INDEX_STRING_COL, INT_COL ASC EXP(0.5) avgt 5 5.147 ± 0.334 ms/op BenchmarkQueries.query 1500000 1 SELECT NO_INDEX_STRING_COL,LOW_CARDINALITY_STRING_COL,COUNT(*) FROM MyTable GROUP BY LOW_CARDINALITY_STRING_COL,NO_INDEX_STRING_COL ORDER BY LOW_CARDINALITY_STRING_COL, NO_INDEX_STRING_COL ASC EXP(0.5) avgt 5 3.568 ± 0.084 ms/op BenchmarkQueries.query 1500000 1 select count(*), year(INT_COL) as y, month(INT_COL) as m from MyTable group by y, m EXP(0.5) avgt 5 35.912 ± 1.409 ms/op BenchmarkQueries.query 1500000 2 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 39.488 ± 0.565 ms/op BenchmarkQueries.query 1500000 2 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 157.133 ± 3.754 ms/op BenchmarkQueries.query 1500000 2 SELECT NO_INDEX_STRING_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY NO_INDEX_STRING_COL,INT_COL ORDER BY NO_INDEX_STRING_COL, INT_COL ASC EXP(0.5) avgt 5 5.430 ± 0.158 ms/op BenchmarkQueries.query 1500000 2 SELECT NO_INDEX_STRING_COL,LOW_CARDINALITY_STRING_COL,COUNT(*) FROM MyTable GROUP BY LOW_CARDINALITY_STRING_COL,NO_INDEX_STRING_COL ORDER BY LOW_CARDINALITY_STRING_COL, NO_INDEX_STRING_COL ASC EXP(0.5) avgt 5 3.731 ± 0.163 ms/op BenchmarkQueries.query 1500000 2 select count(*), year(INT_COL) as y, month(INT_COL) as m from MyTable group by y, m EXP(0.5) avgt 5 36.784 ± 1.647 ms/op BenchmarkQueries.query 1500000 10 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 166.385 ± 10.191 ms/op BenchmarkQueries.query 1500000 10 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 735.311 ± 20.755 ms/op BenchmarkQueries.query 1500000 10 SELECT NO_INDEX_STRING_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY NO_INDEX_STRING_COL,INT_COL ORDER BY NO_INDEX_STRING_COL, INT_COL ASC EXP(0.5) avgt 5 25.390 ± 0.409 ms/op BenchmarkQueries.query 1500000 10 SELECT NO_INDEX_STRING_COL,LOW_CARDINALITY_STRING_COL,COUNT(*) FROM MyTable GROUP BY LOW_CARDINALITY_STRING_COL,NO_INDEX_STRING_COL ORDER BY LOW_CARDINALITY_STRING_COL, NO_INDEX_STRING_COL ASC EXP(0.5) avgt 5 16.623 ± 0.327 ms/op BenchmarkQueries.query 1500000 10 select count(*), year(INT_COL) as y, month(INT_COL) as m from MyTable group by y, m EXP(0.5) avgt 5 175.212 ± 3.031 ms/op BenchmarkQueries.query 1500000 50 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 961.633 ± 17.295 ms/op BenchmarkQueries.query 1500000 50 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 3640.062 ± 21.391 ms/op BenchmarkQueries.query 1500000 50 SELECT NO_INDEX_STRING_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY NO_INDEX_STRING_COL,INT_COL ORDER BY NO_INDEX_STRING_COL, INT_COL ASC EXP(0.5) avgt 5 123.276 ± 1.000 ms/op BenchmarkQueries.query 1500000 50 SELECT NO_INDEX_STRING_COL,LOW_CARDINALITY_STRING_COL,COUNT(*) FROM MyTable GROUP BY LOW_CARDINALITY_STRING_COL,NO_INDEX_STRING_COL ORDER BY LOW_CARDINALITY_STRING_COL, NO_INDEX_STRING_COL ASC EXP(0.5) avgt 5 81.729 ± 0.784 ms/op BenchmarkQueries.query 1500000 50 select count(*), year(INT_COL) as y, month(INT_COL) as m from MyTable group by y, m EXP(0.5) avgt 5 862.859 ± 49.178 ms/op ``` New (with changes from this PR): ``` Benchmark (_numRows) (_numSegments) (_query) (_scenario) Mode Cnt Score Error Units BenchmarkQueries.query 1500000 1 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 37.356 ± 0.371 ms/op BenchmarkQueries.query 1500000 1 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 133.085 ± 1.483 ms/op BenchmarkQueries.query 1500000 1 SELECT NO_INDEX_STRING_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY NO_INDEX_STRING_COL,INT_COL ORDER BY NO_INDEX_STRING_COL, INT_COL ASC EXP(0.5) avgt 5 5.282 ± 0.309 ms/op BenchmarkQueries.query 1500000 1 SELECT NO_INDEX_STRING_COL,LOW_CARDINALITY_STRING_COL,COUNT(*) FROM MyTable GROUP BY LOW_CARDINALITY_STRING_COL,NO_INDEX_STRING_COL ORDER BY LOW_CARDINALITY_STRING_COL, NO_INDEX_STRING_COL ASC EXP(0.5) avgt 5 3.648 ± 0.147 ms/op BenchmarkQueries.query 1500000 1 select count(*), year(INT_COL) as y, month(INT_COL) as m from MyTable group by y, m EXP(0.5) avgt 5 34.617 ± 0.604 ms/op BenchmarkQueries.query 1500000 2 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 37.743 ± 0.558 ms/op BenchmarkQueries.query 1500000 2 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 157.974 ± 1.498 ms/op BenchmarkQueries.query 1500000 2 SELECT NO_INDEX_STRING_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY NO_INDEX_STRING_COL,INT_COL ORDER BY NO_INDEX_STRING_COL, INT_COL ASC EXP(0.5) avgt 5 5.401 ± 0.134 ms/op BenchmarkQueries.query 1500000 2 SELECT NO_INDEX_STRING_COL,LOW_CARDINALITY_STRING_COL,COUNT(*) FROM MyTable GROUP BY LOW_CARDINALITY_STRING_COL,NO_INDEX_STRING_COL ORDER BY LOW_CARDINALITY_STRING_COL, NO_INDEX_STRING_COL ASC EXP(0.5) avgt 5 3.633 ± 0.145 ms/op BenchmarkQueries.query 1500000 2 select count(*), year(INT_COL) as y, month(INT_COL) as m from MyTable group by y, m EXP(0.5) avgt 5 34.755 ± 1.062 ms/op BenchmarkQueries.query 1500000 10 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 191.722 ± 4.703 ms/op BenchmarkQueries.query 1500000 10 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 762.582 ± 5.516 ms/op BenchmarkQueries.query 1500000 10 SELECT NO_INDEX_STRING_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY NO_INDEX_STRING_COL,INT_COL ORDER BY NO_INDEX_STRING_COL, INT_COL ASC EXP(0.5) avgt 5 24.967 ± 0.320 ms/op BenchmarkQueries.query 1500000 10 SELECT NO_INDEX_STRING_COL,LOW_CARDINALITY_STRING_COL,COUNT(*) FROM MyTable GROUP BY LOW_CARDINALITY_STRING_COL,NO_INDEX_STRING_COL ORDER BY LOW_CARDINALITY_STRING_COL, NO_INDEX_STRING_COL ASC EXP(0.5) avgt 5 16.709 ± 0.314 ms/op BenchmarkQueries.query 1500000 10 select count(*), year(INT_COL) as y, month(INT_COL) as m from MyTable group by y, m EXP(0.5) avgt 5 175.788 ± 11.362 ms/op BenchmarkQueries.query 1500000 50 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 917.493 ± 17.065 ms/op BenchmarkQueries.query 1500000 50 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 3623.057 ± 87.547 ms/op BenchmarkQueries.query 1500000 50 SELECT NO_INDEX_STRING_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY NO_INDEX_STRING_COL,INT_COL ORDER BY NO_INDEX_STRING_COL, INT_COL ASC EXP(0.5) avgt 5 122.368 ± 1.879 ms/op BenchmarkQueries.query 1500000 50 SELECT NO_INDEX_STRING_COL,LOW_CARDINALITY_STRING_COL,COUNT(*) FROM MyTable GROUP BY LOW_CARDINALITY_STRING_COL,NO_INDEX_STRING_COL ORDER BY LOW_CARDINALITY_STRING_COL, NO_INDEX_STRING_COL ASC EXP(0.5) avgt 5 82.259 ± 2.439 ms/op BenchmarkQueries.query 1500000 50 select count(*), year(INT_COL) as y, month(INT_COL) as m from MyTable group by y, m EXP(0.5) avgt 5 781.819 ± 13.464 ms/op ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org