gortiz opened a new issue, #8671: URL: https://github.com/apache/pinot/issues/8671
By reading this test, is clear that there are some cases where Pinot is not pruning expressions it can prune. Specifically: 1. If all values are outside the max/min range but at least one was in the bloom filter, the segment is not pruned. 2. If at least one candidate value is in the max/min range, the bloom filter is not applied. 3. When there is no bloom filter, `eq` doesn't prune, but `in` does. Semantically it isn't a problem because issue 2. applies first. But once that is fixed, it will be evident that the bloom filter condition has to be changed as well. The tests on [ColumnValueSegmentPrunerTest](https://github.com/apache/pinot/blob/b9bfb8e752f61079f42c2c857cacbd673eadf5d1/pinot-core/src/test/java/org/apache/pinot/core/query/pruner/ColumnValueSegmentPrunerTest.java#L163) actually prove that the situation is wrong. Specifically: ```java when(bloomFilterReader.mightContain("1")).thenReturn(true); when(bloomFilterReader.mightContain("2")).thenReturn(true); when(bloomFilterReader.mightContain("3")).thenReturn(true); when(dataSourceMetadata.getMinValue()).thenReturn(5); when(dataSourceMetadata.getMaxValue()).thenReturn(10); ... assertFalse(runPruner(indexSegment, "SELECT COUNT(*) FROM testTable WHERE column IN (0, 1, 2)")); ``` In that situation, 0, 1 and 2 are lower than the min value of the segment, but the system is not pruning. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org