troywinter opened a new issue #6910:
URL: https://github.com/apache/incubator-pinot/issues/6910


   When filtering a transformed column, segment pruning is not happening, below 
query cost more than 2s to finish, 
   ```
   SELECT datetimeconvert(__time, '1:MILLISECONDS:EPOCH', 
'1:MILLISECONDS:EPOCH', '30:MINUTES'),
          COUNT(*)
   FROM product_log
   WHERE datetimeconvert(__time, '1:MILLISECONDS:EPOCH', 
'1:MILLISECONDS:EPOCH', '30:MINUTES') >= 1620830760000
     AND datetimeconvert(__time, '1:MILLISECONDS:EPOCH', 
'1:MILLISECONDS:EPOCH', '30:MINUTES') < 1620917160000
     AND method = 'DeviceInternalService.CheckDeviceInSameGroup'
     AND container_name = 'whale-device'
     AND error > '0'
   GROUP BY datetimeconvert(__time, '1:MILLISECONDS:EPOCH', 
'1:MILLISECONDS:EPOCH', '30:MINUTES')
   ORDER BY COUNT(*) DESC
   ```
   The query result is:
   ```
   {
       "resultTable": {
           "dataSchema": {
               "columnDataTypes": [
                   "LONG",
                   "LONG"
               ],
               "columnNames": [
                   
"datetimeconvert(__time,'1:MILLISECONDS:EPOCH','1:MILLISECONDS:EPOCH','30:MINUTES')",
                   "count(*)"
               ]
           },
           "rows": [
               [
                   1620873000000,
                   180
               ],
               [
                   1620869400000,
                   179
               ],
               [
                   1620871200000,
                   178
               ],
               [
                   1620894600000,
                   172
               ],
               [
                   1620892800000,
                   166
               ],
               [
                   1620874800000,
                   164
               ],
               [
                   1620876600000,
                   163
               ],
               [
                   1620896400000,
                   163
               ],
               [
                   1620867600000,
                   162
               ],
               [
                   1620885600000,
                   161
               ]
           ]
       },
       "exceptions": [],
       "numServersQueried": 1,
       "numServersResponded": 1,
       "numSegmentsQueried": 41,
       "numSegmentsProcessed": 41,
       "numSegmentsMatched": 12,
       "numConsumingSegmentsQueried": 3,
       "numDocsScanned": 7706,
       "numEntriesScannedInFilter": 195554753,
       "numEntriesScannedPostFilter": 7706,
       "numGroupsLimitReached": false,
       "totalDocs": 165272282,
       "timeUsedMs": 2335,
       "segmentStatistics": [],
       "traceInfo": {},
       "minConsumingFreshnessTimeMs": 1620917392724
   }
   ```
   And if not using a transformed time column in filter, it will return in 647ms
   query:
   ```
   SELECT datetimeconvert(__time, '1:MILLISECONDS:EPOCH', 
'1:MILLISECONDS:EPOCH', '30:MINUTES'),
          COUNT(*)
   FROM product_log
   WHERE __time >= 1620830760000
     AND __time < 1620917160000
     AND method = 'DeviceInternalService.CheckDeviceInSameGroup'
     AND container_name = 'whale-device'
     AND error > '0'
   GROUP BY datetimeconvert(__time, '1:MILLISECONDS:EPOCH', 
'1:MILLISECONDS:EPOCH', '30:MINUTES')
   ORDER BY COUNT(*) DESC
   ```
   result is:
   ```
   {
       "resultTable": {
           "dataSchema": {
               "columnDataTypes": [
                   "LONG",
                   "LONG"
               ],
               "columnNames": [
                   
"datetimeconvert(__time,'1:MILLISECONDS:EPOCH','1:MILLISECONDS:EPOCH','30:MINUTES')",
                   "count(*)"
               ]
           },
           "rows": [
               [
                   1620873000000,
                   180
               ],
               [
                   1620869400000,
                   179
               ],
               [
                   1620871200000,
                   178
               ],
               [
                   1620894600000,
                   172
               ],
               [
                   1620892800000,
                   166
               ],
               [
                   1620874800000,
                   164
               ],
               [
                   1620876600000,
                   163
               ],
               [
                   1620896400000,
                   163
               ],
               [
                   1620867600000,
                   162
               ],
               [
                   1620865800000,
                   161
               ]
           ]
       },
       "exceptions": [],
       "numServersQueried": 1,
       "numServersResponded": 1,
       "numSegmentsQueried": 41,
       "numSegmentsProcessed": 12,
       "numSegmentsMatched": 12,
       "numConsumingSegmentsQueried": 3,
       "numDocsScanned": 7770,
       "numEntriesScannedInFilter": 68503679,
       "numEntriesScannedPostFilter": 7770,
       "numGroupsLimitReached": false,
       "totalDocs": 165381107,
       "timeUsedMs": 647,
       "segmentStatistics": [],
       "traceInfo": {},
       "minConsumingFreshnessTimeMs": 1620917833431
   }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to