atifiu commented on PR #6252: URL: https://github.com/apache/iceberg/pull/6252#issuecomment-1764246045
@huaxingao Based on your suggestion, I have narrowed the filter criteria so that even considering the timezone problem, we dont filter on more than two partitions so that filter can be pushed down completely or filter complete partition by adjusting the time timestamp according to UTC time but in either case I see post-scan filters and no aggregate pushdown. Although I do see in the log. Please let me know what I am missing here. > "Evaluating completely on Iceberg side: IsNotNull(initial_page_view_dtm)" ``` 23/10/16 06:36:08 INFO SparkScanBuilder: Evaluating completely on Iceberg side: IsNotNull(initial_page_view_dtm) 23/10/16 06:36:08 INFO V2ScanRelationPushDown: Pushing operators to spark_catalog.schema.table1 Pushed Filters: IsNotNull(initial_page_view_dtm), GreaterThanOrEqual(initial_page_view_dtm,2023-06-02 06:00:00.0), LessThanOrEqual(initial_page_view_dtm,2023-06-02 08:59:59.0) Post-Scan Filters: (initial_page_view_dtm#3 >= 2023-06-02 06:00:00),(initial_page_view_dtm#3 <= 2023-06-02 08:59:59) 23/10/16 06:36:08 INFO V2ScanRelationPushDown: Output: pageviewdate#0, initial_page_view_dtm#3 23/10/16 06:36:09 INFO SnapshotScan: Scanning table spark_catalog.schema.table1 snapshot 3251312493606204579 created at 2023-10-05T08:25:16.490+00:00 with filter ((initial_page_view_dtm IS NOT NULL AND initial_page_view_dtm >= (16-digit-int)) AND initial_page_view_dtm <= (16-digit-int)) 23/10/16 06:36:09 INFO LoggingMetricsReporter: Received metrics report: ScanReport{tableName=spark_catalog.schema.table1, snapshotId=3251312493606204579, filter=((not_null(ref(name="initial_page_view_dtm")) and ref(name="initial_page_view_dtm") >= "(16-digit-int)") and ref(name="initial_page_view_dtm") <= "(16-digit-int)"), schemaId=0, projectedFieldIds=[1, 4], projectedFieldNames=[pageviewdate, initial_page_view_dtm], scanMetrics=ScanMetricsResult{totalPlanningDuration=TimerResult{timeUnit=NANOSECONDS, totalDuration=PT0.383991592S, count=1}, resultDataFiles=CounterResult{unit=COUNT, value=1}, resultDeleteFiles=CounterResult{unit=COUNT, value=0}, totalDataManifests=CounterResult{unit=COUNT, value=68}, totalDeleteManifests=CounterResult{unit=COUNT, value=0}, scannedDataManifests=CounterResult{unit=COUNT, value=1}, skippedDataManifests=CounterResult{unit=COUNT, value=67}, totalFileSizeInBytes=CounterResult{unit=BYTES, value=340185692}, totalDeleteFileSizeInBytes=CounterResult{unit=B YTES, value=0}, skippedDataFiles=CounterResult{unit=COUNT, value=30}, skippedDeleteFiles=CounterResult{unit=COUNT, value=0}, scannedDeleteManifests=CounterResult{unit=COUNT, value=0}, skippedDeleteManifests=CounterResult{unit=COUNT, value=0}, indexedDeleteFiles=CounterResult{unit=COUNT, value=0}, equalityDeleteFiles=CounterResult{unit=COUNT, value=0}, positionalDeleteFiles=CounterResult{unit=COUNT, value=0}}, metadata={engine-version=3.3.1, iceberg-version=Apache Iceberg 1.3.0 (commit 7dbdfd33a667a721fbb21c7c7d06fec9daa30b88), app-id=application_1689900894764_104752, engine-name=spark}} ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org