This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/datafusion.git


The following commit(s) were added to refs/heads/main by this push:
     new 4ae19ebce1 fix: update clickbench expected plan for NDV-aware 
optimization (#21050)
4ae19ebce1 is described below

commit 4ae19ebce11b02fc73d37d25dacc07d36c7221ef
Author: Alessandro Solimando <[email protected]>
AuthorDate: Thu Mar 19 13:41:52 2026 +0100

    fix: update clickbench expected plan for NDV-aware optimization (#21050)
    
    ## Which issue does this PR close?
    
    Fixes CI breakage on `main` introduced by #19957.
    
    ## Rationale for this change
    
    #19957 introduced NDV extraction from Parquet metadata. The optimizer
    now sees NDV=1 for `HitColor`, `BrowserCountry`, `BrowserLanguage` in
    the clickbench test file and short-circuits `COUNT(DISTINCT)` to a
    constant projection, skipping the full table scan.
    
    ## What changes are included in this PR?
    
    Updates the expected EXPLAIN plan in `clickbench.slt` to match the new
    (better) physical plan:
    
    ```diff
    -   01)AggregateExec: mode=Single, gby=[], aggr=[count(DISTINCT 
hits.HitColor), ...]
    -   02)--DataSourceExec: file_groups={1 group: [...]}, 
projection=[HitColor, BrowserLanguage, BrowserCountry], file_type=parquet
    +   01)ProjectionExec: expr=[1 as count(DISTINCT hits.HitColor), 1 as 
count(DISTINCT hits.BrowserCountry), 1 as count(DISTINCT hits.BrowserLanguage)]
    +   02)--PlaceholderRowExec
    ```
    
    ## Are these changes tested?
    
    This PR *is* the test fix. Verified locally with `cargo test --profile
    ci -p datafusion-sqllogictest --test sqllogictests`.
    
    ## Are there any user-facing changes?
    
    No.
---
 datafusion/sqllogictest/test_files/clickbench.slt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/datafusion/sqllogictest/test_files/clickbench.slt 
b/datafusion/sqllogictest/test_files/clickbench.slt
index 42f066a80d..4e9849e365 100644
--- a/datafusion/sqllogictest/test_files/clickbench.slt
+++ b/datafusion/sqllogictest/test_files/clickbench.slt
@@ -1203,8 +1203,8 @@ logical_plan
 02)--SubqueryAlias: hits
 03)----TableScan: hits_raw projection=[HitColor, BrowserLanguage, 
BrowserCountry]
 physical_plan
-01)AggregateExec: mode=Single, gby=[], aggr=[count(DISTINCT hits.HitColor), 
count(DISTINCT hits.BrowserCountry), count(DISTINCT hits.BrowserLanguage)]
-02)--DataSourceExec: file_groups={1 group: 
[[WORKSPACE_ROOT/datafusion/core/tests/data/clickbench_hits_10.parquet]]}, 
projection=[HitColor, BrowserLanguage, BrowserCountry], file_type=parquet
+01)ProjectionExec: expr=[1 as count(DISTINCT hits.HitColor), 1 as 
count(DISTINCT hits.BrowserCountry), 1 as count(DISTINCT hits.BrowserLanguage)]
+02)--PlaceholderRowExec
 
 query III
 SELECT COUNT(DISTINCT "HitColor"), COUNT(DISTINCT "BrowserCountry"), 
COUNT(DISTINCT "BrowserLanguage")  FROM hits;


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to