[ 
https://issues.apache.org/jira/browse/CALCITE-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18069407#comment-18069407
 ] 

TJ Banghart commented on CALCITE-7380:
--------------------------------------

I apologize for the lengthy delay here. I am still interested in working on 
this and will take it up shortly.

Yes, likely an in-memory implementation: {{EnumerableWCOJ}} and associated 
conversion rule. I believe Calcite has much of the existing patterns/operators 
for us to make this a physical optimization only with no requirement for new 
logical operators or rules.

At the very least we'll need to recognize when WCOJ is appropriate over other 
join options by testing for cyclicity. I think reusing parts of the existing 
{{JoinToHyperGraphRule}} to get a hypergraph and performing a simple GYO 
reduction would suffice but it wouldn't protect from false positives. As noted 
in the paper, binary joins that _don't produce_ plans with exploding 
intermediate results (e.g. it may have join conditions that may reduce 
cardinality) could still be treated as WCOJ and actually hurt performance.

{{EnumerableWCOJ}} should consist of a trie where each level corresponds to a 
join variable in the global variable ordering. Following the paper, this trie 
can be used to perform hash-based intersections of join candidates, ensuring 
that each lookup is constrained by all prior join bindings.

> Add support for worst-case optimal join algorithms
> --------------------------------------------------
>
>                 Key: CALCITE-7380
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7380
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.42.0
>            Reporter: TJ Banghart
>            Assignee: TJ Banghart
>            Priority: Major
>         Attachments: p1891-freitag.pdf
>
>
> This issue proposes introducing worst-case optimal join (WCOJ) algorithms 
> into Calcite as an opt-in optimization. WCOJs evaluate multi-way joins in a 
> way that avoids generating large intermediate results and guarantees runtime 
> proportional to the worst-case output size of the query, which can 
> significantly outperform traditional binary join plans for certain query 
> patterns (e.g., cyclic or graph-like joins).
> The implementation would follow a hybrid approach, similar to [Freitag et al. 
> (PVLDB 2020|https://db.in.tum.de/~freitag/papers/p1891-freitag.pdf]), where 
> the optimizer can choose worst-case optimal joins when beneficial but fall 
> back to existing join strategies when they are more appropriate.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to