Re: [DISCUSS] CEP-39: Cost Based Optimizer

Caleb Rackliffe Thu, 21 Dec 2023 14:24:32 -0800

> We are also currently working on some SAI features that need cost based
optimization.


I don't even think we have to think about *new* SAI features to see where
it will benefit from further *local* optimization, and I'm sympathetic to
that happening in the context of a larger framework, as long as the
framework itself starts as thin as possible and grows over time.

For SAI, the main difficulties we're likely to have in the very short term
are a.) how to order/choose predicates during AND queries to minimize
intersection complexity, b.) how to make decisions about when to use an
index or simple filtering, and c.) combinations of those two, where we
might take different paths depending on how many predicates exist and the
cardinality of the term indexes those predicates touch.

ex. We have a system property called SAI_INTERSECTION_CLAUSE_LIMIT (in CRP)
that controls the maximum number of index query intersections that will
participate in an AND query, leaving the rest for post-filtering. Having
local cardinality estimation on the individual column indexes might make it
a lot easier to pick the two most selective predicates. (Numeric range
predicates, for example, can have matching posting lists of wildly varying
sizes.)

tl;dr I'd like to see us start by enumerating the specific scenarios where
query optimization will benefit SAI in conjunction w/ creating a template
for how a high-level CBO apparatus would work (which is sort of what we
have in this CEP, even if it doesn't go into extreme detail). Then, we
build from the bottom up to ship improvements as quickly as possible w/o
compromising the longer term CBO vision.

Re: [DISCUSS] CEP-39: Cost Based Optimizer

Reply via email to