Re: Indexes on expressions with multiple columns and operators
I wrote:
> Sigh ... so the answer is this used to work (since commit 39df0f150)
> and then I carelessly broke it in commit a391ff3c3. If you try this
> test case in versions 9.5..11 you get a spot-on rowcount estimate.
> Serves me right for not having a test case I guess, but I'm astonished
> that nobody complained sooner.
The attached fixes things so it works like it did pre-a391ff3c3.
I spent some time trying to devise a test case, and was reminded
of why I didn't have one before: it's hard to make a case that
will be robust enough to not show diffs in the buildfarm.
I'll keep thinking about that though.
regards, tom lane
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 5d51f97f219..d0f516b7645 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -874,6 +874,10 @@ clause_selectivity_ext(PlannerInfo *root,
varRelid,
jointype,
sjinfo);
+
+ /* If no support, fall back on boolvarsel */
+ if (s1 < 0)
+ s1 = boolvarsel(root, clause, varRelid);
}
else if (IsA(clause, ScalarArrayOpExpr))
{
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index f8641204a67..da5d901ec3c 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -2143,9 +2143,8 @@ join_selectivity(PlannerInfo *root,
/*
* function_selectivity
*
- * Returns the selectivity of a specified boolean function clause.
- * This code executes registered procedures stored in the
- * pg_proc relation, by calling the function manager.
+ * Attempt to estimate the selectivity of a specified boolean function clause
+ * by asking its support function. If the function lacks support, return -1.
*
* See clause_selectivity() for the meaning of the additional parameters.
*/
@@ -2163,15 +2162,8 @@ function_selectivity(PlannerInfo *root,
SupportRequestSelectivity req;
SupportRequestSelectivity *sresult;
- /*
- * If no support function is provided, use our historical default
- * estimate, 0.333. This seems a pretty unprincipled choice, but
- * Postgres has been using that estimate for function calls since 1992.
- * The hoariness of this behavior suggests that we should not be in too
- * much hurry to use another value.
- */
if (!prosupport)
- return (Selectivity) 0.333;
+ return (Selectivity) -1; /* no support function */
req.type = T_SupportRequestSelectivity;
req.root = root;
@@ -2188,9 +2180,8 @@ function_selectivity(PlannerInfo *root,
DatumGetPointer(OidFunctionCall1(prosupport,
PointerGetDatum(&req)));
- /* If support function fails, use default */
if (sresult != &req)
- return (Selectivity) 0.333;
+ return (Selectivity) -1; /* function did not honor request */
if (req.selectivity < 0.0 || req.selectivity > 1.0)
elog(ERROR, "invalid function selectivity: %f", req.selectivity);
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 1c480cfaaf7..e5e066a5537 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -1528,6 +1528,17 @@ boolvarsel(PlannerInfo *root, Node *arg, int varRelid)
selec = var_eq_const(&vardata, BooleanEqualOperator, InvalidOid,
BoolGetDatum(true), false, true, false);
}
+ else if (is_funcclause(arg))
+ {
+ /*
+ * If we have no stats and it's a function call, estimate 0.333.
+ * This seems a pretty unprincipled choice, but Postgres has been
+ * using that estimate for function calls since 1992. The hoariness
+ * of this behavior suggests that we should not be in too much hurry
+ * to use another value.
+ */
+ selec = 0.333;
+ }
else
{
/* Otherwise, the default estimate is 0.5 */
Re: Why isn't PG using an index-only scan?
On Thu, 18 Sept 2025 at 23:55, Andrei Lepikhov wrote: > It looks like a makeshift solution. By implementing a callback, we could > elevate 'interrupter' to a first-class feature, enabling us to monitor > the state of the entire query tree (it is especially cool in EXPLAIN > ANALYZE mode when we may be OK with a partial result). > What's more interesting if Robert's work on nodes' extensibility > successes, it enables modules to save planning decisions in the plan. > Thereby, the in-executor callback provides a means to kick off the query > planned with incorrect estimations (e.g., too many spilled tuples). > > Perhaps we should start working on introducing this type of callback/ hook? I don't really have a fully formed opinion here, but I expect that if there is something that needs a new hook function, it might be easier to convince everyone that it's a good idea to accept that change if you first write a patch to add the hook, then go off and write the extension that uses it and then come back with that as evidence that the hook is useful enough to be committed to core. There's certainly places where you could add a hook that would just add an unacceptable overhead that we couldn't stomach. I expect ExecProcNode would be one of those places. I do agree that trying to come up with something for this is worthy of some time and effort. Helping people with performance issues when they can't even get EXPLAIN ANALYZE to finish is quite tricky. David
Re: Why isn't PG using an index-only scan?
On Thu, 18 Sept 2025 at 18:36, Jean-Christophe BOGGIO wrote: > Insert on copyrightad (cost=613790.59..4448045.97 rows=0 width=0) > -> Merge Join (cost=613790.59..4448045.97 rows=84972138 width=328) > Merge Cond: (((c.imcompid)::numeric) = ip.sipa_ip_code) > -> Sort (cost=35712.97..36475.44 rows=304986 width=8) > Sort Key: ((c.imcompid)::numeric) > -> Index Only Scan using ix_ad_imcompid on ad c > (cost=0.42..7931.21 rows=304986 width=8) > -> Materialize (cost=578077.61..594521.17 rows=3288712 width=23) > -> Sort (cost=578077.61..586299.39 rows=3288712 width=23) > Sort Key: ip.sipa_ip_code > -> Hash Join (cost=56043.04..154644.48 rows=3288712 > width=23) > Hash Cond: ((o.imworkid)::numeric = > ip.sipa_song_code) > -> Seq Scan on oeu o (cost=0.00..41901.33 > rows=1587533 width=8) > -> Hash (cost=48542.24..48542.24 rows=600064 > width=25) > -> Seq Scan on msipfl ip > (cost=0.00..48542.24 rows=600064 width=25) > Filter: ((sipa_comp_or_publ)::text = > 'C'::text) > Is it normal that PG is doing a seq scan on table oeu but an index-only scan > on ad? I had to stop the query after 5 hours, how can I make this faster? Of > course I ran VACUUM ANALYZE. Yes. Since *all* records of "oeu" are required and they're not required in any particular order, then Seq Scan should be the fastest way to access those records. 5 hours seems very slow for the estimated number of records. Have you tried running the SELECT using EXPLAIN ANALYZE without the INSERT part? Even if the 84 million Merge Join row estimate is accurate, 5 hours seems excessively long. If it still takes a long time, you might try SET enable_mergejoin = 0; and run the EXPLAIN ANALYZE SELECT .. part. That'll at least give us more accurate row counts of what we're actually working with. David
Re: Indexes on expressions with multiple columns and operators
Em qui., 18 de set. de 2025 às 13:40, Tom Lane escreveu: > I wrote: > > Sigh ... so the answer is this used to work (since commit 39df0f150) > > and then I carelessly broke it in commit a391ff3c3. If you try this > > test case in versions 9.5..11 you get a spot-on rowcount estimate. > > Serves me right for not having a test case I guess, but I'm astonished > > that nobody complained sooner. > > The attached fixes things so it works like it did pre-a391ff3c3. > > I spent some time trying to devise a test case, and was reminded > of why I didn't have one before: it's hard to make a case that > will be robust enough to not show diffs in the buildfarm. > I'll keep thinking about that though. > One question, make difference return -1.0 (float point notation)? If not, static analyzers will have less work. best regards, Ranier Vilela
Re: Indexes on expressions with multiple columns and operators
Jehan-Guillaume de Rorthais writes: > On a fresh instance from HEAD with its default configuration, it shows: > Index Scan using foo_s_idx on foo (cost=0.29..8.39 rows=3 width=13) > Index Cond: (s(crit, ackid) = true) > It seems statistics shown in "pg_stats" view for function "s()" are good. The > query itself even have the same costs than the query using the syntax tips you > provide before. > However, the estimated row number seems wrong in regard with the costs shown > and statistics. Yeah. The problem is that clause_selectivity_ext fails to consider use of statistics if the clause looks like "bool_valued_function(...)". If it looks like "bool_valued_function(...) = true", that goes down a different code path that does the right thing. Additional factors: * If you just write "WHERE bool_valued_function(...) = true", that gets stripped down to "WHERE bool_valued_function(...)" in the name of making equivalent expressions look equivalent. (IS TRUE doesn't get stripped, which is why you have to use that wording to avoid that.) * Index condition building puts back the "= true" in order to construct something that satisfies the index AM API. And then it uses that form to get a selectivity estimate for costing purposes --- so the right number goes into the indexscan cost estimate. * But the rowcount estimate is made on the form without "= true". That's the number shown in EXPLAIN and used when considering joins. regards, tom lane
Re: Indexes on expressions with multiple columns and operators
=?UTF-8?Q?Fr=C3=A9d=C3=A9ric_Yhuel?= writes: > On 9/17/25 16:41, Tom Lane wrote: >> =?UTF-8?Q?Fr=C3=A9d=C3=A9ric_Yhuel?= writes: >>> 2) the number of estimated rows is completely off in the second EXPLAIN, >>> whereas the planner could easily use the statistics of foo_f_idx. >> Hmm, not sure about that. Again, boolean-valued indexes aren't >> something we've worked on too hard, but I don't see why that >> would affect this case. > OK, thanks anyway, I think the ju-jitsu mentioned above will do, even > though the application code will have to be patched. Sigh ... so the answer is this used to work (since commit 39df0f150) and then I carelessly broke it in commit a391ff3c3. If you try this test case in versions 9.5..11 you get a spot-on rowcount estimate. Serves me right for not having a test case I guess, but I'm astonished that nobody complained sooner. regards, tom lane
Re: Why isn't PG using an index-only scan?
Thanks David, Le 18/09/2025 à 09:20, David Rowley a écrit : Yes. Since *all* records of "oeu" are required and they're not required in any particular order, then Seq Scan should be the fastest way to access those records. Ok but then why is it doing it on the AD table? Is it because of the number of rows? 5 hours seems very slow for the estimated number of records. Have you tried running the SELECT using EXPLAIN ANALYZE without the INSERT part? Even if the 84 million Merge Join row estimate is accurate, 5 hours seems excessively long. I tried this, it took 1 second. And then I discovered that I had an old trigger on the copyrightad table. The insert took 2 seconds. Very sorry for wasting your time and thanks again for putting me on the right track. Have a great day!
Re: Why isn't PG using an index-only scan?
On 18/9/2025 13:35, David Rowley wrote: On Thu, 18 Sept 2025 at 19:55, Andrei Lepikhov wrote: Imagine if we had a hook within the ExecProcNode. In that scenario, we could create a trivial extension that would stop the query after, let's say, 10 minutes of execution and display the current state. This would give us more reliable data on estimation and the state of the plan tree. I recall something along those lines existing once in the extension world. Or maybe just from a previous employer. I don't recall many of the details. Maybe something like a function you pass an SQL string and if you cancelled the query it reported the EXPLAIN ANALYZE done so far. I assume it must have done something like LOG it as it couldn't have shown the EXPLAIN as query results on cancel. It looks like a makeshift solution. By implementing a callback, we could elevate 'interrupter' to a first-class feature, enabling us to monitor the state of the entire query tree (it is especially cool in EXPLAIN ANALYZE mode when we may be OK with a partial result). What's more interesting if Robert's work on nodes' extensibility successes, it enables modules to save planning decisions in the plan. Thereby, the in-executor callback provides a means to kick off the query planned with incorrect estimations (e.g., too many spilled tuples). Perhaps we should start working on introducing this type of callback/ hook? -- regards, Andrei Lepikhov
Re: Why isn't PG using an index-only scan?
On 18/9/2025 09:20, David Rowley wrote: On Thu, 18 Sept 2025 at 18:36, Jean-Christophe BOGGIO If it still takes a long time, you might try SET enable_mergejoin = 0; and run the EXPLAIN ANALYZE SELECT .. part. That'll at least give us more accurate row counts of what we're actually working with.This appears to be a good example of a missing feature: the in-execution callback or hook. Imagine if we had a hook within the ExecProcNode. In that scenario, we could create a trivial extension that would stop the query after, let's say, 10 minutes of execution and display the current state. This would give us more reliable data on estimation and the state of the plan tree. What are your thoughts? -- regards, Andrei Lepikhov
Re: Why isn't PG using an index-only scan?
On Thu, 18 Sept 2025 at 19:45, Jean-Christophe BOGGIO wrote: > Ok but then why is it doing it on the AD table? Is it because of the > number of rows? It's hard to tell as I don't have a clear view on which columns are from which tables. Perhaps "ad" can Index Only Scan because all of the columns required are in an index but with "oeu" there's a column being used that's not indexed? David
Re: Why isn't PG using an index-only scan?
On Thu, 18 Sept 2025 at 19:55, Andrei Lepikhov wrote: > Imagine if we had a hook within the ExecProcNode. In that scenario, we > could create a trivial extension that would stop the query after, let's > say, 10 minutes of execution and display the current state. This would > give us more reliable data on estimation and the state of the plan tree. I recall something along those lines existing once in the extension world. Or maybe just from a previous employer. I don't recall many of the details. Maybe something like a function you pass an SQL string and if you cancelled the query it reported the EXPLAIN ANALYZE done so far. I assume it must have done something like LOG it as it couldn't have shown the EXPLAIN as query results on cancel. David
Re: Indexes on expressions with multiple columns and operators
Hi there,
I think this discussion has a nice solution, thank you!
However, while poking around this issue yesterday, we also found something
surprising between estimated rows and costs when using a function. Bellow the
scenario to apply on top of Frederic's one to quickly expose the weirdness:
CREATE OR REPLACE FUNCTION s(crit text, ackid int)
RETURNS bool LANGUAGE plpgsql IMMUTABLE AS $$
BEGIN RETURN crit = 'WARNING' AND ackid IS NULL;
END $$;
CREATE INDEX foo_s_idx ON foo (s(crit, ackid));
ANALYZE foo ;
EXPLAIN SELECT * FROM foo WHERE s(crit, ackid);
EXPLAIN SELECT * FROM foo
WHERE (ackid IS NULL AND crit = 'WARNING') is true;
SELECT most_common_vals, most_common_freqs
FROM pg_stats WHERE tablename = 'foo_s_idx';
On a fresh instance from HEAD with its default configuration, it shows:
Index Scan using foo_s_idx on foo (cost=0.29..8.39 rows=3 width=13)
Index Cond: (s(crit, ackid) = true)
Index Scan using foo_expr_idx on foo (cost=0.29..8.39 rows=5 width=13)
Index Cond: (((ackid IS NULL) AND (crit = 'WARNING'::text)) = true)
most_common_vals | most_common_freqs
--+---
{f,t}| {0.5,5e-05}
It seems statistics shown in "pg_stats" view for function "s()" are good. The
query itself even have the same costs than the query using the syntax tips you
provide before.
However, the estimated row number seems wrong in regard with the costs shown
and statistics. It looks like a default hardcoded 33% has been applied. As
Frederic said earlier, this bad row estimation drives the upper join to the
wrong method.
Is this a known planer behavior?
(again, the ju-jitsu syntax provided by Laurenz and you is definitely on
point here, I just hijack the thread to discuss this weirdness.)
Regards,
