RE: Postgres Optimizer ignores information about foreign key relationship, severly misestimating number of returned rows in join
Hi Tom, Thanks a lot for your help! If it is in the HEAD, does it mean, it will be included in v14? I'll have to see, if we dare building our own v13 version with the patch. (I would love to, because I am simply thrilled to pieces, having a patch made by you for us 😉) Regards, Sigrid -Original Message- From: Tom Lane Sent: Wednesday, October 28, 2020 5:55 PM To: Ehrenreich, Sigrid Cc: David Rowley ; [email protected] Subject: Re: Postgres Optimizer ignores information about foreign key relationship, severly misestimating number of returned rows in join "Ehrenreich, Sigrid" writes: > A patch would be very much appreciated. > We are currently running on Version 12, but could upgrade to 13, if necessary. > Could you send me a notification if you managed to program a patch for that? I've pushed a patch for this to HEAD, but current thinking is that we will not be back-patching it. Still, if you're desperate you could consider running a custom build of v13 with that patch --- a quick check suggests that it would back-patch easily. v12 would be a slightly harder lift (I see one hunk doesn't apply) but probably not by much. https://git.postgresql.org/gitweb/?p=postgresql.git;a=patch;h=ad1c36b0709e47cdb3cc4abd6c939fe64279b63f regards, tom lane
Re: Postgres Optimizer ignores information about foreign key relationship, severly misestimating number of returned rows in join
On Thu, Oct 29, 2020 at 9:43 AM Ehrenreich, Sigrid wrote: > > Hi Tom, > > Thanks a lot for your help! > > If it is in the HEAD, does it mean, it will be included in v14? Yes, that's precisely what it means. Unless someone finds something bad with it and it has to be removed of course, but in principle it means that it will be in 14. //Magnus > > > I'll have to see, if we dare building our own v13 version with the patch. > (I would love to, because I am simply thrilled to pieces, having a patch made > by you for us ) > > Regards, > Sigrid > > -Original Message- > From: Tom Lane > Sent: Wednesday, October 28, 2020 5:55 PM > To: Ehrenreich, Sigrid > Cc: David Rowley ; > [email protected] > Subject: Re: Postgres Optimizer ignores information about foreign key > relationship, severly misestimating number of returned rows in join > > "Ehrenreich, Sigrid" writes: > > A patch would be very much appreciated. > > We are currently running on Version 12, but could upgrade to 13, if > > necessary. > > Could you send me a notification if you managed to program a patch for that? > > I've pushed a patch for this to HEAD, but current thinking is that we > will not be back-patching it. Still, if you're desperate you could > consider running a custom build of v13 with that patch --- a quick > check suggests that it would back-patch easily. v12 would be a > slightly harder lift (I see one hunk doesn't apply) but probably > not by much. > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=patch;h=ad1c36b0709e47cdb3cc4abd6c939fe64279b63f > > regards, tom lane
RE: Postgres Optimizer ignores information about foreign key relationship, severly misestimating number of returned rows in join
Thanks! Regards, Sigrid -Original Message- From: Magnus Hagander Sent: Thursday, October 29, 2020 10:57 AM To: Ehrenreich, Sigrid Cc: Tom Lane ; David Rowley ; [email protected] Subject: Re: Postgres Optimizer ignores information about foreign key relationship, severly misestimating number of returned rows in join On Thu, Oct 29, 2020 at 9:43 AM Ehrenreich, Sigrid wrote: > > Hi Tom, > > Thanks a lot for your help! > > If it is in the HEAD, does it mean, it will be included in v14? Yes, that's precisely what it means. Unless someone finds something bad with it and it has to be removed of course, but in principle it means that it will be in 14. //Magnus > > > I'll have to see, if we dare building our own v13 version with the patch. > (I would love to, because I am simply thrilled to pieces, having a patch made > by you for us ) > > Regards, > Sigrid > > -Original Message- > From: Tom Lane > Sent: Wednesday, October 28, 2020 5:55 PM > To: Ehrenreich, Sigrid > Cc: David Rowley ; > [email protected] > Subject: Re: Postgres Optimizer ignores information about foreign key > relationship, severly misestimating number of returned rows in join > > "Ehrenreich, Sigrid" writes: > > A patch would be very much appreciated. > > We are currently running on Version 12, but could upgrade to 13, if > > necessary. > > Could you send me a notification if you managed to program a patch for that? > > I've pushed a patch for this to HEAD, but current thinking is that we > will not be back-patching it. Still, if you're desperate you could > consider running a custom build of v13 with that patch --- a quick > check suggests that it would back-patch easily. v12 would be a > slightly harder lift (I see one hunk doesn't apply) but probably > not by much. > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=patch;h=ad1c36b0709e47cdb3cc4abd6c939fe64279b63f > > regards, tom lane
Re: query plan using partial index expects a much larger number of rows than is possible
On Wed, Oct 28, 2020 at 5:30 PM Tom Lane wrote: > "Olivier Poquet" writes: > > Looking at it in more detail, I found that the planner is assuming that > I'll get millions of rows back even when I do a simple query that does an > index scan on my partial index: > > We don't look at partial-index predicates when trying to estimate the > selectivity of a WHERE clause. It's not clear to me whether that'd be > a useful thing to do, or whether it could be shoehorned into the system > easily. (One big problem is that while the index size could provide > an upper bound, it's not apparent how to combine that knowledge with > selectivities of unrelated conditions. Also, it's riskier to extrapolate > a current rowcount estimate from stale relpages/reltuples data for an > index than it is for a table, because the index is less likely to scale > up linearly.) > > regards, tom lane > Aren't there custom stats created for functional indexes? Would it be feasible to create those for partial indexes as well, maybe only optionally? I assume there may be giant gaps with that notion.
Re: Understanding bad estimate (related to FKs?)
> On Oct 28, 2020, at 9:13 PM, Justin Pryzby wrote: > > On Mon, Oct 26, 2020 at 11:20:01AM -0600, Michael Lewis wrote: >> On Mon, Oct 26, 2020 at 11:14 AM Philip Semanchuk >> wrote: >> > The item I'm focused on is node 23. The estimate is for 7 rows, actual >>> is 896 (multiplied by 1062 loops). I'm confused about two things in this >>> node. > > The first is Postgres' estimate. The condition for this index scan >>> contains three expressions -- > > (five_uniform = zulu_five.five_uniform) AND > (whiskey_mike = juliet_india.whiskey_mike) AND > (bravo = 'mike'::text) >>> >> >> Are the columns correlated? > > I guess it shouldn't matter, since the FKs should remove all but one of the > conditions. Yes, I had the same expectation. I thought Postgres would calculate the selectivity as 1.0 * 1.0 * whatever estimate it has for the frequency of ‘mike’, but since the frequency estimate is very accurate but the planner’s estimate is not, there’s something else going on. > Maybe you saw this other thread, which I tentatively think also affects your > case (equijoin with nonjoin condition) > https://www.postgresql.org/message-id/AM6PR02MB5287A0ADD936C1FA80973E72AB190%40AM6PR02MB5287.eurprd02.prod.outlook.com Yes, thank you, I read that thread with interest. I tried your clever trick using BETWEEN, but it didn’t change the plan. Does that suggest there’s some other cause for the planner’s poor estimate? Cheers Philip
Re: Understanding bad estimate (related to FKs?)
On Thu, Oct 29, 2020 at 11:25:48AM -0400, Philip Semanchuk wrote: On Oct 28, 2020, at 9:13 PM, Justin Pryzby wrote: On Mon, Oct 26, 2020 at 11:20:01AM -0600, Michael Lewis wrote: On Mon, Oct 26, 2020 at 11:14 AM Philip Semanchuk wrote: The item I'm focused on is node 23. The estimate is for 7 rows, actual is 896 (multiplied by 1062 loops). I'm confused about two things in this node. The first is Postgres' estimate. The condition for this index scan contains three expressions -- (five_uniform = zulu_five.five_uniform) AND (whiskey_mike = juliet_india.whiskey_mike) AND (bravo = 'mike'::text) Are the columns correlated? I guess it shouldn't matter, since the FKs should remove all but one of the conditions. Yes, I had the same expectation. I thought Postgres would calculate the selectivity as 1.0 * 1.0 * whatever estimate it has for the frequency of ‘mike’, but since the frequency estimate is very accurate but the planner’s estimate is not, there’s something else going on. Well, this is quite a bit more complicated, I'm afraid :-( The clauses include parameters passed from the nodes above the index scan. So even if we had extended stats on the table, we couldn't use them as that requires (Var op Const) conditions. So this likely ends up with a product of estimates for each clause, and even then we can't use any particular value so we probably end up with something like 1/ndistinct or something like that. So if the values actually passed to the index scan are more common and/or if the columns are somehow correlated, it's not surprising we end up with an overestimate. Maybe you saw this other thread, which I tentatively think also affects your case (equijoin with nonjoin condition) https://www.postgresql.org/message-id/AM6PR02MB5287A0ADD936C1FA80973E72AB190%40AM6PR02MB5287.eurprd02.prod.outlook.com Yes, thank you, I read that thread with interest. I tried your clever trick using BETWEEN, but it didn’t change the plan. Does that suggest there’s some other cause for the planner’s poor estimate? I don't think that's related - to hit that bug, there would have to be implied conditions pushed-down to the scan level. And there's nothing like that in this case. FWIW I don't think this has anything to do with join cardinality estimation - at least not for the node 23. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
