Queries in plpgsql are 6 times slower on partitioned tables
After migrating to a partitioned table, I noticed that a
performance-critical plpgsql function is a few times slower.
Basically, the function takes a key as an argument, and performs SELECT,
UPDATE and DELETE operations on tables partitioned by the key.
I narrowed down the problem to the following: let's have an empty table
"demo" with column "key", and two plpgsql functions that run "DELETE FROM
demo WHERE key = XYZ" 1 times in two flavours: one takes the key by
argument, and in the other the key hardcoded.
Here are the running times:
- delete by hardcoded value from non-partitioned table: 39.807 ms
- delete by argument from non-partitioned table:45.734 ms
- delete by hardcoded value from partitioned table: 47.101 ms
- delete by argument from partitioned table:295.748 ms
Deleting by argument from an empty partitioned table is 6 times slower!
Why is it so? The number of partitions doesn't seem to be important. And
deleting is just an example, SELECT behaves in the same way.
Sample code:
-- partioned table
DROP TABLE IF EXISTS demo_partitioned;
CREATE TABLE demo_partitioned(key BIGINT, val BIGINT) PARTITION BY LIST
(key);
DO $$
DECLARE
i BIGINT;
BEGIN
FOR i IN SELECT * FROM generate_series(1, 15)
LOOP
EXECUTE 'CREATE TABLE demo_partitioned_key_'|| i ||' PARTITION OF
demo_partitioned FOR VALUES IN (' || i || ');';
END LOOP;
END$$;
CREATE OR REPLACE FUNCTION del_from_partitioned_by_arg(k BIGINT)
RETURNS VOID AS $$
DECLARE
i BIGINT;
BEGIN
FOR i IN SELECT * FROM generate_series(1, 1)
LOOP
DELETE FROM demo_partitioned WHERE key = k;
END LOOP;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION del_from_partitioned_hardcoded()
RETURNS VOID AS $$
DECLARE
i BIGINT;
BEGIN
FOR i IN SELECT * FROM generate_series(1, 1)
LOOP
DELETE FROM demo_partitioned WHERE key = 3;
END LOOP;
END;
$$ LANGUAGE plpgsql;
ANALYZE demo_partitioned;
EXPLAIN ANALYZE DELETE FROM demo_partitioned WHERE key = 3;
EXPLAIN ANALYZE SELECT * FROM del_from_partitioned_hardcoded();
EXPLAIN ANALYZE SELECT * FROM del_from_partitioned_by_arg(3);
-- non-partitioned table
DROP TABLE IF EXISTS demo_non_partitioned;
CREATE TABLE demo_non_partitioned(key BIGINT, val BIGINT);
ANALYZE demo_non_partitioned;
CREATE OR REPLACE FUNCTION del_from_non_partitioned_by_arg(k BIGINT)
RETURNS VOID AS $$
DECLARE
i BIGINT;
BEGIN
FOR i IN SELECT * FROM generate_series(1, 1)
LOOP
DELETE FROM demo_non_partitioned WHERE key = k;
END LOOP;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION del_from_non_partitioned_hardcoded()
RETURNS VOID AS $$
DECLARE
i BIGINT;
BEGIN
FOR i IN SELECT * FROM generate_series(1, 1)
LOOP
DELETE FROM demo_non_partitioned WHERE key = 3;
END LOOP;
END;
$$ LANGUAGE plpgsql;
EXPLAIN ANALYZE DELETE FROM demo_non_partitioned WHERE key = 3;
EXPLAIN ANALYZE SELECT * FROM del_from_non_partitioned_hardcoded();
EXPLAIN ANALYZE SELECT * FROM del_from_non_partitioned_by_arg(3);
Output:
DROP TABLE
CREATE TABLE
DO
CREATE FUNCTION
CREATE FUNCTION
ANALYZE
QUERY PLAN
---
Delete on demo_partitioned (cost=0.00..29.43 rows=9 width=6) (actual
time=0.002..0.002 rows=0 loops=1)
Delete on demo_partitioned_key_3
-> Seq Scan on demo_partitioned_key_3 (cost=0.00..29.43 rows=9
width=6) (actual time=0.001..0.001 rows=0 loops=1)
Filter: (key = 3)
Planning Time: 0.180 ms
Execution Time: 0.069 ms
(6 rows)
QUERY PLAN
---
Function Scan on del_from_partitioned_hardcoded (cost=0.05..0.06 rows=1
width=4) (actual time=47.030..47.030 rows=1 loops=1)
Planning Time: 0.020 ms
Execution Time: 47.101 ms
(3 rows)
QUERY PLAN
--
Function Scan on del_from_partitioned_by_arg (cost=0.05..0.06 rows=1
width=4) (actual time=295.737..295.737 rows=1 loops=1)
Planning Time: 0.023 ms
Execution Time: 295.748 ms
(3 rows)
DROP TABLE
CREATE TABLE
ANALYZE
CREATE FUNCTION
CREATE FUNCTION
QUERY PLAN
-
Delete on demo_non_partitioned (cost=0.00..29.43 rows=9 width=6) (actual
time=0.002..0.003 rows=0 loops=1)
-> Seq Scan on demo_non_partitioned (cost=0.00..29.43 rows=9 width=6)
(actual time=0.002..0.002 rows=0 loops=1)
Filter: (key = 3)
Planning Time: 0.046 ms
Ex
Bad query plan decision when using multiple column index - postgresql uses only first column then filters
Hello List, I'm Cosmin. This is my first post and I'll get right down to
the problem. I'm using Postgresql 10 (because that's what's installed by
default on Ubuntu 18.04):
explain analyze
select R, C, V from LBD
where Ver = 92 and Id in (10,11)
Index Scan using "IX_LBD_Ver_Id" on "LBD" (cost=0.56..2.37 rows=1
width=13) (actual time=0.063..857.725 rows=2 loops=1)
Index Cond: ("Ver" = 92)
Filter: ("Id" = ANY ('{10,11}'::integer[]))
Rows Removed by Filter: 1869178
Planning time: 0.170 ms
Execution time: 857.767 ms
The IX_LBD_Ver_Id index is on two columns (Ver, Id) - it's not in "Ver"
alone!
Somehow the query planner thinks that scanning the index on "Ver" alone
should only return 1 record. The problem is that there are, on average,
millions of records for each "Ver"!
The current query is not my real query: the original problem was with a
JOIN. I boiled it down to this simple query because it shows the same
problem: when dealing with more then one "Id" the planner scans on "Ver"
and filters on "Id". Running the query with a single "Id" does use the
index on both columns and the query finishes in only 0.7 ms (one thousand
times faster)
The planner doesn't always get it this bad. The original JOIN usually runs
instantaneously. Unless the planner gets into it's current funk and then
the original JOIN never finishes.
- I've reindexed the whole database
- I ran ANALYZE on all tables
- I checked "pg_stats", here are the stats:
select attname, null_frac, avg_width, n_distinct, correlation from pg_stats
where tablename = 'LBD' and (attname in ('Id', 'Ver'))
attname null_frac acg_width n_distinct correlation
Id 042029846 0.0631249
Ver 02220.624823
According to data from "pg_stats" the query planner should know that
scanning the "LBD" table has on average millions of records per "Ver".
The fact that this works right most of the time tells me I'm dealing with
some kind of statistical information (something along the lines of
n_distinct from pg_stat) and gives me hope. Once I know why the planner
gets this wrong I should be able to make it right.
Please point me in the right direction. Where should I look, what should I
try?
Thank you,
Cosmin
Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters
Cosmin Prund writes:
> explain analyze
>select R, C, V from LBD
>where Ver = 92 and Id in (10,11)
> Index Scan using "IX_LBD_Ver_Id" on "LBD" (cost=0.56..2.37 rows=1
> width=13) (actual time=0.063..857.725 rows=2 loops=1)
> Index Cond: ("Ver" = 92)
> Filter: ("Id" = ANY ('{10,11}'::integer[]))
> Rows Removed by Filter: 1869178
> Planning time: 0.170 ms
> Execution time: 857.767 ms
> The IX_LBD_Ver_Id index is on two columns (Ver, Id) - it's not in "Ver"
> alone!
Seems like an odd choice of plan, then, but you haven't provided any
detail that would let anyone guess why it's not using the second index
column. For starters it would be good to show the exact table and
index schema (eg via \d+ in psql). Also, does explicitly ANALYZE'ing
the table change anything?
https://wiki.postgresql.org/wiki/Slow_Query_Questions
regards, tom lane
Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters
Does the behavior change with different values of Ver column? I'd be curious of the fraction in the MCVs frequency list in stats indicates that rows with Ver = 92 are rare and therefore the index on only Ver column is sufficient to find the rows quickly. What is reltuples for this table by the way? I also wonder if the situation may be helped by re-indexing the "index on both columns" to remove any chance of issues on bloat in the index. Which order are the columns by the way? If Ver is first, is there also an index on only id column?. Since you aren't on v12, you don't get to re-index concurrently but I assume you know the work around of create concurrently (different name), drop concurrently (old one), and finally rename new index.
Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters
Hi Tom, and thanks.
Running ANALYZE doesn't change a thing. REINDEXING doesn't change a thing.
I know it's an odd choice of plan - that's why I'm here!
I thought I'd just post what felt relevant, hoping it's not something out
of the ordinary and I'm just missing something obvious.
Here's lots of data:
===
SELECT version()
PostgreSQL 10.10 (Ubuntu 10.10-0ubuntu0.18.04.1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0, 64-bit
===
\d "LucrareBugetDate"
Table
"public.LucrareBugetDate"
Column | Type | Collation | Nullable
| Default
+---+---+--+--
OrdonatorPrincipalId | uuid | | |
UnitateSubordonataId | uuid | | |
CentralizatorSelectiv | text | | |
IdRand | character varying(32) | | |
IdColoana | character varying(32) | | |
ClasEc | character varying(32) | | |
CodSector | character varying(4) | | |
CodSursa | character varying(4) | | |
Paragraf | character varying(16) | | |
Venit | character varying(16) | | |
FelValoare | integer | | not null |
Valoare| numeric | | not null |
RangOperator | integer | | not null |
OrdineCalcul | integer | | not null |
ConflictFormuleAlternative | boolean | | not null
| false
Sectiune | integer | | |
RefColoana | text | | |
RefDocument| text | | |
RefLinie | text | | |
SeqModificare | integer | | not null
| 0
LucrareBugetDateId | integer | | not null
| nextval('"LucrareBugetDate_LucrareBugetDateIdV2_seq"'::regclass)
LucrareBugetVersiuneId | smallint | | not null |
CentralizatorSelectivId| uuid | | |
Stil | text | | |
ValoareArhivata| boolean | | |
Indexes:
"PK_LucrareBugetDate" PRIMARY KEY, btree ("LucrareBugetVersiuneId",
"LucrareBugetDateId")
"IX_LucrareBugetDate_LucrareBugetVersiuneId_LucrareBugetDateId" btree
("LucrareBugetVersiuneId", "LucrareBugetDateId")
Foreign-key constraints:
"FK_LucrareBugetDate_LucrareBugetVersiune_LucrareBugetVersiuneId"
FOREIGN KEY ("LucrareBugetVersiuneId") REFERENCES
"LucrareBugetVersiune"("LucrareBugetVersiuneId") ON DELETE CASCADE
===
SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts,
relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE
relname='LucrareBugetDate';
relname | relpages | reltuples | relallvisible | relkind |
relnatts | relhassubclass | reloptions| pg_table_size
--+--+-+---+-+--++-+---
LucrareBugetDate | 2659660 | 4.17124e+07 |671510 | r |
25 | f | {fillfactor=50} | 21793775616
(1 row)
===
Does the table have anything unusual about it?
- contains large objects: NO
- has a large proportion of NULLs in several columns: NO
- receives a large number of UPDATEs or DELETEs regularly: YES - Lots of
UPDATES but no UPDATES to indexed columns. No DELETE's.
- is growing rapidly: I'm inserting millions of records at once but not
very often. Have manually done ANALYZE and REINDEX
- has many indexes on it: NO
- uses triggers that may be executing database functions, or is calling
functions directly: NO
===
EXPLAIN (ANALYZE, BUFFERS) select "IdRand", "IdColoana", "Valoare" from
"LucrareBugetDate" where ("LucrareBugetVersiuneId" = 92) and
("LucrareBugetDateId" in (10,11));
QUERY PLAN
--
Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters
Michael Lewis writes: > Does the behavior change with different values of Ver column? By and large, indxpath.c will just add all qual clauses that match an index to the indexscan's conditions --- there's no attempt to decide that some of them might not be worth it on cost grounds. So I'd be pretty surprised if altering the Ver constant made any difference. My money is on there being some reason why the IN clause doesn't match the index, perhaps a type mismatch. Without seeing the table schema, and the exact query, it's hard to say what that reason is. (I'll not insult your intelligence by saying how I know that the OP didn't just copy-and-paste that query.) regards, tom lane
Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters
Hello Michael and hello again Tom, sorry for mailing you directly. I just
hit Reply in gmail - I expected the emails to have a reply-to=Pgsql.
Apparently they do not.
Running the same query with a different "Ver" produces a proper plan.
Here's a non-redacted example (Ver=91):
EXPLAIN (ANALYZE, BUFFERS) select "IdRand", "IdColoana", "Valoare" from
"LucrareBugetDate" where ("LucrareBugetVersiuneId" = 91) and
("LucrareBugetDateId" in (10,11));
QUERY PLAN
-
Index Scan using
"IX_LucrareBugetDate_LucrareBugetVersiuneId_LucrareBugetDateId" on
"LucrareBugetDate" (cost=0.56..4.95 rows=2 width=13) (actual
time=3.617..3.631 rows=2 loops=1)
Index Cond: (("LucrareBugetVersiuneId" = 91) AND ("LucrareBugetDateId" =
ANY ('{10,11}'::integer[])))
Buffers: shared hit=9 read=3
Planning time: 0.223 ms
Execution time: 3.663 ms
(5 rows)
I have reindex everything, not just this INDEX.
"reltuples" for this table is 41712436.
> I'd be curious of the fraction in the MCVs frequency list in stats
indicates that rows with Ver = 92 are rare and therefore the index on
only Ver column is sufficient to find the rows quickly.
There are 25 valid values for "Ver" in this database. I ran the query for
all of them. The only one miss-behaving is "92". I ran the query with
random values for Ver (invalid values), the query plan always attempts to
use the index using both values.
I looked into "most_common_values" in pg_stats, this value (92) is not in
that list.
Finally I ran "ANALYZE" again and now the problem went away. Running the
query with Ver=92 uses the proper plan. I'm not happy with this - I know I
haven't solved the problem (I've ran ANALYZE multiple times before).
On Thu, 16 Jan 2020 at 19:00, Michael Lewis wrote:
> Does the behavior change with different values of Ver column? I'd be
> curious of the fraction in the MCVs frequency list in stats indicates that
> rows with Ver = 92 are rare and therefore the index on only Ver column is
> sufficient to find the rows quickly. What is reltuples for this table by
> the way?
>
> I also wonder if the situation may be helped by re-indexing the "index on
> both columns" to remove any chance of issues on bloat in the index. Which
> order are the columns by the way? If Ver is first, is there also an index
> on only id column?. Since you aren't on v12, you don't get to re-index
> concurrently but I assume you know the work around of create concurrently
> (different name), drop concurrently (old one), and finally rename new index.
>
Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters
On Thu, 2020-01-16 at 19:18 +0200, Cosmin Prund wrote:
> Indexes:
> "PK_LucrareBugetDate" PRIMARY KEY, btree ("LucrareBugetVersiuneId",
> "LucrareBugetDateId")
> "IX_LucrareBugetDate_LucrareBugetVersiuneId_LucrareBugetDateId" btree
> ("LucrareBugetVersiuneId", "LucrareBugetDateId")
> Foreign-key constraints:
> "FK_LucrareBugetDate_LucrareBugetVersiune_LucrareBugetVersiuneId" FOREIGN
> KEY ("LucrareBugetVersiuneId") REFERENCES
> "LucrareBugetVersiune"("LucrareBugetVersiuneId") ON DELETE CASCADE
>
> EXPLAIN (ANALYZE, BUFFERS) select "IdRand", "IdColoana", "Valoare" from
> "LucrareBugetDate" where ("LucrareBugetVersiuneId" = 92) and
> ("LucrareBugetDateId" in (10,11));
>
> QUERY PLAN
> ---
> Index Scan using
> "IX_LucrareBugetDate_LucrareBugetVersiuneId_LucrareBugetDateId" on
> "LucrareBugetDate" (cost=0.56..2.37 rows=1 width=13) (actual
> time=0.096..978.398 rows=2 loops=1)
>Index Cond: ("LucrareBugetVersiuneId" = 92)
>Filter: ("LucrareBugetDateId" = ANY ('{10,11}'::integer[]))
>Rows Removed by Filter: 1869178
>Buffers: shared hit=161178
> Planning time: 0.699 ms
> Execution time: 978.433 ms
Well, what should the poor thing do?
There is no index on "LucrareBugetDateId".
Rather, you have two indexes on ("LucrareBugetVersiuneId",
"LucrareBugetDateId"),
one of which should be dropped.
Try with an index on "LucrareBugetDateId".
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters
Cosmin Prund writes:
> I know it's an odd choice of plan - that's why I'm here!
Indeed. I cannot reproduce it here on 10.11:
regression=# create table bb(f1 smallint, f2 serial, primary key(f1,f2));
CREATE TABLE
regression=# explain select * from bb where f1 = 92 and f2 in (10,11);
QUERY PLAN
---
Index Only Scan using bb_pkey on bb (cost=0.15..8.34 rows=1 width=6)
Index Cond: ((f1 = 92) AND (f2 = ANY ('{10,11}'::integer[])))
(2 rows)
As I said before, as long as it chooses an indexscan at all, I wouldn't
expect variation in what clauses it chooses to use with the index.
So I don't see why this trivial example doesn't replicate your result.
If you try exactly the above on your database, do you get my result,
or a plan more like yours?
I wonder if you have some extension installed that's causing the
operators to be interpreted differently.
BTW, why do you have two identical indexes on the table?
> Indexes:
> "PK_LucrareBugetDate" PRIMARY KEY, btree ("LucrareBugetVersiuneId",
> "LucrareBugetDateId")
> "IX_LucrareBugetDate_LucrareBugetVersiuneId_LucrareBugetDateId" btree
> ("LucrareBugetVersiuneId", "LucrareBugetDateId")
That shouldn't be affecting this either, but it seems wasteful.
regards, tom lane
Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters
"Finally I ran "ANALYZE" again and now the problem went away. Running the
query with Ver=92 uses the proper plan. I'm not happy with this - I know I
haven't solved the problem (I've ran ANALYZE multiple times before)."
Does 92 appear in MCVs list with that new sampling? I wonder if
default_statistics_target should be increased a bit to help ensure a
thorough sample of the data in this table. Note- don't go too high (maybe
250, not 1000) or planning time can increase significantly. Also, perhaps
only increase on this Ver column.
What is the real frequency of value 92? With default_statistics_target =
100, analyze takes 100*300 rows as sample, and if it is missed in that 30k
rows set, or very very small when in fact it has equal weight with other
values, then during planning time it is expected to be very very rare when
in fact it is only slightly less common than the others in the list. If the
others in the list are expected to be 100% of the table as you showed with
the query to compute "frac_MCV" from pg_stats for that column, then perhaps
the optimizer is wise to scan only the LucrareBugetVersiuneId column of the
composite index and filter in memory.
Curious, when you get bad plans (re-analyze the table repeatedly to get new
samples until the wrong plan is chosen), what does PG estimate for total
rows returned with ONLY LucrareBugetVersiuneId = 92 as the where condition?
Note- Tom & Laurenz are real experts. I might have no idea what I am doing
yet. It is too early to say.
On Thu, Jan 16, 2020 at 11:15 AM Cosmin Prund wrote:
> Hello Michael and hello again Tom, sorry for mailing you directly. I just
> hit Reply in gmail - I expected the emails to have a reply-to=Pgsql.
> Apparently they do not.
>
> Running the same query with a different "Ver" produces a proper plan.
> Here's a non-redacted example (Ver=91):
>
> EXPLAIN (ANALYZE, BUFFERS) select "IdRand", "IdColoana", "Valoare" from
> "LucrareBugetDate" where ("LucrareBugetVersiuneId" = 91) and
> ("LucrareBugetDateId" in (10,11));
>
>QUERY PLAN
>
> -
> Index Scan using
> "IX_LucrareBugetDate_LucrareBugetVersiuneId_LucrareBugetDateId" on
> "LucrareBugetDate" (cost=0.56..4.95 rows=2 width=13) (actual
> time=3.617..3.631 rows=2 loops=1)
>Index Cond: (("LucrareBugetVersiuneId" = 91) AND ("LucrareBugetDateId"
> = ANY ('{10,11}'::integer[])))
>Buffers: shared hit=9 read=3
> Planning time: 0.223 ms
> Execution time: 3.663 ms
> (5 rows)
>
> I have reindex everything, not just this INDEX.
>
> "reltuples" for this table is 41712436.
>
> > I'd be curious of the fraction in the MCVs frequency list in stats
> indicates that rows with Ver = 92 are rare and therefore the index on
> only Ver column is sufficient to find the rows quickly.
>
> There are 25 valid values for "Ver" in this database. I ran the query for
> all of them. The only one miss-behaving is "92". I ran the query with
> random values for Ver (invalid values), the query plan always attempts to
> use the index using both values.
> I looked into "most_common_values" in pg_stats, this value (92) is not in
> that list.
> Finally I ran "ANALYZE" again and now the problem went away. Running the
> query with Ver=92 uses the proper plan. I'm not happy with this - I know I
> haven't solved the problem (I've ran ANALYZE multiple times before).
>
>
> On Thu, 16 Jan 2020 at 19:00, Michael Lewis wrote:
>
>> Does the behavior change with different values of Ver column? I'd be
>> curious of the fraction in the MCVs frequency list in stats indicates that
>> rows with Ver = 92 are rare and therefore the index on only Ver column is
>> sufficient to find the rows quickly. What is reltuples for this table by
>> the way?
>>
>> I also wonder if the situation may be helped by re-indexing the "index on
>> both columns" to remove any chance of issues on bloat in the index. Which
>> order are the columns by the way? If Ver is first, is there also an index
>> on only id column?. Since you aren't on v12, you don't get to re-index
>> concurrently but I assume you know the work around of create concurrently
>> (different name), drop concurrently (old one), and finally rename new index.
>>
>
Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters
Cosmin Prund writes: > Running the same query with a different "Ver" produces a proper plan. Oh, that *is* interesting. After studying the code a bit more I see why this is possible when I originally thought not. The case that you are interested in is one that has special handling -- it's a "lower-order ScalarArrayOpExpr" in the terms of the code. This means that get_index_paths will actually produce two index paths, one with the IN clause as an indexqual and one without, because it expects that they have different sort behaviors [1]. So then we do have a chance for a cost-based choice, making it possible for the estimated selectivity of the higher-order clause to affect the outcome. I'm still a bit surprised that it wouldn't choose the alternative with the IN ... but if the estimated number of rows matching just the first column is small enough, it might see the paths as having indistinguishable costs, and then it's down to luck which it chooses. > There are 25 valid values for "Ver" in this database. I ran the query for > all of them. The only one miss-behaving is "92". I ran the query with > random values for Ver (invalid values), the query plan always attempts to > use the index using both values. > I looked into "most_common_values" in pg_stats, this value (92) is not in > that list. Are the other 24 all in the list? > Finally I ran "ANALYZE" again and now the problem went away. Running the > query with Ver=92 uses the proper plan. I'm not happy with this - I know I > haven't solved the problem (I've ran ANALYZE multiple times before). Maybe increasing the stats target for the "Ver" column would help. It sounds like you want to get to a point where all the valid values are given in the MCV list, so that the estimates for them will be accurate. regards, tom lane [1] Right at the moment, it seems like that's wrong and we could just generate one path. Need to study this.
Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters
On Thu, 16 Jan 2020 at 20:20, Laurenz Albe wrote:
> Well, what should the poor thing do?
> There is no index on "LucrareBugetDateId".
>
I did add an index on "LucrareBugetDateId" (before accidentally "fixing"
the problem with ANALYZE) and it didn't help.
> Rather, you have two indexes on ("LucrareBugetVersiuneId",
> "LucrareBugetDateId"),
> one of which should be dropped.
>
One will be dropped. The second one was added out of desperation (because
it wasn't using the first one).
> Try with an index on "LucrareBugetDateId".
>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>
>
Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters
most_common_values before the ANALYZE had 22 values. After ANALYZE it has 23 values. After ANALYZE I get an entry for "92" with 0.0441333 frequency (the frequency is about right). The stats target for the "Ver" column is already at 1. I'm going to have to bring the stats target back on everything, but I'm not sure about this. The life-cycle of this table is a bit special. Once-in-a-while a new "Version" is created: 1 to 3 million records are inserted at once, all with the same Version and with sequential Id-s (re-starting from 1 with each version). The unfortunate side-effect is that I get huge clusters of records with the same "Ver". I created a script that calculates the correct "n_distinct" value for the column and repeatedly runs ANALYZE until the reported "n_distinct" value is grater then 75% of the correct number; on each loop of the script the stats target is increased by 5%. I thought this would help me find a good value for the stats target but it invariably brings the stats target all the way up to 1. Finally I have one last option: take "stats" into my own hands. Since inserting anything into those tables is such a big (but rare and well defined) event, I could simply set the stats target to ZERO and compute correct values on my own after pushing a new version. The issue here is that I don't understand the system well-enough to make this work. Hopefully I'll be able to reproduce this on a backup of the database so I can safely experiment. Until I manage to reproduce this I don't think I can make any more progress, so thank you everyone for the help. On Thu, 16 Jan 2020 at 20:55, Tom Lane wrote: > Cosmin Prund writes: > > Running the same query with a different "Ver" produces a proper plan. > > Oh, that *is* interesting. > > After studying the code a bit more I see why this is possible when I > originally thought not. The case that you are interested in is one that > has special handling -- it's a "lower-order ScalarArrayOpExpr" in the > terms of the code. This means that get_index_paths will actually produce > two index paths, one with the IN clause as an indexqual and one without, > because it expects that they have different sort behaviors [1]. So then > we do have a chance for a cost-based choice, making it possible for the > estimated selectivity of the higher-order clause to affect the outcome. > > I'm still a bit surprised that it wouldn't choose the alternative with > the IN ... but if the estimated number of rows matching just the first > column is small enough, it might see the paths as having indistinguishable > costs, and then it's down to luck which it chooses. > > > There are 25 valid values for "Ver" in this database. I ran the query for > > all of them. The only one miss-behaving is "92". I ran the query with > > random values for Ver (invalid values), the query plan always attempts to > > use the index using both values. > > I looked into "most_common_values" in pg_stats, this value (92) is not in > > that list. > > Are the other 24 all in the list? > > > Finally I ran "ANALYZE" again and now the problem went away. Running the > > query with Ver=92 uses the proper plan. I'm not happy with this - I know > I > > haven't solved the problem (I've ran ANALYZE multiple times before). > > Maybe increasing the stats target for the "Ver" column would help. It > sounds like you want to get to a point where all the valid values are > given in the MCV list, so that the estimates for them will be accurate. > > regards, tom lane > > [1] Right at the moment, it seems like that's wrong and we could just > generate one path. Need to study this. >
