Help with row estimate problem
Hi all, I'm trying to come up with an efficient query, ideally without having to coerce the planner into doing what I want, but I'm running up against a problem with row estimates, and I'm curious if there's a better way to address the problem than what I've come up with. Here's a straightforward version of the query: SELECT v.patient_class, count( DISTINCT v.id ) AS num_visits FROM patient_visits AS v JOIN patients AS p ON p.id = v.patient_id JOIN members AS m ON m.patient_id = p.id JOIN user_populations AS up ON up.population_id = m.population_id JOIN users AS u ON u.id = up.user_id JOIN sponsor_populations AS sp ON sp.population_id = m.population_id AND sp.sponsor_id = u.sponsor_id WHERE u.id = 3962 AND v.start_on < ( m.last_activity_on + sp.authz_expiration_interval ) AND v.end_on IS NULL GROUP BY v.patient_class; The plan looks like this: - GroupAggregate (cost=5666.32..5666.42 rows=5 width=12) (actual time=22239.478..22244.339 rows=5 loops=1) Group Key: v.patient_class Buffers: shared hit=20466857 -> Sort (cost=5666.32..5666.34 rows=7 width=12) (actual time=22236.049..22239.030 rows=50988 loops=1) Sort Key: v.patient_class, v.id Sort Method: quicksort Memory: 3528kB Buffers: shared hit=20466857 -> Nested Loop (cost=2.94..5666.22 rows=7 width=12) (actual time=1.759..22203.441 rows=50988 loops=1) Join Filter: (v.patient_id = p.id) Buffers: shared hit=20466857 -> Nested Loop (cost=2.37..5659.62 rows=11 width=28) (actual time=1.743..21892.354 rows=50988 loops=1) Join Filter: (v.start_on < (m.last_activity_on + sp.authz_expiration_interval)) Rows Removed by Join Filter: 19368 Buffers: shared hit=20204287 -> Nested Loop (cost=1.80..3616.14 rows=2285 width=28) (actual time=0.065..3322.440 rows=2968729 loops=1) Join Filter: (m.population_id = up.population_id) Buffers: shared hit=2053968 -> Nested Loop (cost=1.11..11.73 rows=1 width=73) (actual time=0.041..0.286 rows=9 loops=1) Join Filter: (u.sponsor_id = sp.sponsor_id) Buffers: shared hit=67 -> Nested Loop (cost=0.70..9.09 rows=1 width=95) (actual time=0.028..0.164 rows=9 loops=1) Buffers: shared hit=31 -> Index Only Scan using index_user_populations_on_user_id_and_population_id on user_populations up (cost=0.42..1.57 rows=3 width=36) (actual time=0.015..0.025 rows=9 loops=1) Index Cond: (user_id = 3962) Heap Fetches: 0 Buffers: shared hit=4 -> Index Scan using index_sponsor_populations_on_population_id_and_sponsor_id on sponsor_populations sp (cost=0.28..2.50 rows=1 width=59) (actual time=0.011..0.012 rows=1 loops=9) Index Cond: (population_id = up.population_id) Buffers: shared hit=27 -> Index Scan using users_pkey on users u (cost=0.41..2.63 rows=1 width=24) (actual time=0.011..0.011 rows=1 loops=9) Index Cond: (id = 3962) Buffers: shared hit=36 -> Index Only Scan using index_members_on_population_id on members m (cost=0.70..3173.12 rows=34503 width=40) (actual time=0.023..324.991 rows=329859 loops=9) Index Cond: (population_id = sp.population_id) Heap Fetches: 2729516 Buffers: shared hit=2053901 -> Index Scan using idx_new_patient_visits_unique on patient_visits v (cost=0.57..0.88 rows=1 width=24) (actual time=0.006..0.006 rows=0 loops=2968729) Index Cond: (patient_id = m.patient_id) Filter: (end_on IS NULL) Rows Removed by Filter: 4 Buffers: shared hit=18150319 -> Index Only Scan using patients_pkey on patients p (cost=0.57..0.59 rows=1 width=8) (actual time=0.006..0.006 rows=1 loops=50988) Index Cond: (id = m.patient_id) Heap Fetches: 12947 Buffers: shared hit=262570 Settings: effective_cache_size = '372GB', max_parallel_workers = '1
Re: Help with row estimate problem
On Tue, Jul 30, 2024 at 11:34 AM Andrei Lepikhov wrote: > > Thanks for report. I see such cases frequently enough and the key > problem here is data skew, as you already mentioned. Extended statistics > doesn't help here. Also, because we can't estimate specific values > coming from the outer NestLoop - we can't involve MCV to estimate > selectivity of the population. That's the reason why the optimiser uses > ndistinct value. > What you can do? I see only one option - split the table to some > partitions where data will be distributed more or less uniformly. And > invent a criteria for pruning unnecessary partitions. > Of course, you can also try pg_hint_plan and force planner to use > MergeJoin or HashJoin in that suspicious case. Thanks for the reply, Andrei, and for the advice about partitioning. - Jon
Why a bitmap scan in this case?
I'm trying to speed up a particular query, so I tried out a very specific index meant to target this one query alone. (I'm not at all convinced that's a good idea, but I'm curious to see just how fast I can make this one query.) The index is like this: create index idx_foo on my_tbl (start_on) where end_on is null and bar_id is null and population_kind = 2; The query needs to find rows that are before a certain start_on date where all of the `where` conditions listed in the index are satisfied. The query planner insists on using the index to do a bitmap scan: ``` db=# explain select start_on from my_tbl where end_on is null and bar_id is null and population_kind = 2 and start_on < '2024-11-07'; QUERY PLAN Bitmap Heap Scan on my_tbl (cost=1116.28..329938.13 rows=317919 width=4) Recheck Cond: ((start_on < '2024-11-07'::date) AND (end_on IS NULL) AND (bar_id IS NULL) AND (population_kind = 2)) -> Bitmap Index Scan on idx_foo (cost=0.00..1036.81 rows=317919 width=0) Index Cond: (start_on < '2024-11-07'::date) JIT: Functions: 2 Options: Inlining false, Optimization false, Expressions true, Deforming true (7 rows) ``` Why wouldn't it do an index (or, really, an index only) scan in this case, given that the index contains exactly the data that the query needs? I'm running PG 16.4. - Jon
Re: Why a bitmap scan in this case?
On Thu, Dec 19, 2024 at 2:09 PM Jon Zeppieri wrote: > > The row estimate is not good. The query estimates 317919 rows but > there are only 27701. There is some correlation here; if end_on is > null, start_on is a lot more likely to be recent, so maybe extended > statistics would be useful here. > Though, given that the index only contains rows where end_on is null, it seems odd that the planner would estimate more rows than are present in the index. That said, I have no idea whether the planner uses that sort of information. -J
Re: Why a bitmap scan in this case?
On Thu, Dec 19, 2024 at 1:39 PM Greg Sabino Mullane wrote: >> >> Why wouldn't it do an index (or, really, an index only) scan in this case > > > Well, it did do an index scan (and a bitmap scan is a pretty good solution > here), but as to why no indexonly scan, there is probably not enough > assurance that it won't have to hit the heap heavily anyway. Try doing a SET > enable_bitmapscan=0; and re-run with EXPLAIN ANALYZE. If you see a large > number of "Heap Fetches", that could be why. Vacuum the table and try again > after doing SET enable_bitmapscan=1; > The table is freshly vacuumed. If I disable bitmap scans, it will do an index only scan, which performs better. For the bitmap heap scan, it says "Heap Blocks: exact=27393," whereas for the index only scan, it's "Heap Fetches: 27701." The row estimate is not good. The query estimates 317919 rows but there are only 27701. There is some correlation here; if end_on is null, start_on is a lot more likely to be recent, so maybe extended statistics would be useful here. - Jon
Re: Why a bitmap scan in this case?
On Fri, Dec 20, 2024 at 4:57 AM Frédéric Yhuel wrote: > > > > On 12/20/24 09:16, Frédéric Yhuel wrote: > > > > > > On 12/19/24 20:09, Jon Zeppieri wrote: > >> The table is freshly vacuumed. If I disable bitmap scans, it will do > >> an index only scan, which performs better. For the bitmap heap scan, > >> it says "Heap Blocks: exact=27393," whereas for the index only scan, > >> it's "Heap Fetches: 27701." > > > > So you have 100% heap fetches. Are you sure that your table is freshly > > vacuumed? Please note that VACUUM FULL doesn't create the visibility > > map, so you still have to run a plain VACUUM for this. Ah, thanks -- I didn't know that. -J
