Re: GIN JSONB path index is not always used

2023-10-23 Thread Tomasz Szymański
Sorry for missing analyze and buffers, we did only had these plans at the time, 
providing ones performed with such:

When it does us an index:
--+
Limit  (cost=255.29..329.26 rows=21 width=0) (actual time=8.023..8.025 rows=1 
loops=1)
  Buffers: shared hit=54 read=6 
  
  I/O Timings: read=7.094   
  
  ->  Bitmap Heap Scan on account_user  (cost=255.29..16293.12 rows=4553 
width=0) (actual time=8.022..8.023 rows=1 loops=1)   
Recheck Cond: (private_metadata @> '{"somekey": "somevalue"}'::jsonb)   
 
Heap Blocks: exact=2
  
Buffers: shared hit=54 read=6   
  
I/O Timings: read=7.094 
  
->  Bitmap Index Scan on user_p_meta_idx  (cost=0.00..254.15 rows=4553 
width=0) (actual time=7.985..7.985 rows=2 loops=1) |
  Index Cond: (private_metadata @> '{"somekey": 
"somevalue"}'::jsonb)|
  Buffers: shared hit=52 read=6 
  
  I/O Timings: read=7.094   
  
Planning Time: 1.134 ms 
  
Execution Time: 8.065 ms
  
--+

When it does not:
--+
 Limit  (cost=0.00..1184.30 rows=21 width=4) (actual time=1567.136..1619.956 
rows=1 loops=1)
   Buffers: shared hit=199857
   ->  Seq Scan on account_user  (cost=0.00..256768.27 rows=4553 width=4) 
(actual time=1567.135..1619.953 rows=1 loops=1)
 Filter: (private_metadata @> '{"somekey": "somevalue"}'::jsonb)
 Rows Removed by Filter: 4592408
 Buffers: shared hit=199857
 Planning Time: 0.072 ms
 Execution Time: 1619.972 ms
--+

> Should we assume that not using the index is much slower (otherwise, why 
> would you be asking the question?)?
Yes, the issue is the sequence scan being expensive and slow.






Re: GIN JSONB path index is not always used

2023-10-23 Thread Jeff Janes
On Mon, Oct 23, 2023 at 6:33 AM Tomasz Szymański  wrote:


>  Limit  (cost=0.00..1184.30 rows=21 width=4) (actual
> time=1567.136..1619.956 rows=1 loops=1)
>->  Seq Scan on account_user  (cost=0.00..256768.27 rows=4553 width=4)
> (actual time=1567.135..1619.953 rows=1 loops=1)
>


It thinks the seq scan will stop 99.5% early, after finding 21 out of 4553
qualifying tuples.  But instead it has to read the entire table to actually
find only 1.

The selectivity estimate of the @> operator has been substantially improved
in v13.  It is still far from perfect, but should be good enough to solve
this problem for this case and most similar cases.  Turning off fastupdate
on the index would probably also solve the problem, for a different reason.

Cheers,

Jeff