Slow Bitmap Index Scan

2018-11-28 Thread Scott Rankin
Hello all,

We recently moved our production database systems from a 9.4 running on a 
self-managed EC2 instance to 9.6.10 on Amazon’s AWS (same RAM, CPU).  After the 
move, we’re finding that certain queries that we run against a GIN full-text 
index have some occasionally very slow executions and I’m struggling to figure 
out what to do about it.   I would be very grateful for any ideas!

Thanks,
Scott

The setup we have is a 32-core, 244 GB RAM primary with a same-sized read 
replica.  The queries are running off the replica, but performance is roughly 
the same between the master and the replica.

Here’s a query that’s performing badly:


SELECT ls.location AS locationId  FROM location_search ls  WHERE ls.client = 83 
 AND search_field_tsvector @@ to_tsquery('9000:* &Smith''s:* &Mill:*')  AND 
ls.favorite = TRUE  LIMIT 1

The mean time for this query (and others like it) is about 900ms, but the std 
deviation is over 1000ms and the max is 11000ms.

The explain looks like this:

Limit  (cost=1516.25..1520.52 rows=1 width=223) (actual time=4506.482..4506.482 
rows=0 loops=1)
  Buffers: shared hit=9073
  ->  Bitmap Heap Scan on location_search ls  (cost=1516.25..1520.52 rows=1 
width=223) (actual time=4506.480..4506.480 rows=0 loops=1)
Recheck Cond: (search_field_tsvector @@ to_tsquery('9000:* &Smith''s:* 
&Mill:*'::text))
Filter: (favorite AND (client = 83))
Rows Removed by Filter: 8
Heap Blocks: exact=12
Buffers: shared hit=9073
->  Bitmap Index Scan on location_search_tsvector_idx  
(cost=0.00..1516.25 rows=1 width=0) (actual time=4506.450..4506.450 rows=12 
loops=1)
  Index Cond: (search_field_tsvector @@ to_tsquery('9000:* 
&Smith''s:* &Mill:*'::text))
  Buffers: shared hit=9061
Planning time: 0.240 ms
Execution time: 4509.995 ms

The table has about 30 million rows in it.   The table and index definition are:

CREATE TABLE public.location_search
(
id bigint NOT NULL DEFAULT nextval('location_search_id_seq'::regclass),
person_location bigint,
person bigint,
client_location bigint,
client bigint,
location bigint,
org_unit_id bigint,
latitude numeric(10,7),
longitude numeric(10,7),
geofence numeric(10,7),
address_line_one text COLLATE pg_catalog."default",
address_line_two text COLLATE pg_catalog."default",
city text COLLATE pg_catalog."default",
state text COLLATE pg_catalog."default",
postal_code text COLLATE pg_catalog."default",
country text COLLATE pg_catalog."default",
full_address text COLLATE pg_catalog."default",
is_google_verified boolean,
address_source text COLLATE pg_catalog."default",
active boolean,
name character varying(255) COLLATE pg_catalog."default",
external_client_location_id character varying(500) COLLATE 
pg_catalog."default",
custom_field_values hstore,
location_tags hstore,
legacy_location_id bigint,
favorite boolean,
search_field_tsvector tsvector
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;

CREATE INDEX location_search_tsvector_idx
ON public.location_search USING gin
(search_field_tsvector)
TABLESPACE pg_default;

Right now the output of pgstatginindex is this:
version pending_pages pending_tuples
2  214 9983

Lastly, here are some of the relevant config entries:

autovacuum on
autovacuum_analyze_scale_factor 0
autovacuum_analyze_threshold 50
autovacuum_freeze_max_age 4
autovacuum_max_workers 3
autovacuum_multixact_freeze_max_age 4
autovacuum_naptime 30s
autovacuum_vacuum_cost_delay 20ms
autovacuum_vacuum_cost_limit-1
autovacuum_vacuum_scale_factor 0
autovacuum_vacuum_threshold 50
cpu_index_tuple_cost 0
cpu_operator_cost 0
cpu_tuple_cost 0
cursor_tuple_fraction 0
effective_cache_size 12584kB
effective_io_concurrency 1
gin_fuzzy_search_limit 0
gin_pending_list_limit 4MB
maintenance_work_mem 4027MB
seq_page_cost 1
shared_buffers 6288kB
work_mem 20kB


SCOTT RANKIN
VP, Technology
Motus, LLC
Two Financial Center, 60 South Street, Boston, MA 02111
617.467.1900 (O) | [email protected]

Follow us on LinkedIn | Visit us 
at motus.com


This email message contains information that Motus, LLC considers confidential 
and/or proprietary, or may later designate as confidential and proprietary. It 
is intended only for use of the individual or entity named above and should not 
be forwarded to any other persons or entities without the express consent of 
Motus, LLC, nor should it be used for any purpose other than in the course of 
any potential or actual business relationship with Motus, LLC. If the reader of 
this message is not the intended recipient, or the employee or agent 
responsible to deliver it to the intended recipient, you are hereby notified 
that any dissemination, distribution, or copying of this communication is 
strictly prohibit

Re: Slow Bitmap Index Scan

2018-11-28 Thread Justin Pryzby
On Wed, Nov 28, 2018 at 07:08:53PM +, Scott Rankin wrote:
> We recently moved our production database systems from a 9.4 running on a 
> self-managed EC2 instance to 9.6.10 on Amazon’s AWS (same RAM, CPU).  After 
> the move, we’re finding that certain queries that we run against a GIN 
> full-text index have some occasionally very slow executions and I’m 
> struggling to figure out what to do about it.   I would be very grateful for 
> any ideas!
> 
> The setup we have is a 32-core, 244 GB RAM primary with a same-sized read 
> replica.  The queries are running off the replica, but performance is roughly 
> the same between the master and the replica.
> 
> Here’s a query that’s performing badly:

Can you compare or show the explain(analyze,buffers) for a fast query instance
vs slow query instance ?  Is it slower due to index access or heap?  Due to
cache misses ?

Also, you have big ram - have you tried disabling KSM or THP ?
https://www.postgresql.org/message-id/20170718180152.GE17566%40telsasoft.com

Justin



Re: Slow Bitmap Index Scan

2018-11-28 Thread Scott Rankin

On 11/28/18, 2:18 PM, "Justin Pryzby"  wrote:

On Wed, Nov 28, 2018 at 07:08:53PM +, Scott Rankin wrote:
> We recently moved our production database systems from a 9.4 running on a 
self-managed EC2 instance to 9.6.10 on Amazon’s AWS (same RAM, CPU).  After the 
move, we’re finding that certain queries that we run against a GIN full-text 
index have some occasionally very slow executions and I’m struggling to figure 
out what to do about it.   I would be very grateful for any ideas!
>
> The setup we have is a 32-core, 244 GB RAM primary with a same-sized read 
replica.  The queries are running off the replica, but performance is roughly 
the same between the master and the replica.
>
> Here’s a query that’s performing badly:

Can you compare or show the explain(analyze,buffers) for a fast query 
instance
vs slow query instance ?  Is it slower due to index access or heap?  Due to
cache misses ?

If I reduce the number of search terms in , I get this:

SELECT ls.location AS locationId   FROM location_search ls  WHERE ls.client = 
83  AND search_field_tsvector @@ to_tsquery('9000:*'::text)  AND ls.favorite = 
TRUE  LIMIT 100

Limit  (cost=13203.99..13627.40 rows=100 width=8) (actual time=66.568..66.759 
rows=100 loops=1)
  Buffers: shared hit=1975
  ->  Bitmap Heap Scan on location_search ls  (cost=13203.99..13923.79 rows=170 
width=8) (actual time=66.568..66.729 rows=100 loops=1)
Recheck Cond: ((search_field_tsvector @@ to_tsquery('9000:*'::text)) 
AND (client = 83))
Filter: favorite
Heap Blocks: exact=86
Buffers: shared hit=1975
->  BitmapAnd  (cost=13203.99..13203.99 rows=170 width=0) (actual 
time=66.471..66.472 rows=0 loops=1)
  Buffers: shared hit=1889
  ->  Bitmap Index Scan on location_search_tsvector_idx  
(cost=0.00..2235.02 rows=11570 width=0) (actual time=20.603..20.604 rows=29155 
loops=1)
Index Cond: (search_field_tsvector @@ 
to_tsquery('9000:*'::text))
Buffers: shared hit=546
  ->  Bitmap Index Scan on location_search_client_idx  
(cost=0.00..10968.63 rows=442676 width=0) (actual time=40.682..40.682 
rows=482415 loops=1)
Index Cond: (client = 83)
Buffers: shared hit=1343
Planning time: 0.181 ms
Execution time: 66.806 ms

I see almost no IO reads, and pg_stat_statements shows no cache misses.

Also, you have big ram - have you tried disabling KSM or THP ?
https://www.postgresql.org/message-id/20170718180152.GE17566%40telsasoft.com

Since this is Amazon RDS, we don't have any control over or access to the 
underlying OS.  I know that huge_page support is on for these instances.  I 
would hope that Amazon's already done that...

Justin



This email message contains information that Motus, LLC considers confidential 
and/or proprietary, or may later designate as confidential and proprietary. It 
is intended only for use of the individual or entity named above and should not 
be forwarded to any other persons or entities without the express consent of 
Motus, LLC, nor should it be used for any purpose other than in the course of 
any potential or actual business relationship with Motus, LLC. If the reader of 
this message is not the intended recipient, or the employee or agent 
responsible to deliver it to the intended recipient, you are hereby notified 
that any dissemination, distribution, or copying of this communication is 
strictly prohibited. If you have received this communication in error, please 
notify sender immediately and destroy the original message.

Internal Revenue Service regulations require that certain types of written 
advice include a disclaimer. To the extent the preceding message contains 
advice relating to a Federal tax issue, unless expressly stated otherwise the 
advice is not intended or written to be used, and it cannot be used by the 
recipient or any other taxpayer, for the purpose of avoiding Federal tax 
penalties, and was not written to support the promotion or marketing of any 
transaction or matter discussed herein.


Re: Query with high planning time at version 11.1 compared versions 10.5 and 11.0

2018-11-28 Thread Justin Pryzby
On Wed, Nov 28, 2018 at 05:03:15PM +1300, David Rowley wrote:
> On Wed, 28 Nov 2018 at 03:16, Sanyo Moura  wrote:
> > 11.0
> > Planning Time: 7.238 ms
> > Planning Time: 2.638 ms
> >
> > 11.5
> > Planning Time: 15138.533 ms
> > Execution Time: 2.310 ms
> 
> Does it still take that long after running ANALYZE on the partitioned table?

Note, I'm sure 11.5 was meant to say 11.1.

Also note this earlier message indicates that "high partitions" tests were with
just 10.6 and 11.1, and that times under 11.0 weren't a useful datapoint:

On Tue, Nov 27, 2018 at 11:36:09PM -0200, Sanyo Moura wrote:
> However, in the test I did in version 11.0, "Precio" is partitioned into
> only 21 partitions.

I reduced the query a bit further:

|postgres=# explain SELECT m-n FROM (SELECT a.i2-b.i2 n FROM partbench a, 
partbench b WHERE a.i2=b.i2) x, (SELECT max(partbench.i2) m FROM partbench)y 
WHERE m=n;
|Time: 35182.536 ms (00:35.183)

I should have said, that's with only 1k partitions, not 10k as you used in
June.

I also tried doing what the query seems to be aiming for by using a window
function, but that also experiences 30+ sec planning time:

|explain SELECT rank() OVER(ORDER BY var) AS k FROM (SELECT 
p.plusalesprice-q.plusalesprice as var from precio p, precio q ) l_variacao
|Time: 34173.401 ms (00:34.173)

Justin



Re: Query with high planning time at version 11.1 compared versions 10.5 and 11.0

2018-11-28 Thread Sanyo Moura
Em qua, 28 de nov de 2018 às 22:40, Justin Pryzby 
escreveu:

> On Wed, Nov 28, 2018 at 05:03:15PM +1300, David Rowley wrote:
> > On Wed, 28 Nov 2018 at 03:16, Sanyo Moura  wrote:
> > > 11.0
> > > Planning Time: 7.238 ms
> > > Planning Time: 2.638 ms
> > >
> > > 11.5
> > > Planning Time: 15138.533 ms
> > > Execution Time: 2.310 ms
> >
> > Does it still take that long after running ANALYZE on the partitioned
> table?
>
> Note, I'm sure 11.5 was meant to say 11.1.
>

Yeah, 11.1, sorry for mistake.


>
> Also note this earlier message indicates that "high partitions" tests were
> with
> just 10.6 and 11.1, and that times under 11.0 weren't a useful datapoint:
>

That's true, at 11.0 version I had tested with only 21 partitions because
by this time I didn't have
realized that it was an issue with a huge number of partitions.
In both versions 10.6 and 11.1 I have tested with 730 partitions each
(2 years of data partitioned by day).

Sanyo


>
> On Tue, Nov 27, 2018 at 11:36:09PM -0200, Sanyo Moura wrote:
> > However, in the test I did in version 11.0, "Precio" is partitioned into
> > only 21 partitions.
>
> I reduced the query a bit further:
>
> |postgres=# explain SELECT m-n FROM (SELECT a.i2-b.i2 n FROM partbench a,
> partbench b WHERE a.i2=b.i2) x, (SELECT max(partbench.i2) m FROM
> partbench)y WHERE m=n;
> |Time: 35182.536 ms (00:35.183)
>
> I should have said, that's with only 1k partitions, not 10k as you used in
> June.
>
> I also tried doing what the query seems to be aiming for by using a window
> function, but that also experiences 30+ sec planning time:
>
> |explain SELECT rank() OVER(ORDER BY var) AS k FROM (SELECT
> p.plusalesprice-q.plusalesprice as var from precio p, precio q ) l_variacao
> |Time: 34173.401 ms (00:34.173)
>
> Justin
>