Slow "not in array" operation

2019-11-12 Thread Marco Colli
I have a large table with millions of rows. Each row has an array field
"tags". I also have the proper GIN index on tags.

Counting the rows that have a tag is fast (~7s):
SELECT COUNT(*) FROM "subscriptions" WHERE (tags @> ARRAY['t1']::varchar[]);

However counting the rows that don't have a tag is extremely slow (~70s):
SELECT COUNT(*) FROM "subscriptions" WHERE NOT (tags @>
ARRAY['t1']::varchar[]);

I have also tried other variants, but with the same results (~70s):
SELECT COUNT(*) FROM "subscriptions" WHERE NOT ('t1' = ANY (tags));

How can I make the "not in array" operation fast?

Any help would be appreciated, thank you!
Marco Colli

PostgreSQL 11 on Ubuntu 18LTS


Re: Slow "not in array" operation

2019-11-12 Thread Michael Lewis
What's the plan for the slow one? What's the time to just count all rows?

>


Re: Slow "not in array" operation

2019-11-12 Thread Marco Colli
To be honest, I have simplified the question above. In order to show you
the plan, I must show you the actual query, which is this:

=== QUERY ===

SELECT COUNT(*) FROM "subscriptions" WHERE "subscriptions"."project_id" =
123 AND "subscriptions"."trashed_at" IS NULL AND NOT (tags @>
ARRAY['en']::varchar[]);


=== QUERY PLAN ===

  QUERY
PLAN

--

 Finalize Aggregate  (cost=2152593.04..2152593.05 rows=1 width=8) (actual
time=70555.561..70555.561 rows=1 loops=1)

   ->  Gather  (cost=2152592.31..2152593.02 rows=7 width=8) (actual
time=70540.641..70702.365 rows=8 loops=1)

 Workers Planned: 7

 Workers Launched: 7

 ->  Partial Aggregate  (cost=2151592.31..2151592.32 rows=1
width=8) (actual time=70537.376..70537.377 rows=1 loops=8)

   ->  Parallel Seq Scan on subscriptions
(cost=0.00..2149490.49 rows=840731 width=0) (actual time=0.742..70479.359
rows=611828 loops=8)

 Filter: ((trashed_at IS NULL) AND (NOT (tags @>
'{en}'::character varying[])) AND (project_id = 123))

 Rows Removed by Filter: 4572769

 Planning Time: 1.304 ms

 Execution Time: 70702.463 ms

(10 rows)


=== INDEXES ===


Indexes:

"subscriptions_pkey" PRIMARY KEY, btree (id)

"index_subscriptions_on_project_id_and_created_at" btree (project_id,
created_at DESC)

"index_subscriptions_on_project_id_and_tags" gin (project_id, tags)
WHERE trashed_at IS NULL

"index_subscriptions_on_project_id_and_trashed_at" btree (project_id,
trashed_at DESC)

=== NOTES ===

Running the query without the last filter on tags takes only 500ms.
Unfortunately I cannot make strict assumptions on data or tags: for example
I also have to count subscriptions in a project that don't have tag A and
don't have tag B, etc. This means that I cannot simply calculate the total
and then make a subtraction.

On Tue, Nov 12, 2019 at 7:40 PM Michael Lewis  wrote:

> What's the plan for the slow one? What's the time to just count all rows?
>
>>


Re: Slow "not in array" operation

2019-11-12 Thread Michael Lewis
It is very interesting to me that the optimizer chose a parallel sequential
scan rather than an index scan on either of your indexes that start
with project_id that also reference trashed_at.

1) Are you running on SSD type storage? Has random_page_cost been lowered
to 1-1.5 or so (close to 1 assumes good cache hits)?
2) It seems you have increased parallel workers. Have you also changed the
startup or other cost configs related to how inclined the system is to use
sequential scans?
3) If you disable sequential scan, what does the plan look like for this
query? (SET ENABLE_SEQSCAN TO OFF;)

>


Re: Slow "not in array" operation

2019-11-12 Thread Justin Pryzby
On Tue, Nov 12, 2019 at 12:20:10PM -0700, Michael Lewis wrote:
> It is very interesting to me that the optimizer chose a parallel sequential
> scan rather than an index scan on either of your indexes that start
> with project_id that also reference trashed_at.

Maybe because of low correlation on any of those columns?
https://wiki.postgresql.org/wiki/Slow_Query_Questions#Statistics:_n_distinct.2C_MCV.2C_histogram
SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, 
attname, inherited, null_frac, n_distinct, array_length(most_common_vals,1) 
n_mcv, array_length(histogram_bounds,1) n_hist, correlation FROM pg_stats WHERE 
tablename='subscriptions' AND  attname IN ('project_id','tags') ORDER BY 1 
DESC; 

Maybe clustering the table on project_id (and ANALYZEing) would help, but that
might need to be done consistently.

Michael previously suggested partitioning which, if done on project_id,
would then no longer need to be specially CLUSTERed.

Is the plan for the fast query the same as in August ?
https://www.postgresql.org/message-id/CAFvCgN4UijKTYiOF61Tyd%2BgHvF_oqnMabatS9%2BDcX%2B_PK2SHRw%40mail.gmail.com

Justin




Re: Slow "not in array" operation

2019-11-12 Thread Marco Colli
1) It is running on a DigitalOcean CPU-optimized droplet with dedicated
hyperthreads (16 cores) and SSD.
SHOW random_page_cost; => 2

2) What config names should I check exactly? I used some suggestions from
the online PGTune, when I first configured the db some months ago:
max_worker_processes = 16
max_parallel_workers_per_gather = 8
max_parallel_workers = 16

3) Here's the query plan that I get after disabling the seq scan:


QUERY PLAN


---

 Finalize Aggregate  (cost=2183938.89..2183938.90 rows=1 width=8) (actual
time=94972.253..94972.254 rows=1 loops=1)

   ->  Gather  (cost=2183938.16..2183938.87 rows=7 width=8) (actual
time=94952.895..95132.626 rows=8 loops=1)

 Workers Planned: 7

 Workers Launched: 7

 ->  Partial Aggregate  (cost=2182938.16..2182938.17 rows=1
width=8) (actual time=94950.958..94950.958 rows=1 loops=8)

   ->  Parallel Bitmap Heap Scan on subscriptions
(cost=50294.50..2180801.47 rows=854677 width=0) (actual
time=1831.342..94895.208 rows=611828 loops=8)

 Recheck Cond: ((project_id = 123) AND (trashed_at IS
NULL))

 Rows Removed by Index Recheck: 2217924

 Filter: (NOT (tags @> '{en}'::character varying[]))

 Rows Removed by Filter: 288545

 Heap Blocks: exact=120301 lossy=134269

 ->  Bitmap Index Scan on
index_subscriptions_on_project_id_and_tags  (cost=0.00..48798.81
rows=6518094 width=0) (actual time=1493.823..1493.823 rows=7203173 loops=1)

   Index Cond: (project_id = 123)

 Planning Time: 1.273 ms

 Execution Time: 95132.766 ms

(15 rows)


On Tue, Nov 12, 2019 at 8:20 PM Michael Lewis  wrote:

> It is very interesting to me that the optimizer chose a parallel
> sequential scan rather than an index scan on either of your indexes that
> start with project_id that also reference trashed_at.
>
> 1) Are you running on SSD type storage? Has random_page_cost been lowered
> to 1-1.5 or so (close to 1 assumes good cache hits)?
> 2) It seems you have increased parallel workers. Have you also changed the
> startup or other cost configs related to how inclined the system is to use
> sequential scans?
> 3) If you disable sequential scan, what does the plan look like for this
> query? (SET ENABLE_SEQSCAN TO OFF;)
>
>>


Re: Slow "not in array" operation

2019-11-12 Thread Michael Lewis
Odd index choice by the optimizer given what is available. The bitmap being
lossy means more work_mem is needed if I remember properly.

It is interesting that skipping the where condition on the array is only
half a second. Is the array being toasted or is it small and being stored
in the same file as primary table?

What is the result for this count query? Is it roughly 4 million?


On Tue, Nov 12, 2019, 1:06 PM Marco Colli  wrote:

> 1) It is running on a DigitalOcean CPU-optimized droplet with dedicated
> hyperthreads (16 cores) and SSD.
> SHOW random_page_cost; => 2
>
> 2) What config names should I check exactly? I used some suggestions from
> the online PGTune, when I first configured the db some months ago:
> max_worker_processes = 16
> max_parallel_workers_per_gather = 8
> max_parallel_workers = 16
>
> 3) Here's the query plan that I get after disabling the seq scan:
>
>
>   QUERY PLAN
>
>
>
> ---
>
>  Finalize Aggregate  (cost=2183938.89..2183938.90 rows=1 width=8) (actual
> time=94972.253..94972.254 rows=1 loops=1)
>
>->  Gather  (cost=2183938.16..2183938.87 rows=7 width=8) (actual
> time=94952.895..95132.626 rows=8 loops=1)
>
>  Workers Planned: 7
>
>  Workers Launched: 7
>
>  ->  Partial Aggregate  (cost=2182938.16..2182938.17 rows=1
> width=8) (actual time=94950.958..94950.958 rows=1 loops=8)
>
>->  Parallel Bitmap Heap Scan on subscriptions
> (cost=50294.50..2180801.47 rows=854677 width=0) (actual
> time=1831.342..94895.208 rows=611828 loops=8)
>
>  Recheck Cond: ((project_id = 123) AND (trashed_at IS
> NULL))
>
>  Rows Removed by Index Recheck: 2217924
>
>  Filter: (NOT (tags @> '{en}'::character varying[]))
>
>  Rows Removed by Filter: 288545
>
>  Heap Blocks: exact=120301 lossy=134269
>
>  ->  Bitmap Index Scan on
> index_subscriptions_on_project_id_and_tags  (cost=0.00..48798.81
> rows=6518094 width=0) (actual time=1493.823..1493.823 rows=7203173 loops=1)
>
>Index Cond: (project_id = 123)
>
>  Planning Time: 1.273 ms
>
>  Execution Time: 95132.766 ms
>
> (15 rows)
>
>
> On Tue, Nov 12, 2019 at 8:20 PM Michael Lewis  wrote:
>
>> It is very interesting to me that the optimizer chose a parallel
>> sequential scan rather than an index scan on either of your indexes that
>> start with project_id that also reference trashed_at.
>>
>> 1) Are you running on SSD type storage? Has random_page_cost been lowered
>> to 1-1.5 or so (close to 1 assumes good cache hits)?
>> 2) It seems you have increased parallel workers. Have you also changed
>> the startup or other cost configs related to how inclined the system is to
>> use sequential scans?
>> 3) If you disable sequential scan, what does the plan look like for this
>> query? (SET ENABLE_SEQSCAN TO OFF;)
>>
>>>


Re: Slow "not in array" operation

2019-11-12 Thread Tom Lane
Marco Colli  writes:
> 3) Here's the query plan that I get after disabling the seq scan:

>  Finalize Aggregate  (cost=2183938.89..2183938.90 rows=1 width=8) (actual
> time=94972.253..94972.254 rows=1 loops=1)

So, this is slower than the seqscan, which means the planner made the
right choice.

You seem to be imagining that there's some way the index could be used
with the NOT clause, but there isn't.  Indexable clauses are of the form
indexed_column indexable_operator constant
and there's no provision for a NOT in that.  If we had a "not contained
in" array operator, the NOT could be folded to be of this syntactic form,
but it's highly unlikely that any index operator class would consider such
an operator to be a supported indexable operator.  It doesn't lend itself
to searching an index.

So the planner is doing the best it can, which in this case is a
full-table scan.

A conceivable solution, if the tags array is a lot smaller than
the table as a whole and the table is fairly static, is that you could
make a btree index on the tags array and let the planner fall back
to an index-only scan that is just using the index as a cheaper
source of the array data.  (This doesn't work for your existing GIST
index because GIST can't reconstruct the original arrays on-demand.)
I suspect though that this wouldn't win much, even if you disregard
the maintenance costs for the extra index.  The really fundamental
problem here is that a large fraction of the table satisfies the
NOT-in condition, and no index is going to beat a seqscan by all that
much when that's true.  Indexes are good at retrieving small portions
of tables.

regards, tom lane




Re: Slow "not in array" operation

2019-11-12 Thread Marco Colli
I am not a PostgreSQL expert, however I think that the following
algorithm should be possible and fast:

1. find the bitmap of all subscriptions in a project that are not trashed
(it can use the index and takes only ~500ms)
2. find the bitmap of all subscriptions that match the above condition and
HAVE the tag (~7s)
3. calculate [bitmap 1] - [bitmap 2] to find the subscriptions of the
project that DON'T HAVE the tag



On Tue, Nov 12, 2019 at 9:50 PM Tom Lane  wrote:

> Marco Colli  writes:
> > 3) Here's the query plan that I get after disabling the seq scan:
>
> >  Finalize Aggregate  (cost=2183938.89..2183938.90 rows=1 width=8) (actual
> > time=94972.253..94972.254 rows=1 loops=1)
>
> So, this is slower than the seqscan, which means the planner made the
> right choice.
>
> You seem to be imagining that there's some way the index could be used
> with the NOT clause, but there isn't.  Indexable clauses are of the form
> indexed_column indexable_operator constant
> and there's no provision for a NOT in that.  If we had a "not contained
> in" array operator, the NOT could be folded to be of this syntactic form,
> but it's highly unlikely that any index operator class would consider such
> an operator to be a supported indexable operator.  It doesn't lend itself
> to searching an index.
>
> So the planner is doing the best it can, which in this case is a
> full-table scan.
>
> A conceivable solution, if the tags array is a lot smaller than
> the table as a whole and the table is fairly static, is that you could
> make a btree index on the tags array and let the planner fall back
> to an index-only scan that is just using the index as a cheaper
> source of the array data.  (This doesn't work for your existing GIST
> index because GIST can't reconstruct the original arrays on-demand.)
> I suspect though that this wouldn't win much, even if you disregard
> the maintenance costs for the extra index.  The really fundamental
> problem here is that a large fraction of the table satisfies the
> NOT-in condition, and no index is going to beat a seqscan by all that
> much when that's true.  Indexes are good at retrieving small portions
> of tables.
>
> regards, tom lane
>


Re: Slow "not in array" operation

2019-11-12 Thread Jeff Janes
>
>
> 3) Here's the query plan that I get after disabling the seq scan:
>
>
>   QUERY PLAN
>
>
>
> ---
>
>  Finalize Aggregate  (cost=2183938.89..2183938.90 rows=1 width=8) (actual
> time=94972.253..94972.254 rows=1 loops=1)
>
>->  Gather  (cost=2183938.16..2183938.87 rows=7 width=8) (actual
> time=94952.895..95132.626 rows=8 loops=1)
>
>  Workers Planned: 7
>
>  Workers Launched: 7
>
>  ->  Partial Aggregate  (cost=2182938.16..2182938.17 rows=1
> width=8) (actual time=94950.958..94950.958 rows=1 loops=8)
>
>->  Parallel Bitmap Heap Scan on subscriptions
> (cost=50294.50..2180801.47 rows=854677 width=0) (actual
> time=1831.342..94895.208 rows=611828 loops=8)
>
>  Recheck Cond: ((project_id = 123) AND (trashed_at IS
> NULL))
>
>  Rows Removed by Index Recheck: 2217924
>
>  Filter: (NOT (tags @> '{en}'::character varying[]))
>
>  Rows Removed by Filter: 288545
>
>  Heap Blocks: exact=120301 lossy=134269
>
>  ->  Bitmap Index Scan on
> index_subscriptions_on_project_id_and_tags  (cost=0.00..48798.81
> rows=6518094 width=0) (actual time=1493.823..1493.823 rows=7203173 loops=1)
>
>Index Cond: (project_id = 123)
>
>  Planning Time: 1.273 ms
>
>  Execution Time: 95132.766 ms
>
> (15 rows)
>

What was the plan for the one that took 500ms?  I don't see how it is
possible that this one is 180 times slower than that one.  Maybe a hot
cache versus cold cache?  Also, it seems weird to me that "trashed_at IS
NULL" shows up in the recheck but not in the original  Index Cond.
Increasing work_mem can also help, but since the  Bitmap Index Scan itself
took half the time there is only so much it can do.

Cheers,

Jeff