Re: Bitmap heap scan performance

2019-08-12 Thread Rob Emery
Aha!

That's a great hint, we had that set down to an obscenely low value
due to our max_connections setting being quite high. I've tweaked it
back up to 4MB for now and it's definitely had a marked improvement!

Many Thanks,
Rob

On 09/08/2019, Jeff Janes  wrote:
> On Fri, Aug 9, 2019 at 4:42 AM Rob Emery  wrote:
>
>
>>
>> It
>> seems to me like the Bitmap Heap Scan on proposal is the issue because
>> the recheck is throwing away enormous amounts of data.
>
>
> Have you tried increasing work_mem?  The probable reason for the recheck is
> that your bitmap overflows the allowed memory, and then switches
> from storing every tid to storing just the block numbers.  As indicated by
> the lossy part of "Heap Blocks: exact=3983 lossy=27989"
>
> The
>> has_been_anonymised flag on the proposal is effectively a soft-delete;
>> so I’ve tried adding something like :
>>
>> CREATE INDEX ON proposal.proposal (system_id, legacy_organisation_id,
>> reference)
>> WHERE has_been_anonymised = false;
>>
>> Which I was hoping would shrink the size of the index significantly
>>
>
> The partial index should be smaller, but when comparing to the index with
> "has_been_anonymised" as the leading column, it won't make a lot of
> difference.  You only have to scan a smaller part of the larger index, and
> the sizes of part of the index you have to scan in each case will be
> roughly comparable.
>
>
>> and encourage an index scan rather than bitmap, however it didn’t have
>> that effect.
>
>
> To encourage index scans over bitmap scans, you can increase
> effective_cache_size.  Or to really force the issue, you can "set
> enable_bitmapscan=off" but that is something you would usually do locally
> for experimental purposes, not do it in production's config settings.
>
> Cheers,
>
> Jeff
>


-- 
Robert Emery
Infrastructure Director

E: [email protected] | T: 01785 711633 | W: www.codeweavers.net

-- 
 


A big Get Focused ‘thank you’ 

Why you 
should partner with an Agile company 



*
*
*Phone:* 0800 021 0888   Email: [email protected] 

Codeweavers Ltd | Barn 4 | Dunston 
Business Village | Dunston | ST18 9AB
Registered in England and Wales No. 
04092394 | VAT registration no. 974 9705 63 



 
  
  





Re: Bitmap heap scan performance

2019-08-12 Thread Jeremy Finzel
>
> Presumably I could partition proposal on has_been_anonymised, however
> the row counts seem low enough that it feels a bit like overkill? We
> also need referential integrity so I'll need to wait until that's in
> (I think it's coming in PG12?)
>
> If I decrease the number of legacy_organisation_id’s that are being
> used then the query performance gets much better, but presumably
> that’s because there’s a smaller dataset.
>

What are the actual counts that your queries are returning?

For your first query at least, are you sure your issue is not simply that
you have no index on proposal.proposal.reference?  Because the entry_time
filter is highly selective (and that part of the query only took 180ms), I
would think the planner would first filter on the note table, then join
back to proposal.proposal using an index scan on reference.  But you have
no index there.  You might even consider an index on (reference) WHERE
has_been_anonymised = false?

Also, one of your challenges seems to be that all of your indexed fields
are low cardinality.  Rather than partitioning on has_been_anonymised,
perhaps you could consider partitioning on system_id and sub-partition on
legacy_organisation_id?  It depends on if your attached queries are always
the standard pattern or not though.  This is something you might play
around with.

Another option is to try yet further specificity in your partial index
conditions, and also to only then index your primary key.  For example:

CREATE INDEX ON proposal.proposal (id)
WHERE has_been_anonymised = false AND system_id = 11;

I'm curious if any of these ideas would make a difference.

Thanks,
Jeremy


Planner performance in partitions

2019-08-12 Thread Piotr Włodarczyk
Hello,

We have partitioned tables in two levels. Both stages are partitioned in
ranges method. We see that planner and executor time was 10 time slower when
we asked main table rather than partitioned. My question is did planner and
executor are working optimal? I have doubts about it. Let's consider that
situation, because I think that in some ways (I think in more cases) planner
shoudl be more optimal. When we have table partitioned by key which is in
"where" clause we have guarantee that all rows we can find just in ONE
partition: contained in range or in default if exists. Planner show that
query should be executed just in one partition, but it takes a lot of time.
So my direct question is if planner is stopping searching for another
partition when he found one correct? Are partition ranges stored sorted and
searching method is optimal and is stopped after first (and only) hit?

I've noticed that increasing numbers of partition proportionally increase
planner time. Additionally having index on column that you are searching for
is adding extra time (one index add +/- 100% time for table). It's
understandable but optimizing planner and executor by ideas I wrote on first
paragraph automatically decrease time for searching indexes.

Reproduction:
1. ADD MAIN TABLE

-- Table: public.book

--DROP TABLE public.book;

CREATE TABLE public.book
(
id bigserial,
id_owner bigint NOT NULL,
added date NOT NULL
) PARTITION BY RANGE (id_owner) 
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;

ALTER TABLE public.book
OWNER to postgres;

2. ADD PARTITIONS (run first "a" variant, then drop table book, reconstruct
and run "variant"):
a. 1200 partitions:
https://gist.github.com/piotrwlodarczyk/4faa05729d1bdd3b5f5738a2a3faabc0 
b. 6000 partitions:
https://gist.github.com/piotrwlodarczyk/2747e0984f521768f5d36ab2b382ea36 

3. ANALYZE ON MAIN TABLE:
EXPLAIN ANALYZE SELECT * FROM public.book WHERE id_owner = 4;
a. My result for 1200 partitions:
https://gist.github.com/piotrwlodarczyk/500f20a0b6e2cac6d36ab88d4fea2c00 
b. My result for 6000 partitions:
https://gist.github.com/piotrwlodarczyk/277687b21201340377116a18a3dd8be8

4. ANALYZE ON PARTITIONED TABLE (only on first level):
  EXPLAIN ANALYZE SELECT * FROM public.book WHERE id_owner = 4;
  a. My result for 1200:
https://gist.github.com/piotrwlodarczyk/4285907c68b34b486cbf39eb8ae5cf92
  b. My result for 6000:
https://gist.github.com/piotrwlodarczyk/c157cc9321b6e1a1d0f900310f14f1cc

4. CONCLUSIONS
Planner time for select on public.book (main table) 1200 was 469.416 ms,
for 6000 was 2530.179 ms. It looks like time is linear to partition count.
That makes me sure that all partitions are checked instead of searching for
first that equals. Intentionally I've searching id_owner = 4 to make sure
that in both cases first partition should by marked as correct and planer
time should be constant. What is intereting too that real execution time was
+/- equal in both cases. Is executor working better than planner?
  When we're asking on first level partition directly - time for planner
1200 is 58.736 ms, for 6000: 60.555 ms. We can say it's equal. Why? Because
planner don't have to search for another matching partitions because first
found can match. It's guaranteed by rule that say ranges in partitions
cannot override. Execution time in this case is 50 times faster! 





smime.p7s
Description: S/MIME cryptographic signature


Re: Planner performance in partitions

2019-08-12 Thread Michael Lewis
"It is also important to consider the overhead of partitioning during query
planning and execution. The query planner is generally able to handle
partition hierarchies with *up to a few hundred partitions fairly well*,
provided that typical queries allow the query planner to prune all but a
small number of partitions. Planning times become longer and memory
consumption becomes higher as more partitions are added." (emphasis added)

--https://www.postgresql.org/docs/current/ddl-partitioning.html


Re: Planner performance in partitions

2019-08-12 Thread MichaelDBA
Queries against tables with a lot of partitions (> 1000) start to incur 
an increasing planning time duration even with the current version, 
V11.  V12 purportedly has fixed this problem, allowing thousands of 
partitioned tables without a heavy planning cost.  Can't seem to find 
the threads on this topic, but there are out there.  I personally noted 
a gigantic increase in planning time once I got past 1500 partitioned 
tables in V11.


On another note, hopefully they have fixed runtime partition pruning in 
V12 since V11 introduced it but some query plans don't use it, so you 
have to reconstruct some queries to sub queries to make it work correctly.


Regards,
Michael Vitale


Michael Lewis wrote on 8/12/2019 3:05 PM:
"It is also important to consider the overhead of partitioning during 
query planning and execution. The query planner is generally able to 
handle partition hierarchies with */up to a few hundred partitions 
fairly well/*, provided that typical queries allow the query planner 
to prune all but a small number of partitions. Planning times become 
longer and memory consumption becomes higher as more partitions are 
added." (emphasis added)


--https://www.postgresql.org/docs/current/ddl-partitioning.html




ODP: Planner performance in partitions

2019-08-12 Thread Piotr Włodarczyk
@Michael Lewis: I know documentation. I'm just considerations about possible 
performance tricks in current production version. I've tested this on V12 on 
another computer and I can say that I'm impressed. I've checked on 1200 
partitions and times are:

PostgreSQL11.5:
• select on main partition (public.book): planner: 60ms, execution: 5ms
• select on partitioned table (public.book_1-1000): planner: 2.7 ms, execution: 
2,4 ms
PostgreSQL 12B3:
• select on main partition (public.book): planner: 2,5ms , execution: 1,2ms
• select on partitioned table (public.book_1-1000): planner: 2.5 ms, execution: 
1,2 ms

So looking at above results we have two options:
• Wait for 12.0 stable version 😉
• Wait for patches to 11 – PostgreSQL Team: can You do this? 😊

Pozdrawiam,
Piotr Włodarczyk

Od: MichaelDBA
Wysłano: poniedziałek, 12 sierpnia 2019 21:25
Do: Michael Lewis
DW: Piotr Włodarczyk; [email protected]
Temat: Re: Planner performance in partitions

Queries against tables with a lot of partitions (> 1000) start to incur an 
increasing planning time duration even with the current version, V11.  V12 
purportedly has fixed this problem, allowing thousands of partitioned tables 
without a heavy planning cost.  Can't seem to find the threads on this topic, 
but there are out there.  I personally noted a gigantic increase in planning 
time once I got past 1500 partitioned tables in V11.

On another note, hopefully they have fixed runtime partition pruning in V12 
since V11 introduced it but some query plans don't use it, so you have to 
reconstruct some queries to sub queries to make it work correctly.

Regards,
Michael Vitale


Michael Lewis wrote on 8/12/2019 3:05 PM:

"It is also important to consider the overhead of partitioning during query 
planning and execution. The query planner is generally able to handle partition 
hierarchies with up to a few hundred partitions fairly well, provided that 
typical queries allow the query planner to prune all but a small number of 
partitions. Planning times become longer and memory consumption becomes higher 
as more partitions are added." (emphasis added)

--https://www.postgresql.org/docs/current/ddl-partitioning.html




Re: Planner performance in partitions

2019-08-12 Thread Michael Lewis
Thanks for clarifying your position and sharing the results you have seen.
That is impressive indeed.

It seems likely that waiting for v12 is needed since feature are not back
patched. Perhaps one of the contributors will confirm, but that is my
expectation.


Last event per user

2019-08-12 Thread Luís Roberto Weck

Hey guys,

So I have two tables: users and events. It is very common for my 
application to request the last user event.


Usually, what I'll do is get the user, and then SELECT * from events 
WHERE user_id = :user order by timestamp_inc desc LIMIT 1.


I have a big problem, however:

My app uses a ORM for SQL execution and generation and it cant create 
subselects at all. The Ideal solution for me would be a view which has 
all the users last events.


I tried:

creating a view (last_user_event_1) on "SELECT DISTINCT ON (user_id) * 
FROM events ORDER BY user_id, timestamp_inc DESC" and another one 
(last_user_event_2) which is a view on users with a lateral join on the 
last event.


Running the query with lateral join by itself is very fast, and exactly 
what I need. It usually runs < 1ms. The one with "distinct on (user_id)" 
takes around 20ms to complete which is just too slow for my needs.


My problem is that when I run a query JOINing users with 
last_user_event_2, it takes about 2 seconds:


This is the explain output from joining users with "last_user_event_2":

https://explain.depesz.com/s/oyEp

And this is with "last_user_event_1":

https://explain.depesz.com/s/hWwF

Any help would be greatly appreciated.




Re: Last event per user

2019-08-12 Thread Michael Lewis
The obfuscation makes it difficult to guess at the query you are writing
and the schema you are using. Can you provide any additional information
without revealing sensitive info?

1) Do you have an index on ( user_id ASC, timestamp_inc DESC ) ?
2) Sub-queries can't be re-written inline by the optimizer when there is an
aggregate inside the subquery, and I think DISTINCT ON would behave the
same. So, that might explain the significant change in behavior when the
lateral is used. I am guessing at how you wrote the two versions of the
view though.

Obviously not best design, but you could insert events as "is_latest" and
update any prior events for that user via trigger as is_latest = false.


Re: Planner performance in partitions

2019-08-12 Thread David Rowley
On Tue, 13 Aug 2019 at 08:03, Piotr Włodarczyk
 wrote:
> PostgreSQL11.5:
>
> select on main partition (public.book): planner: 60ms, execution: 5ms
> select on partitioned table (public.book_1-1000): planner: 2.7 ms, execution: 
> 2,4 ms
>
> PostgreSQL 12B3:
>
> select on main partition (public.book): planner: 2,5ms , execution: 1,2ms
> select on partitioned table (public.book_1-1000): planner: 2.5 ms, execution: 
> 1,2 ms
>
> So looking at above results we have two options:
>
> Wait for 12.0 stable version
> Wait for patches to 11 – PostgreSQL Team: can You do this?

You'll need to either reduce the number of partitions down to
something realistic or wait for 12.0.

The work done to speed up the planner with partitioned tables for v12
won't be going into v11.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services




Re: Last event per user

2019-08-12 Thread Luís Roberto Weck
> The obfuscation makes it difficult to guess at the query you are writing and 
> the schema you are using. Can you provide any additional information without 
> revealing sensitive info?
> 
> 1) Do you have an index on ( user_id ASC, timestamp_inc DESC ) ? 
> 2) Sub-queries can't be re-written inline by the optimizer when there is an 
> aggregate inside the subquery, and I think DISTINCT ON would behave the same. 
> So, that might explain the significant change in behavior when the lateral is 
> used. I am guessing at how you wrote the two versions of the view though. 
> 
> Obviously not best design, but you could insert events as "is_latest" and 
> update any prior events for that user via trigger as is_latest = false.

Thanks for the reply! 

the schema is basically this (simplified): 

table users (user_id,user_group,user_name) 

table events
(user_id,user_group,event_id,timestamp_inc,event_description) 

Views: 

"last_user_event_2" 

SELECT e.* 

   FROM users u 

JOIN LATERAL (SELECT * 

FROM events 

   WHERE user_id = u.user_id 

 AND user_group = u.user_group  

   ORDER BY timestamp_inc DESC 

   LIMIT 1 ) e ON TRUE 

"last_user_event_1" 

SELECT DISTINCT ON (user_id) 

   * 

  FROM events 

 ORDER BY user_id, timestamp_inc DESC 

The query itself  is: 

SELECT * 

  FROM users u 

   JOIN last_user_event_(1|2) e USING (user_id,user_group) 

This explain plan: https://explain.depesz.com/s/oyEp is what Postgres
uses with "last_user_event_2" and https://explain.depesz.com/s/hWwF,
"last_user_event_1" 

I do have a btree index on user_id,user_group,timestamp_inc DESC.

Re: Last event per user

2019-08-12 Thread Michael Lewis
It seems like it should be-
SELECT * FROM users u JOIN last_user_event_1 e USING (user_id,user_group);
--OR--
SELECT * FROM last_user_event_2 e;

for them to produce the same result set, since the last_user_event_2
already (could) have users info in it very simply by select * instead of
e.* in that view definition.

Are there other important joins/where/order by/limits that would be on this
"main query" that is just SELECT * FROM  right now which you have
dropped to try to simplify the example?


Re: Planner performance in partitions

2019-08-12 Thread Michael Lewis
Was there a reason to exceed 100-500 partitions in real life that pushed
you to do this test? Is there some issue you see when using 100 partitions
that is solved or reduced in severity by increasing to 1200 or 6000
partitions?


Re:

2019-08-12 Thread Luís Roberto Weck
> It seems like it should be-
> 
> SELECT * FROM users u JOIN last_user_event_1 e USING (user_id,user_group); 
> --OR-- 
> SELECT * FROM last_user_event_2 e; 
> 
> for them to produce the same result set, since the last_user_event_2 already 
> (could) have users info in it very simply by select * instead of e.* in that 
> view definition. 
> 
> Are there other important joins/where/order by/limits that would be on this 
> "main query" that is just SELECT * FROM  right now which you have dropped 
> to try to simplify the example?

You're right about the queries, I made a mistake. 

Yes, I'm going to filter them by user_id and user_group, possibly (but
not likely) using LIMIT 1. In the explain examples I am using user_id =
1272897 and user_group = 19117

Re:

2019-08-12 Thread Michael Lewis
If you modify last_user_event_2 to select user and event info in the view,
and just put there where clause directly on the view which is not joined to
anything, instead of on the "extra copy" of the users table like you were
showing previously, I would expect that the performance should be excellent.

>


Re: Last event per user

2019-08-12 Thread Luís Roberto Weck
> If you modify last_user_event_2 to select user and event info in the view, 
> and just put there where clause directly on the view which is not joined to 
> anything, instead of on the "extra copy" of the users table like you were 
> showing previously, I would expect that the performance should be excellent.

But I need user_id and user_group to be outside of the view definition.
user_id and user_group are dynamic values, as in, I need to call this
query multiple times for different user_ids and user_groups .

Re: Last event per user

2019-08-12 Thread Michael Lewis
On Mon, Aug 12, 2019 at 5:03 PM Luís Roberto Weck <
[email protected]> wrote:

> If you modify last_user_event_2 to select user and event info in the view,
> and just put there where clause directly on the view which is not joined to
> anything, instead of on the "extra copy" of the users table like you were
> showing previously, I would expect that the performance should be excellent.
>
> But I need user_id and user_group to be outside of the view definition.
> user_id and user_group are dynamic values, as in, I need to call this query
> multiple times for different user_ids and user_groups .
>

I don't follow. Perhaps there is something within the limitations of the
ORM layer that I am not expecting. If you have this view-

"last_user_event_2"

SELECT u.*, e.*

   FROM users u

JOIN LATERAL (SELECT *

FROM events

   WHERE user_id = u.user_id

 AND user_group = u.user_group

   ORDER BY timestamp_inc DESC

   LIMIT 1 ) e ON TRUE


And you execute a query like this-
SELECT * FROM last_user_event_2 e WHERE  user_id = 1272897 and user_group =
19117;

Then I would expect very good performance.


ODP: Planner performance in partitions

2019-08-12 Thread Piotr Włodarczyk
As you wrote we have about 400/500 partitions in real life. So time problem is 
much smaller, but still it is and in one place of aur application we have 
decided to help DB and we're indicating in query exact partition we need. What 
pushed me to do this test? Just curiosity I think. After I saw in pg_locks that 
all partitions which was selected in uncommitted transaction have ACCESS SHARED 
i've started thinking about efficiency. And that way here we are. Why we need 
some hundred partitions? It’s because our main table (public.book) have 
hundreds of millions records. It’s not maintainable. VACUUM never ends, space 
on device is huge and we cannot take database down for longer that 2-3 hours, 
what is too short to maintain them manually. So we've partitioned them on two 
levels. First on id_owner (which is in every query) and the second level based 
on date. It’ll help as detach partitions with old data we no longer need. 

 
Pozdrawiam,
Piotr Włodarczyk

Od: Michael Lewis
Wysłano: wtorek, 13 sierpnia 2019 00:37
Do: David Rowley
DW: Piotr Włodarczyk; MichaelDBA; Piotr Włodarczyk; 
[email protected]
Temat: Re: Planner performance in partitions

Was there a reason to exceed 100-500 partitions in real life that pushed you to 
do this test? Is there some issue you see when using 100 partitions that is 
solved or reduced in severity by increasing to 1200 or 6000 partitions?