RE: PostgreSQL 12.8 Same Query Same Execution Plan Different Time

2022-01-20 Thread Michel SALAIS
Michel SALAIS

De : David G. Johnston  
Envoyé : mercredi 19 janvier 2022 16:11
À : Ludwig Isaac Lim 
Cc : [email protected]
Objet : Re: PostgreSQL 12.8 Same Query Same Execution Plan Different Time

 

On Wed, Jan 19, 2022 at 7:59 AM Ludwig Isaac Lim mailto:[email protected]> > wrote:


I noticed that different is actually in Nested Loop join. One is taking 2 
minutes, other is taking 12 seconds. I find this puzzling as I assume the 
nested loop should be done in memory.

 

Everything is done in memory, but the data has to get there first (hence 
BUFFERS as you figured out below).

 


The disk is gp2 SDD so I'm even more baffled by this. What could be the factors 
that affect the speed of nested loop. I notice for that both loops the rows is 
7780 and loops is 1. I don't think those are big numbers

 

The loops are ~= 400 and 6,000

 


It was only after the running the 2 queries that I realize I could do EXPLAIN 
(ANALYZE, BUFFERS), but I couldn't reproduce the slowness.

 

Did you (can you even in RDS) attempt to clear those buffers?  If the first 
query ran slowly because none of the data was in memory (which you don't know 
for certain because you didn't run with BUFFERS option then) then subsequent 
runs would indeed be faster (the implementation of shared buffers having 
fulfilled one of its major purposes in life).

 

I'll agree buffers for that query does not seem to account for nearly two 
minutes...though as RDS is a shared resource I'd probably chalk at least some 
of it to contention on the underlying hardware (disk likely being more 
problematic than memory).

 

David J.

Hi,

 

Another point to check is eventually IOPS…

It depends on the contracted service, If the quantity of IOPS is guaranteed or 
not. When it is not guaranteed and a sufficiently heavy load (in I/O) was 
executed for a while, the value of IOPS falls down dramatically and then you 
are sure to have performance problems…

 

Michel SALAIS

 



Slow query fixed by replacing equality with a nested query

2022-01-20 Thread Valentin Janeiko
Hi everyone,

I have a SELECT query that uses a chain of CTEs (4) that is slow to run on
a large
database. But if I change a where clause in one of the small CTEs from an
equality to an equivalent nested IN query, then the query becomes fast.
Looking
at the query plan I can see that after the change Postgres avoids a large
and
slow index scan by using a different index and aggregation. I am reluctant
to
accept the accidental "fix" because it seems odd and counter intuitive. Can
anyone shed some light on what's going on? Is my fix the intended solution
or
is there a better way to write this query?

We have a system which stores resource blobs and extracts search parameters
into a number of tables. The query in question tries to find all resources
with
a specific tag (cte0) that are related to resource X (cte2) and are dated
before some (recent) date Y (cte1) and sort them by date (cte3 & cte4). The
query was working okay on a small database, but over time as the database
grew
the query started to timeout. Which is why I am looking at it now.

I have accidentally fixed the performance by replacing `system_id = 20` with
`system_id IN (SELECT system_id FROM fhir.system WHERE value = 'REDACTED')`.
The nested query here returns a single row with a value `20`.

Here are the results of EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT
JSON):
- Slow: https://explain.depesz.com/s/joHK
- Fast: https://explain.depesz.com/s/tgd4

Some more info about the CTEs:
- cte0: select resources with a specific tag
(most common resource types with the most common tag)
- cte1: filter resource by date no later than Y
(matches ~50% of the table, and most of resource from cte0)
- cte2: select resources that are related to a specific resource X
(matches 1-5 resources)
- cte3: adds the date as a sort value
- cte4: sorts the result

I have also created a gist:
https://gist.github.com/valeneiko/89f8cbe26db7ca2651b47524462b5d18
- Schema.sql: the SQL script to create tables and indexes
- Query.sql: the query I am trying to run
- Postgres Settings, Table sizes and Statistics are also included in the
gist

PostgreSQL Version:
PostgreSQL 13.3 (Ubuntu 13.3-1.pgdg18.04+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit

Setup: PostgreSQL is running inside a docker container on a dedicate node
in a
Kubernetes cluster (using Zalando Spilo image:
https://github.com/zalando/spilo)

Thank you,
Valentinas


Poor performance PostgreSQL 13/ PostGIS 3.x

2022-01-20 Thread Lugosi, Jim
Hi folks,

We are struggling to figure out what is going on. We are migrating from 
PostgreSQL 9.6 to PostgreSQL 13 w/ PostGIS. Our 9.6 version was compiled from 
source and the new version (13) was installed using Yum. BTW, the new version 
is on a VM that has 16GB of memory, two cores, and 500 GB of disk. In addition, 
we are using MapServer as our mapping engine and OpenLayers as the client side 
interface. Once we switch over to the new version of PostgreSQL, the 
performance takes a big nose dive. We have being tweaking and tuning the 
database and it appears to be happy but the response times from mapfile 
requests are 3 -7 seconds. Previously, the response time was below a second.

Another point is that we populated the new database from the old (9.6), using 
pg_dump. Could this be causing issues? Should we load the data from scratch? We 
use ogr2ogr (GDAL) to help assist with loading of spatial data. Anyway, not 
really sure what the problem is.

Lastly, why am I seeing so many requests as to the PostGIS version. It appears 
that every map request sends the following query "SELECT PostGIS_Version();", 
which in turn takes up a connection.

Any help would be greatly appreciated.

Thanks

 __:)
   _ \<,_
  (*)/ (*)
James Lugosi
Clackamas County GISP
IS Software Specialist, Senior
121 Library Court, Oregon City OR 97045
503-723-4829



PostgreSQL and Linux CPU's

2022-01-20 Thread Sbob

All;


I am looking for information on how PostgreSQL leverages or interfaces 
with CPU's on Linux. Does PostgreSQL let Linux do the work? Does it 
bypass the OS? Any information or docs you can send my way would be much 
appreciated.



Thanks in advance






Re: PostgreSQL and Linux CPU's

2022-01-20 Thread David G. Johnston
On Thu, Jan 20, 2022 at 4:22 PM Sbob  wrote:

> I am looking for information on how PostgreSQL leverages or interfaces
> with CPU's on Linux. Does PostgreSQL let Linux do the work? Does it
> bypass the OS? Any information or docs you can send my way would be much
> appreciated.
>
>
PostgreSQL is a user land process in Linux.  Linux doesn't allow itself to
be bypassed by user land processes when dealing with the CPU.  That is kind
of its main reason for existing...

PostgreSQL uses a process forking model and each process runs on a single
thread.

You can probably verify all of that by perusing the PostgreSQL
documentation.  Don't know what to recommend regarding Linxu, user land,
kernel mode, and CPUs...

David J.


Re: Slow query fixed by replacing equality with a nested query

2022-01-20 Thread Michael Lewis
I don't see any reference to cte1. Is that expected?

I'm unclear why these sets are not just inner join'd
on resource_surrogate_id. It seems like that column it is being selected as
Sid1 in each CTE, and then the next one does the below. Why?

where resource_surrogate_id IN (SELECT Sid1 FROM cte_previous_number)