RE: PostgreSQL 12.8 Same Query Same Execution Plan Different Time
Michel SALAIS De : David G. Johnston Envoyé : mercredi 19 janvier 2022 16:11 À : Ludwig Isaac Lim Cc : [email protected] Objet : Re: PostgreSQL 12.8 Same Query Same Execution Plan Different Time On Wed, Jan 19, 2022 at 7:59 AM Ludwig Isaac Lim mailto:[email protected]> > wrote: I noticed that different is actually in Nested Loop join. One is taking 2 minutes, other is taking 12 seconds. I find this puzzling as I assume the nested loop should be done in memory. Everything is done in memory, but the data has to get there first (hence BUFFERS as you figured out below). The disk is gp2 SDD so I'm even more baffled by this. What could be the factors that affect the speed of nested loop. I notice for that both loops the rows is 7780 and loops is 1. I don't think those are big numbers The loops are ~= 400 and 6,000 It was only after the running the 2 queries that I realize I could do EXPLAIN (ANALYZE, BUFFERS), but I couldn't reproduce the slowness. Did you (can you even in RDS) attempt to clear those buffers? If the first query ran slowly because none of the data was in memory (which you don't know for certain because you didn't run with BUFFERS option then) then subsequent runs would indeed be faster (the implementation of shared buffers having fulfilled one of its major purposes in life). I'll agree buffers for that query does not seem to account for nearly two minutes...though as RDS is a shared resource I'd probably chalk at least some of it to contention on the underlying hardware (disk likely being more problematic than memory). David J. Hi, Another point to check is eventually IOPS… It depends on the contracted service, If the quantity of IOPS is guaranteed or not. When it is not guaranteed and a sufficiently heavy load (in I/O) was executed for a while, the value of IOPS falls down dramatically and then you are sure to have performance problems… Michel SALAIS
Slow query fixed by replacing equality with a nested query
Hi everyone, I have a SELECT query that uses a chain of CTEs (4) that is slow to run on a large database. But if I change a where clause in one of the small CTEs from an equality to an equivalent nested IN query, then the query becomes fast. Looking at the query plan I can see that after the change Postgres avoids a large and slow index scan by using a different index and aggregation. I am reluctant to accept the accidental "fix" because it seems odd and counter intuitive. Can anyone shed some light on what's going on? Is my fix the intended solution or is there a better way to write this query? We have a system which stores resource blobs and extracts search parameters into a number of tables. The query in question tries to find all resources with a specific tag (cte0) that are related to resource X (cte2) and are dated before some (recent) date Y (cte1) and sort them by date (cte3 & cte4). The query was working okay on a small database, but over time as the database grew the query started to timeout. Which is why I am looking at it now. I have accidentally fixed the performance by replacing `system_id = 20` with `system_id IN (SELECT system_id FROM fhir.system WHERE value = 'REDACTED')`. The nested query here returns a single row with a value `20`. Here are the results of EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON): - Slow: https://explain.depesz.com/s/joHK - Fast: https://explain.depesz.com/s/tgd4 Some more info about the CTEs: - cte0: select resources with a specific tag (most common resource types with the most common tag) - cte1: filter resource by date no later than Y (matches ~50% of the table, and most of resource from cte0) - cte2: select resources that are related to a specific resource X (matches 1-5 resources) - cte3: adds the date as a sort value - cte4: sorts the result I have also created a gist: https://gist.github.com/valeneiko/89f8cbe26db7ca2651b47524462b5d18 - Schema.sql: the SQL script to create tables and indexes - Query.sql: the query I am trying to run - Postgres Settings, Table sizes and Statistics are also included in the gist PostgreSQL Version: PostgreSQL 13.3 (Ubuntu 13.3-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit Setup: PostgreSQL is running inside a docker container on a dedicate node in a Kubernetes cluster (using Zalando Spilo image: https://github.com/zalando/spilo) Thank you, Valentinas
Poor performance PostgreSQL 13/ PostGIS 3.x
Hi folks, We are struggling to figure out what is going on. We are migrating from PostgreSQL 9.6 to PostgreSQL 13 w/ PostGIS. Our 9.6 version was compiled from source and the new version (13) was installed using Yum. BTW, the new version is on a VM that has 16GB of memory, two cores, and 500 GB of disk. In addition, we are using MapServer as our mapping engine and OpenLayers as the client side interface. Once we switch over to the new version of PostgreSQL, the performance takes a big nose dive. We have being tweaking and tuning the database and it appears to be happy but the response times from mapfile requests are 3 -7 seconds. Previously, the response time was below a second. Another point is that we populated the new database from the old (9.6), using pg_dump. Could this be causing issues? Should we load the data from scratch? We use ogr2ogr (GDAL) to help assist with loading of spatial data. Anyway, not really sure what the problem is. Lastly, why am I seeing so many requests as to the PostGIS version. It appears that every map request sends the following query "SELECT PostGIS_Version();", which in turn takes up a connection. Any help would be greatly appreciated. Thanks __:) _ \<,_ (*)/ (*) James Lugosi Clackamas County GISP IS Software Specialist, Senior 121 Library Court, Oregon City OR 97045 503-723-4829
PostgreSQL and Linux CPU's
All; I am looking for information on how PostgreSQL leverages or interfaces with CPU's on Linux. Does PostgreSQL let Linux do the work? Does it bypass the OS? Any information or docs you can send my way would be much appreciated. Thanks in advance
Re: PostgreSQL and Linux CPU's
On Thu, Jan 20, 2022 at 4:22 PM Sbob wrote: > I am looking for information on how PostgreSQL leverages or interfaces > with CPU's on Linux. Does PostgreSQL let Linux do the work? Does it > bypass the OS? Any information or docs you can send my way would be much > appreciated. > > PostgreSQL is a user land process in Linux. Linux doesn't allow itself to be bypassed by user land processes when dealing with the CPU. That is kind of its main reason for existing... PostgreSQL uses a process forking model and each process runs on a single thread. You can probably verify all of that by perusing the PostgreSQL documentation. Don't know what to recommend regarding Linxu, user land, kernel mode, and CPUs... David J.
Re: Slow query fixed by replacing equality with a nested query
I don't see any reference to cte1. Is that expected? I'm unclear why these sets are not just inner join'd on resource_surrogate_id. It seems like that column it is being selected as Sid1 in each CTE, and then the next one does the below. Why? where resource_surrogate_id IN (SELECT Sid1 FROM cte_previous_number)
