Fwd: different execution time for the same query (and same DB status)
Hi all, I would appreciate if somebody could help me understanding why the query described below takes very different execution times to complete, almost completely randomly. I have two very "simple" tables, A and B: CREATE TABLE A ( a1 varchar(10) NULL, a2 varchar(10) NULL, v int4 NULL ); CREATE TABLE B ( a1 varchar(10) NULL, a2 varchar(10) NULL, v int4 NULL ); I load A and B with some random numbers (10^7 records each one) and, when the loading is complete, I run a query that "essentially" counts the number of the records obtained through a sequence of JOINs, which has the following form: SELECT count(*) from (... A join B ) The query does not write/update any value, it is simply a SELECT applied over the aforementioned tables (plus some unions and intersections). The problem is the following: the query can take between 20 seconds and 4 minutes to complete. Most of times, when I run the query for the first time after the server initialisation, it takes 20 seconds; but if I re-run it again (without changing anything) right after the first execution, the probability to take more than 4 minutes is very high. My impression is that the "status" of the internal structures of the DBMS is somehow affected by the first execution, but I cannot figure out neither what nor how I can fix it. To give some additional information, I can state the following facts: - There are no other processes reading or writing the tables on the schema and the status of A and B is constant. - During the loading of A and B (process that takes more or less one minute to complete), I get the following messages: LOG: checkpoints are occurring too frequently (29 seconds apart). HINT: Consider increasing the configuration parameter "max_wal_size". LOG: checkpoints are occurring too frequently (18 seconds apart) HINT: Consider increasing the configuration parameter "max_wal_size". - I am running my query through DBeaver and PostgreSQL runs in a Docker container. The version that I am using is the following: 13.2 (Debian 13.2-1.pgdg100+1). The only configuration parameters I have changed are the following: - force_parallel_mode = on - max_parallel_workers_per_gather = 64 - parallel_setup_cost = 1 - parallel_tuple_cost = 0.001 - work_mem = '800MB' I hope somebody could help me, as I really don't know why I am experiencing such a strange behaviour. Thanks a lot. Best regards, Francesco
RE: different execution time for the same query (and same DB status)
Hi, Have you tried to use EXPLAIN ANALYZE at least? It could give valuable information about why this is occurring. Michel SALAIS De : Francesco De Angelis Envoyé : samedi 6 mars 2021 22:40 À : [email protected] Objet : Fwd: different execution time for the same query (and same DB status) Hi all, I would appreciate if somebody could help me understanding why the query described below takes very different execution times to complete, almost completely randomly. I have two very "simple" tables, A and B: CREATE TABLE A ( a1 varchar(10) NULL, a2 varchar(10) NULL, v int4 NULL ); CREATE TABLE B ( a1 varchar(10) NULL, a2 varchar(10) NULL, v int4 NULL ); I load A and B with some random numbers (10^7 records each one) and, when the loading is complete, I run a query that "essentially" counts the number of the records obtained through a sequence of JOINs, which has the following form: SELECT count(*) from (... A join B ) The query does not write/update any value, it is simply a SELECT applied over the aforementioned tables (plus some unions and intersections). The problem is the following: the query can take between 20 seconds and 4 minutes to complete. Most of times, when I run the query for the first time after the server initialisation, it takes 20 seconds; but if I re-run it again (without changing anything) right after the first execution, the probability to take more than 4 minutes is very high. My impression is that the "status" of the internal structures of the DBMS is somehow affected by the first execution, but I cannot figure out neither what nor how I can fix it. To give some additional information, I can state the following facts: - There are no other processes reading or writing the tables on the schema and the status of A and B is constant. - During the loading of A and B (process that takes more or less one minute to complete), I get the following messages: LOG: checkpoints are occurring too frequently (29 seconds apart). HINT: Consider increasing the configuration parameter "max_wal_size". LOG: checkpoints are occurring too frequently (18 seconds apart) HINT: Consider increasing the configuration parameter "max_wal_size". - I am running my query through DBeaver and PostgreSQL runs in a Docker container. The version that I am using is the following: 13.2 (Debian 13.2-1.pgdg100+1). The only configuration parameters I have changed are the following: - force_parallel_mode = on - max_parallel_workers_per_gather = 64 - parallel_setup_cost = 1 - parallel_tuple_cost = 0.001 - work_mem = '800MB' I hope somebody could help me, as I really don't know why I am experiencing such a strange behaviour. Thanks a lot. Best regards, Francesco
Re: different execution time for the same query (and same DB status)
On Sun, Mar 07, 2021 at 03:51:05PM +0100, Michel SALAIS wrote: > > Have you tried to use EXPLAIN ANALYZE at least? > > It could give valuable information about why this is occurring. +1, and more generally please follow https://wiki.postgresql.org/wiki/Slow_Query_Questions.
Re: different execution time for the same query (and same DB status)
Julien Rouhaud writes: > +1, and more generally please follow > https://wiki.postgresql.org/wiki/Slow_Query_Questions. Yeah. FWIW, the most likely explanation for the change in behavior is that by the time of the second execution, auto-analyze has managed to update the table's statistics, and that for some reason that's changing the plan for the worse. But then the next question is why and what can be done about that. See the wiki entry for info that would be helpful in diagnosing this. regards, tom lane
