transaction blocking on COMMIT
I am seeing a strange issue on a database using jdbc. Regularly, 4 or 5 times a day, I see something like a "stutter", where a bundle of maybe 30 transactions suddenly finish at the same time. It looks like (it is quite hard to catch this exactly) that the lead transaction which has been blocking the rest has been blocked in COMMIT. In each case it blocks for almost exactly 30s, just over, and once it goes through, releases locks, and the others clear behind it. My question: what are the range of possibilities that might cause a COMMIT to block? I haven't seen this before. Is there anything suspicious about the regular 30s? Occasionally we see 60s, which seems likely to be two sets of 30. Regards Bob
Re: transaction blocking on COMMIT
I think there have been similar issues reported earlier as well. But it would be too early to generalize. Where is the db server running? Cloud? Also what is the version ? On Mon, May 24, 2021, 5:00 PM Bob Jolliffe wrote: > I am seeing a strange issue on a database using jdbc. Regularly, 4 or > 5 times a day, I see something like a "stutter", where a bundle of > maybe 30 transactions suddenly finish at the same time. It looks like > (it is quite hard to catch this exactly) that the lead transaction > which has been blocking the rest has been blocked in COMMIT. In each > case it blocks for almost exactly 30s, just over, and once it goes > through, releases locks, and the others clear behind it. > > My question: what are the range of possibilities that might cause a > COMMIT to block? I haven't seen this before. Is there anything > suspicious about the regular 30s? Occasionally we see 60s, which > seems likely to be two sets of 30. > > Regards > Bob > > >
Optimising outer joins in the presence of non-nullable references
Hi list. I have a question about the different plans produced by postgres for an outer join versus an inner join. (complete sql script attached) Take these two tables: CREATE TABLE album ( id SERIAL PRIMARY KEY, title TEXT NOT NULL ); CREATE INDEX album_title ON album (title); CREATE TABLE track ( id SERIAL PRIMARY KEY, album_id INT NOT NULL REFERENCES album(id), title TEXT NOT NULL ); CREATE INDEX track_album ON track(album_id); CREATE INDEX track_title ON track(title); where crucially `track` references `album(id)` with a `NOT NULL` reference. Now, if we query both an inner and an outer join on `track` and `album` (after having suitably filled-in data), we get very different plans where only the inner join exploits indices. That is: EXPLAIN ANALYZE SELECT t.id FROM track t INNER JOIN album a ON (t.album_id = a.id) ORDER BY a.title ASC LIMIT 10; Produces this query plan: QUERY PLAN --- Limit (cost=0.56..3.29 rows=10 width=36) (actual time=0.038..0.052 rows=10 loops=1) -> Nested Loop (cost=0.56..3606.93 rows=13200 width=36) (actual time=0.036..0.046 rows=10 loops=1) -> Index Scan using album_title on album a (cost=0.28..113.23 rows=1397 width=36) (actual time=0.015..0.016 rows=1 loops=1) -> Index Scan using track_album on track t (cost=0.29..1.84 rows=66 width=8) (actual time=0.012..0.018 rows=10 loops=1) Index Cond: (album_id = a.id) Planning Time: 0.473 ms Execution Time: 0.096 ms (7 rows) While this: EXPLAIN ANALYZE SELECT t.id FROM track t LEFT JOIN album a ON (t.album_id = a.id) ORDER BY a.title ASC LIMIT 10; Produces this query plan: QUERY PLAN - Limit (cost=604.43..604.45 rows=10 width=36) (actual time=20.934..20.943 rows=10 loops=1) -> Sort (cost=604.43..637.43 rows=13200 width=36) (actual time=20.932..20.934 rows=10 loops=1) Sort Key: a.title Sort Method: top-N heapsort Memory: 27kB -> Hash Left Join (cost=42.43..319.18 rows=13200 width=36) (actual time=1.082..12.333 rows=1 loops=1) Hash Cond: (t.album_id = a.id) -> Seq Scan on track t (cost=0.00..242.00 rows=13200 width=8) (actual time=0.031..3.919 rows=1 loops=1) -> Hash (cost=24.97..24.97 rows=1397 width=36) (actual time=0.990..0.991 rows=1000 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 101kB -> Seq Scan on album a (cost=0.00..24.97 rows=1397 width=36) (actual time=0.014..0.451 rows=1000 loops=1) Planning Time: 0.251 ms Execution Time: 20.999 ms (12 rows) My question then is, shouldn't the inner and outer join queries be semantically equivalent when the columns we are joining on are non-nullable foreign keys? Is there some corner case I'm not considering? Would it be a good addition to postgres if it could detect this and produce a plan that exploits the indices? (My root motivation for asking this question is this github issue: https://github.com/hasura/graphql-engine/issues/5949) Regards, Philip outer-join-indexes.sql Description: application/sql
Re: transaction blocking on COMMIT
Hello Jain Sorry forgot to indicate: it is running the ubuntu packaged version 13.3 on ubuntu 20.04. It is not in the cloud, but is a VM in a government datacentre. I am not sure of the underlying hyperviser. I could find out. Regards Bob On Mon, 24 May 2021 at 12:35, Vijaykumar Jain wrote: > > I think there have been similar issues reported earlier as well. But it would > be too early to generalize. > > > Where is the db server running? Cloud? > > Also what is the version ? > > > On Mon, May 24, 2021, 5:00 PM Bob Jolliffe wrote: >> >> I am seeing a strange issue on a database using jdbc. Regularly, 4 or >> 5 times a day, I see something like a "stutter", where a bundle of >> maybe 30 transactions suddenly finish at the same time. It looks like >> (it is quite hard to catch this exactly) that the lead transaction >> which has been blocking the rest has been blocked in COMMIT. In each >> case it blocks for almost exactly 30s, just over, and once it goes >> through, releases locks, and the others clear behind it. >> >> My question: what are the range of possibilities that might cause a >> COMMIT to block? I haven't seen this before. Is there anything >> suspicious about the regular 30s? Occasionally we see 60s, which >> seems likely to be two sets of 30. >> >> Regards >> Bob >> >>
Re: transaction blocking on COMMIT
No worries, There were some threads earlier which mentioned some automated changes to disk by the provider that resulted in some slowness. But otherwise also, do you query system, disk metrics. Do you see any anomaly in disk io (wait) when you saw blocking? If it did, did the io return to normal when blocks were cleared ? On Mon, May 24, 2021, 7:23 PM Bob Jolliffe wrote: > Hello Jain > > Sorry forgot to indicate: it is running the ubuntu packaged version > 13.3 on ubuntu 20.04. > > It is not in the cloud, but is a VM in a government datacentre. I am > not sure of the underlying hyperviser. I could find out. > > Regards > Bob > > > On Mon, 24 May 2021 at 12:35, Vijaykumar Jain > wrote: > > > > I think there have been similar issues reported earlier as well. But it > would be too early to generalize. > > > > > > Where is the db server running? Cloud? > > > > Also what is the version ? > > > > > > On Mon, May 24, 2021, 5:00 PM Bob Jolliffe > wrote: > >> > >> I am seeing a strange issue on a database using jdbc. Regularly, 4 or > >> 5 times a day, I see something like a "stutter", where a bundle of > >> maybe 30 transactions suddenly finish at the same time. It looks like > >> (it is quite hard to catch this exactly) that the lead transaction > >> which has been blocking the rest has been blocked in COMMIT. In each > >> case it blocks for almost exactly 30s, just over, and once it goes > >> through, releases locks, and the others clear behind it. > >> > >> My question: what are the range of possibilities that might cause a > >> COMMIT to block? I haven't seen this before. Is there anything > >> suspicious about the regular 30s? Occasionally we see 60s, which > >> seems likely to be two sets of 30. > >> > >> Regards > >> Bob > >> > >> >
Re: Optimising outer joins in the presence of non-nullable references
Philip Lykke Carlsen writes: > My question then is, shouldn't the inner and outer join queries be > semantically equivalent when the columns we are joining on are > non-nullable foreign keys? Maybe, but no such knowledge is built into the planner. > Is there some corner case I'm not considering? I'm a little suspicious whether it's actually a safe assumption to make, in view of the fact that enforcement of FKs is delayed till end-of-statement or even end-of-transaction. Thus, the relationship isn't necessarily valid at every instant. > Would it be a good addition to postgres if it could detect this and > produce a plan that exploits the indices? Maybe. Aside from semantic correctness issues, the big question would be whether the detection could be made cheap enough to not be a drag on the 99.99% of cases where it's not helpful. regards, tom lane
Re: transaction blocking on COMMIT
It is hard to say as it only happens for 30s couple of times per day. Everything does return to normal after the blocking transaction is committed. It could be a disk thing or even a network issue (the java app is on a different machine to the db). But I never saw transactions blocked in commit before so was wondering if there is any rational set of reasons why it might do that. On Mon, 24 May 2021 at 15:09, Vijaykumar Jain wrote: > > No worries, > > There were some threads earlier which mentioned some automated changes to > disk by the provider that resulted in some slowness. > > But otherwise also, do you query system, disk metrics. > > Do you see any anomaly in disk io (wait) when you saw blocking? > If it did, did the io return to normal when blocks were cleared ? > > > > On Mon, May 24, 2021, 7:23 PM Bob Jolliffe wrote: >> >> Hello Jain >> >> Sorry forgot to indicate: it is running the ubuntu packaged version >> 13.3 on ubuntu 20.04. >> >> It is not in the cloud, but is a VM in a government datacentre. I am >> not sure of the underlying hyperviser. I could find out. >> >> Regards >> Bob >> >> >> On Mon, 24 May 2021 at 12:35, Vijaykumar Jain >> wrote: >> > >> > I think there have been similar issues reported earlier as well. But it >> > would be too early to generalize. >> > >> > >> > Where is the db server running? Cloud? >> > >> > Also what is the version ? >> > >> > >> > On Mon, May 24, 2021, 5:00 PM Bob Jolliffe wrote: >> >> >> >> I am seeing a strange issue on a database using jdbc. Regularly, 4 or >> >> 5 times a day, I see something like a "stutter", where a bundle of >> >> maybe 30 transactions suddenly finish at the same time. It looks like >> >> (it is quite hard to catch this exactly) that the lead transaction >> >> which has been blocking the rest has been blocked in COMMIT. In each >> >> case it blocks for almost exactly 30s, just over, and once it goes >> >> through, releases locks, and the others clear behind it. >> >> >> >> My question: what are the range of possibilities that might cause a >> >> COMMIT to block? I haven't seen this before. Is there anything >> >> suspicious about the regular 30s? Occasionally we see 60s, which >> >> seems likely to be two sets of 30. >> >> >> >> Regards >> >> Bob >> >> >> >>
Re: transaction blocking on COMMIT
> On May 24, 2021, at 09:22, Bob Jolliffe wrote: > > It is hard to say as it only happens for 30s couple of times per day. > Everything does return to normal after the blocking transaction is > committed. It could be a disk thing or even a network issue (the java > app is on a different machine to the db). But I never saw > transactions blocked in commit before so was wondering if there is any > rational set of reasons why it might do that. One thing you can check is to turn off synchronous_commit (understanding the possibility of "time loss" in the event of a system crash). If that mitigates the problem, the issue is likely the I/O subsystem blocking during the fsync() operation.
Re: transaction blocking on COMMIT
On 05/24/21 19:24, Christophe Pettus wrote: On May 24, 2021, at 09:22, Bob Jolliffe wrote: It is hard to say as it only happens for 30s couple of times per day. Everything does return to normal after the blocking transaction is committed. It could be a disk thing or even a network issue (the java app is on a different machine to the db). But I never saw transactions blocked in commit before so was wondering if there is any rational set of reasons why it might do that. One thing you can check is to turn off synchronous_commit (understanding the possibility of "time loss" in the event of a system crash). If that mitigates the problem, the issue is likely the I/O subsystem blocking during the fsync() operation. Just a question. Is there a btrfs(with compression maybe) around? 30 seconds is a commit(file system) timeout for btrfs. Some processes like btrfs cleaner/allocate/worker on top of CPU/io use?
