RE: LWLocks by LockManager slowing large DB
Thanks for this, I read too quickly! I've attached the 2 perf reports. From the 2nd one, I can see lots of time waiting for TOAST table locks on the geometry column, but I definitely don't fully understand the implications or why LockManager would be struggling here. Thanks for the continued help! ---Paul Paul Friedman CTO 677 Harrison St | San Francisco, CA 94107 M: (650) 270-7676 E-mail: [email protected] -Original Message- From: Andres Freund Sent: Monday, April 12, 2021 4:04 PM To: Paul Friedman Cc: [email protected] Subject: Re: LWLocks by LockManager slowing large DB Hi, On 2021-04-12 15:56:08 -0700, Paul Friedman wrote: > Also, I didn't understand your comment about a 'futex profile', could > you point me in the right direction here? My earlier mail included a section about running a perf profile showing the callers of the futex() system call, which in turn is what lwlocks end up using on linux when the lock is contended. Check the second half of: https://www.postgresql.org/message-id/20210412215738.xytq33wlciljyva5%40al ap3.anarazel.de Greetings, Andres Freund perf1.out Description: Binary data perf2.out Description: Binary data
RE: LWLocks by LockManager slowing large DB
Thanks for this – these tools (and the raw selects on pg_stat_activity and pg_locks) are all showing wait events being created by LockManager waiting on an LWLock. ---Paul Paul Friedman CTO 677 Harrison St | San Francisco, CA 94107 *M:* (650) 270-7676 *E-mail:* [email protected] *From:* Nikolay Samokhvalov *Sent:* Monday, April 12, 2021 4:34 PM *To:* Andres Freund *Cc:* Paul Friedman ; [email protected] *Subject:* Re: LWLocks by LockManager slowing large DB On Mon, Apr 12, 2021 at 14:57 Andres Freund wrote: Without knowing the proportion of LockManager wait events compared to the rest it's hard to know what to make of it. These OSS tools can be useful to understand the proportion: - pgCenter https://github.com/lesovsky/pgcenter - pg_wait_sampling (can be used together with POWA monitoring) https://github.com/postgrespro/pg_wait_sampling - pgsentinel https://github.com/pgsentinel/pgsentinel - PASH Viewer (good for visualization, integrates with pgsentinel) https://github.com/dbacvetkov/PASH-Viewer
Re: LWLocks by LockManager slowing large DB
Hi,
On 2021-04-13 09:33:48 -0700, Paul Friedman wrote:
> I've attached the 2 perf reports. From the 2nd one, I can see lots of
> time waiting for TOAST table locks on the geometry column, but I
> definitely don't fully understand the implications or why LockManager
> would be struggling here.
Oh, that is interesting. For toast tables we do not keep locks held for
the duration of the transaction, but release the lock as soon as one
access is done. It seems your workload is doing so many toast accesses
that the table / index level locking for toast tables gets to be the
bottleneck.
It'd be interesting to know if the issue vanishes if you force the lock
on the toast table and its index to be acquired explicitly.
You can find the toast table names with something like:
SELECT reltoastrelid::regclass
FROM pg_class
WHERE oid IN ('travel_processing_v5.llc_zone'::regclass,
'travel_processing_v5.llc_zone'::regclass);
That should give you two relation names looking like
"pg_toast.pg_toast_24576", just with a different number.
If you then change your workload to be (with adjusted OIDs of course):
BEGIN;
SELECT * FROM pg_toast.pg_toast_24576 LIMIT 0;
SELECT * FROM pg_toast.pg_toast_64454 LIMIT 0;
COMMIT;
Does the scalability issue vanish?
Regards,
Andres
RE: LWLocks by LockManager slowing large DB
YES!!! This completely alleviates the bottleneck and I'm able to run the queries full-throttle. Thank you SO much for your help+insight. Interestingly, "lock pg_toast.pg_toast_2233612264 in ACCESS SHARE MODE;" which should do the same thing returns an error " ERROR: "pg_toast_2233612264" is not a table or view" Sounds like I should file this as a requested improvement? Thanks again. ---Paul Paul Friedman CTO 677 Harrison St | San Francisco, CA 94107 M: (650) 270-7676 E-mail: [email protected] -Original Message- From: Andres Freund Sent: Tuesday, April 13, 2021 11:17 AM To: Paul Friedman Cc: [email protected] Subject: Re: LWLocks by LockManager slowing large DB Hi, On 2021-04-13 09:33:48 -0700, Paul Friedman wrote: > I've attached the 2 perf reports. From the 2nd one, I can see lots of > time waiting for TOAST table locks on the geometry column, but I > definitely don't fully understand the implications or why LockManager > would be struggling here. Oh, that is interesting. For toast tables we do not keep locks held for the duration of the transaction, but release the lock as soon as one access is done. It seems your workload is doing so many toast accesses that the table / index level locking for toast tables gets to be the bottleneck. It'd be interesting to know if the issue vanishes if you force the lock on the toast table and its index to be acquired explicitly. You can find the toast table names with something like: SELECT reltoastrelid::regclass FROM pg_class WHERE oid IN ('travel_processing_v5.llc_zone'::regclass, 'travel_processing_v5.llc_zone'::regclass); That should give you two relation names looking like "pg_toast.pg_toast_24576", just with a different number. If you then change your workload to be (with adjusted OIDs of course): BEGIN; SELECT * FROM pg_toast.pg_toast_24576 LIMIT 0; SELECT * FROM pg_toast.pg_toast_64454 LIMIT 0; COMMIT; Does the scalability issue vanish? Regards, Andres
Re: LWLocks by LockManager slowing large DB
Hi, On 2021-04-13 11:47:06 -0700, Paul Friedman wrote: > YES!!! This completely alleviates the bottleneck and I'm able to run the > queries full-throttle. Thank you SO much for your help+insight. Cool. And damn: I can't immediately think of a way to optimize this to not require this kind of hack in the future. > Interestingly, "lock pg_toast.pg_toast_2233612264 in ACCESS SHARE MODE;" > which should do the same thing returns an error " ERROR: > "pg_toast_2233612264" is not a table or view" > > Sounds like I should file this as a requested improvement? The ability to lock a toast table? Yea, it might be worth doing that. I seem to recall this being discussed not too long ago... Greetings, Andres Freund
Re: LWLocks by LockManager slowing large DB
On 2021-Apr-13, Andres Freund wrote: > > Sounds like I should file this as a requested improvement? > > The ability to lock a toast table? Yea, it might be worth doing that. I > seem to recall this being discussed not too long ago... Yep, commit 59ab4ac32460 reverted by eeda7f633809. There were some issues with the semantics of locking views. It didn't seem insurmountable, but I didn't get around to it. -- Álvaro Herrera39°49'30"S 73°17'W "Use it up, wear it out, make it do, or do without"
Re: LWLocks by LockManager slowing large DB
Andres Freund writes: > On 2021-04-13 11:47:06 -0700, Paul Friedman wrote: >> YES!!! This completely alleviates the bottleneck and I'm able to run the >> queries full-throttle. Thank you SO much for your help+insight. > Cool. And damn: I can't immediately think of a way to optimize this to > not require this kind of hack in the future. Maybe the same thing we do with user tables, ie not give up the lock when we close a toast rel? As long as the internal lock counters are 64-bit, we'd not have to worry about overflowing them. regards, tom lane
RE: LWLocks by LockManager slowing large DB
> For toast tables we do not keep locks held for the duration of the transaction, > but release the lock as soon as one access is done. ... > The ability to lock a toast table? Yea, it might be worth doing that. I seem to > recall this being discussed not too long ago... Actually, the requested improvement I was thinking of was to have the locks on the toast table somehow have the same lifespan as the locks on the main table to avoid this problem to begin with. ---Paul Paul Friedman CTO 677 Harrison St | San Francisco, CA 94107 M: (650) 270-7676 E-mail: [email protected] -Original Message- From: Andres Freund Sent: Tuesday, April 13, 2021 1:48 PM To: Paul Friedman Cc: [email protected] Subject: Re: LWLocks by LockManager slowing large DB Hi, On 2021-04-13 11:47:06 -0700, Paul Friedman wrote: > YES!!! This completely alleviates the bottleneck and I'm able to run > the queries full-throttle. Thank you SO much for your help+insight. Cool. And damn: I can't immediately think of a way to optimize this to not require this kind of hack in the future. > Interestingly, "lock pg_toast.pg_toast_2233612264 in ACCESS SHARE MODE;" > which should do the same thing returns an error " ERROR: > "pg_toast_2233612264" is not a table or view" > > Sounds like I should file this as a requested improvement? The ability to lock a toast table? Yea, it might be worth doing that. I seem to recall this being discussed not too long ago... Greetings, Andres Freund
Re: LWLocks by LockManager slowing large DB
I wrote:
> Andres Freund writes:
>> Cool. And damn: I can't immediately think of a way to optimize this to
>> not require this kind of hack in the future.
> Maybe the same thing we do with user tables, ie not give up the lock
> when we close a toast rel? As long as the internal lock counters
> are 64-bit, we'd not have to worry about overflowing them.
Like this? This passes check-world, modulo the one very-unsurprising
regression test change. I've not tried to do any performance testing.
regards, tom lane
diff --git a/src/backend/access/common/detoast.c b/src/backend/access/common/detoast.c
index 545a6b8867..3826b4d93e 100644
--- a/src/backend/access/common/detoast.c
+++ b/src/backend/access/common/detoast.c
@@ -367,7 +367,7 @@ toast_fetch_datum(struct varlena *attr)
* case. */
/*
- * Open the toast relation and its indexes
+ * Open the toast relation (but its indexes are dealt with by the AM)
*/
toastrel = table_open(toast_pointer.va_toastrelid, AccessShareLock);
@@ -376,7 +376,7 @@ toast_fetch_datum(struct varlena *attr)
attrsize, 0, attrsize, result);
/* Close toast table */
- table_close(toastrel, AccessShareLock);
+ table_close(toastrel, NoLock);
return result;
}
@@ -457,7 +457,7 @@ toast_fetch_datum_slice(struct varlena *attr, int32 sliceoffset,
result);
/* Close toast table */
- table_close(toastrel, AccessShareLock);
+ table_close(toastrel, NoLock);
return result;
}
diff --git a/src/backend/access/common/toast_internals.c b/src/backend/access/common/toast_internals.c
index 730cd04a2d..f0d0251ce0 100644
--- a/src/backend/access/common/toast_internals.c
+++ b/src/backend/access/common/toast_internals.c
@@ -27,7 +27,7 @@
#include "utils/rel.h"
#include "utils/snapmgr.h"
-static bool toastrel_valueid_exists(Relation toastrel, Oid valueid);
+static bool toastrel_valueid_exists(Relation toastrel, Oid valueid, LOCKMODE lock);
static bool toastid_valueid_exists(Oid toastrelid, Oid valueid);
/* --
@@ -264,7 +264,8 @@ toast_save_datum(Relation rel, Datum value,
* be reclaimed by VACUUM.
*/
if (toastrel_valueid_exists(toastrel,
- toast_pointer.va_valueid))
+ toast_pointer.va_valueid,
+ RowExclusiveLock))
{
/* Match, so short-circuit the data storage loop below */
data_todo = 0;
@@ -359,8 +360,8 @@ toast_save_datum(Relation rel, Datum value,
/*
* Done - close toast relation and its indexes
*/
- toast_close_indexes(toastidxs, num_indexes, RowExclusiveLock);
- table_close(toastrel, RowExclusiveLock);
+ toast_close_indexes(toastidxs, num_indexes, NoLock);
+ table_close(toastrel, NoLock);
/*
* Create the TOAST pointer value that we'll return
@@ -440,8 +441,8 @@ toast_delete_datum(Relation rel, Datum value, bool is_speculative)
* End scan and close relations
*/
systable_endscan_ordered(toastscan);
- toast_close_indexes(toastidxs, num_indexes, RowExclusiveLock);
- table_close(toastrel, RowExclusiveLock);
+ toast_close_indexes(toastidxs, num_indexes, NoLock);
+ table_close(toastrel, NoLock);
}
/* --
@@ -453,7 +454,7 @@ toast_delete_datum(Relation rel, Datum value, bool is_speculative)
* --
*/
static bool
-toastrel_valueid_exists(Relation toastrel, Oid valueid)
+toastrel_valueid_exists(Relation toastrel, Oid valueid, LOCKMODE lock)
{
bool result = false;
ScanKeyData toastkey;
@@ -464,7 +465,7 @@ toastrel_valueid_exists(Relation toastrel, Oid valueid)
/* Fetch a valid index relation */
validIndex = toast_open_indexes(toastrel,
- RowExclusiveLock,
+ lock,
&toastidxs,
&num_indexes);
@@ -489,7 +490,7 @@ toastrel_valueid_exists(Relation toastrel, Oid valueid)
systable_endscan(toastscan);
/* Clean up */
- toast_close_indexes(toastidxs, num_indexes, RowExclusiveLock);
+ toast_close_indexes(toastidxs, num_indexes, NoLock);
return result;
}
@@ -508,9 +509,9 @@ toastid_valueid_exists(Oid toastrelid, Oid valueid)
toastrel = table_open(toastrelid, AccessShareLock);
- result = toastrel_valueid_exists(toastrel, valueid);
+ result = toastrel_valueid_exists(toastrel, valueid, AccessShareLock);
- table_close(toastrel, AccessShareLock);
+ table_close(toastrel, NoLock);
return result;
}
diff --git a/src/backend/access/heap/heaptoast.c b/src/backend/access/heap/heaptoast.c
index 55bbe1d584..c3bfb97c62 100644
--- a/src/backend/access/heap/heaptoast.c
+++ b/src/backend/access/heap/heaptoast.c
@@ -789,5 +789,5 @@ heap_fetch_toast_slice(Relation toastrel, Oid valueid, int32 attrsize,
/* End scan and close indexes. */
systable_endscan_ordered(toastscan);
- toast_close_indexes(toastidxs, num_indexes, AccessShareLock);
+ toast_close_indexes(toastidxs, num_indexes, NoLock);
}
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index ec14b060a6..b7a007904e 100644
--- a/src/test/regress/expected/alter_t
Re: LWLocks by LockManager slowing large DB
Hi, On 2021-04-13 19:16:46 -0400, Tom Lane wrote: > > Maybe the same thing we do with user tables, ie not give up the lock > > when we close a toast rel? As long as the internal lock counters > > are 64-bit, we'd not have to worry about overflowing them. Well, I was assuming we'd not want to do that, but I am generally on board with the concept (and think our early lock release in a bunch of places is problematic). > Like this? This passes check-world, modulo the one very-unsurprising > regression test change. I've not tried to do any performance testing. I wonder if there's a realistic chance it could create additional deadlocks that don't exist right now? Would it be a problem that we'd still release the locks on catalog tables early, but not on its toast table? Greetings, Andres Freund
Re: LWLocks by LockManager slowing large DB
Andres Freund writes: > On 2021-04-13 19:16:46 -0400, Tom Lane wrote: >> Like this? This passes check-world, modulo the one very-unsurprising >> regression test change. I've not tried to do any performance testing. > I wonder if there's a realistic chance it could create additional > deadlocks that don't exist right now? Not on user tables, because we'd always be holding at least as much of a lock on the parent table. However ... > Would it be a problem that we'd still release the locks on catalog > tables early, but not on its toast table? ... hmm, not sure. I can't immediately think of a scenario where it'd be problematic (or any more problematic than DDL on a catalog would be anyway). But that doesn't mean there isn't one. The concerns that had come to my mind were more along the lines of things like pg_dump requiring a larger footprint in the shared lock table. We could alleviate that by increasing the default value of max_locks_per_transaction, perhaps. regards, tom lane
Re: LWLocks by LockManager slowing large DB
Hi, On 2021-04-13 23:04:50 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2021-04-13 19:16:46 -0400, Tom Lane wrote: > >> Like this? This passes check-world, modulo the one very-unsurprising > >> regression test change. I've not tried to do any performance testing. > > > I wonder if there's a realistic chance it could create additional > > deadlocks that don't exist right now? > > Not on user tables, because we'd always be holding at least as much > of a lock on the parent table. However ... I suspect that's not strictly *always* the case due to some corner cases around a variable to a toast value in plpgsql surviving subtransactions etc... > The concerns that had come to my mind were more along the lines > of things like pg_dump requiring a larger footprint in the shared > lock table. We could alleviate that by increasing the default > value of max_locks_per_transaction, perhaps. Probably worth doing one of these releases independently - especially with partitioning the current value strikes me as being on the too low side. Greetings, Andres Freund
