On Fri, Mar 29, 2019 at 4:27 AM Andres Freund <[email protected]> wrote:
> On March 28, 2019 11:24:46 AM EDT, Peter Geoghegan <[email protected]> wrote:
> >On Thu, Mar 28, 2019 at 5:28 AM Andres Freund <[email protected]>
> >wrote:
> >> Hm, good catch.  I don't like this fix very much (even if it were
> >> commented), but I don't have a great idea right now.
> >
> >That was just a POC, to verify the problem. Not a proposal.
>
> I'm mildly inclined to push a commented version of this. And add a open items 
> entry.  Alternatively I'm thinking of just but taking the tablespace setting 
> into account.

I didn't understand that last sentence.

Here's an attempt to write a suitable comment for the quick fix.  And
I suppose effective_io_concurrency is a reasonable default.

It's pretty hard to think of a good way to get your hands on the real
value safely from here.  I wondered if there was a way to narrow this
to just GLOBALTABLESPACE_OID since that's where pg_tablespace lives,
but that doesn't work, we access other catalog too in that path.

Hmm, it seems a bit odd that 0 is supposed to mean "disable issuance
of asynchronous I/O requests" according to config.sgml, but here 0
will prefetch 10 buffers.

-- 
Thomas Munro
https://enterprisedb.com
From 835d94dc1e37473a7bc93cee095aa3c123c8e614 Mon Sep 17 00:00:00 2001
From: Thomas Munro <[email protected]>
Date: Sat, 30 Mar 2019 22:40:10 +1300
Subject: [PATCH] Fix deadlock in heap_compute_xid_horizon_for_tuples().

We can't call code that uses syscache while we hold buffer locks
on a catalog.  If called for a catalog, just fall back to the
effective_io_concurrency GUC rather than trying to look up the
tablespace's IO concurrency setting.

Diagnosed-by: Peter Geoghegan
Discussion: https://postgr.es/m/CA%2BhUKGLCwPF0S4Mk7S8qw%2BDK0Bq65LueN9rofAA3HHSYikW-Zw%40mail.gmail.com
---
 src/backend/access/heap/heapam.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index de5bb9194e..8fefd2a3fe 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -6976,8 +6976,15 @@ heap_compute_xid_horizon_for_tuples(Relation rel,
 	 * more prefetching in this case, too. It may be that this formula is too
 	 * simplistic, but at the moment there is no evidence of that or any idea
 	 * about what would work better.
+	 *
+	 * Since the caller holds a buffer lock somewhere in rel, we'd better make
+	 * sure that isn't a catalog before we call code that does syscache
+	 * lookups, or it could deadlock.
 	 */
-	io_concurrency = get_tablespace_io_concurrency(rel->rd_rel->reltablespace);
+	if (IsCatalogRelation(rel))
+		io_concurrency = effective_io_concurrency;
+	else
+		io_concurrency = get_tablespace_io_concurrency(rel->rd_rel->reltablespace);
 	prefetch_distance = Min((io_concurrency) + 10, MAX_IO_CONCURRENCY);
 
 	/* Start prefetching. */
-- 
2.21.0

Reply via email to