On Fri, Mar 29, 2019 at 4:27 AM Andres Freund <[email protected]> wrote: > On March 28, 2019 11:24:46 AM EDT, Peter Geoghegan <[email protected]> wrote: > >On Thu, Mar 28, 2019 at 5:28 AM Andres Freund <[email protected]> > >wrote: > >> Hm, good catch. I don't like this fix very much (even if it were > >> commented), but I don't have a great idea right now. > > > >That was just a POC, to verify the problem. Not a proposal. > > I'm mildly inclined to push a commented version of this. And add a open items > entry. Alternatively I'm thinking of just but taking the tablespace setting > into account.
I didn't understand that last sentence. Here's an attempt to write a suitable comment for the quick fix. And I suppose effective_io_concurrency is a reasonable default. It's pretty hard to think of a good way to get your hands on the real value safely from here. I wondered if there was a way to narrow this to just GLOBALTABLESPACE_OID since that's where pg_tablespace lives, but that doesn't work, we access other catalog too in that path. Hmm, it seems a bit odd that 0 is supposed to mean "disable issuance of asynchronous I/O requests" according to config.sgml, but here 0 will prefetch 10 buffers. -- Thomas Munro https://enterprisedb.com
From 835d94dc1e37473a7bc93cee095aa3c123c8e614 Mon Sep 17 00:00:00 2001 From: Thomas Munro <[email protected]> Date: Sat, 30 Mar 2019 22:40:10 +1300 Subject: [PATCH] Fix deadlock in heap_compute_xid_horizon_for_tuples(). We can't call code that uses syscache while we hold buffer locks on a catalog. If called for a catalog, just fall back to the effective_io_concurrency GUC rather than trying to look up the tablespace's IO concurrency setting. Diagnosed-by: Peter Geoghegan Discussion: https://postgr.es/m/CA%2BhUKGLCwPF0S4Mk7S8qw%2BDK0Bq65LueN9rofAA3HHSYikW-Zw%40mail.gmail.com --- src/backend/access/heap/heapam.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index de5bb9194e..8fefd2a3fe 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -6976,8 +6976,15 @@ heap_compute_xid_horizon_for_tuples(Relation rel, * more prefetching in this case, too. It may be that this formula is too * simplistic, but at the moment there is no evidence of that or any idea * about what would work better. + * + * Since the caller holds a buffer lock somewhere in rel, we'd better make + * sure that isn't a catalog before we call code that does syscache + * lookups, or it could deadlock. */ - io_concurrency = get_tablespace_io_concurrency(rel->rd_rel->reltablespace); + if (IsCatalogRelation(rel)) + io_concurrency = effective_io_concurrency; + else + io_concurrency = get_tablespace_io_concurrency(rel->rd_rel->reltablespace); prefetch_distance = Min((io_concurrency) + 10, MAX_IO_CONCURRENCY); /* Start prefetching. */ -- 2.21.0
