On Thu, 2026-04-16 at 11:42 -0700, Jeff Davis wrote:
> I plan to commit this soon.
>
> I don't plan to backport unless someone sees a reason that it should
> be
> backported (and if so, how far?).
Actually, this does need to be backported, a NULL pointer dereference
is easily reproducible on master and v18:
PGOPTIONS="-c zero_damaged_pages=on" \
pg_receivewal -D archive -U repl
On 17 the symptom is slightly different but the fix is the same.
I attached a new patch, and only the commit message is different, which
I plan to backport to 17.
There's another bug, though. Even with the patch applied, if you do the
same pg_receivewal command immediately after starting the server
(without any other connections), you get:
FATAL: cannot read pg_class without having selected a database
The path is similar: it's trying to do pg_parameter_aclcheck, but is
unable to open pg_parameter_acl at all because it can't read pg_class.
It seems to work if you connect another backend first, where it does
some initialization first, through I haven't worked out the details. I
think it goes back to when parameter ACLs were introduced in
a0ffa885e47, so CC Mark Dilger.
Regards,
Jeff Davis
From 39f0bc866b092a5def31184bf7f8ce7b7fcd7f89 Mon Sep 17 00:00:00 2001
From: Jeff Davis <[email protected]>
Date: Mon, 13 Apr 2026 12:09:40 -0700
Subject: [PATCH v2] catcache.c: always use C_COLLATION_OID.
The problem report was about setting GUCs in the startup packet when
initiating a replication connection. Setting the GUC required an ACL
check, which performed a catalog lookup on pg_parameter_acl.parname
(type TEXT). The catalog cache was hardwired to use
DEFAULT_COLLATION_OID for text attributes, but walsender never calls
CheckMyDatabase(), so the default collation was uninitialized and it
caused a NULL pointer dereference.
As the comments stated, using DEFAULT_COLLATION_OID was arbitrary
anyway: if the collation actually mattered, it should use the column's
actual collation. (In the catalog, some text columns are the default
collation and some are "C".)
Fix by using C_COLLATION_OID, which doesn't require any initialization
and is always available. When any deterministic collation will do,
it's best to consistently use the simplest and fastest one, so this is
a good idea anyway.
There may be other problems in this general area, so this should not
be considered a complete fix. But this is an independently good change
and solves the immediate problem.
Reported-by: Andrey Borodin <[email protected]>
Discussion: https://postgr.es/m/[email protected]
Backpatch-through: 17
---
src/backend/utils/cache/catcache.c | 21 ++++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index 87ed5506460..a8e7bf649d2 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -205,6 +205,10 @@ nameeqfast(Datum a, Datum b)
char *ca = NameStr(*DatumGetName(a));
char *cb = NameStr(*DatumGetName(b));
+ /*
+ * Catalogs only use deterministic collations, so ignore column collation
+ * and use fast path.
+ */
return strncmp(ca, cb, NAMEDATALEN) == 0;
}
@@ -213,6 +217,10 @@ namehashfast(Datum datum)
{
char *key = NameStr(*DatumGetName(datum));
+ /*
+ * Catalogs only use deterministic collations, so ignore column collation
+ * and use fast path.
+ */
return hash_bytes((unsigned char *) key, strlen(key));
}
@@ -244,17 +252,20 @@ static bool
texteqfast(Datum a, Datum b)
{
/*
- * The use of DEFAULT_COLLATION_OID is fairly arbitrary here. We just
- * want to take the fast "deterministic" path in texteq().
+ * Catalogs only use deterministic collations, so ignore column collation
+ * and use "C" locale for efficiency.
*/
- return DatumGetBool(DirectFunctionCall2Coll(texteq, DEFAULT_COLLATION_OID, a, b));
+ return DatumGetBool(DirectFunctionCall2Coll(texteq, C_COLLATION_OID, a, b));
}
static uint32
texthashfast(Datum datum)
{
- /* analogously here as in texteqfast() */
- return DatumGetInt32(DirectFunctionCall1Coll(hashtext, DEFAULT_COLLATION_OID, datum));
+ /*
+ * Catalogs only use deterministic collations, so ignore column collation
+ * and use "C" locale for efficiency.
+ */
+ return DatumGetInt32(DirectFunctionCall1Coll(hashtext, C_COLLATION_OID, datum));
}
static bool
--
2.43.0