Hi All, I ran into this issue when pg_upgrade-ing a DB with PostGIS. This is my first code patch, so any feedback on the approach will be appreciated!
The problem: ============ pg_upgrade uses pg_dump --schema-only --binary-upgrade to copy the schema from $oldcluster to $newcluster. Because this excludes all table data, it leaves out data in extension config tables registered with pg_extension_config_dump(). In $newcluster, binary_upgrade_create_empty_extension() creates the extensions without populating any table data. The extensions' CREATE EXTENSION scripts never get executed so any INSERTs are skipped. As a consequence, if any CREATE TABLE statement in $newcluster requires validation against these empty config tables, the upgrade fails. As an example, PostGIS registers config table spatial_ref_sys to hold ~8500 spatial reference system definitions (SRIDs). When a table has, e.g. a geometry column that specifies an SRID, this gets validated during the CREATE TABLE: CREATE TABLE points (id int, location geometry(Point, 27700)); ERROR: Cannot find SRID (27700) in spatial_ref_sys This will happen for any SRID-constrained column, which will prevent many real-world PostGIS deployments from being able to pg_upgrade. To summarise the problem, our ordering is wrong here because extension configuration data must be present before user tables that depend on it get created, but --schema-only strips this data. The patch: ========== We are adding a new dump object type DO_EXTENSION_DATA that dumps extension config table data in SECTION_PRE_DATA during --binary-upgrade ONLY. This restores the needed data between extension creation and user object creation, allowing the DDL to succeed. Four files are modified in bin/pg_dump: pg_dump.h: Add DO_EXTENSION_DATA to the DumpableObjectType enum, between DO_EXTENSION and DO_TYPE pg_dump_sort.c: Add PRIO_EXTENSION_DATA between PRIO_EXTENSION and PRIO_TYPE pg_dump.c: 1. Add makeExtensionDataInfo() to create a TableDataInfo with objType = DO_EXTENSION_DATA. Called for plain tables (RELKIND_RELATION) during --binary-upgrade ONLY. As it depends on the table def, the COPY will be emitted after the CREATE TABLE. 2. Add dumpExtensionData() to emit the entry in SECTION_PRE_DATA with description "EXTENSION DATA" using dumpTableData_copy(). This allows the config table data to go into the schema-only dump. 3. In processExtensionTables(), when dopt->binary_upgrade is true, call makeExtensionDataInfo() instead of makeTableDataInfo(). Additionally, skip extcondition filter because we need to dump all rows here. 4. Include DO_EXTENSION_DATA in pre-data boundary in addBoundaryDependencies() pg_backup_archiver.c: Add "EXTENSION DATA" to the whitelist in _tocEntryRequired() similar to BLOB, BLOB METADATA, etc. to include extension config table data in --schema-only dumps during --binary-upgrade ONLY. What ends up happening: ======================= The inserted rows are basically scaffolding to allow the upgrade, and do not persist. The pg_upgrade sequence goes like: 1. pg_dump includes $oldcluster extension config data in schema-only dump 2. pg_restore replays the dump into $newcluster and "EXTENSION DATA" entries populate tables like spatial_ref_sys with COPY. Subsequent CREATE TABLEs with e.g. SRID-constrained columns pass validation. 3. pg_upgrade transfers all data files from $oldcluster to $newcluster, making spatial_ref_sys byte-for-byte identical to its previous state. This patch: 1. Does NOT affect normal pg_dumps (without --binary-upgrade). DO_EXTENSION_DATA objects are not created in this case. 2. Leaves binary_upgrade_create_empty_extension() unchanged. 3. Is not PostGIS-specific, and should solve this class of problem for any extension that registers config tables that will be needed for DDL validation. 4. Has been tested against HEAD at 29bf4ee7496 with $oldcluster PostGIS 3.3.9 on PG14 and $newcluster PostGIS 3.7.0dev/master on PG19-devel. Thanks in advance for your review! Please find attached the patch for HEAD. I believe this should be easily backpatchable to (at least) PG15, and will be happy to work on backports. Best regards, Jimmy
From 29019b8638b28a487b66c7ad01893a2de40ff79d Mon Sep 17 00:00:00 2001 From: Jimmy Angelakos <[email protected]> Date: Fri, 20 Mar 2026 16:04:45 +0000 Subject: [PATCH] pg_dump: Restore extension config table data before user objects during binary upgrade pg_upgrade uses pg_dump --schema-only --binary-upgrade, which excludes all table data including extension configuration tables registered via pg_extension_config_dump(). Since binary_upgrade_create_empty_extension() does not populate these tables, any user table whose CREATE TABLE triggers validation against config data will fail. For example, PostGIS tables with SRID-constrained geometry/geography columns fail because spatial_ref_sys is empty during schema restore. Fix by introducing a new dump object type DO_EXTENSION_DATA that dumps extension config table data into SECTION_PRE_DATA during binary upgrade. This puts the data restore between extension creation and user object creation, allowing DDL-time validation to succeed. The data is scaffolding: it is overwritten when pg_upgrade transfers the old cluster's data files to the new cluster. This is not PostGIS-specific and applies to any extension that registers config tables via pg_extension_config_dump() where that data is needed for DDL-time validation. --- src/bin/pg_dump/pg_backup_archiver.c | 2 + src/bin/pg_dump/pg_dump.c | 109 ++++++++++++++++++++++++++- src/bin/pg_dump/pg_dump.h | 1 + src/bin/pg_dump/pg_dump_sort.c | 7 ++ 4 files changed, 117 insertions(+), 2 deletions(-) diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c index 271a2c3e481..c9f9e574f16 100644 --- a/src/bin/pg_dump/pg_backup_archiver.c +++ b/src/bin/pg_dump/pg_backup_archiver.c @@ -3309,6 +3309,7 @@ _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH) */ if (strcmp(te->desc, "SEQUENCE SET") == 0 || strcmp(te->desc, "BLOB") == 0 || + strcmp(te->desc, "EXTENSION DATA") == 0 || strcmp(te->desc, "BLOB METADATA") == 0 || (strcmp(te->desc, "ACL") == 0 && strncmp(te->tag, "LARGE OBJECT", 12) == 0) || @@ -3350,6 +3351,7 @@ _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH) if (!(ropt->sequence_data && strcmp(te->desc, "SEQUENCE SET") == 0) && !(ropt->binary_upgrade && (strcmp(te->desc, "BLOB") == 0 || + strcmp(te->desc, "EXTENSION DATA") == 0 || strcmp(te->desc, "BLOB METADATA") == 0 || (strcmp(te->desc, "ACL") == 0 && strncmp(te->tag, "LARGE OBJECT", 12) == 0) || diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index ad09677c336..2ffb8eb863f 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -264,6 +264,7 @@ static void prohibit_crossdb_refs(PGconn *conn, const char *dbname, static NamespaceInfo *findNamespace(Oid nsoid); static void dumpTableData(Archive *fout, const TableDataInfo *tdinfo); static void refreshMatViewData(Archive *fout, const TableDataInfo *tdinfo); +static void dumpExtensionData(Archive *fout, const TableDataInfo *tdinfo); static const char *getRoleName(const char *roleoid_str); static void collectRoleNames(Archive *fout); static void getAdditionalACLs(Archive *fout); @@ -352,6 +353,7 @@ static void addConstrChildIdxDeps(DumpableObject *dobj, const IndxInfo *refidx); static void getDomainConstraints(Archive *fout, TypeInfo *tyinfo); static void getTableData(DumpOptions *dopt, TableInfo *tblinfo, int numTables, char relkind); static void makeTableDataInfo(DumpOptions *dopt, TableInfo *tbinfo); +static void makeExtensionDataInfo(DumpOptions *dopt, TableInfo *tbinfo); static void buildMatViewRefreshDependencies(Archive *fout); static void getTableDataFKConstraints(void); static void determineNotNullFlags(Archive *fout, PGresult *res, int r, @@ -2961,6 +2963,48 @@ dumpTableData(Archive *fout, const TableDataInfo *tdinfo) destroyPQExpBuffer(clistBuf); } +/* + * dumpExtensionData - + * dump extension configuration table data for binary upgrade + * + * Entry goes into SECTION_PRE_DATA so the data is available before + * user tables that may need it for validation. + */ +static void +dumpExtensionData(Archive *fout, const TableDataInfo *tdinfo) +{ + TableInfo *tbinfo = tdinfo->tdtable; + PQExpBuffer copyBuf = createPQExpBuffer(); + PQExpBuffer clistBuf = createPQExpBuffer(); + + /* Check that we have per-column details about this table */ + Assert(tbinfo->interesting); + + /* Build COPY statement */ + printfPQExpBuffer(copyBuf, "COPY %s ", + fmtQualifiedDumpable(tbinfo)); + appendPQExpBuffer(copyBuf, "%s FROM stdin;\n", + fmtCopyColumnList(tbinfo, clistBuf)); + + if (tdinfo->dobj.dump & DUMP_COMPONENT_DATA) + { + ArchiveEntry(fout, tdinfo->dobj.catId, tdinfo->dobj.dumpId, + ARCHIVE_OPTS(.tag = tbinfo->dobj.name, + .namespace = tbinfo->dobj.namespace->dobj.name, + .owner = tbinfo->rolname, + .description = "EXTENSION DATA", + .section = SECTION_PRE_DATA, + .copyStmt = copyBuf->data, + .deps = &(tbinfo->dobj.dumpId), + .nDeps = 1, + .dumpFn = dumpTableData_copy, + .dumpArg = tdinfo)); + } + + destroyPQExpBuffer(copyBuf); + destroyPQExpBuffer(clistBuf); +} + /* * refreshMatViewData - * load or refresh the contents of a single materialized view @@ -3105,6 +3149,48 @@ makeTableDataInfo(DumpOptions *dopt, TableInfo *tbinfo) tbinfo->interesting = true; } +/* + * makeExtensionDataInfo --- create TableDataInfo for extension config table + * + * This is used during binary upgrades to ensure extension configuration + * table data is dumped early (before user tables that may depend on it). + * For example, PostGIS's spatial_ref_sys must be populated before any + * table with geography(Point, 4283) can be created due to SRID validation. + */ +static void +makeExtensionDataInfo(DumpOptions *dopt, TableInfo *tbinfo) +{ + TableDataInfo *tdinfo; + + /* Already have a data object? */ + if (tbinfo->dataObj != NULL) + return; + + /* + * Caller ensures that this is only called for RELKIND_RELATION. + */ + + /* OK, create the data object */ + tdinfo = (TableDataInfo *) pg_malloc(sizeof(TableDataInfo)); + + tdinfo->dobj.objType = DO_EXTENSION_DATA; + + tdinfo->dobj.catId.tableoid = 0; + tdinfo->dobj.catId.oid = tbinfo->dobj.catId.oid; + AssignDumpId(&tdinfo->dobj); + tdinfo->dobj.name = tbinfo->dobj.name; + tdinfo->dobj.namespace = tbinfo->dobj.namespace; + tdinfo->tdtable = tbinfo; + tdinfo->filtercond = NULL; + addObjectDependency(&tdinfo->dobj, tbinfo->dobj.dumpId); + + /* Mark that this object contains data */ + tdinfo->dobj.components |= DUMP_COMPONENT_DATA; + + tbinfo->dataObj = tdinfo; + tbinfo->interesting = true; +} + /* * The refresh for a materialized view must be dependent on the refresh for * any materialized view that this one is dependent on. @@ -11828,6 +11914,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj) case DO_EXTENSION: dumpExtension(fout, (const ExtensionInfo *) dobj); break; + case DO_EXTENSION_DATA: + dumpExtensionData(fout, (const TableDataInfo *) dobj); + break; case DO_TYPE: dumpType(fout, (const TypeInfo *) dobj); break; @@ -20380,10 +20469,25 @@ processExtensionTables(Archive *fout, ExtensionInfo extinfo[], if (dumpobj) { - makeTableDataInfo(dopt, configtbl); + /* + * For binary upgrades, dump extension config table data + * before user tables are created so it's available for + * validation (e.g. PostGIS SRIDs). + */ + if (dopt->binary_upgrade && + configtbl->relkind == RELKIND_RELATION) + makeExtensionDataInfo(dopt, configtbl); + else + makeTableDataInfo(dopt, configtbl); if (configtbl->dataObj != NULL) { - if (strlen(extconditionarray[j]) > 0) + /* + * For binary upgrade (DO_EXTENSION_DATA), don't apply + * the filter condition - we need ALL data since the + * extension won't populate built-in data in binary + * upgrade mode. + */ + if (strlen(extconditionarray[j]) > 0 && !dopt->binary_upgrade) configtbl->dataObj->filtercond = pg_strdup(extconditionarray[j]); } } @@ -20661,6 +20765,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs, { case DO_NAMESPACE: case DO_EXTENSION: + case DO_EXTENSION_DATA: case DO_TYPE: case DO_SHELL_TYPE: case DO_FUNC: diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h index 2b9c01b2c0a..185396a01a4 100644 --- a/src/bin/pg_dump/pg_dump.h +++ b/src/bin/pg_dump/pg_dump.h @@ -40,6 +40,7 @@ typedef enum /* When modifying this enum, update priority tables in pg_dump_sort.c! */ DO_NAMESPACE, DO_EXTENSION, + DO_EXTENSION_DATA, /* extension config table data for binary upgrade */ DO_TYPE, DO_SHELL_TYPE, DO_FUNC, diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c index 03e5c1c1116..3cced9c27be 100644 --- a/src/bin/pg_dump/pg_dump_sort.c +++ b/src/bin/pg_dump/pg_dump_sort.c @@ -58,6 +58,7 @@ enum dbObjectTypePriorities PRIO_COLLATION, PRIO_TRANSFORM, PRIO_EXTENSION, + PRIO_EXTENSION_DATA, /* ext config data: used for binary upgrade */ PRIO_TYPE, /* used for DO_TYPE and DO_SHELL_TYPE */ PRIO_CAST, PRIO_FUNC, @@ -106,6 +107,7 @@ static const int dbObjectTypePriority[] = { [DO_NAMESPACE] = PRIO_NAMESPACE, [DO_EXTENSION] = PRIO_EXTENSION, + [DO_EXTENSION_DATA] = PRIO_EXTENSION_DATA, [DO_TYPE] = PRIO_TYPE, [DO_SHELL_TYPE] = PRIO_TYPE, [DO_FUNC] = PRIO_FUNC, @@ -1525,6 +1527,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize) "EXTENSION %s (ID %d OID %u)", obj->name, obj->dumpId, obj->catId.oid); return; + case DO_EXTENSION_DATA: + snprintf(buf, bufsize, + "EXTENSION DATA %s (ID %d OID %u)", + obj->name, obj->dumpId, obj->catId.oid); + return; case DO_TYPE: snprintf(buf, bufsize, "TYPE %s (ID %d OID %u)", -- 2.51.0
