This is the patch. shawn wang <[email protected]> 于2026年3月23日周一 21:19写道:
> Hi hackers, > > == Motivation == > > We operate a fleet of PostgreSQL instances with logical replication. On > several occasions, we have experienced production incidents where logical > decoding spill files (pg_replslot/<slot>/xid-*.spill) grew uncontrollably — > consuming tens of gigabytes and eventually filling up the data disk. This > caused the entire instance to go read-only, impacting not just replication > but all write workloads. > > The typical scenario is a large transaction (e.g. bulk data load or a > long-running DDL) combined with a subscriber that is either slow or > temporarily disconnected. The reorder buffer exceeds > logical_decoding_work_mem and starts spilling, but there is no upper bound > on how much can be spilled. The only backstop today is the OS returning > ENOSPC, at which point the damage is already done. > > We looked for existing protections: > > - max_slot_wal_keep_size: limits WAL retention, but does not affect > spill files at all. > - logical_decoding_work_mem: controls *when* spilling starts, but not > *how much* can be spilled. > - There is no existing GUC, patch, or commitfest entry that addresses > spill file disk quota. > > > The "Report reorder buffer size" patch (CF #6053, by Ashutosh Bapat) > improves observability of reorder buffer state, which is complementary — > but observability alone cannot prevent disk-full incidents. > > == Proposed solution == > > The attached patch adds a new GUC: > logical_decoding_spill_limit (integer, unit kB, default 0) > > When set to a positive value, it limits the total size of on-disk spill > files per replication slot. Key design points: > > 1. Tracking: We add two new fields: - ReorderBuffer.spillBytesOnDisk — > current total on-disk spill size for this slot (unlike spillBytes which is > a cumulative statistic counter, this is a live gauge). - > ReorderBufferTXN.serialized_size — per-transaction on-disk size, so we can > accurately decrement the global counter during cleanup. > 2. Increment: In ReorderBufferSerializeChange(), after a successful > write(), both counters are incremented by the size written. > 3. Decrement: In ReorderBufferRestoreCleanup(), when spill files are > unlinked, the global counter is decremented by the transaction's > serialized_size. > 4. Enforcement: In ReorderBufferCheckMemoryLimit(), before calling > ReorderBufferSerializeTXN(), we check: if (spillBytesOnDisk + txn->size > > spill_limit) ereport(ERROR, ...) This is only checked on the spill-to-disk > path — not on the streaming path (which involves no disk I/O). > 5. Behavior on limit exceeded: An ERROR is raised with > ERRCODE_CONFIGURATION_LIMIT_EXCEEDED. The walsender exits, but the slot's > restart_lsn and confirmed_flush are preserved. The subscriber can reconnect > after the DBA: > 1. increases logical_decoding_spill_limit, or > 2. increases logical_decoding_work_mem (to reduce spilling), or > 3. switches to a streaming-capable output plugin (which avoids > spilling entirely). > 6. Default 0 means unlimited — fully backward compatible. > > == Why per-slot, not global? == > > Each ReorderBuffer instance lives in a single walsender process and > corresponds to exactly one replication slot. A per-slot limit is: > > - Lock-free (no shared memory coordination needed) > - Simple to reason about (each slot has its own budget) > - Sufficient to protect against disk-full (the DBA sets the limit > based on available disk / number of slots) > > A global (cross-slot) limit could be layered on top later if needed, but > would require shared-memory counters with spinlock/atomic protection. > > == Performance impact == > > - Hot path (in-memory change queuing): zero overhead. > - Spill path: one integer comparison before serialization, one integer > addition after write() — negligible compared to the I/O cost. > - Cleanup path: one integer subtraction after unlink() — negligible. > > > Looking forward to feedback. > Thanks, > Shawn. >
0001-Add-logical_decoding_spill_limit-GUC-to-cap-spill-file-limit.patch
Description: Binary data
