Alvaro Herrera <[email protected]> wrote: > On 2026-Mar-25, Srinath Reddy Sadipiralla wrote: > > > Then as suggested by Alvaro off-list I checked the lock upgrade > > behavior during the table swap phase. I observed that if another > > transaction holds a conflicting lock on the table when the swap is > > attempted, it can lead to “transient table” data loss during a manual > > or timeout abort. when a REPACK (concurrent) waits for a conflicting > > lock to be released and eventually hits a lock_timeout (or is > > cancelled via ctrl+c), the transaction aborts. During this abort, the > > cleanup process triggers smgrDoPendingDeletes. This results in the > > removal of all transient table relfiles and decoder worker files > > created during the process. This effectively wipes out the work done > > by the transient table creation before the swap could successfully > > complete, this happens because during transient table creation we add > > the table to the PendingRelDelete list. > > I think we certainly need to make the files be deleted in some > reasonable fashion if repack fails partway through.
I think that Srinath tries to explain the cleanup in detail, but in fact that's a normal processing of transaction abort. Not sure we need to do anything special. > As for lock upgrade, I wonder if the best way to handle this isn't to > hack the deadlock detector so that it causes any *other* process to die, > if they detect that they would block on REPACK. Arguably there's > nothing that you can do to a table while its undergoing REPACK > CONCURRENTLY; any alterations would have to wait until the repacking is > compelted. We can implement that idea simply enough, as shown in this > crude prototype. (I omitted the last three patches in the series, and > squashed my proposed changes into 0003, as announced in my previous > posting.) I haven't thought of it because I'm not familiar with the deadlock detector, but what you do seems consistent with the way blocking by autovacuum is handled. The only problem I noticed is that PROC_IN_CONCURRENT_REPACK is not cleared at the end of transaction. Perhaps it should be added to PROC_VACUUM_STATE_MASK (name of which is already misleading due to the presence of PROC_IN_SAFE_IC, but that's another problem). > The isolation test file is also a bit crude; I just copied repack.spec > to a new file and removed the uninteresting bits. Maybe just add a new permutation to repack.spec? I don't remembery if I created repack_toast.spec as a separate file just for better readability or if there was some other issue, but the deadlock test might fit into repack.spec. -- Antonin Houska Web: https://www.cybertec-postgresql.com
