Antonin Houska <[email protected]> wrote: > Antonin Houska <[email protected]> wrote: > > > Srinath Reddy Sadipiralla <[email protected]> wrote: > > > > > The concurrency test failed once. I tried to reproduce the below scenario > > > but no luck,i think the reason the assert failure happened because > > > after speculative insert there might be no spec CONFIRM or ABORT, > > > thoughts? > > > > Perhaps, I'll try. I'm not sure the REPACK decoding worker does anthing > > special regarding decoding. If you happen to see the problem again, please > > try > > to preserve the related WAL segments - if this is a bug in PG executor, > > pg_waldump might reveal that. > > I could not reproduce the failure, and have no idea how speculative insert can > stay w/o CONFIRM / ABORT record. The only problem I could imagine is that > change_useless_for_repack() filters out the CONFIRM / ABORT record > accidentally, but neither code review nor debugger proves that > theory. (Actually if this was the problem, the test failure probably wouldn't > be that rare.)
I confirm that I was able to reproduce the crash using debugger and your more recent diagnosis [1]. Indeed, filtering was the problem. Unfortunately, I wasn't able to make the crash easily reproducible using isolation tester. The problem is that the logical decoding is performed by a background worker, and when the backend executing REPACK waits for the background worker, which in turn waits on an injection point, the isolation tester does not recognize that it's effectively the backend who is waiting on the injection point. Therefore the isolation tester does not proceed to the next step. Anyway, thanks again for your testing! [1] https://www.postgresql.org/message-id/CAFC%2Bb6qk3-DQTi43QMqvVLP%2BsudPV4vsLQm5iHfcCeObrNaVyA%40mail.gmail.com -- Antonin Houska Web: https://www.cybertec-postgresql.com
