Am 14.07.25 um 15:43 schrieb Kevin Wolf: > Am 01.07.2025 um 19:16 hat Kevin Wolf geschrieben: >> Am 30.05.2025 um 17:10 hat Fiona Ebner geschrieben: >>> This series is an attempt to fix a deadlock issue reported by Andrey >>> here [3]. >>> >>> bdrv_drained_begin() polls and is not allowed to be called with the >>> block graph lock held. Mark the function as GRAPH_UNLOCKED. >>> >>> This alone does not catch the issue reported by Andrey, because there >>> is a bdrv_graph_rdunlock_main_loop() before bdrv_drained_begin() in >>> the function bdrv_change_aio_context(). That unlock is of course >>> ineffective if the exclusive lock is held, but it prevents TSA from >>> finding the issue. >>> >>> Thus the bdrv_drained_begin() call from inside >>> bdrv_change_aio_context() needs to be moved up the call stack before >>> acquiring the locks. This is the bulk of the series. >>> >>> Granular draining is not trivially possible, because many of the >>> affected functions can recursively call themselves. >>> >>> In place where bdrv_drained_begin() calls were removed, assertions >>> are added, checking the quiesced_counter to ensure that the nodes >>> already got drained further up in the call stack. >> >> I finished review for this series. I had some minor comments on patches >> 24, 27 and 41. Once we agree what to do there, I can probably just make >> any changes myself while applying. > > I don't see any objections, so I just applied this and made all the > changes I had suggested.
Sorry, for not responding anymore. I was on vacation for a while and will still be busy with other stuff in the coming weeks. The changes you suggested sound good to me, thanks! Best Regards, Fiona