On 19.12.19 11:02, Kevin Wolf wrote: > Am 18.12.2019 um 11:28 hat Vladimir Sementsov-Ogievskiy geschrieben:
[...] >> qcow2_write_snapshots actually called unlocked from >> qcow2_check_fix_snapshot_table.. It seems unsafe. > > This is curious, I'm not sure why you would drop the lock there. Max? I don’t remember why but it may certainly have to do with the fact that everything that calls qcow2_write_snapshots() (i.e., qcow2_snapshot_*) does so without having taken the lock. I suppose I simply assumed this would have to be how it’s done. I don’t think it’s a problem right now because you can only check (and repair) the image from qemu-img (or when it is opened with the dirty flag set), so there shouldn’t be concurrent I/O. Anyway. I tried to remove it and then 261 hangs. This is because qcow2_write_snapshots() calls bdrv_flush(bs) twice. It would have to drop the lock around those calls at least. I’m actually not sure whether this is safe to do (in the sense of whether it’s fundamentally safer than just not holding the lock at all and trusting that there are no concurrent requests). In any case, it’s also not purely trivial, because if we were to make qcow2_write_snapshots() drop the locks around bdrv_flush(), all of its callers would in turn need to take the lock around it. (I’m not saying that is difficult, I’m just saying it’s more difficult than dropping three lines in qcow2_write_snapshots()). Max
signature.asc
Description: OpenPGP digital signature
