> -----Original Message----- > From: Klaus Jensen <[email protected]> > Sent: Tuesday, 29 September 2020 20.00 > To: Keith Busch <[email protected]> > Cc: Damien Le Moal <[email protected]>; Fam Zheng > <[email protected]>; Kevin Wolf <[email protected]>; qemu- > [email protected]; Niklas Cassel <[email protected]>; Klaus Jensen > <[email protected]>; [email protected]; Alistair Francis > <[email protected]>; Philippe Mathieu-Daudé <[email protected]>; > Matias Bjorling <[email protected]> > Subject: Re: [PATCH v4 00/14] hw/block/nvme: Support Namespace Types and > Zoned Namespace Command Set > > On Sep 29 10:29, Keith Busch wrote: > > On Tue, Sep 29, 2020 at 12:46:33PM +0200, Klaus Jensen wrote: > > > It is unmistakably clear that you are invalidating my arguments > > > about portability and endianness issues by suggesting that we just > > > remove persistent state and deal with it later, but persistence is > > > the killer feature that sets the QEMU emulated device apart from > > > other emulation options. It is not about using emulation in > > > production (because yeah, why would you?), but persistence is what > > > makes it possible to develop and test "zoned FTLs" or something that > requires recovery at power up. > > > This is what allows testing of how your host software deals with > > > opened zones being transitioned to FULL on power up and the > > > persistent tracking of LBA allocation (in my series) can be used to > > > properly test error recovery if you lost state in the app. > > > > Hold up -- why does an OPEN zone transition to FULL on power up? The > > spec suggests it should be CLOSED. The spec does appear to support > > going to FULL on a NVM Subsystem Reset, though. Actually, now that I'm > > looking at this part of the spec, these implicit transitions seem a > > bit less clear than I expected. I'm not sure it's clear enough to > > evaluate qemu's compliance right now. > > > > But I don't see what testing these transitions has to do with having a > > persistent state. You can reboot your VM without tearing down the > > running QEMU instance. You can also unbind the driver or shutdown the > > controller within the running operating system. That should make those > > implicit state transitions reachable in order to exercise your FTL's > > recovery. > > > > Oh dear - don't "spec" with me ;) > > NVMe v1.4 Section 7.3.1: > > An NVM Subsystem Reset is initiated when: > * Main power is applied to the NVM subsystem; > * A value of 4E564D64h ("NVMe") is written to the NSSR.NSSRC > field; > * Requested using a method defined in the NVMe Management > Interface specification; or > * A vendor specific event occurs. > > In the context of QEMU, "Main power" is tearing down QEMU and starting it > from scratch. Just like on a "real" host, unbinding the driver, rebooting or > shutting down the controller does not cause a subsystem reset (and does not > cause the zones to change state). And since the device does not indicate > support for the optional NSSR.NSSRC register, that way to initiate a subsystem > cannot be used. > > The reason for moving to FULL is that write pointer updates are not persisted > on each advancement, only when the zone state changes. So zones that were > opened might have valid data, but invalid write pointer. > So the device transitions them to FULL as it is allowed to. >
How about when one must also recover from intermediate states (i.e., open/closed upon power loss). For example, I don't hope a real SSD implementation transition zones to full when it has thousands of open simultaneously. That could be a disaster for the PE cycles, and a lot of media going to waste. One would want applications to support that kind of failure mode as well.
