nvme: Support Namespace Types and Zoned Namespace Command Set

Matias Bjorling Tue, 29 Sep 2020 13:41:57 -0700


> -----Original Message-----
> From: Klaus Jensen <[email protected]>
> Sent: Tuesday, 29 September 2020 20.00
> To: Keith Busch <[email protected]>
> Cc: Damien Le Moal <[email protected]>; Fam Zheng
> <[email protected]>; Kevin Wolf <[email protected]>; qemu-
> [email protected]; Niklas Cassel <[email protected]>; Klaus Jensen
> <[email protected]>; [email protected]; Alistair Francis
> <[email protected]>; Philippe Mathieu-Daudé <[email protected]>;
> Matias Bjorling <[email protected]>
> Subject: Re: [PATCH v4 00/14] hw/block/nvme: Support Namespace Types and
> Zoned Namespace Command Set
> 
> On Sep 29 10:29, Keith Busch wrote:
> > On Tue, Sep 29, 2020 at 12:46:33PM +0200, Klaus Jensen wrote:
> > > It is unmistakably clear that you are invalidating my arguments
> > > about portability and endianness issues by suggesting that we just
> > > remove persistent state and deal with it later, but persistence is
> > > the killer feature that sets the QEMU emulated device apart from
> > > other emulation options. It is not about using emulation in
> > > production (because yeah, why would you?), but persistence is what
> > > makes it possible to develop and test "zoned FTLs" or something that
> requires recovery at power up.
> > > This is what allows testing of how your host software deals with
> > > opened zones being transitioned to FULL on power up and the
> > > persistent tracking of LBA allocation (in my series) can be used to
> > > properly test error recovery if you lost state in the app.
> >
> > Hold up -- why does an OPEN zone transition to FULL on power up? The
> > spec suggests it should be CLOSED. The spec does appear to support
> > going to FULL on a NVM Subsystem Reset, though. Actually, now that I'm
> > looking at this part of the spec, these implicit transitions seem a
> > bit less clear than I expected. I'm not sure it's clear enough to
> > evaluate qemu's compliance right now.
> >
> > But I don't see what testing these transitions has to do with having a
> > persistent state. You can reboot your VM without tearing down the
> > running QEMU instance. You can also unbind the driver or shutdown the
> > controller within the running operating system. That should make those
> > implicit state transitions reachable in order to exercise your FTL's
> > recovery.
> >
> 
> Oh dear - don't "spec" with me ;)
> 
> NVMe v1.4 Section 7.3.1:
> 
>     An NVM Subsystem Reset is initiated when:
>       * Main power is applied to the NVM subsystem;
>       * A value of 4E564D64h ("NVMe") is written to the NSSR.NSSRC
>         field;
>       * Requested using a method defined in the NVMe Management
>         Interface specification; or
>       * A vendor specific event occurs.
> 
> In the context of QEMU, "Main power" is tearing down QEMU and starting it
> from scratch. Just like on a "real" host, unbinding the driver, rebooting or
> shutting down the controller does not cause a subsystem reset (and does not
> cause the zones to change state). And since the device does not indicate
> support for the optional NSSR.NSSRC register, that way to initiate a subsystem
> cannot be used.
> 
> The reason for moving to FULL is that write pointer updates are not persisted
> on each advancement, only when the zone state changes. So zones that were
> opened might have valid data, but invalid write pointer.
> So the device transitions them to FULL as it is allowed to.
>


How about when one must also recover from intermediate states (i.e., 
open/closed upon power loss). For example, I don't hope a real SSD 
implementation transition zones to full when it has thousands of open 
simultaneously. That could be a disaster for the PE cycles, and a lot of media 
going to waste. One would want applications to support that kind of failure 
mode as well.

RE: [PATCH v4 00/14] hw/block/nvme: Support Namespace Types and Zoned Namespace Command Set

Reply via email to