I'm sorry for the late reply, we have discussed your point of view
and agree with the point, so we stop pursuing this patchset unless
we find a persuasive scene that needs to disable these features.

Thank Peter for the explanation of the main objection. :)

Yong

On Wed, Jul 19, 2023 at 6:17 PM Peter Krempa <[email protected]> wrote:

> On Wed, Jul 19, 2023 at 11:52:45 +0800, Yong Huang wrote:
> > On Tue, Jul 18, 2023 at 9:31 PM Peter Krempa <[email protected]> wrote:
> >
> > > On Sun, Jul 16, 2023 at 22:36:21 +0800, ~hyman wrote:
> > > > From: Hyman Huang(黄勇) <[email protected]>
> > > >
> > > > Add 'virtio_discard' and 'virtio_write_zeroes' attribute to control
> > > > whether discard and write-zeroes requests are handled by the
> > > > virtio-blk device.
> > > >
> > > > To distinguish the attributes in block drive layer, use the 'virtio'
> > > > prefix to indicate that these attributes are specific for virtio-blk.
> > > >
> > > > Signed-off-by: Hyman Huang(黄勇) <[email protected]>
> > > > ---
> > > >  docs/formatdomain.rst             |  8 ++++++++
> > > >  src/conf/domain_conf.c            | 16 ++++++++++++++++
> > > >  src/conf/domain_conf.h            |  2 ++
> > > >  src/conf/schemas/domaincommon.rng | 10 ++++++++++
> > > >  src/conf/storage_source_conf.c    |  2 ++
> > > >  src/conf/storage_source_conf.h    |  2 ++
> > > >  src/qemu/qemu_domain.c            |  2 ++
> > > >  src/qemu/qemu_driver.c            |  4 +++-
> > > >  src/vz/vz_utils.c                 | 12 ++++++++++++
> > > >  9 files changed, 57 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst
> > > > index 4af0b82569..7be12ff08e 100644
> > > > --- a/docs/formatdomain.rst
> > > > +++ b/docs/formatdomain.rst
> > > > @@ -3259,6 +3259,14 @@ paravirtualized driver is specified via the
> > > ``disk`` element.
> > > >        value can be either "unmap" (allow the discard request to be
> > > passed) or
> > > >        "ignore" (ignore the discard request). :since:`Since 1.0.6
> (QEMU
> > > and KVM
> > > >        only)`
> > > > +   -  The optional ``virtio_discard`` and ``virtio_write_zeroes``
> are
> > > attributes
> > > > +      that control whether discard and write-zeroes requests are
> > > handled by the
> > > > +      virtio-blk device. The feature is based on DISCARD and
> > > WRITE_ZEROES
> > > > +      commands introduced in virtio-blk protocol to improve
> performance
> > > when
> > > > +      using SSD backend. The value can be either 'on' or 'off'. Note
> > > that
> > > > +      ``discard`` and ``write_zeroes`` implementations in the block
> > > drive layer
> > > > +      are parts of the feature logically and should be turned on
> when
> > > enabling
> > > > +      the feature. :since:`Since 9.6.0 (QEMU and KVM only)`
> > >
> > > Based on current released qemu both 'discard' and 'write-zeroes'
> feature
> > > of the 'virtio-blk' device is enabled by default:
> > >
> > >  $ qemu-system-x86_64 -device virtio-blk-pci,? | grep discard
> > >    discard=<bool>         - on/off (default: true)
> > >    discard_granularity=<size> -  (default: 4294967295)
> > >    max-discard-sectors=<uint32> -  (default: 4194303)
> > >    report-discard-granularity=<bool> -  (default: true)
> > >  $ qemu-system-x86_64 -device virtio-blk-pci,? | grep write-zeroes
> > >    max-write-zeroes-sectors=<uint32> -  (default: 4194303)
> > >    write-zeroes=<bool>    - on/off (default: true)
> > >
> > > Do you need a way to disable this feature? Based on the description the
> > > default seems to be sane and actually what you'd want users to set.
> > >
> > The default is reasonable indeed. But there are still some scenarios
> > in the production environment that discard may  need to be disabled.
> > For example:
> > - The virtio-blk discard/write-zeroes commands was introduced in
> >   2017 in virtio-blk spec, so the OS distribution before 2017 can not
> >   use this feature. In that case, the cloud management platform(CMP)
> >   could recognize this issue and disable the feature in advance.
>
> Older drivers which do not support the feature are supposed to ignore
> any new feature bits:
>
>
> https://docs.oasis-open.org/virtio/virtio/v1.2/csd01/virtio-v1.2-csd01.html#x1-140002
>
>  2.2 Feature Bits
>
>   Each virtio device offers all the features it understands. During device
>   initialization, the driver reads this and tells the device the subset
>   that it accepts. The only way to renegotiate is to reset the device.
>
>   This allows for forwards and backwards compatibility: if the device is
>   enhanced with a new feature bit, older drivers will not write that
>   feature bit back to the device. Similarly, if a driver is enhanced with
>   a feature that the device doesn’t support, it see the new feature is not
>   offered.
>
> Based on the above it's just fine to use a older OS with the new device
> definition and management doesn't need to do anything to ensure
> compatibility.
>
> > - Discard/write-zeroes may need to be configured at disk granularity
> >   in some scenarios. Assume that CMP support 2 distributed storage
> >   clusters,  one cluster supports discard and another does not.
> >   If the VM is configured with 2 disks - vda, vdb. Which are located in
> >   2 clusters respectively. Or, the VM with disks located in the
> >   discard-supportive cluster and want to migrate disks to another
> >   cluster for some reason(eg. storage capability is exhausted)
> >   CMP may want to turn discard off explicitly.
>
> There are two distinct operations which have different requirements and
> limitations based on the specification:
>
>  - discard:
>    The virtio spec says:
>      "For discard commands, the device MAY deallocate the specified
>       range of sectors in the device backend storage."
>
>    Thus the device implementation is free to ignore any unsupported
>    discard requests. In fact the implementation in qemu does ignore
>    the request if e.g. the device's backend has discard disabled.
>
>    The backend is in fact configured by the <driver discard="unmap">
>    attribute, thus if you need to disable discard you can do that even
>    now on the backend level.
>
>  - write zeroes:
>
>    Write zeroes becomes mandatory once negotiated, but qemu has it's own
>    implementation which has integrated fallback of simply writing zeroes
>    to the backing image if there isn't any faster method available.
>
> > Though the above scenarios are quite rare and the virtio spec can
> > ensure the feature can be negotiated and used correctly.
> > CMP still wants to control the features it supports carefully and
> > precisely.
>
> As we both note there's not an actual technical problem here, right? As
> in, everything will work as configured.
>
> I'd prefer if you could justify/elaborate a bit more why you still need
> to control those features.
>
> > To summarise, IMHO, libvirt may not forbid the upper layer to
> > control the 2 features of the virtio-blk disk. Leaving the option
> > configurable for CMP or even customers.
>
> As noted above, disard requests reaching the block device below the
> image can be already configured via <driver discard="unmap/ignore">.
>
> With the 'write-zeroes' feature there isn't really any semantic
> difference regardless of what feature is supported by the underlying
> storage. The difference is only in who actually does the actual writing
> of zeroes.
>
> > There's one more background may need to be stated:
> > Current released QEMU do not provide hmp/qmp to query the
> > final features that are negotiated between front-end and back-end
> > from the hypervisor side. So if CMP wants to query what features a
> > guest VM uses, it has to query it inside the guest VM or hack the
> > qemu process.
>
> I don't quite understand why that would be something any management
> application needs to know. The hypervisor provides the features and if
> the guest OS decides not to use them it' doesn't really matter, or does
> it?
>
> There are multiple other layers apart from the hypervisor support for
> passing discards and the software that runs inside the VM (kernel,
> volume manager, filesystem driver, filesystem configuration) which can
> decide to use/don't use discards so the knowledge at the hypervisor
> level doesn't tell you much.
>
> > This way is inconvenient for control-planes, so the CMP
> > needs to control the feature as aggressively as it can.
>
> Based on the above, the only thing you can achieve is to forbid discards
> for a VM on another level than those already supported. You will not be
> able to force the OS to use discards and neither will be able to tell
> whether the OS does use them in any other way you do now can (you still
> can query number of discard requests send to the image backend).
>
> > Thank Peter for the attention to this patchset.
>
> My objections to this patchset are on the fact that the code originally
> didn't provide much justification for this patch. With your explanations
> I can't say I'm persuaded more of the contrary.
>
> If you decide to still pursue this patchset:
>
>  - Add justification to the commit message why that is needed. This is
>    needed because the code itself can't justify the need for itself.
>    Additionally do not use the first justification from this message, it
>    simply doesn't have technical grounds.
>
>  - rename the 'virtio_discard' and 'virtio_write_zeroes' features to
>    'frontend_discard' and 'frontend_write_zeroes', so that if any other
>    device frontend will expose such knobs they can be used without the
>    need for yet another knob
>
>  - modify the documentation where it states that in the vast majority of
>    cases the user doesn't need to configure it because it simply does
>    the right thing (this is actually the main objection, if the user
>    doesn't need to set a feature it doesn't make sense to expose a
>    knob)
>
>  - Since this is a device frontend feature you'll need to add a ABI
>    stability check for it (see. virDomainDiskDefCheckABIStability). Note
>    that the existing discard/write_zeroes knobs don't need a ABI
>    stability check because they are backend features thus can be changed
>    e.g. on migration.
>
>

-- 
Best regards

Reply via email to