On 5/22/26 12:19 PM, John Groves wrote:
> From: John Groves <[email protected]>
>
> Clear holder_ops before holder_data so that a concurrent fs_dax_get()
> cannot have its newly installed holder_ops overwritten. Also add a
> kerneldoc comment documenting that fs_put_dax() must only be called
> by the current holder.
>
> Fixes: eec38f5d86d27 ("dax: add fs_dax_get() for devdax")
> Signed-off-by: John Groves <[email protected]>
Couple things from Claude that may be worth taking a look at:
1. Memory ordering is now load-bearing and missing
The whole correctness argument depends on the reader observing holder_ops =
NULL before observing holder_data = NULL. The patch uses a plain store
followed by cmpxchg. On x86 plain stores are ordered, but on arm64/ppc they
are not — the reader can observe cmpxchg's release of holder_data while still
seeing the old holder_ops. That puts us back in the dangerous (holder_data ==
NULL, holder_ops == old) state on weakly-ordered arches.
Required:
smp_store_release(&dax_dev->holder_ops, NULL); /* publish ops=NULL first */
cmpxchg(&dax_dev->holder_data, holder, NULL); /* then release holder_data
*/
And the reader in dax_holder_notify_failure should use
smp_load_acquire/READ_ONCE because today it reads dax_dev->holder_ops twice
(line 334 and line 339), allowing tearing or stale-cache reads. Pre-existing
weakness, but this patch is what makes the ordering matter.
kill_dax (line 461-462) has the same naked-store pattern — it should be made
consistent.
2. Unconditional holder_ops = NULL is a behavior regression
Pre-patch was defensive: if a caller passed the wrong holder, the cmpxchg
failed and nothing got cleared.
Post-patch clears holder_ops unconditionally whenever dax_dev && holder is
truthy. A wrong-holder fs_put_dax() now actively damages the legitimate
holder's state — sets holder_ops to NULL while holder_data retains the
legitimate holder's pointer. From that point, all dax_holder_notify_failure()
calls return -EOPNOTSUPP, silently breaking the legitimate holder's
poison-recovery path.
DJ
> ---
> drivers/dax/super.c | 24 ++++++++++++++++++++++--
> 1 file changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> index 25cf99dd9360b..fa1d2a6eb2408 100644
> --- a/drivers/dax/super.c
> +++ b/drivers/dax/super.c
> @@ -116,11 +116,31 @@ EXPORT_SYMBOL_GPL(fs_dax_get_by_bdev);
>
> #if IS_ENABLED(CONFIG_FS_DAX)
>
> +/**
> + * fs_put_dax() - release holder ownership of a dax_device
> + * @dax_dev: dax device to release (may be NULL)
> + * @holder: the holder pointer previously passed to fs_dax_get() or
> + * fs_dax_get_by_bdev(); must match exactly, as it is used
> + * in a cmpxchg to atomically release ownership
> + *
> + * Must only be called by the current holder. Clears holder_ops before
> + * holder_data to avoid a race where a concurrent fs_dax_get() could have
> + * its newly installed holder_ops overwritten.
> + */
> void fs_put_dax(struct dax_device *dax_dev, void *holder)
> {
> - if (dax_dev && holder &&
> - cmpxchg(&dax_dev->holder_data, holder, NULL) == holder)
> + if (dax_dev && holder) {
> + /*
> + * Clear holder_ops before holder_data so that a concurrent
> + * fs_dax_get() cannot have its newly installed holder_ops
> + * overwritten. holder_ops is only consulted when holder_data
> + * is non-NULL, so clearing ops first is safe — any in-flight
> + * holder_notify_failure() will see the old holder_data with
> + * NULL ops (a no-op) rather than new ops with wrong context.
> + */
> dax_dev->holder_ops = NULL;
> + cmpxchg(&dax_dev->holder_data, holder, NULL);
> + }
> put_dax(dax_dev);
> }
> EXPORT_SYMBOL_GPL(fs_put_dax);