On Mon, Nov 23, 2015 at 12:51 PM, Dan McDonald <[email protected]> wrote:

> Make sure you have a window pointing at the reproduction for this bug.
>
> > On Nov 23, 2015, at 2:02 PM, Dan McDonald <[email protected]> wrote:
> >
> > I'm playing with a prototype where I lifted the checks into the
> zfs_ioctl.c function zfs_ioc_recv() instead.  I'm about to reboot an ONUed
> system to try it.  I'll report back with details (and if things go well
> with the bug-reported case, I'll see about some existing-filesystem tests
> as well).
>
> Still failing on the original reproduction.  New cause, however.  I added
> these changes to suspend the refquota...:
>
> diff --git a/usr/src/uts/common/fs/zfs/zfs_ioctl.c
> b/usr/src/uts/common/fs/zfs/zfs_ioctl.c
> index 933802f..d65ec44 100644
> --- a/usr/src/uts/common/fs/zfs/zfs_ioctl.c
> +++ b/usr/src/uts/common/fs/zfs/zfs_ioctl.c
> @@ -4130,6 +4130,7 @@ zfs_ioc_recv(zfs_cmd_t *zc)
>         char *tosnap;
>         char tofs[ZFS_MAXNAMELEN];
>         boolean_t first_recvd_props = B_FALSE;
> +       uint64_t saved_refquota;
>
>         if (dataset_namecheck(zc->zc_value, NULL, NULL) != 0 ||
>             strchr(zc->zc_value, '@') == NULL ||
> @@ -4221,6 +4222,23 @@ zfs_ioc_recv(zfs_cmd_t *zc)
>                 props_error = SET_ERROR(EINVAL);
>         }
>
> +       /*
> +        * Corner case:  If we receive a replication stream, we need to
> +        * suspend the refquota, because intermediate snapshots may exceed
> +        * the refquota.
> +        *
> +        * For now, just suspend the refquota.  We may wish to be more
> +        * intelligent about this based on data in the drc.
> +        */
> +       if (dsl_prop_get(tofs, zfs_prop_to_name(ZFS_PROP_REFQUOTA), 1,
> +               sizeof (saved_refquota), &saved_refquota, NULL) != 0) {
> +               /* Don't bother with saving the refquota. */
> +               saved_refquota = 0;
> +       } else if (dsl_dataset_set_refquota(tofs,
> +           ZPROP_SRC_TEMPORARY | ZPROP_SRC_RECEIVED, 0) != 0) {
> +               saved_refquota = 0;
> +       }
> +
>

So you've removed the existing on-disk refquota.  What happens if you crash
now?  Seems like you have no way of knowing what it should be reset to.



>         off = fp->f_offset;
>         error = dmu_recv_stream(&drc, fp->f_vnode, &off, zc->zc_cleanup_fd,
>             &zc->zc_action_handle);
> @@ -4247,6 +4265,18 @@ zfs_ioc_recv(zfs_cmd_t *zc)
>                 }
>         }
>
> +       if (saved_refquota != 0) {
> +               /*
> +                * Corner case endgame:  Attempt to restore the
> +                * refquota to the received filesystem.  If THAT fails,
> +                * we should return EDQUOT.
> +                */
> +               if (dsl_dataset_set_refquota(tofs, ZPROP_SRC_RECEIVED,
> +                   saved_refquota) != 0) {
> +                       error = EDQUOT;
> +               }
> +       }
> +
>         zc->zc_cookie = off - fp->f_offset;
>         if (VOP_SEEK(fp->f_vnode, fp->f_offset, &off, NULL) == 0)
>                 fp->f_offset = off
>
>
> In dsl_dataset_set_refquota_check(), there is this check:
>
>         if (newval < dsl_dataset_phys(ds)->ds_referenced_bytes ||
>             newval < ds->ds_reserved) {
>                 dsl_dataset_rele(ds, FTAG);
>                 return (SET_ERROR(ENOSPC));
>         }
>
> "newval" is the 150M in the reproduction on 4986's bug report.
> dsl_dataset_phys(ds)->ds_referenced_bytes  is set to the 200M that the
> first, pre-refquota-set, snapshot takes up.
>
> Furthermore, ds->ds_reserved is set to 0, so I'm not sure how either of
> these would pass.


I don't understand, can you elaborate?  It is checking:
(newval < ds_referenced_bytes || newval < ds_reserved)
ds_reserved is 0, so
(newval < ds_referenced_bytes || newval < 0)
no uint64_t is < 0, so
(newval < ds_referenced_bytes)

This check seems valid to me -- you can't set the refquota to be less than
what is currently referenced, and you also can't set the refquota to be
less than the refreservation.


> I see that there's a refreservation that sets ds_reserved, but it's not at
> all clear how or why the dsl_phys's ds_referenced_bytes should come into
> play when the snapshot (not the current, newly-received
> snapshot-now-dataset) being received is actually within the requested limit.
>
> I'll dig around some more and see what else I can find.  If I'm missing
> something obvious I'd love to know.  I'll be back after I dig around some
> more.
>
> Dan
> _______________________________________________
> developer mailing list
> [email protected]
> http://lists.open-zfs.org/mailman/listinfo/developer
>
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to