On Mon, Jul 27, 2015 at 11:44 PM, Jorgen Lundman <[email protected]>
wrote:
>
> Hello list,
>
> (I sure hope this maintains some legibility)
>
> So we have cause to add another SA_ZPL_ entry under OSX (because its fun,
> and someone gave me root).
>
> In our case, we have added SA_ZPL_ADDTIME, which keeps the time when an
> entry was added to a directory. (So it is updated if you move the entry to
> a new directory, diverging from CRTIME - used by Spotlight for indexing).
>
> It seemed fairly straight forward, diff looks like:
>
> include//sys/zfs_sa.h ZPL_DXATTR,
> include//sys/zfs_sa.h+#ifdef __APPLE__
> include//sys/zfs_sa.h+ ZPL_ADDTIME,
> include//sys/zfs_sa.h+#endif
> include//sys/zfs_sa.h ZPL_END
>
> include//sys/zfs_znode.h #define SA_ZPL_PAD(z) z->z_attr_table[ZPL_PAD]
> include//sys/zfs_znode.h+#ifdef __APPLE__
> include//sys/zfs_znode.h+#define SA_ZPL_ADDTIME(z)
> z->z_attr_table[ZPL_ADDTIME]
> include//sys/zfs_znode.h+#endif
>
> module//zfs/zfs_sa.c {"ZPL_DXATTR", 0, SA_UINT8_ARRAY, 0},
> module//zfs/zfs_sa.c+#ifdef __APPLE__
> module//zfs/zfs_sa.c+ {"ZPL_ADDTIME",sizeof(uint64_t)*2,SA_UINT64_ARRAY,0},
> module//zfs/zfs_sa.c+#endif
> module//zfs/zfs_sa.c {NULL, 0, 0, 0}
>
>
> Which takes care of the initialisation. Then we go update it when needed.
> Currently, we have two places that could happen. First one is in zfs_dir.c,
> when it is already working on a bulk update:
>
> if (!(flag & ZRENAMING) &&
> zfsvfs->z_use_sa == B_TRUE) {
> timestruc_t now;
> gethrestime(&now);
> ZFS_TIME_ENCODE(&now, addtime);
> SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_ADDTIME(zfsvfs), NULL,
> addtime, sizeof (addtime));
> }
>
> and I bumped the total number of "bulk" entries by one.
>
>
>
> Second case is an update outside of a dmu_tx, so I do it manually with:
>
> /* Write the new ADDTIME to SA */
> if ((zfsvfs->z_use_sa == B_TRUE) &&
> !vfs_isrdonly(zfsvfs->z_vfs) &&
> spa_writeable(dmu_objset_spa(zfsvfs->z_os))) {
>
> tx = dmu_tx_create(zfsvfs->z_os);
> dmu_tx_hold_sa_create(tx, sizeof(addtime));
> dmu_tx_hold_sa(tx, zp->z_sa_hdl, B_TRUE);
>
> error = dmu_tx_assign(tx, TXG_WAIT);
> if (error) {
> dmu_tx_abort(tx);
> } else {
> error = sa_update(zp->z_sa_hdl, SA_ZPL_ADDTIME(zfsvfs),
> &addtime, sizeof(addtime), tx);
> if (error)
> dmu_tx_abort(tx);
>
You can't dmu_tx_abort() after dmu_tx_assign() returns successfully.
(maybe dmu_tx_abort and dmu_tx_commit should just be one call that does
the right thing depending on whether it is assigned or not.)
> else
> dmu_tx_commit(tx);
> }
>
> if (error)
>
>
> To read the value in vnop_getattr, I use the simple:
>
> sa_lookup(zp->z_sa_hdl, SA_ZPL_ADDTIME(zfsvfs),
> &addtime, sizeof(addtime));
>
>
>
> That is all well so far. This does indeed appear to work, and ADDTIME is
> updated, and fetches as expected. Hurrah!
>
>
>
>
> But, the timing is a bit suspicious. Soon after adding this code, lets say
> 48 hours or so, we have come across SA corruption. Since we got tired of SA
> panics from previous problem we shared with ZOL, we still have some
> additional sanity testing.
>
>
>
> One of them has tripped on my test pools which was previously free of
> troubles.
>
>
> kernel[0]: ZFS: NULL SA: SA_COPY_DATA(0, 0, 0xffffff80b60d32b8, 16) i=15
>
>
> from sa.c's sa_build_layouts()
>
> if (!attr_desc[i].sa_data) {
> printf("ZFS: NULL SA: SA_COPY_DATA(%p, %p, %p, %u) i=%u\n",
> attr_desc[i].sa_data_func, attr_desc[i].sa_data,
> data_start, length, i);
> } else {
> SA_COPY_DATA(attr_desc[i].sa_data_func, attr_desc[i].sa_data,
> data_start, length);
> }
>
>
> Generally followed by a
> kernel[0]: kernel memory allocator: redzone violation: write past end of
> buffer
> Jul 28 14:27:58 icgi-vip kernel[0]: buffer=0xffffff80b6dd1b00
> bufctl=0xffffff80b6cba7b8 cache: kmem_alloc_640
>
> but I suspect that is simply from us skipping a copy (but still
> incrementing counters), and is directly from the sa_data conditional we
> added.
>
>
>
> So, are we using the SA_ZPL_ wrong? Can we not dynamically add attributes
> we like, when we like? All other SA entries are always set/read from
> zfs_znode after all. 99% of the time everything does work correctly, but it
> seems eventually something happens to the SA.
>
AFAIK you are doing it right. You can add attributes. There may be some
subtlety to the SA API that is currently eluding me, or there may be a bug
in the SA code.
--matt
>
> Any pointers are appreciated :)
>
> Lund
>
> --
> Jorgen Lundman | <[email protected]>
> Unix Administrator | +81 (0)90-5578-8500 (work)
> Shibuya-ku, Tokyo | +81 (0)80-2090-5800 (cell)
> Japan | +81 (0)3 -3375-1767 (home)
> _______________________________________________
> developer mailing list
> [email protected]
> http://lists.open-zfs.org/mailman/listinfo/developer
>
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer