Someone previously brought up that SA has some bugs, apparently ZoL has
fixed some of those, see:
http://permalink.gmane.org/gmane.comp.file-systems.openzfs.devel/1545

As Matt has suggested - perhaps you are running into one of those SA bugs.

-Alek

On Tue, Jul 28, 2015 at 11:55 AM Matthew Ahrens <[email protected]> wrote:

> On Mon, Jul 27, 2015 at 11:44 PM, Jorgen Lundman <[email protected]>
> wrote:
>
>>
>> Hello list,
>>
>> (I sure hope this maintains some legibility)
>>
>> So we have cause to add another SA_ZPL_ entry under OSX (because its fun,
>> and someone gave me root).
>>
>> In our case, we have added SA_ZPL_ADDTIME, which keeps the time when an
>> entry was added to a directory. (So it is updated if you move the entry to
>> a new directory, diverging from CRTIME - used by Spotlight for indexing).
>>
>> It seemed fairly straight forward, diff looks like:
>>
>> include//sys/zfs_sa.h     ZPL_DXATTR,
>> include//sys/zfs_sa.h+#ifdef __APPLE__
>> include//sys/zfs_sa.h+  ZPL_ADDTIME,
>> include//sys/zfs_sa.h+#endif
>> include//sys/zfs_sa.h   ZPL_END
>>
>> include//sys/zfs_znode.h #define SA_ZPL_PAD(z) z->z_attr_table[ZPL_PAD]
>> include//sys/zfs_znode.h+#ifdef __APPLE__
>> include//sys/zfs_znode.h+#define SA_ZPL_ADDTIME(z)
>> z->z_attr_table[ZPL_ADDTIME]
>> include//sys/zfs_znode.h+#endif
>>
>> module//zfs/zfs_sa.c     {"ZPL_DXATTR", 0, SA_UINT8_ARRAY, 0},
>> module//zfs/zfs_sa.c+#ifdef __APPLE__
>> module//zfs/zfs_sa.c+
>> {"ZPL_ADDTIME",sizeof(uint64_t)*2,SA_UINT64_ARRAY,0},
>> module//zfs/zfs_sa.c+#endif
>> module//zfs/zfs_sa.c    {NULL, 0, 0, 0}
>>
>>
>> Which takes care of the initialisation. Then we go update it when needed.
>> Currently, we have two places that could happen. First one is in
>> zfs_dir.c,
>> when it is already working on a bulk update:
>>
>>         if (!(flag & ZRENAMING) &&
>>                 zfsvfs->z_use_sa == B_TRUE) {
>>                 timestruc_t     now;
>>                 gethrestime(&now);
>>                 ZFS_TIME_ENCODE(&now, addtime);
>>                 SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_ADDTIME(zfsvfs),
>> NULL,
>>                     addtime, sizeof (addtime));
>>         }
>>
>> and I bumped the total number of "bulk" entries by one.
>>
>>
>>
>> Second case is an update outside of a dmu_tx, so I do it manually with:
>>
>>         /* Write the new ADDTIME to SA */
>>         if ((zfsvfs->z_use_sa == B_TRUE) &&
>>                 !vfs_isrdonly(zfsvfs->z_vfs) &&
>>                 spa_writeable(dmu_objset_spa(zfsvfs->z_os))) {
>>
>>                 tx = dmu_tx_create(zfsvfs->z_os);
>>                 dmu_tx_hold_sa_create(tx, sizeof(addtime));
>>                 dmu_tx_hold_sa(tx, zp->z_sa_hdl, B_TRUE);
>>
>>                 error = dmu_tx_assign(tx, TXG_WAIT);
>>                 if (error) {
>>                         dmu_tx_abort(tx);
>>                 } else {
>>       error = sa_update(zp->z_sa_hdl, SA_ZPL_ADDTIME(zfsvfs),
>>                                    &addtime, sizeof(addtime), tx);
>>                         if (error)
>>                                 dmu_tx_abort(tx);
>>
>
> You can't dmu_tx_abort() after dmu_tx_assign() returns successfully.
>  (maybe dmu_tx_abort and dmu_tx_commit should just be one call that does
> the right thing depending on whether it is assigned or not.)
>
>
>>                         else
>>                                 dmu_tx_commit(tx);
>>                 }
>>
>>                 if (error)
>>
>>
>> To read the value in vnop_getattr, I use the simple:
>>
>>                 sa_lookup(zp->z_sa_hdl, SA_ZPL_ADDTIME(zfsvfs),
>>                                   &addtime, sizeof(addtime));
>>
>>
>>
>> That is all well so far. This does indeed appear to work, and ADDTIME is
>> updated, and fetches as expected. Hurrah!
>>
>>
>>
>>
>> But, the timing is a bit suspicious. Soon after adding this code, lets say
>> 48 hours or so, we have come across SA corruption. Since we got tired of
>> SA
>> panics from previous problem we shared with ZOL, we still have some
>> additional sanity testing.
>>
>>
>>
>> One of them has tripped on my test pools which was previously free of
>> troubles.
>>
>>
>> kernel[0]: ZFS: NULL SA: SA_COPY_DATA(0, 0, 0xffffff80b60d32b8, 16) i=15
>>
>>
>> from sa.c's sa_build_layouts()
>>
>> if (!attr_desc[i].sa_data) {
>>         printf("ZFS: NULL SA: SA_COPY_DATA(%p, %p, %p, %u) i=%u\n",
>>                    attr_desc[i].sa_data_func, attr_desc[i].sa_data,
>>                    data_start, length, i);
>> } else {
>>         SA_COPY_DATA(attr_desc[i].sa_data_func, attr_desc[i].sa_data,
>>                                  data_start, length);
>> }
>>
>>
>> Generally followed by a
>> kernel[0]: kernel memory allocator: redzone violation: write past end of
>> buffer
>> Jul 28 14:27:58 icgi-vip kernel[0]: buffer=0xffffff80b6dd1b00
>> bufctl=0xffffff80b6cba7b8  cache: kmem_alloc_640
>>
>> but I suspect that is simply from us skipping a copy (but still
>> incrementing counters), and is directly from the sa_data conditional we
>> added.
>>
>>
>>
>> So, are we using the SA_ZPL_ wrong? Can we not dynamically add attributes
>> we like, when we like? All other SA entries are always set/read from
>> zfs_znode after all. 99% of the time everything does work correctly, but
>> it
>> seems eventually something happens to the SA.
>>
>
> AFAIK you are doing it right.  You can add attributes.  There may be some
> subtlety to the SA API that is currently eluding me, or there may be a bug
> in the SA code.
>
> --matt
>
>
>>
>> Any pointers are appreciated :)
>>
>> Lund
>>
>> --
>> Jorgen Lundman       | <[email protected]>
>> Unix Administrator   | +81 (0)90-5578-8500          (work)
>> Shibuya-ku, Tokyo    | +81 (0)80-2090-5800          (cell)
>> Japan                | +81 (0)3 -3375-1767          (home)
>> _______________________________________________
>> developer mailing list
>> [email protected]
>> http://lists.open-zfs.org/mailman/listinfo/developer
>>
> _______________________________________________
> developer mailing list
> [email protected]
> http://lists.open-zfs.org/mailman/listinfo/developer
>
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to