----- Original Message ----- From: "Andriy Gapon" <a...@freebsd.org>

on 18/10/2013 17:57 Steven Hartland said the following:
I think we we may well need the following patch to set the minblock
size based on the vdev ashift and not SPA_MINBLOCKSIZE.

svn diff -x -p sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
===================================================================
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c        (revision 
256554)
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c        (working copy)
@@ -5147,7 +5147,7 @@ l2arc_compress_buf(l2arc_buf_hdr_t *l2hdr)
       len = l2hdr->b_asize;
       cdata = zio_data_buf_alloc(len);
       csize = zio_compress_data(ZIO_COMPRESS_LZ4, l2hdr->b_tmp_cdata,
-           cdata, l2hdr->b_asize, (size_t)SPA_MINBLOCKSIZE);
+           cdata, l2hdr->b_asize, (size_t)(1ULL <<
l2hdr->b_dev->l2ad_vdev->vdev_ashift));

       if (csize == 0) {
               /* zero block, indicate that there's nothing to write */


This is a rather old thread and change, but I think that I have identified
another problem with 4KB cache devices.

I noticed that on some of our systems we were getting a clearly abnormal number
of l2arc checksum errors accounted in l2_cksum_bad.  The hardware appeared to be
in good health.  Using DTrace I noticed that the data seemed to be overwritten
with other data.  After more DTrace analysis I observed that sometimes
l2arc_write_buffers() would advance l2ad_hand by more than target_sz.
This meant that l2arc_write_buffers() would write beyond a region cleared by
l2arc_evict() and thus overwrite data belonging to non-evicted buffers.  Havoc
ensues.

The cache devices in question are all SSDs with logical sector size of 4KB.
I am not sure about other ZFS platforms, but on FreeBSD this fact is detected
and ashift of 12 is used for the cache vdevs.

Looking at l2arc_write_buffers() code you can see that it properly accounts for
ashift when actually writing buffers and advancing l2ad_hand:
                       /*
                        * Keep the clock hand suitably device-aligned.
                        */
                       buf_p_sz = vdev_psize_to_asize(dev->l2ad_vdev, buf_sz);
                       write_psize += buf_p_sz;
                       dev->l2ad_hand += buf_p_sz;

But the same is not done when selecting buffers to be written and checking that
target_sz is not exceeded.
So, if ARC contains a lot of buffers smaller than 4K that means that an aligned
on-disk size of the L2ARC buffers could be quite larger than their non-aligned 
size.

I propose the following patch which has been tested and seems to fix the problem
without introducing any side effects:
https://github.com/avg-I/freebsd/compare/review;l2arc-write-target-size.diff
https://github.com/avg-I/freebsd/compare/review;l2arc-write-target-size

Looks good to me.

   Regards
   Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it.
In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to