Lots of good info here. I've copied Matt's reply to the project page on the
open-zfs wiki: http://open-zfs.org/wiki/Projects#Sorted_Scrub

On Sat, Jul 9, 2016 at 2:24 PM, Matthew Ahrens <[email protected]> wrote:

> We had an intern work on "sorted scrub" last year.  Essentially the idea
> was to read the metadata to gather into memory all the BP's that need to be
> scrubbed, sort them by DVA (i.e. offset on disk) and then issue the scrub
> i/os in that sorted order.  However, memory can't hold all of the BP's, so
> we do multiple passes over the metadata, each pass gathering the next chunk
> of BP's.  This code is implemented and seems to work but probably needs
> some more testing and code cleanup.
>
> One of the downsides of that approach is having to do multiple passes over
> the metadata if it doesn't all fit in memory (which it typically does
> not).  In some circumstances, this is worth it, but in others not so much.
> To improve on that, we would like to do just one pass over the metadata to
> find all the block pointers.  Rather than storing the BP's sorted in
> memory, we would store them on disk, but only roughly sorted.  There are
> several ways we could do the sorting, which is one of the issues that makes
> this problem interesting.
>
> We could divide each top-level vdev into chunks (like metaslabs, but
> probably a different number of them) and for each chunk have an on-disk
> list of BP's in that chunk that need to be scrubbed/resilvered.  When we
> find a BP, we would append it to the appropriate list.  Once we have
> traversed all the metadata to find all the BP's, we would load one chunk's
> list of BP's into memory, sort it, and then issue the resilver i/os in
> sorted order.
>
> As an alternative, it might be better to accumulate as many BP's as fit in
> memory, sort them, and then write that sorted list to disk.  Then remove
> those BP's from memory and start filling memory again, write that list,
> etc.  Then read all the sorted lists in parallel to do a merge sort.  This
> has the advantage that we do not need to append to lots of lists as we are
> traversing the metadata. Instead we have to read from lots of lists as we
> do the scrubs, but this should be more efficient  We also don't have to
> determine beforehand how many chunks to divide each vdev into.
>
> If you'd like to continue working on sorted scrub along these lines, let
> me know.
>
> --matt
>
>
> On Sat, Jul 9, 2016 at 7:10 AM, Gvozden Neskovic <[email protected]>
> wrote:
>
>> Dear OpenZFS developers,
>>
>> Since SIMD RAID-Z code was merged to ZoL [1], I started to look into the
>> rest of the scrub/resilvering code path.
>> I've found some existing specs and ideas about how to make the process
>> more rotational drive friendly [2][3][4][5].
>> What I've gathered from these is that scrub should be split to metadata
>> and data traversal phases. As I'm new to ZFS,
>> I've made a quick prototype simulating large elevator using AVL list to
>> sort blocks by DVA offset [6]. It's probably
>> broken in more than few ways, but this is just a quick hack to get a
>> grasp of the code. Solution turned out similar to
>> 'ASYNC_DESTROY' feature, so I'm wondering if this might be a direction to
>> take?
>>
>> At this stage, I would appreciate any input on how to proceed with this
>> project. If you're a core dev and would like
>> to provide any kind of mentorship or willing to answer some questions
>> from time to time, please let me know.
>> Or, if there's a perfect solution for this just waiting to be
>> implemented, even better.
>> For starters, pointers like: read this article, make sure you understand
>> this peace of code, etc., would also be very helpful.
>>
>> Regards,
>>
>> [1]
>> https://github.com/zfsonlinux/zfs/commit/ab9f4b0b824ab4cc64a4fa382c037f4154de12d6
>> [2] https://blogs.oracle.com/roch/entry/sequential_resilvering
>> [3]
>> http://wiki.old.lustre.org/images/f/ff/Rebuild_performance-2009-06-15.pdf
>> [4] https://blogs.oracle.com/ahrens/entry/new_scrub_code
>> [5] http://open-zfs.org/wiki/Projects#Periodic_Data_Validation
>> [6]
>> https://github.com/ironMann/zfs/commit/9a2ec765d2afc38ec76393dd694216fae0221443
>>
>
> *openzfs-developer* | Archives
> <https://www.listbox.com/member/archive/274414/=now>
> <https://www.listbox.com/member/archive/rss/274414/28015357-32dd7c48> |
> Modify
> <https://www.listbox.com/member/?&;>
> Your Subscription <http://www.listbox.com>
>



-------------------------------------------
openzfs-developer
Archives: https://www.listbox.com/member/archive/274414/=now
RSS Feed: https://www.listbox.com/member/archive/rss/274414/28015062-cce53afa
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=28015062&id_secret=28015062-f966d51c
Powered by Listbox: http://www.listbox.com

Reply via email to