Re: [Qemu-devel] QCOW2 deduplication design

Stefan Hajnoczi Thu, 10 Jan 2013 00:17:00 -0800

On Wed, Jan 9, 2013 at 5:40 PM, Benoît Canet <benoit.ca...@irqsave.net> wrote:
>> > I.5) cluster removal
>> > When a L2 entry to a cluster become stale the qcow2 code decrement the
>> > refcount.
>> > When the refcount reach zero the L2 hash block of the stale cluster
>> > is written to clear the hash.
>> > This happen often and require the second GTree to find the hash by it's 
>> > physical
>> > sector number
>>
>> This happens often?  I'm surprised.  Thought this only happens when
>> you delete snapshots or resize the image file?  Maybe I misunderstood
>> this case.
>
> Yes the preliminary metrics code shows that cluster removal happen often.
> Maybe some recurent filesystem structure is written to disk first and
> overwritten. (inode skeleton, or journal zeroing)


Now I understand.  This case covers overwriting existing data with new
contents.  That is common :).

But are you seeing a cluster with refcount > 1 being overwritten
often?  If so, it's worth looking into why that happens.  It may be a
common pattern for certain file systems or applications to write
initial data 'A' first and then change it later.  This actually
suggests against online dedup, or at least for something like qcow2
delayed write where we don't "commit" yet because the guest will
probably still modify or append to the data.

If the initial write with data 'A' is usually followed up by a rewrite
shortly afterwards it may be possible to use a timer to dedup after a
delay.

Stefan

Re: [Qemu-devel] QCOW2 deduplication design

Reply via email to