On Thu, Jan 10, 2013 at 4:18 PM, Benoît Canet wrote:
>> Now I understand. This case covers overwriting existing data with new
>> contents. That is common :).
>>
>> But are you seeing a cluster with refcount > 1 being overwritten
>> often? If so, it's worth looking into why that happens. It may
> Now I understand. This case covers overwriting existing data with new
> contents. That is common :).
>
> But are you seeing a cluster with refcount > 1 being overwritten
> often? If so, it's worth looking into why that happens. It may be a
> common pattern for certain file systems or applica
On Wed, Jan 9, 2013 at 5:40 PM, Benoît Canet wrote:
>> > I.5) cluster removal
>> > When a L2 entry to a cluster become stale the qcow2 code decrement the
>> > refcount.
>> > When the refcount reach zero the L2 hash block of the stale cluster
>> > is written to clear the hash.
>> > This happen ofte
On Wed, Jan 9, 2013 at 5:32 PM, Eric Blake wrote:
> On 01/09/2013 09:16 AM, Stefan Hajnoczi wrote:
>
>>> I.6) max refcount reached
>>> The L2 hash block of the cluster is written in order to remember at next
>>> startup
>>> that it must not be used anymore for deduplication. The hash is dropped
> > Two GTrees are used to give access to the hashes : one indexed by hash and
> > one other indexed by physical offset.
>
> What is the GTree indexed by physical offset used for?
I think I can get rid of the second GTree for ram based deduplication.
It need to:
-Start qcow2 with the deduplicati
On Wed, Jan 9, 2013 at 4:24 PM, Benoît Canet wrote:
> Here is a mail to open a discussion on QCOW2 deduplication design and
> performance.
>
> The actual deduplication strategy is RAM based.
> One of the goal of the project is to plan and implement an alternative way to
> do
> the lookups from di
>
> What is the GTree indexed by physical offset used for?
It's used for two things: deletion and loading of the hashes.
-Deletion is a hook in the refcount code that trigger when zero is reached.
the only information the code got is the physical offset of the yet to discard
cluster. The hash m
On 01/09/2013 09:16 AM, Stefan Hajnoczi wrote:
>> I.6) max refcount reached
>> The L2 hash block of the cluster is written in order to remember at next
>> startup
>> that it must not be used anymore for deduplication. The hash is dropped from
>> the
>> gtrees.
>
> Interesting case. This means
Hello,
Here is a mail to open a discussion on QCOW2 deduplication design and
performance.
The actual deduplication strategy is RAM based.
One of the goal of the project is to plan and implement an alternative way to do
the lookups from disk for bigger images.
I will in a first section enumerate