qcow2-refcount: fix out-of-file L2 entries to be read-as-zero

Vladimir Sementsov-Ogievskiy Mon, 08 Oct 2018 16:21:21 -0700


On 10/09/2018 01:21 AM, Max Reitz wrote:
> On 09.10.18 00:14, Vladimir Sementsov-Ogievskiy wrote:
>>
>>
>> On 10/09/2018 01:08 AM, Max Reitz wrote:
>>> On 09.10.18 00:02, Vladimir Sementsov-Ogievskiy wrote:
>>>>
>>>>
>>>> On 10/08/2018 11:51 PM, Max Reitz wrote:
>>>>> On 17.08.18 14:22, Vladimir Sementsov-Ogievskiy wrote:
>>>>>> Rewrite corrupted L2 table entry, which reference space out of
>>>>>> underlying file.
>>>>>>
>>>>>> Make this L2 table entry read-as-all-zeros without any allocation.
>>>>>>
>>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <[email protected]>
>>>>>> ---
>>>>>>     block/qcow2-refcount.c | 32 ++++++++++++++++++++++++++++++++
>>>>>>     1 file changed, 32 insertions(+)
>>>>>>
>>>>>> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
>>>>>> index 3c004e5bfe..3de3768a3c 100644
>>>>>> --- a/block/qcow2-refcount.c
>>>>>> +++ b/block/qcow2-refcount.c
>>>>>> @@ -1720,8 +1720,30 @@ static int check_refcounts_l2(BlockDriverState 
>>>>>> *bs, BdrvCheckResult *res,
>>>>>>                 /* Mark cluster as used */
>>>>>>                 csize = (((l2_entry >> s->csize_shift) & s->csize_mask) 
>>>>>> + 1) *
>>>>>>                         BDRV_SECTOR_SIZE;
>>>>>> +            if (csize > s->cluster_size) {
>>>>>> +                ret = fix_l2_entry_to_zero(
>>>>>> +                        bs, res, fix, l2_offset, i, active,
>>>>>> +                        "compressed cluster larger than cluster: size 
>>>>>> 0x%"
>>>>>> +                        PRIx64, csize);
>>>>>> +                if (ret < 0) {
>>>>>> +                    goto fail;
>>>>>> +                }
>>>>>> +                continue;
>>>>>> +            }
>>>>>> +
>>>>>
>>>>> This seems recoverable, isn't it?  Can we not try to just limit the
>>>>> csize, or decompress the cluster with the given csize from the given
>>>>> offset, disregarding the cluster limit?
>>>>
>>>> Hm, you want to assume that csize is corrupted but coffset may be
>>>> correct? Unlikely, I think.
>>>
>>> Better to reconstruct probably garbage data than to definitely garbage
>>> data (all zeroes) is what I think.
>>>
>>>> So, to carefully repair csize, we should decompress one cluster (or one
>>>> cluster - 1 byte) of data, trying to get one cluster of decompressed
>>>> data. If we succeed, we know csize, or we can safely set it to one cluster.
>>>
>>> Yes.
>>>
>>>> Or we can just set csize = 1 cluster, if it is larger. And leave
>>>> problems to real execution which will lead to EIO in worst case.
>>>
>>> Or this, yes.
>>>
>>>>>>                 coffset = l2_entry & s->cluster_offset_mask &
>>>>>>                           ~(BDRV_SECTOR_SIZE - 1);
>>>>>> +            if (coffset >= bdrv_getlength(bs->file->bs)) {
>>>>>> +                ret = fix_l2_entry_to_zero(
>>>>>> +                        bs, res, fix, l2_offset, i, active,
>>>>>> +                        "compressed cluster out of file: offset 0x%" 
>>>>>> PRIx64,
>>>>>> +                        coffset);
>>>>>> +                if (ret < 0) {
>>>>>> +                    goto fail;
>>>>>> +                }
>>>>>> +                continue;
>>>>>> +            }
>>>>>> +
>>>>>>                 ret = qcow2_inc_refcounts_imrt(bs, res,
>>>>>>                                                refcount_table, 
>>>>>> refcount_table_size,
>>>>>>                                                coffset, csize);
>>>>>> @@ -1748,6 +1770,16 @@ static int check_refcounts_l2(BlockDriverState 
>>>>>> *bs, BdrvCheckResult *res,
>>>>>>             {
>>>>>>                 uint64_t offset = l2_entry & L2E_OFFSET_MASK;
>>>>>>     
>>>>>> +            if (offset >= bdrv_getlength(bs->file->bs)) {
>>>>>> +                ret = fix_l2_entry_to_zero(
>>>>>> +                        bs, res, fix, l2_offset, i, active,
>>>>>> +                        "cluster out of file: offset 0x%" PRIx64, 
>>>>>> offset);
>>>>>> +                if (ret < 0) {
>>>>>> +                    goto fail;
>>>>>> +                }
>>>>>> +                continue;
>>>>>> +            }
>>>>>> +
>>>>>
>>>>> These other two look OK, but they have another issue:  If this is a v2
>>>>> image, you cannot create zero clusters; so you'll have to unallocate the
>>>>> cluster in that case.
>>>>
>>>>
>>>> Oho, it's a problem. It may be unsafe to discard clusters, making
>>>> backing image available through the holes. What discard do on v2?
>>>> Zeroing or holes?
>>>
>>> Oh, right!  discard on v2 punches a hole.  So I see three ways:
>>> (1) You can do the same and point to that bit of code, or
>>> (2) You allocate a data cluster full of zeroes in case of v2, or
>>> (3) You just error out.
>>>
>>> (3) doesn't seem like the worst option.
>>
>>> Amending the image to be v3 is
>>> always possible and trivial.
>>
>> how to do it for corrupted image?
> 
> Oh, yeah, you can't open a corrupted image, can you...  I suppose we
> want a way to force-clear the flag anyway. :-)


am, which flag?

> 
> Max
>

Re: [Qemu-devel] [PATCH v2 7/7] block/qcow2-refcount: fix out-of-file L2 entries to be read-as-zero

Reply via email to