δΊ 10/19/2012 12:10 AM, Eric Blake ει: > On 10/18/2012 03:51 AM, Dong Xu Wang wrote: >> Document for add-cow format, the usage and spec of add-cow are introduced. >> >> Signed-off-by: Dong Xu Wang <wdon...@linux.vnet.ibm.com> >> --- >> docs/specs/add-cow.txt | 139 >> ++++++++++++++++++++++++++++++++++++++++++++++++ >> 1 files changed, 139 insertions(+), 0 deletions(-) >> create mode 100644 docs/specs/add-cow.txt >> >> diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt >> new file mode 100644 >> index 0000000..dc1e107 >> --- /dev/null >> +++ b/docs/specs/add-cow.txt >> @@ -0,0 +1,139 @@ >> +== General == >> + >> +The raw file format does not support backing files or copy on write feature. >> +The add-cow image format makes it possible to use backing files with raw > > s/with raw/with a raw/ > Okay.
>> +image by keeping a separate .add-cow metadata file. Once all sectors >> +have been written into the raw image it is safe to discard the .add-cow >> +and backing files, then we can use the raw image directly. >> + >> +An example usage of add-cow would look like:: >> +(ubuntu.img is a disk image which has been installed OS.) > > s/has been installed/has an installed/ Okay. > >> + 1) Create a raw image with the same size of ubuntu.img >> + qemu-img create -f raw test.raw 8G >> + 2) Create an add-cow image which will store dirty bitmap >> + qemu-img create -f add-cow test.add-cow \ >> + -o backing_file=ubuntu.img,image_file=test.raw >> + 3) Run qemu with add-cow image >> + qemu -drive if=virtio,file=test.add-cow >> + >> +test.raw may be larger than ubuntu.img, in that case, the size of >> test.add-cow >> +will be calculated from the size of test.raw. >> + >> +=Specification= >> + >> +The file format looks like this: >> + >> + +---------------+-------------+-----------------+ >> + | Header | Reserved | COW bitmap | >> + +---------------+-------------+-----------------+ > > Since you call out what all 4096 bytes must be in the Header section > (all bytes not occupied by a backing file name or format must be NUL), > do your really need Reserved in this section, or can you just claim that > the 4096-byte header is directly followed by the COW bitmap? > Okay, I think Header + COW would be enough. >> + >> +All numbers in add-cow are stored in Little Endian byte order. >> + >> +== Header == >> + >> +The Header is included in the first bytes: >> +(#define HEADER_SIZE (4096 * header_size)) >> + Byte 0 - 7: magic >> + add-cow magic string ("ADD_COW\xff"). >> + >> + 8 - 11: version >> + Version number (only valid value is 1 now). >> + >> + 12 - 15: backing file name offset >> + Offset in the add-cow file at which the backing file >> + name is stored (NB: The string is not >> nul-terminated). >> + If backing file name does NOT exist, this field >> will be >> + 0. Must be between 80 and [HEADER_SIZE - 2](a file >> name >> + must be at least 1 byte). >> + >> + 16 - 19: backing file name size >> + Length of the backing file name in bytes. It will >> be 0 >> + if the backing file name offset is 0. If backing >> file >> + name offset is non-zero, then it must be non-zero. >> Must >> + be less than [HEADER_SIZE - 80] to fit in the >> reserved >> + part of the header. > > More specifically, it must be small enough so that offset+size <= > HEADER_SIZE. Okay. > >> + >> + 20 - 23: image file name offset >> + Offset in the add-cow file at which the image file >> name >> + is stored (NB: The string is not null terminated). >> It >> + must be between 80 and [HEADER_SIZE - 2]. >> + >> + 24 - 27: image file name size >> + Length of the image file name in bytes. >> + Must be less than [HEADER_SIZE - 80] to fit in the >> reserved >> + part of the header. > > More specifically, it must be small enough so that offset+size <= > HEADER_SIZE. > Okay. >> + >> + 28 - 31: cluster bits >> + Number of bits that are used for addressing an >> offset >> + within a cluster (1 << cluster_bits is the cluster >> size). >> + Must not be less than 9 (i.e. 512 byte clusters). >> + >> + Note: qemu as of today has an implementation limit >> of 2 MB >> + as the maximum cluster size and won't be able to >> open images >> + with larger cluster sizes. >> + >> + 32 - 39: features >> + Bitmask of features. An implementation can safely >> ignore >> + any unknown bits that are set. > > Really? That sounds more like optional features, if an implementation > can ignore a set bit. You really want to require that implementations > reject operations on a file with a feature bit set that they don't > recognize. > Yep, I should reject the file if un-recognized bits are set. >> + >> + Bit 0: All allocated bit. If this bit is set >> then >> + backing file and COW bitmap will not be >> used, >> + and can read from or write to image >> file directly. > > And this particular bit sounds like an optional feature - setting the > bit is an optimization in speed, but leaving the bit clear or ignoring > the bit when it is set does not change correctness. > Okay, but if this bit is categorized to optional feature, I will be no features bit then... >> + >> + Bits 1-63: Reserved (set to 0) >> + >> + 40 - 47: optional features >> + Not used now. Reserved for future use. It must be >> set to 0. >> + And must be ignored while reading. > > s/. And must be ignored/, and ignored/ > Okay. >> + >> + 48 - 51: header size >> + The header field is variable-sized. This field >> indicates >> + how many 4096 bytes will be used to store add-cow >> header. >> + In add-cow v1, it is fixed to 1, so the header size >> will >> + be 4096 * 1 = 4096 bytes. > > So is the value '1' or '4096' in this field? The wording isn't quite > clear. But reading elsewhere, it looks like this should always be '1' > in version one add-cow. > I mean '1' in this field, I will describe more clearly in future patch. >> + >> + 52 - 67: backing file format >> + Format of backing file. It will be filled with 0 if >> + backing file name offset is 0. If backing file name >> + offset is non-empty, it must be non-empty. > > Are you going to enforce this? Normally, if backing file format is > omitted, then qemu knows how to probe backing file format (with the > caveat that it is a security risk if the probe returns non-raw but the > backing file really was raw). > To avoid the security risk I add this field in header, can I do anything to enforce this more? >> It is coded >> + in free-form ASCII, and is not NUL-terminated. Zero >> + padded on the right. >> + >> + 68 - 83: image file format >> + Format of image file. It must be non-empty. It is >> coded >> + in free-form ASCII, and is not NUL-terminated. Zero >> + padded on the right. > > Why do we need this field? Isn't the image file ALWAYS raw? > In v1, it only supports raw as a image_file, but in future it might support other formats which do not support COW, so I add image file format here. >> + >> + 84 - [HEADER_SIZE - 1]: > > Elsewhere in your spec, you use 80 rather than 84 for the starting point > of valid offsets. Which is it? Sorry, I add "cluster bits", which is 4 bytes, and I forgot changing 80 to 84. Will correct. > >> + It is used to make sure COW bitmap field starts at >> the >> + HEADER_SIZE byte, backing file name and image file >> name >> + will be stored here. The bytes that is not pointing >> to >> + backing file and image file names must be set to 0. > > Not just file names, but also backing file format. > Okay. >> + >> +== COW bitmap == >> + >> +The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap >> related to > > Shouldn't that be HEADER_SIZE * number of header pages, since you > dedicated a field in the header for that purpose? > There is one line in this doc: (#define HEADER_SIZE (4096 * header_size)) HEADER_SIZE and header_size are not good names, will correct. >> +backing file and image file. The bitmap will track whether the sector in >> +backing file is dirty or not. >> + >> +Each bit in the bitmap tracks one cluster's status. For example, if cluster >> +bit is 16, then each bit tracks one cluster, (1 >> 16) = 65536 bytes. The >> size > > s/>>/<</ > Okay. >> +of bitmap is calculated according to virtual size of image file, and it must >> +be multiple of 65536, the bits not used will be set to 0. Within each byte, > > What must be a multiple of 65536, the image file, or the size of the > bitmap in the add-cow file? I think what you want is: > I mean bitmap must be multiple for 65536. Will describe more clearly. > The image size is rounded up to cluster size (where any bytes in the > last cluster that do not fit in the image are ignored), then if the > number of clusters is not a multiple of 8, then remaining bits in the > bitmap will be set to 0. > > Or do you really want to require that the bitmap is a multiple of 64k > bytes (at 8 bits per byte, that means the bitmap covers a multiple of > 512k clusters, and at 512 bytes as the minimum cluster size, that the > add-cow file format manages a minimum of 256M)? That is, are you > requiring that the bitmap end on an aligned boundary, to make the bitmap > easier to use without having to special case a short-read on the last > page of the bitmap? > I think my description was wrong. add-cow is using block-cache.c, which is from qcow2-cache.c. I was not to change qcow2-cache.c to a great extent, so I use it directly, and it uses cluster_size as writing unit: ret = bdrv_pwrite(bs->file, c->entries[i].offset, c->entries[i].table, s->cluster_size); So the size of bitmap would be multiple of cluster_size of add-cow, do you think if it is acceptable? >> +the least significant bit covers the first cluster. Bit orders in one >> +byte look like: >> + +----+----+----+----+----+----+----+----+ >> + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 | >> + +----+----+----+----+----+----+----+----+ >> + >> +If the bit is 0, indicates the sector has not been allocated in image file, >> data >> +should be loaded from backing file while reading; if the bit is 1, >> indicates the > > s/indicates/it indicates/ (twice) > Okay. > If there is no backing file, or if the image file is larger than the > backing file and the offset is beyond the end of the backing file, then > the data should be read as all zero bytes instead. > Okay. >> +related sector has been dirty, should be loaded from image file while >> reading. >> +Writing to a sector causes the corresponding bit to be set to 1. >> + >> +If raw image is not an even multiple of cluster bytes, bits that correspond >> to >> +bytes beyond the raw file size in add-cow must be written as 0 and must be >> +ignored when reading. >> + >> +Image file name and backing file name must NOT be the same, we prevent this >> +while creating add-cow files. > > You prevent it when creating add-cow files via qemu-img, but that > doesn't stop malicious users from creating a file with those properties > and then trying to get you to parse it as add-cow. I think this needs > to instead be a requirement on the consumer of a potentially bad file, > and not a requirement on the producer to avoid bad files, since you > can't control all producers, as in: > If image file name and backing file resolve to the same file, the > add-cow image must be treated as invalid. Okay, I will check it while opening add-cow file. > Thank you Eric.