On 24.09.2016 20:32, Alex Bligh wrote: >> On 24 Sep 2016, at 18:13, Vladimir Sementsov-Ogievskiy >> <[email protected]> wrote: >> >> On 24.09.2016 19:49, Alex Bligh wrote: >>>> On 24 Sep 2016, at 17:42, Vladimir Sementsov-Ogievskiy >>>> <[email protected]> wrote: >>>> >>>> On 24.09.2016 19:31, Alex Bligh wrote: >>>>>> On 24 Sep 2016, at 13:06, Vladimir Sementsov-Ogievskiy >>>>>> <[email protected]> wrote: >>>>>> >>>>>> Note: if disk size is not aligned to X we will have to send request >>>>>> larger than the disk size to clear the whole disk. >>>>> If you look at the block size extension, the size of the disk must be an >>>>> exact multiple of the minimum block size. So that would work. >> This means that this extension could not be used with any qcow2 disk, as >> qcow2 may have size not aligned to its cluster size. >> >> # qemu-img create -f qcow2 mega 1K >> Formatting 'mega', fmt=qcow2 size=1024 encryption=off cluster_size=65536 >> lazy_refcounts=off refcount_bits=16 >> # qemu-img info mega >> image: mega >> file format: qcow2 >> virtual size: 1.0K (1024 bytes) >> disk size: 196K >> cluster_size: 65536 >> Format specific information: >> compat: 1.1 >> lazy refcounts: false >> refcount bits: 16 >> corrupt: false >> >> And there is no such restriction in documentation. Or we have to consider >> sector-size (512b) as block size for qcow2, which is too small for our needs. > If by "this extension" you mean the INFO extension (which reports block > sizes) that's incorrect. > > An nbd server using a QCOW2 file as the backend would report the sector size > as the minimum block size. It might report the cluster size or the sector > size as the preferred block size, or anything in between. > > QCOW2 cluster size essentially determines the allocation unit. NBD is not > bothered as to the underlying allocation unit. It does not (currently) > support the concept of making holes visible to the client. If you use > NBD_CMD_WRITE_ZEREOS you get zeroes, which might or might not be implemented > as one or more holes or 'real' zeroes (save if you specify > NBD_CMD_FLAG_NO_HOLE in which case you are guaranteed to get 'real' zeroes'). > If you use NBD_CMD_TRIM then the area trimmed might nor might not be written > with one or more whole. There is (currently) no way to detect the presence of > holes separately from zeroes (though a bitmap extension was discussed).
I just wanted to say, that if we want a possibility of clearing the whole disk in one request for qcow2 we have to take 512 as granularity for such requests (with X = 9). An this is too small. 1tb will be the upper bound for the request. > >>>> But there is no guarantee that disk_size/block_size < INT_MAX.. >>> I think you mean 2^32-1, but yes there is no guarantee of that. In that >>> case you would need to break the call up into multiple calls. >>> >>> However, being able to break the call up into multiple calls seems pretty >>> sensible given that NBD_CMD_WRITE_ZEROES may take a large amount of >>> time, and a REALLY long time if the server doesn't support trim. >>> >>>> May be, additional option, specifying the shift would be better. With >>>> convention that if offset+length exceeds disk size, length should be >>>> recalculated as disk_size-offset. >>> I don't think we should do that. We already have clear semantics that >>> prevent operations beyond the end of the disk. Again, just break the >>> command up into multipl commands. No great hardship. >>> >> I agree that requests larger than disk size are ugly.. But splitting request >> brings me again to idea of having separate command or flag for clearing the >> whole disk without that dance. Server may report availability of this/flag >> command only if target driver supports fast write_zeroes (qcow2 in our case). > Why? In the general case you need to break up requests anyway (particularly > with the INFO extension where there is a maximum command size), and issuing a > command over a TCP connection that might take hours or days to complete with > no hint of progress, and no TCP traffic to keep NAT etc. alive, sounds like > bad practice. The overhead is tiny. > > I would be against this change. > Full backup, for example: 1. target can do fast write_zeroes: clear the whole disk (great if we can do it in one request, without splitting, etc), then backup all data except zero or unallocated (save a lot of time on this skipping). 2. target can not do fast write_zeroes: just backup all data. We need not clear the disk, as we will not save time by this. So here, we need not splitting as a general. Just clear all or not clearing at all. -- Best regards, Vladimir ------------------------------------------------------------------------------ _______________________________________________ Nbd-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nbd-general
