Re: [PATCH v5 03/13] mm/shmem: Support memfile_notifier

2022-03-10 Thread Dave Chinner
) and so could generate both notify_invalidate() and a notify_populate() events. Hence "fallocate" as an internal mm namespace or operation does not belong anywhere in core MM infrastructure - it should never get used anywhere other than the VFS/filesystem layers that implement the fallocate() syscall or use it directly. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 0/1] qcow2: Skip copy-on-write when allocating a zero cluster

2020-08-23 Thread Dave Chinner
usters to the end of the file (i.e. the file itself is not sparse), while the extent size hint will just add 64kB extents into the file around the write offset. That demonstrates the other behavioural advantage that extent size hints have is they avoid needing to extend the file, which is yet another way to serialise concurrent IO and create IO pipeline stalls... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 0/1] qcow2: Skip copy-on-write when allocating a zero cluster

2020-08-23 Thread Dave Chinner
you what the device and filesytem are doing in real time (e.g. I use PCP for this and visualise ithe behaviour in real time via pmchart) gives a lot of insight into exactly what is changing during transient workload changes liek starting a benchmark... > I was running fio with --ramp_time=5 which ignores the first 5 seconds > of data in order to let performance settle, but if I remove that I can > see the effect more clearly. I can observe it with raw files (in 'off' > and 'prealloc' modes) and qcow2 files in 'prealloc' mode. With qcow2 and > preallocation=off the performance is stable during the whole test. What does "preallocation=off" mean again? Is that using fallocate(ZERO_RANGE) prior to the data write rather than preallocating the metadata/entire file? If so, I would expect the limiting factor is the rate at which IO can be issued because of the fallocate() triggered pipeline bubbles. That leaves idle device time so you're not pushing the limits of the hardware and hence none of the behaviours above will be evident... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 0/1] qcow2: Skip copy-on-write when allocating a zero cluster

2020-08-20 Thread Dave Chinner
. Thing is, once your writes into sprase image files regularly start hitting written extents, the performance of (1), (2) and (4) will trend towards (5) as writes hit already allocated ranges of the file and the serialisation of extent mapping changes goes away. This occurs with guest filesystems th

Re: [Qemu-devel] [PATCH v6 6/6] xfs: disable map_sync for async flush

2019-04-23 Thread Dave Chinner
onous. >*/ /* * This is the correct multi-line comment format. Please * update the patch to maintain the existing comment format. */ Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [Qemu-devel] [PATCH v4 5/5] xfs: disable map_sync for async flush

2019-04-03 Thread Dave Chinner
tor out all the "MAP_SYNC supported" checks into a helper so that the filesystem code just doesn't have to care about the details of checking for DAX+MAP_SYNC support Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [Qemu-devel] security implications of caching with virtio pmem (was Re: [PATCH v3 0/5] kvm "virtio pmem" device)

2019-02-11 Thread Dave Chinner
or daring to ask hard questions about this topic Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [Qemu-devel] [PATCH v3 0/5] kvm "virtio pmem" device

2019-01-15 Thread Dave Chinner
wn, which makes it very difficult for admins to manage. > We are also planning to support qcow2 sparse image format at > host side with virtio-pmem. So you're going to be remapping a huge number of disjoint regions into a linear pmem mapping? ISTR discussions about similar things for virtio+fuse+dax that came up against "large numbers of mapped regions don't scale" and so it wasn't a practical solution compared to a just using raw sparse files > - There is no existing solution for Qemu persistent memory > emulation with write support currently. This solution provides > us the paravartualized way of emulating persistent memory. Sure, but the question is why do you need to create an emulation that doesn't actually perform like pmem? The whole point of pmem is performance, and emulating pmem by mmap() of a file on spinning disks is going to be horrible for performance. Even on SSDs it's going to be orders of magnitudes slower than real pmem. So exactly what problem are you trying to solve with this driver? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [Qemu-devel] [PATCH v3 0/5] kvm "virtio pmem" device

2019-01-14 Thread Dave Chinner
On Mon, Jan 14, 2019 at 01:35:57PM -0800, Dan Williams wrote: > On Mon, Jan 14, 2019 at 1:25 PM Dave Chinner wrote: > > > > On Mon, Jan 14, 2019 at 02:15:40AM -0500, Pankaj Gupta wrote: > > > > > > > > Until you have images (and hence host page cache) s

Re: [Qemu-devel] [PATCH v3 0/5] kvm "virtio pmem" device

2019-01-14 Thread Dave Chinner
ache exceptional > entries. > Its solely decision of host to take action on the host page cache pages. > > In case of virtio-pmem, guest does not modify host file directly i.e don't > perform hole punch & truncation operation directly on host file. ... this will no longer be true, and the nuclear landmine in this driver interface will have been armed Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [Qemu-devel] [PATCH v3 0/5] kvm "virtio pmem" device

2019-01-13 Thread Dave Chinner
On Sun, Jan 13, 2019 at 03:38:21PM -0800, Matthew Wilcox wrote: > On Mon, Jan 14, 2019 at 10:29:02AM +1100, Dave Chinner wrote: > > Until you have images (and hence host page cache) shared between > > multiple guests. People will want to do this, because it means they > > only

Re: [Qemu-devel] [PATCH v3 0/5] kvm "virtio pmem" device

2019-01-13 Thread Dave Chinner
y of the same set of pages. If the guests can then, in any way, control eviction of the pages from the host cache, then we have a guest-to-guest information leak channel. i.e. it's something we need to be aware of and really careful about enabling infrastructure that /will/ be abused if guests can find a way to influence the host side cache residency. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [Qemu-devel] [PATCH v3 0/5] kvm "virtio pmem" device

2019-01-09 Thread Dave Chinner
161 I might be wrong, but if I'm not we're going to have to be very careful about how guest VMs can access and manipulate host side resources like the page cache. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [Qemu-devel] semantics of FIEMAP without FIEMAP_FLAG_SYNC (was Re: [PATCH v5 13/14] nbd: Implement NBD_CMD_WRITE_ZEROES on server)

2016-07-22 Thread Dave Chinner
M operating on top of the filesystem before layout can be determined? All of the above are *valid* and *correct*, because the filesytem defines what FIEMAP returns for a given file offset. just because ext4 and XFS have mostly the same behaviour, it doesn't mean that every other filesystem behaves the same way. The assumptions being made about FIEMAP behaviour will only lead to user data corruption, as they already have several times in the past. Cheers, Dave. -- Dave Chinner dchin...@redhat.com

Re: [Qemu-devel] semantics of FIEMAP without FIEMAP_FLAG_SYNC (was Re: [PATCH v5 13/14] nbd: Implement NBD_CMD_WRITE_ZEROES on server)

2016-07-21 Thread Dave Chinner
On Thu, Jul 21, 2016 at 01:31:21PM +0100, Pádraig Brady wrote: > On 21/07/16 12:43, Dave Chinner wrote: > > On Wed, Jul 20, 2016 at 03:35:17PM +0200, Niels de Vos wrote: > >> Oh... And I was surprised to learn that "cp" does use FIEMAP and not > >> SEEK_HOLE/SE

Re: [Qemu-devel] semantics of FIEMAP without FIEMAP_FLAG_SYNC (was Re: [PATCH v5 13/14] nbd: Implement NBD_CMD_WRITE_ZEROES on server)

2016-07-21 Thread Dave Chinner
exactly what you need. Using FIEMAP, fallocate and moving data through userspace won't ever be reliable without special filesystem help (that only exists for XFS right now), nor will it enable the application to transparently use smart storage protocols and hardware when it is present on user systems Cheers, Dave. -- Dave Chinner dchin...@redhat.com

Re: [Qemu-devel] semantics of FIEMAP without FIEMAP_FLAG_SYNC (was Re: [PATCH v5 13/14] nbd: Implement NBD_CMD_WRITE_ZEROES on server)

2016-07-21 Thread Dave Chinner
On Wed, Jul 20, 2016 at 03:35:17PM +0200, Niels de Vos wrote: > On Wed, Jul 20, 2016 at 10:30:25PM +1000, Dave Chinner wrote: > > On Wed, Jul 20, 2016 at 05:19:37AM -0400, Paolo Bonzini wrote: > > > Adding ext4 and XFS guys (Lukas and Dave respectively). As a quick > >

Re: [Qemu-devel] semantics of FIEMAP without FIEMAP_FLAG_SYNC (was Re: [PATCH v5 13/14] nbd: Implement NBD_CMD_WRITE_ZEROES on server)

2016-07-20 Thread Dave Chinner
s will report clean unwritten extents as data. 3. Maybe - if there is written data in memory over the unwritten extent on disk (i.e. hasn't been flushed to disk, it will be considered a data region with non-zero data. (FIEMAP will still report is as unwritten) > If not, would > it be acceptable to introduce Linux-specific SEEK_ZERO/SEEK_NONZERO, which > would be similar to what SEEK_HOLE/SEEK_DATA do now? To solve what problem? You haven't explained what problem you are trying to solve yet. > 2) for FIEMAP do we really need FIEMAP_FLAG_SYNC? And if not, for what > filesystems and kernel releases is it really not needed? I can't answer this question, either, because I don't know what you want the fiemap information for. Cheers, Dave. -- Dave Chinner dchin...@redhat.com

Re: [Qemu-devel] ext4 error when testing virtio-scsi & vhost-scsi

2016-07-17 Thread Dave Chinner
On Fri, Jul 15, 2016 at 03:55:20PM +0800, Zhangfei Gao wrote: > Dear Dave > > On Wed, Jul 13, 2016 at 7:03 AM, Dave Chinner wrote: > > On Tue, Jul 12, 2016 at 12:43:24PM -0400, Theodore Ts'o wrote: > >> On Tue, Jul 12, 2016 at 03:14:38PM +0800, Zhangfei

Re: [Qemu-devel] ext4 error when testing virtio-scsi & vhost-scsi

2016-07-12 Thread Dave Chinner
gt; > Given that I'm reguarly testing ext4 using kvm, and I haven't seen > anything like this in a very long time, I suspect the problemb is with > your SCSI code, and not with ext4. It's the same error I reported yesterday for ext3 on 4.7-rc6 when rebooting a VM after it hung. Cheers, Dave. -- Dave Chinner da...@fromorbit.com