Re: [Xen-devel] [PATCH] dma-mapping: remove unused attrs parameter to dma_common_get_sgtable

2019-01-03 Thread h...@lst.de
On Fri, Jan 04, 2019 at 01:45:26AM +, Huaisheng HS1 Ye wrote:
> From: Stefano Stabellini 
> Sent: Friday, January 04, 2019 1:55 AM
> > On Thu, 3 Jan 2019, Huaisheng Ye wrote:
> > > From: Huaisheng Ye 
> > >
> > > dma_common_get_sgtable has parameter attrs which is not used at all.
> > > Remove it.
> > >
> > > Signed-off-by: Huaisheng Ye 
> > 
> > Acked-by: Stefano Stabellini 
> > 
> > FYI the patch doesn't apply cleanly to master.
> 
> Got it, I will rebase it to branch master and resend the patch later.

I think we can skip it.  For the next merge window I plan to remove
the fallbacks for get_sgtable and mmap and instead wire everything up
explicitly.  And to be a possible method instance it will have to match
the prototype and thus keep the unused argument.

___
Xen-devel mailing list
[email protected]
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [PATCH V3 11/13] HV/IOMMU: Enable swiotlb bounce buffer for Isolation VM

2021-08-19 Thread h...@lst.de
On Thu, Aug 19, 2021 at 06:11:30PM +, Michael Kelley wrote:
> This function is manipulating page tables in the guest VM.  It is not involved
> in communicating with Hyper-V, or passing PFNs to Hyper-V.  The pfn array
> contains guest PFNs, not Hyper-V PFNs.  So it should use PAGE_SIZE
> instead of HV_HYP_PAGE_SIZE, and similarly PAGE_SHIFT and virt_to_pfn().
> If this code were ever to run on ARM64 in the future with PAGE_SIZE other
> than 4 Kbytes, the use of PAGE_SIZE is correct choice.

Yes.  I just stumled over this yesterday.  I think we can actually use a
nicer helper ased around vmap_range() to simplify this exactly because it
always uses the host page size.  I hope I can draft up a RFC today.




Re: [PATCH V3 12/13] HV/Netvsc: Add Isolation VM support for netvsc driver

2021-08-19 Thread h...@lst.de
On Thu, Aug 19, 2021 at 06:14:51PM +, Michael Kelley wrote:
> > +   if (!pfns)
> > +   return NULL;
> > +
> > +   for (i = 0; i < size / HV_HYP_PAGE_SIZE; i++)
> > +   pfns[i] = virt_to_hvpfn(buf + i * HV_HYP_PAGE_SIZE)
> > +   + (ms_hyperv.shared_gpa_boundary >> HV_HYP_PAGE_SHIFT);
> > +
> > +   vaddr = vmap_pfn(pfns, size / HV_HYP_PAGE_SIZE, PAGE_KERNEL_IO);
> > +   kfree(pfns);
> > +
> > +   return vaddr;
> > +}
> 
> This function appears to be a duplicate of hv_map_memory() in Patch 11 of this
> series.  Is it possible to structure things so there is only one 
> implementation?  In

So right now it it identical, but there is an important difference:
the swiotlb memory is physically contiguous to start with, so we can
do the simple remap using vmap_range as suggested in the last mail.
The cases here are pretty weird in that netvsc_remap_buf is called right
after vzalloc.  That is we create _two_ mappings in vmalloc space right
after another, where the original one is just used for establishing the
"GPADL handle" and freeing the memory.  In other words, the obvious thing
to do here would be to use a vmalloc variant that allows to take the
shared_gpa_boundary into account when setting up the PTEs.

And here is somthing I need help from the x86 experts:  does the CPU
actually care about this shared_gpa_boundary?  Or does it just matter
for the generated DMA address?  Does somehow have a good pointer to
how this mechanism works?



Re: [PATCH V3 13/13] HV/Storvsc: Add Isolation VM support for storvsc driver

2021-08-19 Thread h...@lst.de
On Thu, Aug 19, 2021 at 06:17:40PM +, Michael Kelley wrote:
> > +#define storvsc_dma_map(dev, page, offset, size, dir) \
> > +   dma_map_page(dev, page, offset, size, dir)
> > +
> > +#define storvsc_dma_unmap(dev, dma_range, dir) \
> > +   dma_unmap_page(dev, dma_range.dma,  \
> > +  dma_range.mapping_size,  \
> > +  dir ? DMA_FROM_DEVICE : DMA_TO_DEVICE)
> > +
> 
> Each of these macros is used only once.  IMHO, they don't
> add a lot of value.  Just coding dma_map/unmap_page()
> inline would be fine and eliminate these lines of code.

Yes, I had the same thought when looking over the code.  Especially
as macros tend to further obsfucate the code (compared to actual helper
functions).

> > +   for (i = 0; i < request->hvpg_count; i++)
> > +   storvsc_dma_unmap(&device->device,
> > +   request->dma_range[i],
> > +   
> > request->vstor_packet.vm_srb.data_in == READ_TYPE);
> 
> I think you can directly get the DMA direction as 
> request->cmd->sc_data_direction.

Yes.

> > 
> > @@ -1824,6 +1848,13 @@ static int storvsc_queuecommand(struct Scsi_Host 
> > *host, struct scsi_cmnd *scmnd)
> > payload->range.len = length;
> > payload->range.offset = offset_in_hvpg;
> > 
> > +   cmd_request->dma_range = kcalloc(hvpg_count,
> > +sizeof(*cmd_request->dma_range),
> > +GFP_ATOMIC);
> 
> With this patch, it appears that storvsc_queuecommand() is always
> doing bounce buffering, even when running in a non-isolated VM.
> The dma_range is always allocated, and the inner loop below does
> the dma mapping for every I/O page.  The corresponding code in
> storvsc_on_channel_callback() that does the dma unmap allows for
> the dma_range to be NULL, but that never happens.

Maybe I'm missing something in the hyperv code, but I don't think
dma_map_page would bounce buffer for the non-isolated case.  It
will just return the physical address.

> > +   if (!cmd_request->dma_range) {
> > +   ret = -ENOMEM;
> 
> The other memory allocation failure in this function returns
> SCSI_MLQUEUE_DEVICE_BUSY.   It may be debatable as to whether
> that's the best approach, but that's a topic for a different patch.  I
> would suggest being consistent and using the same return code
> here.

Independent of if SCSI_MLQUEUE_DEVICE_BUSY is good (it it a common
pattern in SCSI drivers), ->queuecommand can't return normal
negative errnos.  It must return the SCSI_MLQUEUE_* codes or 0.
We should probably change the return type of the method definition
to a suitable enum to make this more clear.

> > +   if (offset_in_hvpg) {
> > +   payload->range.offset = dma & 
> > ~HV_HYP_PAGE_MASK;
> > +   offset_in_hvpg = 0;
> > +   }
> 
> I'm not clear on why payload->range.offset needs to be set again.
> Even after the dma mapping is done, doesn't the offset in the first
> page have to be the same?  If it wasn't the same, Hyper-V wouldn't
> be able to process the PFN list correctly.  In fact, couldn't the above
> code just always set offset_in_hvpg = 0?

Careful.  DMA mapping is supposed to keep the offset in the page, but
for that the DMA mapping code needs to know what the device considers a
"page".  For that the driver needs to set the min_align_mask field in
struct device_dma_parameters.

> 
> The whole approach here is to do dma remapping on each individual page
> of the I/O buffer.  But wouldn't it be possible to use dma_map_sg() to map
> each scatterlist entry as a unit?  Each scatterlist entry describes a range of
> physically contiguous memory.  After dma_map_sg(), the resulting dma
> address must also refer to a physically contiguous range in the swiotlb
> bounce buffer memory.   So at the top of the "for" loop over the scatterlist
> entries, do dma_map_sg() if we're in an isolated VM.  Then compute the
> hvpfn value based on the dma address instead of sg_page().  But everything
> else is the same, and the inner loop for populating the pfn_arry is 
> unmodified.
> Furthermore, the dma_range array that you've added is not needed, since
> scatterlist entries already have a dma_address field for saving the mapped
> address, and dma_unmap_sg() uses that field.

Yes, I think dma_map_sg is the right thing to use here, probably even
for the non-isolated case so that we can get the hv drivers out of their
little corner and into being more like a normal kernel driver.  That
is, use the scsi_dma_map/scsi_dma_unmap helpers, and then iterate over
the dma addresses one page at a time using for_each_sg_dma_page.

> 
> One thing:  There's a maximum swiotlb mapping size, which I think works
> out to be 256 Kbytes.  See swiotlb_max_mapping_siz

Re: [PATCH V3 13/13] HV/Storvsc: Add Isolation VM support for storvsc driver

2021-08-24 Thread h...@lst.de
On Sat, Aug 21, 2021 at 02:04:11AM +0800, Tianyu Lan wrote:
> After dma_map_sg(), we still need to go through scatter list again to 
> populate payload->rrange.pfn_array. We may just go through the scatter list 
> just once if dma_map_sg() accepts a callback and run it during go
> through scatter list.

Iterating a cache hot array is way faster than doing lots of indirect
calls.



min_align_mask Re: [PATCH V3 13/13] HV/Storvsc: Add Isolation VM support for storvsc driver

2021-08-24 Thread h...@lst.de
On Fri, Aug 20, 2021 at 03:40:08PM +, Michael Kelley wrote:
> I see that the swiotlb code gets and uses the min_align_mask field.  But
> the NVME driver is the only driver that ever sets it, so the value is zero
> in all other cases.  Does swiotlb just use PAGE_SIZE in that that case?  I
> couldn't tell from a quick glance at the swiotlb code.

The encoding isn't all that common.  I only know it for the RDMA memory
registration format, and RDMA in general does not interact very well
with swiotlb (although the in-kernel drivers should work fine, it is
userspace RDMA that is the problem).  It seems recently a new driver
using the format (mpi3mr) also showed up.  All these should probably set
the min_align_mask.



Re: [RFC 1/1] swiotlb: Reduce calls to swiotlb_find_pool()

2024-06-26 Thread h...@lst.de
On Wed, Jun 26, 2024 at 11:58:13PM +, Michael Kelley wrote:
> > This patch trades off making many of the core swiotlb APIs take
> > an additional argument in order to avoid duplicating calls to
> > swiotlb_find_pool(). The current code seems rather wasteful in
> > making 6 calls per round-trip, but I'm happy to accept others'
> > judgment as to whether getting rid of the waste is worth the
> > additional code complexity.
> 
> Quick ping on this RFC.  Is there any interest in moving forward?
> Quite a few lines of code are affected because of adding the
> additional "pool" argument to several functions, but the change
> is conceptually pretty simple.

Yes, this looks sensible to me.  I'm tempted to apply it.




Re: [RFC 1/1] swiotlb: Reduce calls to swiotlb_find_pool()

2024-06-27 Thread h...@lst.de
On Thu, Jun 27, 2024 at 02:59:03PM +, Michael Kelley wrote:
> Conceptually, it's still being used as a boolean function based on
> whether the return value is NULL.  Renaming it to swiotlb_get_pool()
> more accurately describes the return value, but obscures the
> intent of determining if it is a swiotlb buffer.  I'll think about it.
> Suggestions are welcome.

Just keep is_swiotlb_buffer as a trivial inline helper that returns
bool.




Re: [RFC 1/1] swiotlb: Reduce calls to swiotlb_find_pool()

2024-06-27 Thread h...@lst.de
On Thu, Jun 27, 2024 at 04:02:59PM +, Michael Kelley wrote:
> > > Conceptually, it's still being used as a boolean function based on
> > > whether the return value is NULL.  Renaming it to swiotlb_get_pool()
> > > more accurately describes the return value, but obscures the
> > > intent of determining if it is a swiotlb buffer.  I'll think about it.
> > > Suggestions are welcome.
> > 
> > Just keep is_swiotlb_buffer as a trivial inline helper that returns
> > bool.
> 
> I don't understand what you are suggesting.  Could you elaborate a bit?
> is_swiotlb_buffer() can't be trivial when CONFIG_SWIOTLB_DYNAMIC
> is set.

Call the main function that finds and retuns the pool swiotlb_find_pool,
and then have a is_swiotlb_buffer wrapper that just returns bool.




Re: [RFC 1/1] swiotlb: Reduce calls to swiotlb_find_pool()

2024-06-29 Thread h...@lst.de
On Sat, Jun 29, 2024 at 03:55:58PM +, Michael Kelley wrote:
> Unless there is further discussion on this point, I'll just keep the original
> "is_swiotlb_buffer()" in v2.

That is the wrong name for something that returns the pool as pointed
out before.



Re: [RFC 1/1] swiotlb: Reduce calls to swiotlb_find_pool()

2024-06-30 Thread h...@lst.de
On Sun, Jun 30, 2024 at 02:02:52PM +, Michael Kelley wrote:
> 1) Rename is_swiotlb_buffer() to swiotlb_find_pool(), since it
> now returns a pool.  A NULL return value indicates that the
> paddr is not an swiotlb buffer.
> 
> 2) Similarly, rename is_xen_swiotlb_buffer() to
> xen_swiotlb_find_pool()
> 
> 3) The existing swiotlb_find_pool() has the same function signature,
> but it is used only where the paddr is known to be an swiotlb buffer
> and hence always succeeds. Rename it to __swiotlb_find_pool() as
> the "internal" version of swiotlb_find_pool().

Sounds good.

> 4) Do you still want is_swiotlb_buffer() as a trivial wrapper around
> the new swiotlb_find_pool(), for use solely in dma_direct_need_sync()
> where only a Boolean is needed and not the pool?

If there is really just a single caller left we can skip the wrapper,
otherwise it might be handy.