On Fri, 10 May 2019 16:10:13 +0200
Cornelia Huck <[email protected]> wrote:

> On Fri, 10 May 2019 00:11:12 +0200
> Halil Pasic <[email protected]> wrote:
> 
> > On Thu, 9 May 2019 12:11:06 +0200
> > Cornelia Huck <[email protected]> wrote:
> > 
> > > On Wed, 8 May 2019 23:22:10 +0200
> > > Halil Pasic <[email protected]> wrote:
> > >   
> > > > On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> > > > Sebastian Ott <[email protected]> wrote:  
> > >   
> > > > > > @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
> > > > > >             unregister_reboot_notifier(&css_reboot_notifier);
> > > > > >             goto out_unregister;
> > > > > >     }
> > > > > > +   cio_dma_pool_init();      
> > > > > 
> > > > > This is too late for early devices (ccw console!).    
> > > > 
> > > > You have already raised concern about this last time (thanks). I think,
> > > > I've addressed this issue: the cio_dma_pool is only used by the airq
> > > > stuff. I don't think the ccw console needs it. Please have an other look
> > > > at patch #6, and explain your concern in more detail if it persists.  
> > > 
> > > What about changing the naming/adding comments here, so that (1) folks
> > > aren't confused by the same thing in the future and (2) folks don't try
> > > to use that pool for something needed for the early ccw consoles?
> > >   
> > 
> > I'm all for clarity! Suggestions for better names?
> 
> css_aiv_dma_pool, maybe? Or is there other cross-device stuff that may
> need it?
> 

Ouch! I was considering to use cio_dma_zalloc() for the adapter
interruption vectors but I ended up between the two chairs in the end.
So with this series there are no uses for cio_dma pool.

I don't feel strongly about this going one way the other.

Against getting rid of the cio_dma_pool and sticking with the speaks
dma_alloc_coherent() that we waste a DMA page per vector, which is a
non obvious side effect.

What speaks against cio_dma_pool is that it is slightly more code, and
this currently can not be used for very early stuff, which I don't
think is relevant. What also used to speak against it is that
allocations asking for more than a page would just fail, but I addressed
that in the patch I've hacked up on top of the series, and I'm going to
paste below. While at it I addressed some other issues as well.

I've also got code that deals with AIRQ_IV_CACHELINE by turning the
kmem_cache into a dma_pool.

Cornelia, Sebastian which approach do you prefer:
1) get rid of cio_dma_pool and AIRQ_IV_CACHELINE, and waste a page per
vector, or
2) go with the approach taken by the patch below?


Regards,
Halil
-----------------------8<---------------------------------------------
From: Halil Pasic <[email protected]>
Date: Sun, 12 May 2019 18:08:05 +0200
Subject: [PATCH] WIP: use cio dma pool for airqs

Let's not waste a DMA page per adapter interrupt bit vector.
---
Lightly tested...
---
 arch/s390/include/asm/airq.h |  1 -
 drivers/s390/cio/airq.c      | 10 +++-------
 drivers/s390/cio/css.c       | 18 +++++++++++++++---
 3 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
index 1492d48..981a3eb 100644
--- a/arch/s390/include/asm/airq.h
+++ b/arch/s390/include/asm/airq.h
@@ -30,7 +30,6 @@ void unregister_adapter_interrupt(struct airq_struct *airq);
 /* Adapter interrupt bit vector */
 struct airq_iv {
        unsigned long *vector;  /* Adapter interrupt bit vector */
-       dma_addr_t vector_dma; /* Adapter interrupt bit vector dma */
        unsigned long *avail;   /* Allocation bit mask for the bit vector */
        unsigned long *bitlock; /* Lock bit mask for the bit vector */
        unsigned long *ptr;     /* Pointer associated with each bit */
diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
index 7a5c0a0..f11f437 100644
--- a/drivers/s390/cio/airq.c
+++ b/drivers/s390/cio/airq.c
@@ -136,8 +136,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned 
long flags)
                goto out;
        iv->bits = bits;
        size = iv_size(bits);
-       iv->vector = dma_alloc_coherent(cio_get_dma_css_dev(), size,
-                                                &iv->vector_dma, GFP_KERNEL);
+       iv->vector = cio_dma_zalloc(size);
        if (!iv->vector)
                goto out_free;
        if (flags & AIRQ_IV_ALLOC) {
@@ -172,8 +171,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned 
long flags)
        kfree(iv->ptr);
        kfree(iv->bitlock);
        kfree(iv->avail);
-       dma_free_coherent(cio_get_dma_css_dev(), size, iv->vector,
-                         iv->vector_dma);
+       cio_dma_free(iv->vector, size);
        kfree(iv);
 out:
        return NULL;
@@ -189,9 +187,7 @@ void airq_iv_release(struct airq_iv *iv)
        kfree(iv->data);
        kfree(iv->ptr);
        kfree(iv->bitlock);
-       kfree(iv->vector);
-       dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
-                         iv->vector, iv->vector_dma);
+       cio_dma_free(iv->vector, iv_size(iv->bits));
        kfree(iv->avail);
        kfree(iv);
 }
diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
index 7087cc3..88d9c92 100644
--- a/drivers/s390/cio/css.c
+++ b/drivers/s390/cio/css.c
@@ -1063,7 +1063,10 @@ struct gen_pool *cio_gp_dma_create(struct device 
*dma_dev, int nr_pages)
 static void __gp_dma_free_dma(struct gen_pool *pool,
                              struct gen_pool_chunk *chunk, void *data)
 {
-       dma_free_coherent((struct device *) data, PAGE_SIZE,
+
+       size_t chunk_size = chunk->end_addr - chunk->start_addr + 1;
+
+       dma_free_coherent((struct device *) data, chunk_size,
                         (void *) chunk->start_addr,
                         (dma_addr_t) chunk->phys_addr);
 }
@@ -1088,13 +1091,15 @@ void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct 
device *dma_dev,
 {
        dma_addr_t dma_addr;
        unsigned long addr = gen_pool_alloc(gp_dma, size);
+       size_t chunk_size;
 
        if (!addr) {
+               chunk_size = round_up(size, PAGE_SIZE);
                addr = (unsigned long) dma_alloc_coherent(dma_dev,
-                                       PAGE_SIZE, &dma_addr, CIO_DMA_GFP);
+                                        chunk_size, &dma_addr, CIO_DMA_GFP);
                if (!addr)
                        return NULL;
-               gen_pool_add_virt(gp_dma, addr, dma_addr, PAGE_SIZE, -1);
+               gen_pool_add_virt(gp_dma, addr, dma_addr, chunk_size, -1);
                addr = gen_pool_alloc(gp_dma, size);
        }
        return (void *) addr;
@@ -1108,6 +1113,13 @@ void cio_gp_dma_free(struct gen_pool *gp_dma, void 
*cpu_addr, size_t size)
        gen_pool_free(gp_dma, (unsigned long) cpu_addr, size);
 }
 
+/**
+ * Allocate dma memory from the css global pool. Intended for memory not
+ * specific to any single device within the css.
+ *
+ * Caution: Not suitable for early stuff like console.
+ *
+ */
 void *cio_dma_zalloc(size_t size)
 {
        return cio_gp_dma_zalloc(cio_dma_pool, cio_get_dma_css_dev(), size);
-- 
2.5.5

_______________________________________________
Virtualization mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Reply via email to