RE: [RFC PATCH v1] devres: align devres.data strictly only for devm_kmalloc()
Hi Marc, We sort of expected something like that to happen at some point. Funny enough it's been a year since my change was accepted in v4.20 and only now somebody noticed :) Though quite a few questions below. > Commit a66d972465d15 ("devres: Align data[] to ARCH_KMALLOC_MINALIGN") > increased the alignment of devres.data unconditionally. > > Some platforms have very strict alignment requirements for DMA-safe > addresses, e.g. 128 bytes on arm64. There, struct devres amounts to: > 3 pointers + pad_to_128 + data + pad_to_256 > i.e. ~220 bytes of padding. Could you please elaborate a bit on mentioned paddings? I may understand the first one for 128 bytes but where does the second one for 256 bytes come from? > Let's enforce the alignment only for devm_kmalloc(). Ok so for devm_kmalloc() we don't change anything, right? We still add the same padding before real data array. > --- > I had not been aware that dynamic allocation granularity on arm64 was > 128 bytes. This means there's a lot of waste on small allocations. Now probably I'm missing something but when do you expect to save something? If those smaller allocations are done with devm_kmalloc() you aren't saving anything. > I suppose there's no easy solution, though. Right! It took a while till I was able to propose something people [almost silently] agreed with. > --- > drivers/base/devres.c | 23 +-- > 1 file changed, 13 insertions(+), 10 deletions(-) > > diff --git a/drivers/base/devres.c b/drivers/base/devres.c > index 0bbb328bd17f..bf39188613d9 100644 > --- a/drivers/base/devres.c > +++ b/drivers/base/devres.c > @@ -26,14 +26,7 @@ struct devres_node { > > struct devres { > struct devres_node node; > - /* > - * Some archs want to perform DMA into kmalloc caches > - * and need a guaranteed alignment larger than > - * the alignment of a 64-bit integer. > - * Thus we use ARCH_KMALLOC_MINALIGN here and get exactly the same > - * buffer alignment as if it was allocated by plain kmalloc(). > - */ > - u8 __aligned(ARCH_KMALLOC_MINALIGN) data[]; > + u8 data[]; > }; > > struct devres_group { > @@ -789,9 +782,16 @@ static void devm_kmalloc_release(struct device *dev, > void *res) > /* noop */ > } > > +#define DEVM_KMALLOC_PADDING_SIZE \ > + (ARCH_KMALLOC_MINALIGN - sizeof(struct devres) % ARCH_KMALLOC_MINALIGN) Even given your update with: --->8 #define DEVM_KMALLOC_PADDING_SIZE \ ((ARCH_KMALLOC_MINALIGN - sizeof(struct devres)) % ARCH_KMALLOC_MINALIGN) --->8 I don't think I understand why do you need that "% ARCH_KMALLOC_MINALIGN" part? > static int devm_kmalloc_match(struct device *dev, void *res, void *data) > { > - return res == data; > + /* > + * 'res' is dr->data (not DMA-safe) > + * 'data' is the hand-aligned address from devm_kmalloc > + */ > + return res + DEVM_KMALLOC_PADDING_SIZE == data; > } > > /** > @@ -811,6 +811,9 @@ void * devm_kmalloc(struct device *dev, size_t size, > gfp_t gfp) > { > struct devres *dr; > > + /* Add enough padding to provide a DMA-safe address */ > + size += DEVM_KMALLOC_PADDING_SIZE; This implementation gets ugly and potentially will lead to problems later when people will start changing code here. Compared to that initially aligned by the compiler dr->data looks much more foolproof. > /* use raw alloc_dr for kmalloc caller tracing */ > dr = alloc_dr(devm_kmalloc_release, size, gfp, dev_to_node(dev)); > if (unlikely(!dr)) > @@ -822,7 +825,7 @@ void * devm_kmalloc(struct device *dev, size_t size, > gfp_t gfp) >*/ > set_node_dbginfo(&dr->node, "devm_kzalloc_release", size); > devres_add(dev, dr->data); > - return dr->data; > + return dr->data + DEVM_KMALLOC_PADDING_SIZE; Ditto. But first I'd like to understand what are you trying to really do with your change and then we'll see if there could be any better implementation. -Alexey ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [RFC PATCH v1] devres: align devres.data strictly only for devm_kmalloc()
On 18/12/2019 15:20, Alexey Brodkin wrote: > On 17/12/2019 16:30, Marc Gonzalez wrote: > >> Commit a66d972465d15 ("devres: Align data[] to ARCH_KMALLOC_MINALIGN") >> increased the alignment of devres.data unconditionally. >> >> Some platforms have very strict alignment requirements for DMA-safe >> addresses, e.g. 128 bytes on arm64. There, struct devres amounts to: >> 3 pointers + pad_to_128 + data + pad_to_256 >> i.e. ~220 bytes of padding. > > Could you please elaborate a bit on mentioned paddings? > I may understand the first one for 128 bytes but where does the > second one for 256 bytes come from? Sure thing. struct devres { struct devres_node node; u8 __aligned(ARCH_KMALLOC_MINALIGN) data[]; }; struct devres_node = 3 pointers kmalloc dishes out memory in multiples of ARCH_KMALLOC_MINALIGN bytes. On arm64, ARCH_KMALLOC_MINALIGN = 128 (Everything written below assumes ARCH_KMALLOC_MINALIGN = 128) In alloc_dr() we request sizeof(struct devres) + sizeof(data) from kmalloc. sizeof(struct devres) = 128 because of the alignment directive. I.e. the 'data' field is automatically padded to 128 by the compiler. For most devm allocs (non-devm_kmalloc allocs), data is just 1 or 2 pointers. So kmalloc(128 + 16) allocates 256 bytes. >> Let's enforce the alignment only for devm_kmalloc(). > > Ok so for devm_kmalloc() we don't change anything, right? > We still add the same padding before real data array. (My commit message probably requires improvement & refining.) Yes, the objective of my patch is to keep the same behavior for devm_kmalloc() while reverting to the old behavior for all other uses of struct devres. >> I had not been aware that dynamic allocation granularity on arm64 was >> 128 bytes. This means there's a lot of waste on small allocations. > > Now probably I'm missing something but when do you expect to save something? > If those smaller allocations are done with devm_kmalloc() you aren't > saving anything. With my patch, a "non-kmalloc" struct devres would take 128 bytes, instead of 256. >> I suppose there's no easy solution, though. > > Right! It took a while till I was able to propose something > people [almost silently] agreed with. I meant the wider subject of dynamic allocation granularity. The 128-byte requirement is only for DMA. Some (most?) uses of kmalloc are not for DMA. If the user could provide a flag ("this is to be used for DMA") we could save lots of memory for small non-DMA allocs. >> +#define DEVM_KMALLOC_PADDING_SIZE \ >> +(ARCH_KMALLOC_MINALIGN - sizeof(struct devres) % ARCH_KMALLOC_MINALIGN) > > Even given your update with: > --->8 > #define DEVM_KMALLOC_PADDING_SIZE \ > ((ARCH_KMALLOC_MINALIGN - sizeof(struct devres)) % ARCH_KMALLOC_MINALIGN) > --->8 > I don't think I understand why do you need that "% ARCH_KMALLOC_MINALIGN" > part? To handle the case where sizeof(struct devres) > ARCH_KMALLOC_MINALIGN e.g ARCH_KMALLOC_MINALIGN = 8 and sizeof(struct devres) = 12 >> +/* Add enough padding to provide a DMA-safe address */ >> +size += DEVM_KMALLOC_PADDING_SIZE; > > This implementation gets ugly and potentially will lead to problems later > when people will start changing code here. Compared to that initially aligned > by > the compiler dr->data looks much more foolproof. Yes, it's better to let the compiler handle the padding... But, we don't want any padding in the non-devm_kmalloc use-case. We could add a pointer to the data field, but arches with small ARCH_KMALLOC_MINALIGN will have to pay the size increase, which doesn't seem fair to them (x86, amd64). >> @@ -822,7 +825,7 @@ void * devm_kmalloc(struct device *dev, size_t size, >> gfp_t gfp) >> */ >> set_node_dbginfo(&dr->node, "devm_kzalloc_release", size); >> devres_add(dev, dr->data); >> -return dr->data; >> +return dr->data + DEVM_KMALLOC_PADDING_SIZE; > > Ditto. But first I'd like to understand what are you trying to really do > with your change and then we'll see if there could be any better > implementation. Basically, every call to devres_alloc() or devm_add_action() allocates 256 bytes instead of 128. A typical arm64 system will call these thousands of times during driver probe. Regards. ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
[PATCH v17 02/23] arc: mm: Add p?d_leaf() definitions
walk_page_range() is going to be allowed to walk page tables other than those of user space. For this it needs to know when it has reached a 'leaf' entry in the page tables. This information will be provided by the p?d_leaf() functions/macros. For arc, we only have two levels, so only pmd_leaf() is needed. CC: Vineet Gupta CC: linux-snps-arc@lists.infradead.org Acked-by: Vineet Gupta Signed-off-by: Steven Price --- arch/arc/include/asm/pgtable.h | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arc/include/asm/pgtable.h b/arch/arc/include/asm/pgtable.h index 9019ed9f9c94..12be7e1b7cc0 100644 --- a/arch/arc/include/asm/pgtable.h +++ b/arch/arc/include/asm/pgtable.h @@ -273,6 +273,7 @@ static inline void pmd_set(pmd_t *pmdp, pte_t *ptep) #define pmd_none(x)(!pmd_val(x)) #definepmd_bad(x) ((pmd_val(x) & ~PAGE_MASK)) #define pmd_present(x) (pmd_val(x)) +#define pmd_leaf(x)(pmd_val(x) & _PAGE_HW_SZ) #define pmd_clear(xp) do { pmd_val(*(xp)) = 0; } while (0) #define pte_page(pte) pfn_to_page(pte_pfn(pte)) -- 2.20.1 ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc