Re: [PATCH 1/1] mm/pgtable/debug: Add test validating architecture page table helpers

2019-09-05 Thread Anshuman Khandual
On 09/04/2019 07:49 PM, Kirill A. Shutemov wrote:
> On Tue, Sep 03, 2019 at 01:31:46PM +0530, Anshuman Khandual wrote:
>> This adds a test module which will validate architecture page table helpers
>> and accessors regarding compliance with generic MM semantics expectations.
>> This will help various architectures in validating changes to the existing
>> page table helpers or addition of new ones.
>>
>> Test page table and memory pages creating it's entries at various level are
>> all allocated from system memory with required alignments. If memory pages
>> with required size and alignment could not be allocated, then all depending
>> individual tests are skipped.
> 
> See my comments below.
> 
>>
>> Cc: Andrew Morton 
>> Cc: Vlastimil Babka 
>> Cc: Greg Kroah-Hartman 
>> Cc: Thomas Gleixner 
>> Cc: Mike Rapoport 
>> Cc: Jason Gunthorpe 
>> Cc: Dan Williams 
>> Cc: Peter Zijlstra 
>> Cc: Michal Hocko 
>> Cc: Mark Rutland 
>> Cc: Mark Brown 
>> Cc: Steven Price 
>> Cc: Ard Biesheuvel 
>> Cc: Masahiro Yamada 
>> Cc: Kees Cook 
>> Cc: Tetsuo Handa 
>> Cc: Matthew Wilcox 
>> Cc: Sri Krishna chowdary 
>> Cc: Dave Hansen 
>> Cc: Russell King - ARM Linux 
>> Cc: Michael Ellerman 
>> Cc: Paul Mackerras 
>> Cc: Martin Schwidefsky 
>> Cc: Heiko Carstens 
>> Cc: "David S. Miller" 
>> Cc: Vineet Gupta 
>> Cc: James Hogan 
>> Cc: Paul Burton 
>> Cc: Ralf Baechle 
>> Cc: linux-snps-arc@lists.infradead.org
>> Cc: linux-m...@vger.kernel.org
>> Cc: linux-arm-ker...@lists.infradead.org
>> Cc: linux-i...@vger.kernel.org
>> Cc: linuxppc-...@lists.ozlabs.org
>> Cc: linux-s...@vger.kernel.org
>> Cc: linux...@vger.kernel.org
>> Cc: sparcli...@vger.kernel.org
>> Cc: x...@kernel.org
>> Cc: linux-ker...@vger.kernel.org
>>
>> Suggested-by: Catalin Marinas 
>> Signed-off-by: Anshuman Khandual 
>> ---
>>  mm/Kconfig.debug   |  14 ++
>>  mm/Makefile|   1 +
>>  mm/arch_pgtable_test.c | 425 +
>>  3 files changed, 440 insertions(+)
>>  create mode 100644 mm/arch_pgtable_test.c
>>
>> diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
>> index 327b3ebf23bf..ce9c397f7b07 100644
>> --- a/mm/Kconfig.debug
>> +++ b/mm/Kconfig.debug
>> @@ -117,3 +117,17 @@ config DEBUG_RODATA_TEST
>>  depends on STRICT_KERNEL_RWX
>>  ---help---
>>This option enables a testcase for the setting rodata read-only.
>> +
>> +config DEBUG_ARCH_PGTABLE_TEST
>> +bool "Test arch page table helpers for semantics compliance"
>> +depends on MMU
>> +depends on DEBUG_KERNEL
>> +help
>> +  This options provides a kernel module which can be used to test
>> +  architecture page table helper functions on various platform in
>> +  verifying if they comply with expected generic MM semantics. This
>> +  will help architectures code in making sure that any changes or
>> +  new additions of these helpers will still conform to generic MM
>> +  expected semantics.
>> +
>> +  If unsure, say N.
>> diff --git a/mm/Makefile b/mm/Makefile
>> index d996846697ef..bb572c5aa8c5 100644
>> --- a/mm/Makefile
>> +++ b/mm/Makefile
>> @@ -86,6 +86,7 @@ obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o
>>  obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
>>  obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
>>  obj-$(CONFIG_DEBUG_RODATA_TEST) += rodata_test.o
>> +obj-$(CONFIG_DEBUG_ARCH_PGTABLE_TEST) += arch_pgtable_test.o
>>  obj-$(CONFIG_PAGE_OWNER) += page_owner.o
>>  obj-$(CONFIG_CLEANCACHE) += cleancache.o
>>  obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o
>> diff --git a/mm/arch_pgtable_test.c b/mm/arch_pgtable_test.c
>> new file mode 100644
>> index ..f15be8a73723
>> --- /dev/null
>> +++ b/mm/arch_pgtable_test.c
>> @@ -0,0 +1,425 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * This kernel module validates architecture page table helpers &
>> + * accessors and helps in verifying their continued compliance with
>> + * generic MM semantics.
>> + *
>> + * Copyright (C) 2019 ARM Ltd.
>> + *
>> + * Author: Anshuman Khandual 
>> + */
>> +#define pr_fmt(fmt) "arch_pgtable_test: %s " fmt, __func__
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +/*
>> + * Basic operations
>> + *
>> + * mkold(entry) = An old and not a young entry
>> + * mkyoung(entry)   = A young and not an old entry
>> + * mkdirty(entry)   = A dirty and not a clean entry
>> + * mkclean(entry)   = A clean and not a dirty entry
>> + * mkwrite(entry)   = A write and not a write protected entry
>> + * wrprotect(entry) = A write protected and not a write entry
>> + * pxx_bad(entry)   = A mapped and non-table entry
>> + * pxx_same(entry1, entry2) = Both entries hold the exact same value
>> + */
>> +#define VADDR_TEST  (PGDIR_SIZE + PUD_SIZE + PMD_SIZE + PAGE_SIZE)
> 

Re: [PATCH 1/1] mm/pgtable/debug: Add test validating architecture page table helpers

2019-09-05 Thread Kirill A. Shutemov
On Thu, Sep 05, 2019 at 01:48:27PM +0530, Anshuman Khandual wrote:
> >> +#define VADDR_TEST(PGDIR_SIZE + PUD_SIZE + PMD_SIZE + PAGE_SIZE)
> > 
> > What is special about this address? How do you know if it is not occupied
> > yet?
> 
> We are creating the page table from scratch after allocating an mm_struct
> for a given random virtual address 'VADDR_TEST'. Hence nothing is occupied
> just yet. There is nothing special about this address, just that it tries
> to ensure the page table entries are being created with some offset from
> beginning of respective page table page at all levels ? The idea is to
> have a more representative form of page table structure for test.

Why P4D_SIZE is missing?

Are you sure it will not land into kernel address space on any arch?

I think more robust way to deal with this would be using
get_unmapped_area() instead of fixed address.

> This makes sense for runtime cases but there is a problem here.
> 
> On arm64, pgd_populate() which takes (pud_t *) as last argument instead of
> (p4d_t *) will fail to build when not wrapped in !__PAGETABLE_P4D_FOLDED
> on certain configurations.
> 
> ./arch/arm64/include/asm/pgalloc.h:81:75: note:
> expected ‘pud_t *’ {aka ‘struct  *’}
> but argument is of type ‘pgd_t *’ {aka ‘struct  *’}
> static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, pud_t 
> *pudp)
>~~~^~~~
> Wondering if this is something to be fixed on arm64 or its more general
> problem. Will look into this further.

I think you need wrap this into #ifndef __ARCH_HAS_5LEVEL_HACK.

> >> +  pmd_populate_tests(mm, pmdp, (pgtable_t) page);
> > 
> > This is not correct for architectures that defines pgtable_t as pte_t
> > pointer, not struct page pointer.
> 
> Right, a grep on the source confirms that.
> 
> These platforms define pgtable_t as struct page *
> 
> arch/alpha/include/asm/page.h:typedef struct page *pgtable_t;
> arch/arm/include/asm/page.h:typedef struct page *pgtable_t;
> arch/arm64/include/asm/page.h:typedef struct page *pgtable_t;
> arch/csky/include/asm/page.h:typedef struct page *pgtable_t;
> arch/hexagon/include/asm/page.h:typedef struct page *pgtable_t;
> arch/ia64/include/asm/page.h:  typedef struct page *pgtable_t;
> arch/ia64/include/asm/page.h:typedef struct page *pgtable_t;
> arch/m68k/include/asm/page.h:typedef struct page *pgtable_t;
> arch/microblaze/include/asm/page.h:typedef struct page *pgtable_t;
> arch/mips/include/asm/page.h:typedef struct page *pgtable_t;
> arch/nds32/include/asm/page.h:typedef struct page *pgtable_t;
> arch/nios2/include/asm/page.h:typedef struct page *pgtable_t;
> arch/openrisc/include/asm/page.h:typedef struct page *pgtable_t;
> arch/parisc/include/asm/page.h:typedef struct page *pgtable_t;
> arch/riscv/include/asm/page.h:typedef struct page *pgtable_t;
> arch/sh/include/asm/page.h:typedef struct page *pgtable_t;
> arch/sparc/include/asm/page_32.h:typedef struct page *pgtable_t;
> arch/um/include/asm/page.h:typedef struct page *pgtable_t;
> arch/unicore32/include/asm/page.h:typedef struct page *pgtable_t;
> arch/x86/include/asm/pgtable_types.h:typedef struct page *pgtable_t;
> arch/xtensa/include/asm/page.h:typedef struct page *pgtable_t;
> 
> These platforms define pgtable_t as pte_t *
> 
> arch/arc/include/asm/page.h:typedef pte_t * pgtable_t;
> arch/powerpc/include/asm/mmu.h:typedef pte_t *pgtable_t;
> arch/s390/include/asm/page.h:typedef pte_t *pgtable_t;
> arch/sparc/include/asm/page_64.h:typedef pte_t *pgtable_t;
> 
> Should we need have two pmd_populate_tests() definitions with
> different arguments (struct page pointer or pte_t pointer) and then
> call either one after detecting the given platform ?

Use pte_alloc_one() instead of alloc_mapped_page() to allocate the page
table.

> >> +  pud_populate_tests(mm, pudp, pmdp);
> >> +  p4d_populate_tests(mm, p4dp, pudp);
> >> +  pgd_populate_tests(mm, pgdp, p4dp);
> > 
> > This is wrong. All p?dp points to the second entry in page table entry.
> > This is not valid pointer for page table and triggers p?d_bad() on x86.
> 
> Yeah these are second entries because of the way we create the page table.
> But I guess its applicable only to the second argument in all these above
> cases because the first argument can be any valid entry on previous page
> table level.

Yes:

@@ -397,9 +396,9 @@ static int __init arch_pgtable_tests_init(void)
pgd_clear_tests(pgdp);
 
pmd_populate_tests(mm, pmdp, (pgtable_t) page);
-   pud_populate_tests(mm, pudp, pmdp);
-   p4d_populate_tests(mm, p4dp, pudp);
-   pgd_populate_tests(mm, pgdp, p4dp);
+   pud_populate_tests(mm, pudp, saved_pmdp);
+   p4d_populate_tests(mm, p4dp, saved_pudp);
+   pgd_populate_tests(mm, pgdp, saved_p4dp);
 
p4d_free(mm, saved_p4dp);
pud_free(mm, saved_pudp);

-- 
 Kirill A. Shutemov

___
linux-snps-arc mailing list
linux-snps-arc@l

Re: [PATCH 1/1] mm/pgtable/debug: Add test validating architecture page table helpers

2019-09-05 Thread Anshuman Khandual


On 09/05/2019 01:46 AM, Gerald Schaefer wrote:
> On Tue,  3 Sep 2019 13:31:46 +0530
> Anshuman Khandual  wrote:
> 
>> This adds a test module which will validate architecture page table helpers
>> and accessors regarding compliance with generic MM semantics expectations.
>> This will help various architectures in validating changes to the existing
>> page table helpers or addition of new ones.
>>
>> Test page table and memory pages creating it's entries at various level are
>> all allocated from system memory with required alignments. If memory pages
>> with required size and alignment could not be allocated, then all depending
>> individual tests are skipped.
> 
> This looks very useful, thanks. Of course, s390 is quite special and does
> not work nicely with this patch (yet), mostly because of our dynamic page
> table levels/folding. Still need to figure out what can be fixed in the arch

Hmm.

> code and what would need to be changed in the test module. See below for some
> generic comments/questions.

Sure.

> 
> At least one real bug in the s390 code was already revealed by this, which
> is very nice. In pmd/pud_bad(), we also check large pmds/puds for sanity,
> instead of reporting them as bad, which is apparently not how it is expected.

Hmm, so it has already started being useful :)

> 
> [...]
>> +/*
>> + * Basic operations
>> + *
>> + * mkold(entry) = An old and not a young entry
>> + * mkyoung(entry)   = A young and not an old entry
>> + * mkdirty(entry)   = A dirty and not a clean entry
>> + * mkclean(entry)   = A clean and not a dirty entry
>> + * mkwrite(entry)   = A write and not a write protected entry
>> + * wrprotect(entry) = A write protected and not a write entry
>> + * pxx_bad(entry)   = A mapped and non-table entry
>> + * pxx_same(entry1, entry2) = Both entries hold the exact same value
>> + */
>> +#define VADDR_TEST  (PGDIR_SIZE + PUD_SIZE + PMD_SIZE + PAGE_SIZE)
> 
> Why is P4D_SIZE missing in the VADDR_TEST calculation?

This was a random possible virtual address to generate a representative
page table structure for the test. As there is a default (PGDIR_SIZE) for
P4D_SIZE on platforms which really do not have P4D level, it should be okay
to add P4D_SIZE in the above calculation.

> 
> [...]
>> +
>> +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
>> +static void pud_clear_tests(pud_t *pudp)
>> +{
>> +memset(pudp, RANDOM_NZVALUE, sizeof(pud_t));
>> +pud_clear(pudp);
>> +WARN_ON(!pud_none(READ_ONCE(*pudp)));
>> +}
> 
> For pgd/p4d/pud_clear(), we only clear if the page table level is present
> and not folded. The memset() here overwrites the table type bits, so
> pud_clear() will not clear anything on s390 and the pud_none() check will
> fail.
> Would it be possible to OR a (larger) random value into the table, so that
> the lower 12 bits would be preserved?

So the suggestion is instead of doing memset() on entry with RANDOM_NZVALUE,
it should OR a large random value preserving lower 12 bits. Hmm, this should
still do the trick for other platforms, they just need non zero value. So on
s390, the lower 12 bits on the page table entry already has valid value while
entering this function which would make sure that pud_clear() really does
clear the entry ?

> 
>> +
>> +static void pud_populate_tests(struct mm_struct *mm, pud_t *pudp, pmd_t 
>> *pmdp)
>> +{
>> +/*
>> + * This entry points to next level page table page.
>> + * Hence this must not qualify as pud_bad().
>> + */
>> +pmd_clear(pmdp);
>> +pud_clear(pudp);
>> +pud_populate(mm, pudp, pmdp);
>> +WARN_ON(pud_bad(READ_ONCE(*pudp)));
>> +}
> 
> This will populate the pud with a pmd pointer that does not point to the
> beginning of the pmd table, but to the second entry (because of how
> VADDR_TEST is constructed). This will result in failing pud_bad() check
> on s390. Not sure why/how it works on other archs, but would it be possible
> to align pmdp down to the beginning of the pmd table (and similar for the
> other pxd_populate_tests)?

Right, that was a problem. Will fix it by using the saved entries used for
freeing the page table pages at the end, which always point to the beginning
of a page table page.

> 
> [...]
>> +
>> +p4d_free(mm, saved_p4dp);
>> +pud_free(mm, saved_pudp);
>> +pmd_free(mm, saved_pmdp);
>> +pte_free(mm, (pgtable_t) virt_to_page(saved_ptep));
> 
> pgtable_t is arch-specific, and on s390 it is not a struct page pointer,
> but a pte pointer. So this will go wrong, also on all other archs (if any)
> where pgtable_t is not struct page.
> Would it be possible to use pte_free_kernel() instead, and just pass
> saved_ptep directly?

It makes sense, will change.

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH 1/1] mm/pgtable/debug: Add test validating architecture page table helpers

2019-09-05 Thread Gerald Schaefer
On Thu, 5 Sep 2019 14:48:14 +0530
Anshuman Khandual  wrote:

> > [...]  
> >> +
> >> +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
> >> +static void pud_clear_tests(pud_t *pudp)
> >> +{
> >> +  memset(pudp, RANDOM_NZVALUE, sizeof(pud_t));
> >> +  pud_clear(pudp);
> >> +  WARN_ON(!pud_none(READ_ONCE(*pudp)));
> >> +}  
> > 
> > For pgd/p4d/pud_clear(), we only clear if the page table level is present
> > and not folded. The memset() here overwrites the table type bits, so
> > pud_clear() will not clear anything on s390 and the pud_none() check will
> > fail.
> > Would it be possible to OR a (larger) random value into the table, so that
> > the lower 12 bits would be preserved?  
> 
> So the suggestion is instead of doing memset() on entry with RANDOM_NZVALUE,
> it should OR a large random value preserving lower 12 bits. Hmm, this should
> still do the trick for other platforms, they just need non zero value. So on
> s390, the lower 12 bits on the page table entry already has valid value while
> entering this function which would make sure that pud_clear() really does
> clear the entry ?

Yes, in theory the table entry on s390 would have the type set in the last
4 bits, so preserving those would be enough. If it does not conflict with
others, I would still suggest preserving all 12 bits since those would contain
arch-specific flags in general, just to be sure. For s390, the pte/pmd tests
would also work with the memset, but for consistency I think the same logic
should be used in all pxd_clear_tests.

However, there is another issue on s390 which will make this only work
for pud_clear_tests(), and not for the p4d/pgd_tests. The problem is that
mm_alloc() will only give you a 3-level page table initially on s390.
This means that pudp == p4dp == pgdp, and so the p4d/pgd_tests will
both see the pud level (of course this also affects other tests).

Not sure yet how to fix this, i.e. how to initialize/update the page table
to 5 levels. We can handle 5 level page tables, and it would be good if
all levels could be tested, but using mm_alloc() to establish the page
tables might not work on s390. One option could be to provide an arch-hook
or weak function to allocate/initialize the mm.

IIUC, the (dummy) mm is really only needed to provide an mm->pgd as starting
point, right?

Regards,
Gerald


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH 1/1] mm/pgtable/debug: Add test validating architecture page table helpers

2019-09-05 Thread Anshuman Khandual
On 09/05/2019 10:36 PM, Gerald Schaefer wrote:
> On Thu, 5 Sep 2019 14:48:14 +0530
> Anshuman Khandual  wrote:
> 
>>> [...]  
 +
 +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
 +static void pud_clear_tests(pud_t *pudp)
 +{
 +  memset(pudp, RANDOM_NZVALUE, sizeof(pud_t));
 +  pud_clear(pudp);
 +  WARN_ON(!pud_none(READ_ONCE(*pudp)));
 +}  
>>>
>>> For pgd/p4d/pud_clear(), we only clear if the page table level is present
>>> and not folded. The memset() here overwrites the table type bits, so
>>> pud_clear() will not clear anything on s390 and the pud_none() check will
>>> fail.
>>> Would it be possible to OR a (larger) random value into the table, so that
>>> the lower 12 bits would be preserved?  
>>
>> So the suggestion is instead of doing memset() on entry with RANDOM_NZVALUE,
>> it should OR a large random value preserving lower 12 bits. Hmm, this should
>> still do the trick for other platforms, they just need non zero value. So on
>> s390, the lower 12 bits on the page table entry already has valid value while
>> entering this function which would make sure that pud_clear() really does
>> clear the entry ?
> 
> Yes, in theory the table entry on s390 would have the type set in the last
> 4 bits, so preserving those would be enough. If it does not conflict with
> others, I would still suggest preserving all 12 bits since those would contain
> arch-specific flags in general, just to be sure. For s390, the pte/pmd tests
> would also work with the memset, but for consistency I think the same logic
> should be used in all pxd_clear_tests.

Makes sense but..

There is a small challenge with this. Modifying individual bits on a given
page table entry from generic code like this test case is bit tricky. That
is because there are not enough helpers to create entries with an absolute
value. This would have been easier if all the platforms provided functions
like __pxx() which is not the case now. Otherwise something like this should
have worked.


pud_t pud = READ_ONCE(*pudp);
pud = __pud(pud_val(pud) | RANDOM_VALUE (keeping lower 12 bits 0))
WRITE_ONCE(*pudp, pud);

But __pud() will fail to build in many platforms.

The other alternative will be to make sure memset() happens on all other
bits except the lower 12 bits which will depend on endianness. If s390
has a fixed endianness, we can still use either of them which will hold
good for others as well.

memset(pudp, RANDOM_NZVALUE, sizeof(pud_t) - 3);

OR

memset(pudp + 3, RANDOM_NZVALUE, sizeof(pud_t) - 3);

> 
> However, there is another issue on s390 which will make this only work
> for pud_clear_tests(), and not for the p4d/pgd_tests. The problem is that
> mm_alloc() will only give you a 3-level page table initially on s390.
> This means that pudp == p4dp == pgdp, and so the p4d/pgd_tests will
> both see the pud level (of course this also affects other tests).

Got it.

> 
> Not sure yet how to fix this, i.e. how to initialize/update the page table
> to 5 levels. We can handle 5 level page tables, and it would be good if
> all levels could be tested, but using mm_alloc() to establish the page
> tables might not work on s390. One option could be to provide an arch-hook
> or weak function to allocate/initialize the mm.

Sure, got it. Though I plan to do add some arch specific tests or init sequence
like the above later on but for now the idea is to get the smallest possible set
of test cases which builds and runs on all platforms without requiring any arch
specific hooks or special casing (#ifdef) to be agreed upon broadly and 
accepted.

Do you think this is absolutely necessary on s390 for the very first set of test
cases or we can add this later on as an improvement ?

> 
> IIUC, the (dummy) mm is really only needed to provide an mm->pgd as starting
> point, right?

Right.

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc