Re: [PATCH 1/4] treewide: remove unused address argument from pte_alloc functions (v2)

2018-10-24 Thread Peter Zijlstra
On Fri, Oct 12, 2018 at 06:31:57PM -0700, Joel Fernandes (Google) wrote:
> This series speeds up mremap(2) syscall by copying page tables at the
> PMD level even for non-THP systems. There is concern that the extra
> 'address' argument that mremap passes to pte_alloc may do something
> subtle architecture related in the future that may make the scheme not
> work.  Also we find that there is no point in passing the 'address' to
> pte_alloc since its unused. So this patch therefore removes this
> argument tree-wide resulting in a nice negative diff as well. Also
> ensuring along the way that the enabled architectures do not do anything
> funky with 'address' argument that goes unnoticed by the optimization.

Did you happen to look at the history of where that address argument
came from? -- just being curious here. ISTR something vague about
architectures having different paging structure for different memory
ranges.

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH 2/4] mm: speed up mremap by 500x on large regions (v2)

2018-10-24 Thread Kirill A. Shutemov
On Fri, Oct 12, 2018 at 06:31:58PM -0700, Joel Fernandes (Google) wrote:
> diff --git a/mm/mremap.c b/mm/mremap.c
> index 9e68a02a52b1..2fd163cff406 100644
> --- a/mm/mremap.c
> +++ b/mm/mremap.c
> @@ -191,6 +191,54 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t 
> *old_pmd,
>   drop_rmap_locks(vma);
>  }
>  
> +static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long 
> old_addr,
> +   unsigned long new_addr, unsigned long old_end,
> +   pmd_t *old_pmd, pmd_t *new_pmd, bool *need_flush)
> +{
> + spinlock_t *old_ptl, *new_ptl;
> + struct mm_struct *mm = vma->vm_mm;
> +
> + if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK)
> + || old_end - old_addr < PMD_SIZE)
> + return false;
> +
> + /*
> +  * The destination pmd shouldn't be established, free_pgtables()
> +  * should have release it.
> +  */
> + if (WARN_ON(!pmd_none(*new_pmd)))
> + return false;
> +
> + /*
> +  * We don't have to worry about the ordering of src and dst
> +  * ptlocks because exclusive mmap_sem prevents deadlock.
> +  */
> + old_ptl = pmd_lock(vma->vm_mm, old_pmd);
> + if (old_ptl) {

How can it ever be false?

> + pmd_t pmd;
> +
> + new_ptl = pmd_lockptr(mm, new_pmd);
> + if (new_ptl != old_ptl)
> + spin_lock_nested(new_ptl, SINGLE_DEPTH_NESTING);
> +
> + /* Clear the pmd */
> + pmd = *old_pmd;
> + pmd_clear(old_pmd);
> +
> + VM_BUG_ON(!pmd_none(*new_pmd));
> +
> + /* Set the new pmd */
> + set_pmd_at(mm, new_addr, new_pmd, pmd);
> + if (new_ptl != old_ptl)
> + spin_unlock(new_ptl);
> + spin_unlock(old_ptl);
> +
> + *need_flush = true;
> + return true;
> + }
> + return false;
> +}
> +
-- 
 Kirill A. Shutemov

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH 2/4] mm: speed up mremap by 500x on large regions (v2)

2018-10-24 Thread Balbir Singh
On Wed, Oct 24, 2018 at 01:12:56PM +0300, Kirill A. Shutemov wrote:
> On Fri, Oct 12, 2018 at 06:31:58PM -0700, Joel Fernandes (Google) wrote:
> > diff --git a/mm/mremap.c b/mm/mremap.c
> > index 9e68a02a52b1..2fd163cff406 100644
> > --- a/mm/mremap.c
> > +++ b/mm/mremap.c
> > @@ -191,6 +191,54 @@ static void move_ptes(struct vm_area_struct *vma, 
> > pmd_t *old_pmd,
> > drop_rmap_locks(vma);
> >  }
> >  
> > +static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long 
> > old_addr,
> > + unsigned long new_addr, unsigned long old_end,
> > + pmd_t *old_pmd, pmd_t *new_pmd, bool *need_flush)
> > +{
> > +   spinlock_t *old_ptl, *new_ptl;
> > +   struct mm_struct *mm = vma->vm_mm;
> > +
> > +   if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK)
> > +   || old_end - old_addr < PMD_SIZE)
> > +   return false;
> > +
> > +   /*
> > +* The destination pmd shouldn't be established, free_pgtables()
> > +* should have release it.
> > +*/
> > +   if (WARN_ON(!pmd_none(*new_pmd)))
> > +   return false;
> > +
> > +   /*
> > +* We don't have to worry about the ordering of src and dst
> > +* ptlocks because exclusive mmap_sem prevents deadlock.
> > +*/
> > +   old_ptl = pmd_lock(vma->vm_mm, old_pmd);
> > +   if (old_ptl) {
> 
> How can it ever be false?
> 
> > +   pmd_t pmd;
> > +
> > +   new_ptl = pmd_lockptr(mm, new_pmd);


Looks like this is largely inspired by move_huge_pmd(), I guess a lot of
the code applies, why not just reuse as much as possible? The same comments
w.r.t mmap_sem helping protect against lock order issues applies as well.

> > +   if (new_ptl != old_ptl)
> > +   spin_lock_nested(new_ptl, SINGLE_DEPTH_NESTING);
> > +
> > +   /* Clear the pmd */
> > +   pmd = *old_pmd;
> > +   pmd_clear(old_pmd);
> > +
> > +   VM_BUG_ON(!pmd_none(*new_pmd));
> > +
> > +   /* Set the new pmd */
> > +   set_pmd_at(mm, new_addr, new_pmd, pmd);
> > +   if (new_ptl != old_ptl)
> > +   spin_unlock(new_ptl);
> > +   spin_unlock(old_ptl);
> > +
> > +   *need_flush = true;
> > +   return true;
> > +   }
> > +   return false;
> > +}
> > +
> -- 
>  Kirill A. Shutemov
> 

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH 2/4] mm: speed up mremap by 500x on large regions (v2)

2018-10-24 Thread Kirill A. Shutemov
On Wed, Oct 24, 2018 at 10:57:33PM +1100, Balbir Singh wrote:
> On Wed, Oct 24, 2018 at 01:12:56PM +0300, Kirill A. Shutemov wrote:
> > On Fri, Oct 12, 2018 at 06:31:58PM -0700, Joel Fernandes (Google) wrote:
> > > diff --git a/mm/mremap.c b/mm/mremap.c
> > > index 9e68a02a52b1..2fd163cff406 100644
> > > --- a/mm/mremap.c
> > > +++ b/mm/mremap.c
> > > @@ -191,6 +191,54 @@ static void move_ptes(struct vm_area_struct *vma, 
> > > pmd_t *old_pmd,
> > >   drop_rmap_locks(vma);
> > >  }
> > >  
> > > +static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long 
> > > old_addr,
> > > +   unsigned long new_addr, unsigned long old_end,
> > > +   pmd_t *old_pmd, pmd_t *new_pmd, bool *need_flush)
> > > +{
> > > + spinlock_t *old_ptl, *new_ptl;
> > > + struct mm_struct *mm = vma->vm_mm;
> > > +
> > > + if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK)
> > > + || old_end - old_addr < PMD_SIZE)
> > > + return false;
> > > +
> > > + /*
> > > +  * The destination pmd shouldn't be established, free_pgtables()
> > > +  * should have release it.
> > > +  */
> > > + if (WARN_ON(!pmd_none(*new_pmd)))
> > > + return false;
> > > +
> > > + /*
> > > +  * We don't have to worry about the ordering of src and dst
> > > +  * ptlocks because exclusive mmap_sem prevents deadlock.
> > > +  */
> > > + old_ptl = pmd_lock(vma->vm_mm, old_pmd);
> > > + if (old_ptl) {
> > 
> > How can it ever be false?
> > 
> > > + pmd_t pmd;
> > > +
> > > + new_ptl = pmd_lockptr(mm, new_pmd);
> 
> 
> Looks like this is largely inspired by move_huge_pmd(), I guess a lot of
> the code applies, why not just reuse as much as possible? The same comments
> w.r.t mmap_sem helping protect against lock order issues applies as well.

pmd_lock() cannot fail, but __pmd_trans_huge_lock() can. We should not
copy the code blindly.

-- 
 Kirill A. Shutemov

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [RFC] ARC: ARCv2: Introduce SmaRT support

2018-10-24 Thread Vineet Gupta


On 10/19/2018 07:27 AM, Eugeniy Paltsev wrote:
> Add compile-time 'ARC_USE_SMART' option for enabling SmaRT support.

Nice !

> Small real time trace (SmaRT) is an optional on-chip debug hardware
> component that captures instruction-trace history. It stores the
> address of the most recent non-sequential instructions executed into
> internal buffer.
>
> Usually we use MetaWare debugger to enable SmaRT and display trace
> information.
>
> This patch allows to display the decoded content of SmaRT buffer
> without MetaWare debugger. It is done by extending ordinary exception
> message with decoded SmaRT instruction-trace history.
>
> In some cases it's really usefull as it allows to show pre-exception
> instruction-trace which was not tainted by exception handler code,
> printk code, etc...

So the reason is not so much as lack of mdb, but to reduce the trace clutter. 
Its
funny because mdb goes to great lengths to generate the clutter (i.e. 
reconstruct
the interim disassembly from the sparse smaRT entries)

> Nevertheless this option has negative performance impact due to
> implementation as we dump SmaRT buffer content into external memory
> buffer in the begining of every slowpath exception handler code.
> We choose this implementation as a compromise between performance
> impact and SmaRT buffer tainting.
> Although the performance impact is not really significant (according
> to lmbench) we leave this option disabled by default.

Oh yes, this a debug feature and intrusive even if not shows in profiles, so 
needs
to be disabled by default.

> Here is th examples of user-space and kernel-space fault messages with
> 'ARC_USE_SMART' option enabled:
>
> User-space exception:
> --->8-
> Exception: u_hell[99]: at 0x103a2 [off 0x103a2 in /root/u_hell, VMA: 
> 0001:00012000]
>   ECR: 0x00050200 => Invalid Write @ 0x by insn @ 0x000103a2
> SmaRT (64 entries):
>  [   0]V 0x90232358 -> 0x9022ce3c [src do_page_fault+0x2c/0x2d8] [dst 
> populate_smart+0x0/0x9c]

So I had to dig into smart spec to understand this src, dst stuff. What it 
implies
is that @src PC, a branch to @dst was taken.
Say we have samples SRC1: DST1, SRC2:DST2. All this is implies is that these 4 
PCs
were observed. So just flatten out the SRC/DST and print them in order. So only 
1
PC entry per line. makes it easier to follow and comprehend.

>  [   1]V 0x9022e3f8 -> 0x9023232c [src EV_TLBProtV+0xec/0xf0] [dst 
> do_page_fault+0x0/0x2d8]
>  [   2]V 0x90233194 -> 0x9022e30c [src do_slow_path_pf+0x10/0x14] [dst 
> EV_TLBProtV+0x0/0xf0]
>  [   3]V 0x90233120 -> 0x90233184 [src EV_TLBMissD+0x80/0xe0] [dst 
> do_slow_path_pf+0x0/0x14]
>  [   4]  E V 0x000103a2 -> 0x902330a0 [off 0x103a2 in /root/u_hell, VMA: 
> 0001:00012000] [dst EV_TLBMissD+0x0/0xe0]
>  [   5] U  V 0x2004f238 -> 0x00010398 [off 0x43238 in 
> /lib/libuClibc-1.0.18.so, VMA: 2000c000:20072000] [off 0x10398 in 
> /root/u_hell, VMA: 0001:00012000]
>  [   6] U  V 0x20049a82 -> 0x2004f214 [off 0x3da82 in 
> /lib/libuClibc-1.0.18.so, VMA: 2000c000:20072000] [off 0x43214 in 
> /lib/libuClibc-1.0.18.so, VMA: 2000c000:20072000]

Once we do above, then we can reduce the print clutter by only printing the vma 
if
it changed - again less printing means brain has to process less information.

> ...[snip]...
> --->8-
>
> TODO:
>   Add runtime procfs options to configure/suspend SmaRT.

Good.

>   Add SmaRT BCR encoding struct.
>   Check SmaRT version number in BCR.

Do we need to also think about how to co-exist with mdb. What if uses enables it
in mdb before hitting run etc.

> NOTE:
> this RFC has prerequisite:
>   http://patchwork.ozlabs.org/patch/986820/

Right I'm still not happy with our approach there and I will respond seperately
after a few trials and tribulations of my own so please be patient with that.
See below for some coding comments

>  
> +config ARC_USE_SMART

ARC_SMART_TRACE ? I know why you picked the _USE_, but the semantics are 
different
here.


> + bool "Enable real time trace on-chip debug HW"

This might confused with RTT, so keep smaRT keyword here with hungarian case.

> diff --git a/arch/arc/include/asm/bug.h b/arch/arc/include/asm/bug.h
>  
> +#ifdef CONFIG_ARC_USE_SMART
> +void populate_smart(void);
> +#define POPULATE_SMART() populate_smart()
> +#else
> +#define POPULATE_SMART()
> +#endif /* CONFIG_ARC_USE_SMART */
> +

Lets keep all smart related stuff in files of own: smart.h and smart.c

> diff --git a/arch/arc/kernel/setup.c b/arch/arc/kernel/setup.c
>  
>   arc_chk_core_config();
> +
> +#ifdef CONFIG_ARC_USE_SMART

> + if (cpuinfo_arc700[cpu_id].extn.smart)

IS_ENABLED() is better here

> + write_aux_reg(ARC_AUX_SMART_CONTROL, SMART_CTL_EN);
> +#endif
> +
>  }
>  
> diff --git a/arch/arc/kernel/troubleshoot.c b/arch/arc/kernel/troubleshoot.c
>  
> +#ifdef CONFIG_ARC_USE_SMART
> +#define MAX_SMART_BUFF   40

[PATCH v2 0/2] arm64: Cut rebuild time when changing CONFIG_BLK_DEV_INITRD

2018-10-24 Thread Florian Fainelli
Hi all,

While investigating why ARM64 required a ton of objects to be rebuilt
when toggling CONFIG_DEV_BLK_INITRD, it became clear that this was
because we define __early_init_dt_declare_initrd() differently and we do
that in arch/arm64/include/asm/memory.h which gets included by a fair
amount of other header files, and translation units as well.

Changing the value of CONFIG_DEV_BLK_INITRD is a common thing with build
systems that generate two kernels: one with the initramfs and one
without. buildroot is one of these build systems, OpenWrt is also
another one that does this.

This patch series proposes adding an empty initrd.h to satisfy the need
for drivers/of/fdt.c to unconditionally include that file, and moves the
custom __early_init_dt_declare_initrd() definition away from
asm/memory.h

This cuts the number of objects rebuilds from 1920 down to 26, so a
factor 73 approximately.

Apologies for the long CC list, please let me know how you would go
about merging that and if another approach would be preferable, e.g:
introducing a CONFIG_ARCH_INITRD_BELOW_START_OK Kconfig option or
something like that.

Changes in v2:

- put an /* empty */ comment in the asm-generic/initrd.h file
- trim down the CC list to maximize the chances of people receiving this

Florian Fainelli (2):
  arch: Add asm-generic/initrd.h and make use of it for most
architectures
  arm64: Create asm/initrd.h

 arch/alpha/include/asm/Kbuild  |  1 +
 arch/arc/include/asm/Kbuild|  1 +
 arch/arm/include/asm/Kbuild|  1 +
 arch/arm64/include/asm/initrd.h| 13 +
 arch/arm64/include/asm/memory.h|  8 
 arch/c6x/include/asm/Kbuild|  1 +
 arch/h8300/include/asm/Kbuild  |  1 +
 arch/hexagon/include/asm/Kbuild|  1 +
 arch/ia64/include/asm/Kbuild   |  1 +
 arch/m68k/include/asm/Kbuild   |  1 +
 arch/microblaze/include/asm/Kbuild |  1 +
 arch/mips/include/asm/Kbuild   |  1 +
 arch/nds32/include/asm/Kbuild  |  1 +
 arch/nios2/include/asm/Kbuild  |  1 +
 arch/openrisc/include/asm/Kbuild   |  1 +
 arch/parisc/include/asm/Kbuild |  1 +
 arch/powerpc/include/asm/Kbuild|  1 +
 arch/riscv/include/asm/Kbuild  |  1 +
 arch/s390/include/asm/Kbuild   |  1 +
 arch/sh/include/asm/Kbuild |  1 +
 arch/sparc/include/asm/Kbuild  |  1 +
 arch/um/include/asm/Kbuild |  1 +
 arch/unicore32/include/asm/Kbuild  |  1 +
 arch/x86/include/asm/Kbuild|  1 +
 arch/xtensa/include/asm/Kbuild |  1 +
 drivers/of/fdt.c   |  1 +
 include/asm-generic/initrd.h   |  1 +
 27 files changed, 38 insertions(+), 8 deletions(-)
 create mode 100644 arch/arm64/include/asm/initrd.h
 create mode 100644 include/asm-generic/initrd.h

-- 
2.17.1


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


[PATCH v2 2/2] arm64: Create asm/initrd.h

2018-10-24 Thread Florian Fainelli
ARM64 is the only architecture that requires a re-definition of
__early_init_dt_declare_initrd(). Now that we added the infrastructure
in asm-generic to provide an asm/initrd.h file, properly break up that
definition from asm/memory.h and make use of that header in
drivers/of/fdt.c where this is used.

This significantly cuts the number of objects that need to be rebuilt on
ARM64 due to the repercusions of including asm/memory.h in several
places.

Signed-off-by: Florian Fainelli 
---
 arch/arm64/include/asm/initrd.h | 13 +
 arch/arm64/include/asm/memory.h |  8 
 drivers/of/fdt.c|  1 +
 3 files changed, 14 insertions(+), 8 deletions(-)
 create mode 100644 arch/arm64/include/asm/initrd.h

diff --git a/arch/arm64/include/asm/initrd.h b/arch/arm64/include/asm/initrd.h
new file mode 100644
index ..0c9572485810
--- /dev/null
+++ b/arch/arm64/include/asm/initrd.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_INITRD_H
+#define __ASM_INITRD_H
+
+#ifdef CONFIG_BLK_DEV_INITRD
+#define __early_init_dt_declare_initrd(__start, __end) \
+   do {\
+   initrd_start = (__start);   \
+   initrd_end = (__end);   \
+   } while (0)
+#endif
+
+#endif /* __ASM_INITRD_H */
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index b96442960aea..dc3ca21ba240 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -168,14 +168,6 @@
 #define IOREMAP_MAX_ORDER  (PMD_SHIFT)
 #endif
 
-#ifdef CONFIG_BLK_DEV_INITRD
-#define __early_init_dt_declare_initrd(__start, __end) \
-   do {\
-   initrd_start = (__start);   \
-   initrd_end = (__end);   \
-   } while (0)
-#endif
-
 #ifndef __ASSEMBLY__
 
 #include 
diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 800ad252cf9c..4e4711af907b 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -28,6 +28,7 @@
 
 #include   /* for COMMAND_LINE_SIZE */
 #include 
+#include 
 
 #include "of_private.h"
 
-- 
2.17.1


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


[PATCH v2 1/2] arch: Add asm-generic/initrd.h and make use of it for most architectures

2018-10-24 Thread Florian Fainelli
In preparation for separating the definition of
__early_init_dt_declare_initrd() on ARM64 in order to cut the amount of
files that require a rebuild when CONFIG_BLK_DEV_INITRD value is
changed, introduce an empty asm-generic initrd.h file and update all
architectures but arm64 to make use of it.

Signed-off-by: Florian Fainelli 
---
 arch/alpha/include/asm/Kbuild  | 1 +
 arch/arc/include/asm/Kbuild| 1 +
 arch/arm/include/asm/Kbuild| 1 +
 arch/c6x/include/asm/Kbuild| 1 +
 arch/h8300/include/asm/Kbuild  | 1 +
 arch/hexagon/include/asm/Kbuild| 1 +
 arch/ia64/include/asm/Kbuild   | 1 +
 arch/m68k/include/asm/Kbuild   | 1 +
 arch/microblaze/include/asm/Kbuild | 1 +
 arch/mips/include/asm/Kbuild   | 1 +
 arch/nds32/include/asm/Kbuild  | 1 +
 arch/nios2/include/asm/Kbuild  | 1 +
 arch/openrisc/include/asm/Kbuild   | 1 +
 arch/parisc/include/asm/Kbuild | 1 +
 arch/powerpc/include/asm/Kbuild| 1 +
 arch/riscv/include/asm/Kbuild  | 1 +
 arch/s390/include/asm/Kbuild   | 1 +
 arch/sh/include/asm/Kbuild | 1 +
 arch/sparc/include/asm/Kbuild  | 1 +
 arch/um/include/asm/Kbuild | 1 +
 arch/unicore32/include/asm/Kbuild  | 1 +
 arch/x86/include/asm/Kbuild| 1 +
 arch/xtensa/include/asm/Kbuild | 1 +
 include/asm-generic/initrd.h   | 1 +
 24 files changed, 24 insertions(+)
 create mode 100644 include/asm-generic/initrd.h

diff --git a/arch/alpha/include/asm/Kbuild b/arch/alpha/include/asm/Kbuild
index 0580cb8c84b2..cd6f723aed1b 100644
--- a/arch/alpha/include/asm/Kbuild
+++ b/arch/alpha/include/asm/Kbuild
@@ -5,6 +5,7 @@ generic-y += compat.h
 generic-y += exec.h
 generic-y += export.h
 generic-y += fb.h
+generic-y += initrd.h
 generic-y += irq_work.h
 generic-y += mcs_spinlock.h
 generic-y += mm-arch-hooks.h
diff --git a/arch/arc/include/asm/Kbuild b/arch/arc/include/asm/Kbuild
index feed50ce89fa..ba18632aa493 100644
--- a/arch/arc/include/asm/Kbuild
+++ b/arch/arc/include/asm/Kbuild
@@ -10,6 +10,7 @@ generic-y += fb.h
 generic-y += ftrace.h
 generic-y += hardirq.h
 generic-y += hw_irq.h
+generic-y += initrd.h
 generic-y += irq_regs.h
 generic-y += irq_work.h
 generic-y += kmap_types.h
diff --git a/arch/arm/include/asm/Kbuild b/arch/arm/include/asm/Kbuild
index 1d66db9c9db5..b91d5b32e64f 100644
--- a/arch/arm/include/asm/Kbuild
+++ b/arch/arm/include/asm/Kbuild
@@ -4,6 +4,7 @@ generic-y += early_ioremap.h
 generic-y += emergency-restart.h
 generic-y += exec.h
 generic-y += extable.h
+generic-y += initrd.h
 generic-y += irq_regs.h
 generic-y += kdebug.h
 generic-y += local.h
diff --git a/arch/c6x/include/asm/Kbuild b/arch/c6x/include/asm/Kbuild
index 33a2c94fed0d..9e14cf6e89b4 100644
--- a/arch/c6x/include/asm/Kbuild
+++ b/arch/c6x/include/asm/Kbuild
@@ -13,6 +13,7 @@ generic-y += extable.h
 generic-y += fb.h
 generic-y += futex.h
 generic-y += hw_irq.h
+generic-y += initrd.h
 generic-y += io.h
 generic-y += irq_regs.h
 generic-y += irq_work.h
diff --git a/arch/h8300/include/asm/Kbuild b/arch/h8300/include/asm/Kbuild
index a5d0b2991f47..7d4e06a757c8 100644
--- a/arch/h8300/include/asm/Kbuild
+++ b/arch/h8300/include/asm/Kbuild
@@ -19,6 +19,7 @@ generic-y += futex.h
 generic-y += hardirq.h
 generic-y += hash.h
 generic-y += hw_irq.h
+generic-y += initrd.h
 generic-y += irq_regs.h
 generic-y += irq_work.h
 generic-y += kdebug.h
diff --git a/arch/hexagon/include/asm/Kbuild b/arch/hexagon/include/asm/Kbuild
index 47c4da3d64a4..0be62abf2123 100644
--- a/arch/hexagon/include/asm/Kbuild
+++ b/arch/hexagon/include/asm/Kbuild
@@ -13,6 +13,7 @@ generic-y += fb.h
 generic-y += ftrace.h
 generic-y += hardirq.h
 generic-y += hw_irq.h
+generic-y += initrd.h
 generic-y += iomap.h
 generic-y += irq_regs.h
 generic-y += irq_work.h
diff --git a/arch/ia64/include/asm/Kbuild b/arch/ia64/include/asm/Kbuild
index 557bbc8ba9f5..1a1f1e4ba0d5 100644
--- a/arch/ia64/include/asm/Kbuild
+++ b/arch/ia64/include/asm/Kbuild
@@ -1,5 +1,6 @@
 generic-y += compat.h
 generic-y += exec.h
+generic-y += initrd.h
 generic-y += irq_work.h
 generic-y += mcs_spinlock.h
 generic-y += mm-arch-hooks.h
diff --git a/arch/m68k/include/asm/Kbuild b/arch/m68k/include/asm/Kbuild
index a4b8d3331a9e..9903551e0c9c 100644
--- a/arch/m68k/include/asm/Kbuild
+++ b/arch/m68k/include/asm/Kbuild
@@ -7,6 +7,7 @@ generic-y += exec.h
 generic-y += extable.h
 generic-y += futex.h
 generic-y += hw_irq.h
+generic-y += initrd.h
 generic-y += irq_regs.h
 generic-y += irq_work.h
 generic-y += kdebug.h
diff --git a/arch/microblaze/include/asm/Kbuild 
b/arch/microblaze/include/asm/Kbuild
index 569ba9e670c1..ec37e6304be5 100644
--- a/arch/microblaze/include/asm/Kbuild
+++ b/arch/microblaze/include/asm/Kbuild
@@ -11,6 +11,7 @@ generic-y += exec.h
 generic-y += extable.h
 generic-y += fb.h
 generic-y += hardirq.h
+generic-y += initrd.h
 generic-y += irq_regs.h
 generic-y += irq_work.h
 generic-y += kdebug.h
diff --git a/arch/mips/include/asm/Kbuild b/arch/mips/include/asm/Kbuild
index 9a81e72119

Re: [PATCH v2 0/2] arm64: Cut rebuild time when changing CONFIG_BLK_DEV_INITRD

2018-10-24 Thread Rob Herring
On Wed, Oct 24, 2018 at 2:33 PM Florian Fainelli  wrote:
>
> Hi all,
>
> While investigating why ARM64 required a ton of objects to be rebuilt
> when toggling CONFIG_DEV_BLK_INITRD, it became clear that this was
> because we define __early_init_dt_declare_initrd() differently and we do
> that in arch/arm64/include/asm/memory.h which gets included by a fair
> amount of other header files, and translation units as well.

I scratch my head sometimes as to why some config options rebuild so
much stuff. One down, ? to go. :)

> Changing the value of CONFIG_DEV_BLK_INITRD is a common thing with build
> systems that generate two kernels: one with the initramfs and one
> without. buildroot is one of these build systems, OpenWrt is also
> another one that does this.
>
> This patch series proposes adding an empty initrd.h to satisfy the need
> for drivers/of/fdt.c to unconditionally include that file, and moves the
> custom __early_init_dt_declare_initrd() definition away from
> asm/memory.h
>
> This cuts the number of objects rebuilds from 1920 down to 26, so a
> factor 73 approximately.
>
> Apologies for the long CC list, please let me know how you would go
> about merging that and if another approach would be preferable, e.g:
> introducing a CONFIG_ARCH_INITRD_BELOW_START_OK Kconfig option or
> something like that.

There may be a better way as of 4.20 because bootmem is now gone and
only memblock is used. This should unify what each arch needs to do
with initrd early. We need the physical address early for memblock
reserving. Then later on we need the virtual address to access the
initrd. Perhaps we should just change initrd_start and initrd_end to
physical addresses (or add 2 new variables would be less invasive and
allow for different translation than __va()). The sanity checks and
memblock reserve could also perhaps be moved to a common location.

Alternatively, given arm64 is the only oddball, I'd be fine with an
"if (IS_ENABLED(CONFIG_ARM64))" condition in the default
__early_init_dt_declare_initrd as long as we have a path to removing
it like the above option.

Rob

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH v2 0/2] arm64: Cut rebuild time when changing CONFIG_BLK_DEV_INITRD

2018-10-24 Thread Florian Fainelli
On 10/24/18 12:55 PM, Rob Herring wrote:
> On Wed, Oct 24, 2018 at 2:33 PM Florian Fainelli  wrote:
>>
>> Hi all,
>>
>> While investigating why ARM64 required a ton of objects to be rebuilt
>> when toggling CONFIG_DEV_BLK_INITRD, it became clear that this was
>> because we define __early_init_dt_declare_initrd() differently and we do
>> that in arch/arm64/include/asm/memory.h which gets included by a fair
>> amount of other header files, and translation units as well.
> 
> I scratch my head sometimes as to why some config options rebuild so
> much stuff. One down, ? to go. :)
> 

This one was by far the most invasive one due to its include chain, but
yes, there would be many more that could be optimized.

>> Changing the value of CONFIG_DEV_BLK_INITRD is a common thing with build
>> systems that generate two kernels: one with the initramfs and one
>> without. buildroot is one of these build systems, OpenWrt is also
>> another one that does this.
>>
>> This patch series proposes adding an empty initrd.h to satisfy the need
>> for drivers/of/fdt.c to unconditionally include that file, and moves the
>> custom __early_init_dt_declare_initrd() definition away from
>> asm/memory.h
>>
>> This cuts the number of objects rebuilds from 1920 down to 26, so a
>> factor 73 approximately.
>>
>> Apologies for the long CC list, please let me know how you would go
>> about merging that and if another approach would be preferable, e.g:
>> introducing a CONFIG_ARCH_INITRD_BELOW_START_OK Kconfig option or
>> something like that.
> 
> There may be a better way as of 4.20 because bootmem is now gone and
> only memblock is used. This should unify what each arch needs to do
> with initrd early. We need the physical address early for memblock
> reserving. Then later on we need the virtual address to access the
> initrd. Perhaps we should just change initrd_start and initrd_end to
> physical addresses (or add 2 new variables would be less invasive and
> allow for different translation than __va()). The sanity checks and
> memblock reserve could also perhaps be moved to a common location.
> 
> Alternatively, given arm64 is the only oddball, I'd be fine with an
> "if (IS_ENABLED(CONFIG_ARM64))" condition in the default
> __early_init_dt_declare_initrd as long as we have a path to removing
> it like the above option.

OK, let me cook a patch doing that and meanwhile I will look at how much
work is involved to implement the above option you outlined, which also
sounds entirely reasonable.

Thanks!
-- 
Florian

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH v2 0/2] arm64: Cut rebuild time when changing CONFIG_BLK_DEV_INITRD

2018-10-24 Thread Rob Herring
On Wed, Oct 24, 2018 at 3:01 PM Florian Fainelli  wrote:
>
> On 10/24/18 12:55 PM, Rob Herring wrote:
> > On Wed, Oct 24, 2018 at 2:33 PM Florian Fainelli  
> > wrote:
> >>
> >> Hi all,
> >>
> >> While investigating why ARM64 required a ton of objects to be rebuilt
> >> when toggling CONFIG_DEV_BLK_INITRD, it became clear that this was
> >> because we define __early_init_dt_declare_initrd() differently and we do
> >> that in arch/arm64/include/asm/memory.h which gets included by a fair
> >> amount of other header files, and translation units as well.
> >
> > I scratch my head sometimes as to why some config options rebuild so
> > much stuff. One down, ? to go. :)
> >
>
> This one was by far the most invasive one due to its include chain, but
> yes, there would be many more that could be optimized.
>
> >> Changing the value of CONFIG_DEV_BLK_INITRD is a common thing with build
> >> systems that generate two kernels: one with the initramfs and one
> >> without. buildroot is one of these build systems, OpenWrt is also
> >> another one that does this.
> >>
> >> This patch series proposes adding an empty initrd.h to satisfy the need
> >> for drivers/of/fdt.c to unconditionally include that file, and moves the
> >> custom __early_init_dt_declare_initrd() definition away from
> >> asm/memory.h
> >>
> >> This cuts the number of objects rebuilds from 1920 down to 26, so a
> >> factor 73 approximately.
> >>
> >> Apologies for the long CC list, please let me know how you would go
> >> about merging that and if another approach would be preferable, e.g:
> >> introducing a CONFIG_ARCH_INITRD_BELOW_START_OK Kconfig option or
> >> something like that.
> >
> > There may be a better way as of 4.20 because bootmem is now gone and
> > only memblock is used. This should unify what each arch needs to do
> > with initrd early. We need the physical address early for memblock
> > reserving. Then later on we need the virtual address to access the
> > initrd. Perhaps we should just change initrd_start and initrd_end to
> > physical addresses (or add 2 new variables would be less invasive and
> > allow for different translation than __va()). The sanity checks and
> > memblock reserve could also perhaps be moved to a common location.
> >
> > Alternatively, given arm64 is the only oddball, I'd be fine with an
> > "if (IS_ENABLED(CONFIG_ARM64))" condition in the default
> > __early_init_dt_declare_initrd as long as we have a path to removing
> > it like the above option.
>
> OK, let me cook a patch doing that and meanwhile I will look at how much
> work is involved to implement the above option you outlined, which also
> sounds entirely reasonable.

BTW, I would suspect that initrd_below_start_ok being 1 is not okay
for most arches. I'm not sure how that would work. min_low_pfn is
typically based on the start of memory. arm64 is not even setting it.

Rob

> --
> Florian

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH 2/4] mm: speed up mremap by 500x on large regions (v2)

2018-10-24 Thread Joel Fernandes
On Wed, Oct 24, 2018 at 03:57:24PM +0300, Kirill A. Shutemov wrote:
> On Wed, Oct 24, 2018 at 10:57:33PM +1100, Balbir Singh wrote:
> > On Wed, Oct 24, 2018 at 01:12:56PM +0300, Kirill A. Shutemov wrote:
> > > On Fri, Oct 12, 2018 at 06:31:58PM -0700, Joel Fernandes (Google) wrote:
> > > > diff --git a/mm/mremap.c b/mm/mremap.c
> > > > index 9e68a02a52b1..2fd163cff406 100644
> > > > --- a/mm/mremap.c
> > > > +++ b/mm/mremap.c
> > > > @@ -191,6 +191,54 @@ static void move_ptes(struct vm_area_struct *vma, 
> > > > pmd_t *old_pmd,
> > > > drop_rmap_locks(vma);
> > > >  }
> > > >  
> > > > +static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long 
> > > > old_addr,
> > > > + unsigned long new_addr, unsigned long old_end,
> > > > + pmd_t *old_pmd, pmd_t *new_pmd, bool *need_flush)
> > > > +{
> > > > +   spinlock_t *old_ptl, *new_ptl;
> > > > +   struct mm_struct *mm = vma->vm_mm;
> > > > +
> > > > +   if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK)
> > > > +   || old_end - old_addr < PMD_SIZE)
> > > > +   return false;
> > > > +
> > > > +   /*
> > > > +* The destination pmd shouldn't be established, free_pgtables()
> > > > +* should have release it.
> > > > +*/
> > > > +   if (WARN_ON(!pmd_none(*new_pmd)))
> > > > +   return false;
> > > > +
> > > > +   /*
> > > > +* We don't have to worry about the ordering of src and dst
> > > > +* ptlocks because exclusive mmap_sem prevents deadlock.
> > > > +*/
> > > > +   old_ptl = pmd_lock(vma->vm_mm, old_pmd);
> > > > +   if (old_ptl) {
> > > 
> > > How can it ever be false?

Kirill,
It cannot, you are right. I'll remove the test.

By the way, there are new changes upstream by Linus which flush the TLB
before releasing the ptlock instead of after. I'm guessing that patch came
about because of reviews of this patch and someone spotted an issue in the
existing code :)

Anyway the patch in concern is:
eb66ae030829 ("mremap: properly flush TLB before releasing the page")

I need to rebase on top of that with appropriate modifications, but I worry
that this patch will slow down performance since we have to flush at every
PMD/PTE move before releasing the ptlock. Where as with my patch, the
intention is to flush only at once in the end of move_page_tables. When I
tried to flush TLB on every PMD move, it was quite slow on my arm64 device [2].

Further observation [1] is, it seems like the move_huge_pmds and move_ptes code
is a bit sub optimal in the sense, we are acquiring and releasing the same
ptlock for a bunch of PMDs if the said PMDs are on the same page-table page
right? Instead we can do better by acquiring and release the ptlock less
often.

I think this observation [1] and the frequent TLB flush issue [2] can be solved
by acquiring the ptlock once for a bunch of PMDs, move them all, then flush
the tlb and then release the ptlock, and then proceed to doing the same thing
for the PMDs in the next page-table page. What do you think?

- Joel


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH 2/4] mm: speed up mremap by 500x on large regions (v2)

2018-10-24 Thread Joel Fernandes
On Wed, Oct 24, 2018 at 10:57:33PM +1100, Balbir Singh wrote:
[...]
> > > + pmd_t pmd;
> > > +
> > > + new_ptl = pmd_lockptr(mm, new_pmd);
> 
> 
> Looks like this is largely inspired by move_huge_pmd(), I guess a lot of
> the code applies, why not just reuse as much as possible? The same comments
> w.r.t mmap_sem helping protect against lock order issues applies as well.

I thought about this and when I looked into it, it seemed there are subtle
differences that make such sharing not worth it (or not possible).

 - Joel


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH 1/4] treewide: remove unused address argument from pte_alloc functions (v2)

2018-10-24 Thread Joel Fernandes
On Wed, Oct 24, 2018 at 10:37:16AM +0200, Peter Zijlstra wrote:
> On Fri, Oct 12, 2018 at 06:31:57PM -0700, Joel Fernandes (Google) wrote:
> > This series speeds up mremap(2) syscall by copying page tables at the
> > PMD level even for non-THP systems. There is concern that the extra
> > 'address' argument that mremap passes to pte_alloc may do something
> > subtle architecture related in the future that may make the scheme not
> > work.  Also we find that there is no point in passing the 'address' to
> > pte_alloc since its unused. So this patch therefore removes this
> > argument tree-wide resulting in a nice negative diff as well. Also
> > ensuring along the way that the enabled architectures do not do anything
> > funky with 'address' argument that goes unnoticed by the optimization.
> 
> Did you happen to look at the history of where that address argument
> came from? -- just being curious here. ISTR something vague about
> architectures having different paging structure for different memory
> ranges.

I didn't happen to do that analysis but from code analysis, no architecutre
is using it. Since its unused in the kernel, may be such architectures don't
exist or were removed, so we don't need to bother? Could you share more about
your concern with the removal of this argument?

thanks,

 - Joel


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc