On 07/11, David Rientjes wrote:
>
> On Wed, 10 Jul 2013, Oleg Nesterov wrote:
>
> > +int vma_dup_policy(struct vm_area_struct *src, struct vm_area_struct *dst)
> > +{
> > +   struct mempolicy *pol = mpol_dup(vma_policy(src));
> > +
> > +   if (IS_ERR(pol))
> > +           return PTR_ERR(pol);
>
> PTR_ERR() returns long, so vma_dup_policy() needs to return long.

I think that "int" should be fine, or we should fix IS_ERR/ERR_PTR. If
nothing else, the changed code did the same. And there are a lot of other
"int" functions which return PTR_ERR().

But I agree, this is only correct because vma_dup_policy() checks IS_ERR()
before PTR_ERR(), and because mpol_dup() doesn't do the wrong things with
ERR_PTR().

For example, ERR_PTR(args->err) in hw_breakpoint_handler() looks really
strange and imho should be killed. But correct, it is not actually the
error.

> > @@ -2505,12 +2504,9 @@ static int __split_vma(struct mm_struct * mm, struct 
> > vm_area_struct * vma,
> >             new->vm_pgoff += ((addr - vma->vm_start) >> PAGE_SHIFT);
> >     }
> >
> > -   pol = mpol_dup(vma_policy(vma));
> > -   if (IS_ERR(pol)) {
> > -           err = PTR_ERR(pol);
> > +   err = vma_dup_policy(vma, new);
> > +   if (err)
> >             goto out_free_vma;
> > -   }
> > -   vma_set_policy(new, pol);
> >
> >     if (anon_vma_clone(new, vma))
> >             goto out_free_mpol;
>
> This isn't the first occurrence in mm/mmap.c, what about vma_adjust()?
> Probably need to patch 3.10 or later.

Ah, sorry for confusion, I forgot to mention that this is on top of
another -mm patch,

        mm-mempolicy-fix-mbind_range-vma_adjust-interaction.patch

attached below just in case.

> Otherwise looks good.

Thanks for review ;)

Oleg.

-----------------------------------------------------------------------
[PATCH] mm: mempolicy: fix mbind_range() && vma_adjust() interaction

vma_adjust() does vma_set_policy(vma, vma_policy(next)) and this
is doubly wrong:

1. This leaks vma->vm_policy if it is not NULL and not equal to
   next->vm_policy.

   This can happen if vma_merge() expands "area", not prev (case 8).

2. This sets the wrong policy if vma_merge() joins prev and area,
   area is the vma the caller needs to update and it still has the
   old policy.

Revert 1444f92c "mm: merging memory blocks resets mempolicy" which
introduced these problems.

Change mbind_range() to recheck mpol_equal() after vma_merge() to
fix the problem 1444f92c tried to address.

Signed-off-by: Oleg Nesterov <[email protected]>
Cc: <[email protected]>
---
 mm/mempolicy.c |    6 +++++-
 mm/mmap.c      |    2 +-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 7431001..4baf12e 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -732,7 +732,10 @@ static int mbind_range(struct mm_struct *mm, unsigned long 
start,
                if (prev) {
                        vma = prev;
                        next = vma->vm_next;
-                       continue;
+                       if (mpol_equal(vma_policy(vma), new_pol))
+                               continue;
+                       /* vma_merge() joined vma && vma->next, case 8 */
+                       goto replace;
                }
                if (vma->vm_start != vmstart) {
                        err = split_vma(vma->vm_mm, vma, vmstart, 1);
@@ -744,6 +747,7 @@ static int mbind_range(struct mm_struct *mm, unsigned long 
start,
                        if (err)
                                goto out;
                }
+ replace:
                err = vma_replace_policy(vma, new_pol);
                if (err)
                        goto out;
diff --git a/mm/mmap.c b/mm/mmap.c
index 7fe7f0b..42234b8 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -865,7 +865,7 @@ again:                      remove_next = 1 + (end > 
next->vm_end);
                if (next->anon_vma)
                        anon_vma_merge(vma, next);
                mm->map_count--;
-               vma_set_policy(vma, vma_policy(next));
+               mpol_put(vma_policy(next));
                kmem_cache_free(vm_area_cachep, next);
                /*
                 * In mprotect's case 6 (see comments on vma_merge),
-- 
1.5.5.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to