Hi,

On Tue, Nov 27, 2012 at 02:02:42PM +0100, Richard Biener wrote:
> On Wed, Nov 21, 2012 at 5:58 PM, Martin Jambor <mjam...@suse.cz> wrote:
> > Hi,
> >
> > On Tue, Nov 20, 2012 at 09:24:20AM -0800, Richard Henderson wrote:
> >> The get_pointer_alignment function can indicate that it does not know
> >> what the alignment should be, and it always fills in worst-case values
> >> for that case.  We should not use these worst-case values to "optimize"
> >> the interface of a function.
> >>
> >> At minimum I think something like the following would be good.  But
> >> I'm unsure why we would *ever* want to lower the alignment at a function
> >> interface.  It seems to me that we'd simply want the caller to handle
> >> copying the data to an aligned location?
> >>
> >> What was the use case of this code in the first place?
> >
> > PR 52402, and not surprisingly, that testcase fails on x86_64-linux
> > with your patch.  Furthermore, misalignment due to MEM_REF offset has
> > to be accounted for whether we know the alignment of base or not.
> >
> > I was hoping that we could do something along the lines of
> > build_ref_for_offset tree-sra.c but that would not work for cases like
> > the one from PR 52890, comment 7 when there is no dereference in the
> > caller, we tranform:
> >
> >   D.2537 = &this->surfaces;
> >   sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 
> > (D.2537, 1);
> >
> > into
> >
> >   D.2537 = &this->surfaces;
> >   D.2554 = MEM[(const struct ggTrain *)D.2537];
> >   sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 
> > (D.2537, 1);
> >
> > At the moment I hope I'll be able to:
> >
> > 1) Bring back tree_non_aligned_mem_p from 4.6 to be used in
> >    access_precludes_ipa_sra_p.  This way, any knowingly misaligned
> >    load in the callee should will not be transferred to the caller, if
> >    there is none.
> >
> > 2) In ipa_modify_call_arguments, be optimistic in like in your patch
> >    except for the case when we are looking at a dereference (i.e. we
> >    are turning a BLK dereference into a smaller scalar type one).  In
> >    that case, use its alignment like in build_ref_for_offset.
> >
> > But I'd like to think about this a bit more.  I believe we should ask
> > for Richi's approval when he comes back and recovers anyway.  But this
> > is now my highest priority (finally).
> 
> The issue here is that IPA SRA, when seeing
> 
> foo (T *addr)
> {
>   *addr;
> }
> 
>  foo (&mem);
> 
> wanting to transform it to
> 
> foo' (T value)
> {
>   value;
> }
> 
>  value = *(T *)mem;
>  foo (value);
> 
> has to use the _same_ alignment for the access to mem as it was used
> inside foo.  That's because of the fact that alignment information in GIMPLE
> is solely present at memory accesses and _not_ in any way associated
> with pointer types (as opposed to more strict rules imposed by some languages
> such as C, often enough violated by users, *(char *)(int *)p is an access
> aligned to at least 4 bytes in C).
> 
> IPA SRA can use bigger alignment if it knows that is safe (thus
> get_pointer_alignment returns something larger than what the actual
> access in foo was using).  What IPA SRA at the moment cannot do
> is "propagate" larger alignment used in foo to the call site (AFAIK).
> Thus, technically IPA SRA can use MAX (get_pointer_alignment (call site),
> get_object_alignment (original dereference site)).
> 
> For fixing pessimizations caused by IPA SRA I suggest to simply punt
> (not do the transform) if get_pointer_alignment_1 returns false for the
> actual call argument.  Or implement the propagation.

the patch below punts if get_object_alignment on any dereference in
the callee returns something smaller than the alignment of the type of
the would-be replacement (and the type of the load we would introduce
in the caller).  This is essentially an implicit propagation of
alignment from callees to callers.  I have added a new testcase that
checks this which, when not handled properly, fails on sparc64 and
produces wrong code on arm (which I however only checked by looking at
cross-compiler assembly output).

Nevertheless, there is another case that must be taken care of (you
already added a x86_64 testcase: gcc.dg/torture/pr52402.c) when there
already is a dereference in the caller but it is a BLK-mode one and
IPA-SRA changes it to a scalar load - this happens when an aggregate
by-value parameter is turned into a scalar one.  In that case, we
should look at the dereference in the caller and derive alignment from
there.

I do not do MAX, I assume that when get_pointer_alignment returns
true, it can be trusted.  If it returns some smaller alignment than
the alignment of the type (propagated from the callee), then I suppose
there is an alignment attribute missing in the callee or some such
mistake in the original program.

So far I have successfully bootstrapped and tested this on
x86_64-linux and have run parts of the testsuite on sparc64-linux and
hppa-linux.  Full bootstrap on sparc64-linux and ppc64-linux are
underway.

I'm going to add a testcase for PR 55448 that will scan x86_64
assembly output.

What do you think?

Thanks,

Martin


2012-11-27  Martin Jambor  <mjam...@suse.cz>

        PR middle-end/52890
        PR tree-optimization/55415
        PR tree-optimization/54386
        PR target/55448
        * ipa-prop.c (ipa_modify_call_arguments): Be optimistic when
        get_pointer_alignment_1 returns false and the base was not a
        dereference.
        * tree-sra.c (access_precludes_ipa_sra_p): New parameter req_align,
        added check for required alignment.  Update the user.

        * testsuite/gcc.dg/ipa/ipa-sra-7.c: New test.

*** /tmp/5XB8rd_ipa-prop.c      Tue Nov 27 21:34:54 2012
--- gcc/ipa-prop.c      Tue Nov 27 17:04:20 2012
*************** ipa_modify_call_arguments (struct cgraph
*** 2888,2893 ****
--- 2888,2894 ----
        {
          tree expr, base, off;
          location_t loc;
+         tree prev_base = NULL_TREE;
  
          /* We create a new parameter out of the value of the old one, we can
             do the following kind of transformations:
*************** ipa_modify_call_arguments (struct cgraph
*** 2919,2925 ****
          else
            {
              HOST_WIDE_INT base_offset;
-             tree prev_base;
  
              if (TREE_CODE (base) == ADDR_EXPR)
                base = TREE_OPERAND (base, 0);
--- 2920,2925 ----
*************** ipa_modify_call_arguments (struct cgraph
*** 2956,2962 ****
              unsigned int align;
              unsigned HOST_WIDE_INT misalign;
  
!             get_pointer_alignment_1 (base, &align, &misalign);
              misalign += (tree_to_double_int (off)
                           .sext (TYPE_PRECISION (TREE_TYPE (off))).low
                           * BITS_PER_UNIT);
--- 2956,2973 ----
              unsigned int align;
              unsigned HOST_WIDE_INT misalign;
  
!             if (!get_pointer_alignment_1 (base, &align, &misalign))
!               {
!                 if (prev_base
!                     && (TREE_CODE (prev_base) == MEM_REF
!                         || TREE_CODE (prev_base) == TARGET_MEM_REF))
!                   {
!                     gcc_assert (misalign == 0);
!                     align = TYPE_ALIGN (TREE_TYPE (prev_base));
!                   }
!                 else
!                   align = TYPE_ALIGN (type);
!               }
              misalign += (tree_to_double_int (off)
                           .sext (TYPE_PRECISION (TREE_TYPE (off))).low
                           * BITS_PER_UNIT);
*** /dev/null   Thu Oct 25 13:49:30 2012
--- gcc/testsuite/gcc.dg/ipa/ipa-sra-7.c        Tue Nov 27 18:48:45 2012
***************
*** 0 ****
--- 1,42 ----
+ /* { dg-do run } */
+ /* { dg-options "-O3" } */
+ 
+ typedef unsigned int myint __attribute__((aligned(1)));
+ 
+ typedef struct __attribute__((packed)) S {
+   unsigned a, b, c;
+ } SS;
+ 
+ typedef SS __attribute__((aligned(1))) SSS;
+ 
+ 
+ static unsigned int __attribute__ ((noinline))
+ get_a (SSS *p)
+ {
+   return p->a;
+ };
+ 
+ static int __attribute__ ((noinline, noclone))
+ foo (SS *p)
+ {
+   int r = (int) get_a(p) + 2;
+   return r;
+ }
+ 
+ char buf[512];
+ 
+ static SSS * __attribute__ ((noinline, noclone))
+ get_sss (void)
+ {
+   return (SSS *)(buf + 1);
+ }
+ 
+ 
+ int
+ main(int argc, char *argv[])
+ {
+   SSS *p = get_sss();
+   if (foo(p) != 2)
+     __builtin_abort ();
+   return 0;
+ }
*** /tmp/Hp0Wyc_tree-sra.c      Tue Nov 27 21:34:54 2012
--- gcc/tree-sra.c      Tue Nov 27 21:28:53 2012
*************** unmodified_by_ref_scalar_representative
*** 3891,3902 ****
    return repr;
  }
  
! /* Return true iff this access precludes IPA-SRA of the parameter it is
!    associated with. */
  
  static bool
! access_precludes_ipa_sra_p (struct access *access)
  {
    /* Avoid issues such as the second simple testcase in PR 42025.  The problem
       is incompatible assign in a call statement (and possibly even in asm
       statements).  This can be relaxed by using a new temporary but only for
--- 3891,3903 ----
    return repr;
  }
  
! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is
!    associated with.  REQ_ALIGN is the minimum required alignment.  */
  
  static bool
! access_precludes_ipa_sra_p (struct access *access, unsigned int req_align)
  {
+   unsigned int exp_align;
    /* Avoid issues such as the second simple testcase in PR 42025.  The problem
       is incompatible assign in a call statement (and possibly even in asm
       statements).  This can be relaxed by using a new temporary but only for
*************** access_precludes_ipa_sra_p (struct acces
*** 3908,3913 ****
--- 3909,3918 ----
          || gimple_code (access->stmt) == GIMPLE_ASM))
      return true;
  
+   exp_align = get_object_alignment (access->expr);
+   if (exp_align < req_align)
+     return true;
+ 
    return false;
  }
  
*************** splice_param_accesses (tree parm, bool *
*** 3943,3949 ****
        tree a1_alias_type;
        access = (*access_vec)[i];
        modification = access->write;
!       if (access_precludes_ipa_sra_p (access))
        return NULL;
        a1_alias_type = reference_alias_ptr_type (access->expr);
  
--- 3948,3954 ----
        tree a1_alias_type;
        access = (*access_vec)[i];
        modification = access->write;
!       if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type)))
        return NULL;
        a1_alias_type = reference_alias_ptr_type (access->expr);
  
*************** splice_param_accesses (tree parm, bool *
*** 3966,3972 ****
          else if (ac2->size != access->size)
            return NULL;
  
!         if (access_precludes_ipa_sra_p (ac2)
              || (ac2->type != access->type
                  && (TREE_ADDRESSABLE (ac2->type)
                      || TREE_ADDRESSABLE (access->type)))
--- 3971,3977 ----
          else if (ac2->size != access->size)
            return NULL;
  
!         if (access_precludes_ipa_sra_p (ac2, TYPE_ALIGN (access->type))
              || (ac2->type != access->type
                  && (TREE_ADDRESSABLE (ac2->type)
                      || TREE_ADDRESSABLE (access->type)))

Reply via email to