Hi, On Tue, Nov 27, 2012 at 02:02:42PM +0100, Richard Biener wrote: > On Wed, Nov 21, 2012 at 5:58 PM, Martin Jambor <mjam...@suse.cz> wrote: > > Hi, > > > > On Tue, Nov 20, 2012 at 09:24:20AM -0800, Richard Henderson wrote: > >> The get_pointer_alignment function can indicate that it does not know > >> what the alignment should be, and it always fills in worst-case values > >> for that case. We should not use these worst-case values to "optimize" > >> the interface of a function. > >> > >> At minimum I think something like the following would be good. But > >> I'm unsure why we would *ever* want to lower the alignment at a function > >> interface. It seems to me that we'd simply want the caller to handle > >> copying the data to an aligned location? > >> > >> What was the use case of this code in the first place? > > > > PR 52402, and not surprisingly, that testcase fails on x86_64-linux > > with your patch. Furthermore, misalignment due to MEM_REF offset has > > to be accounted for whether we know the alignment of base or not. > > > > I was hoping that we could do something along the lines of > > build_ref_for_offset tree-sra.c but that would not work for cases like > > the one from PR 52890, comment 7 when there is no dereference in the > > caller, we tranform: > > > > D.2537 = &this->surfaces; > > sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 > > (D.2537, 1); > > > > into > > > > D.2537 = &this->surfaces; > > D.2554 = MEM[(const struct ggTrain *)D.2537]; > > sPtr.0 = ggTrain<mrSurface*>::_ZNK7ggTrainIP9mrSurfaceEixEi.isra.0 > > (D.2537, 1); > > > > At the moment I hope I'll be able to: > > > > 1) Bring back tree_non_aligned_mem_p from 4.6 to be used in > > access_precludes_ipa_sra_p. This way, any knowingly misaligned > > load in the callee should will not be transferred to the caller, if > > there is none. > > > > 2) In ipa_modify_call_arguments, be optimistic in like in your patch > > except for the case when we are looking at a dereference (i.e. we > > are turning a BLK dereference into a smaller scalar type one). In > > that case, use its alignment like in build_ref_for_offset. > > > > But I'd like to think about this a bit more. I believe we should ask > > for Richi's approval when he comes back and recovers anyway. But this > > is now my highest priority (finally). > > The issue here is that IPA SRA, when seeing > > foo (T *addr) > { > *addr; > } > > foo (&mem); > > wanting to transform it to > > foo' (T value) > { > value; > } > > value = *(T *)mem; > foo (value); > > has to use the _same_ alignment for the access to mem as it was used > inside foo. That's because of the fact that alignment information in GIMPLE > is solely present at memory accesses and _not_ in any way associated > with pointer types (as opposed to more strict rules imposed by some languages > such as C, often enough violated by users, *(char *)(int *)p is an access > aligned to at least 4 bytes in C). > > IPA SRA can use bigger alignment if it knows that is safe (thus > get_pointer_alignment returns something larger than what the actual > access in foo was using). What IPA SRA at the moment cannot do > is "propagate" larger alignment used in foo to the call site (AFAIK). > Thus, technically IPA SRA can use MAX (get_pointer_alignment (call site), > get_object_alignment (original dereference site)). > > For fixing pessimizations caused by IPA SRA I suggest to simply punt > (not do the transform) if get_pointer_alignment_1 returns false for the > actual call argument. Or implement the propagation.
the patch below punts if get_object_alignment on any dereference in the callee returns something smaller than the alignment of the type of the would-be replacement (and the type of the load we would introduce in the caller). This is essentially an implicit propagation of alignment from callees to callers. I have added a new testcase that checks this which, when not handled properly, fails on sparc64 and produces wrong code on arm (which I however only checked by looking at cross-compiler assembly output). Nevertheless, there is another case that must be taken care of (you already added a x86_64 testcase: gcc.dg/torture/pr52402.c) when there already is a dereference in the caller but it is a BLK-mode one and IPA-SRA changes it to a scalar load - this happens when an aggregate by-value parameter is turned into a scalar one. In that case, we should look at the dereference in the caller and derive alignment from there. I do not do MAX, I assume that when get_pointer_alignment returns true, it can be trusted. If it returns some smaller alignment than the alignment of the type (propagated from the callee), then I suppose there is an alignment attribute missing in the callee or some such mistake in the original program. So far I have successfully bootstrapped and tested this on x86_64-linux and have run parts of the testsuite on sparc64-linux and hppa-linux. Full bootstrap on sparc64-linux and ppc64-linux are underway. I'm going to add a testcase for PR 55448 that will scan x86_64 assembly output. What do you think? Thanks, Martin 2012-11-27 Martin Jambor <mjam...@suse.cz> PR middle-end/52890 PR tree-optimization/55415 PR tree-optimization/54386 PR target/55448 * ipa-prop.c (ipa_modify_call_arguments): Be optimistic when get_pointer_alignment_1 returns false and the base was not a dereference. * tree-sra.c (access_precludes_ipa_sra_p): New parameter req_align, added check for required alignment. Update the user. * testsuite/gcc.dg/ipa/ipa-sra-7.c: New test. *** /tmp/5XB8rd_ipa-prop.c Tue Nov 27 21:34:54 2012 --- gcc/ipa-prop.c Tue Nov 27 17:04:20 2012 *************** ipa_modify_call_arguments (struct cgraph *** 2888,2893 **** --- 2888,2894 ---- { tree expr, base, off; location_t loc; + tree prev_base = NULL_TREE; /* We create a new parameter out of the value of the old one, we can do the following kind of transformations: *************** ipa_modify_call_arguments (struct cgraph *** 2919,2925 **** else { HOST_WIDE_INT base_offset; - tree prev_base; if (TREE_CODE (base) == ADDR_EXPR) base = TREE_OPERAND (base, 0); --- 2920,2925 ---- *************** ipa_modify_call_arguments (struct cgraph *** 2956,2962 **** unsigned int align; unsigned HOST_WIDE_INT misalign; ! get_pointer_alignment_1 (base, &align, &misalign); misalign += (tree_to_double_int (off) .sext (TYPE_PRECISION (TREE_TYPE (off))).low * BITS_PER_UNIT); --- 2956,2973 ---- unsigned int align; unsigned HOST_WIDE_INT misalign; ! if (!get_pointer_alignment_1 (base, &align, &misalign)) ! { ! if (prev_base ! && (TREE_CODE (prev_base) == MEM_REF ! || TREE_CODE (prev_base) == TARGET_MEM_REF)) ! { ! gcc_assert (misalign == 0); ! align = TYPE_ALIGN (TREE_TYPE (prev_base)); ! } ! else ! align = TYPE_ALIGN (type); ! } misalign += (tree_to_double_int (off) .sext (TYPE_PRECISION (TREE_TYPE (off))).low * BITS_PER_UNIT); *** /dev/null Thu Oct 25 13:49:30 2012 --- gcc/testsuite/gcc.dg/ipa/ipa-sra-7.c Tue Nov 27 18:48:45 2012 *************** *** 0 **** --- 1,42 ---- + /* { dg-do run } */ + /* { dg-options "-O3" } */ + + typedef unsigned int myint __attribute__((aligned(1))); + + typedef struct __attribute__((packed)) S { + unsigned a, b, c; + } SS; + + typedef SS __attribute__((aligned(1))) SSS; + + + static unsigned int __attribute__ ((noinline)) + get_a (SSS *p) + { + return p->a; + }; + + static int __attribute__ ((noinline, noclone)) + foo (SS *p) + { + int r = (int) get_a(p) + 2; + return r; + } + + char buf[512]; + + static SSS * __attribute__ ((noinline, noclone)) + get_sss (void) + { + return (SSS *)(buf + 1); + } + + + int + main(int argc, char *argv[]) + { + SSS *p = get_sss(); + if (foo(p) != 2) + __builtin_abort (); + return 0; + } *** /tmp/Hp0Wyc_tree-sra.c Tue Nov 27 21:34:54 2012 --- gcc/tree-sra.c Tue Nov 27 21:28:53 2012 *************** unmodified_by_ref_scalar_representative *** 3891,3902 **** return repr; } ! /* Return true iff this access precludes IPA-SRA of the parameter it is ! associated with. */ static bool ! access_precludes_ipa_sra_p (struct access *access) { /* Avoid issues such as the second simple testcase in PR 42025. The problem is incompatible assign in a call statement (and possibly even in asm statements). This can be relaxed by using a new temporary but only for --- 3891,3903 ---- return repr; } ! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is ! associated with. REQ_ALIGN is the minimum required alignment. */ static bool ! access_precludes_ipa_sra_p (struct access *access, unsigned int req_align) { + unsigned int exp_align; /* Avoid issues such as the second simple testcase in PR 42025. The problem is incompatible assign in a call statement (and possibly even in asm statements). This can be relaxed by using a new temporary but only for *************** access_precludes_ipa_sra_p (struct acces *** 3908,3913 **** --- 3909,3918 ---- || gimple_code (access->stmt) == GIMPLE_ASM)) return true; + exp_align = get_object_alignment (access->expr); + if (exp_align < req_align) + return true; + return false; } *************** splice_param_accesses (tree parm, bool * *** 3943,3949 **** tree a1_alias_type; access = (*access_vec)[i]; modification = access->write; ! if (access_precludes_ipa_sra_p (access)) return NULL; a1_alias_type = reference_alias_ptr_type (access->expr); --- 3948,3954 ---- tree a1_alias_type; access = (*access_vec)[i]; modification = access->write; ! if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type))) return NULL; a1_alias_type = reference_alias_ptr_type (access->expr); *************** splice_param_accesses (tree parm, bool * *** 3966,3972 **** else if (ac2->size != access->size) return NULL; ! if (access_precludes_ipa_sra_p (ac2) || (ac2->type != access->type && (TREE_ADDRESSABLE (ac2->type) || TREE_ADDRESSABLE (access->type))) --- 3971,3977 ---- else if (ac2->size != access->size) return NULL; ! if (access_precludes_ipa_sra_p (ac2, TYPE_ALIGN (access->type)) || (ac2->type != access->type && (TREE_ADDRESSABLE (ac2->type) || TREE_ADDRESSABLE (access->type)))