http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50067
Richard Guenther <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |spop at gcc dot gnu.org --- Comment #6 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-08-19 11:16:41 UTC --- I think the bug is that we have #(Data Ref: # bb: 3 # stmt: D.2736_10 = MEM[(int *)&a + 16B]; # ref: MEM[(int *)&a + 16B]; # base_object: a # Access function 0: 16B thus, an access function with a byte offset. I don't see how this couldn't happen with the 4.5 code-base and the indirect-ref handling as well. Thus, the code in dr_analyze_indices that pushes an access-function for if (nest && (INDIRECT_REF_P (aref) || TREE_CODE (aref) == MEM_REF)) { probably shouldn't do so if the base analysis op = TREE_OPERAND (aref, 0); access_fn = analyze_scalar_evolution (loop, op); access_fn = instantiate_scev (before_loop, loop, access_fn); isn't a POLYNOMIAL_CHREC. Because then we have no idea if the steps / element sizes are compatible. For example for int a[256] = { 0, 0, 0, 0, 7, 9, 0, 0, }; int main() { int i; for (i = 0; i < 256; ++i) { a[i] = (*((char(*)[8])&a[4]))[i]; } } we create Creating dr for MEM[(char[8] *)&a + 16B][i_12] analyze_innermost: success. base_address: &a offset from base address: 0 constant offset from base address: 16 step: 1 aligned to: 128 base_object: MEM[(char[8] *)&a] Creating dr for a[i_12] analyze_innermost: success. base_address: &a offset from base address: 0 constant offset from base address: 0 step: 4 aligned to: 128 base_object: a (compute_affine_dependence (stmt_a = D.2727_4 = MEM[(char[8] *)&a + 16B][i_12]; ) (stmt_b = a[i_12] = D.2728_5; ) ) (Data Dep: #(Data Ref: # bb: 3 # stmt: D.2727_4 = MEM[(char[8] *)&a + 16B][i_12]; # ref: MEM[(char[8] *)&a + 16B][i_12]; # base_object: MEM[(char[8] *)&a]; # Access function 0: {0, +, 1}_1 # Access function 1: 16B #) #(Data Ref: # bb: 3 # stmt: a[i_12] = D.2728_5; # ref: a[i_12]; # base_object: a # Access function 0: {0, +, 1}_1 #) (don't know) with a mismatched number of access functions. On the 4.5 branch we seem to do the same and we seem to rely on the mismatch in the number of access functions to say "don't know" for dependence analysis. Testcase: extern void abort (void); int a[6] = { 0, 0, 0, 0, 7, 0 }; int main () { int i; int (*p)[1] = &a[4]; for (i = 0; i < 4; ++i) { a[i + 1] = a[i + 2] > i; (*p)[i] &= ~1; } if (a[4] != 0) abort (); return 0; } #(Data Ref: # stmt: D.2727_9 = (*p_3)[i_19]; # ref: (*p_3)[i_19]; # base_object: (*(int[1] *) &a)[0]; # Access function 0: {0, +, 1}_1 # Access function 1: 16B so the new "issue" seems to be that with invariant DRs now allowed we have a spurious match in access functions, because with a non-indirect ref base we do not add a access function for the base (which would be simply 0B). Of course then still (Data Dep: #(Data Ref: # bb: 3 # stmt: D.2727_4 = MEM[(char[8] *)&a + 16B][i_12]; # ref: MEM[(char[8] *)&a + 16B][i_12]; # base_object: a # Access function 0: {0, +, 1}_1 # Access function 1: 16B #) #(Data Ref: # bb: 3 # stmt: a[i_12] = D.2728_5; # ref: a[i_12]; # base_object: a # Access function 0: {0, +, 1}_1 # Access function 1: 0 #) (no dependence) because the code assumes that access functions affect "independent" index spaces, if you look at int a[256] = { 0, 0, 0, 0, 7, 9, 0, 0, }; int main() { int i; for (i = 0; i < 256; ++i) { a[i] = (*((char(*)[8])&a[4]))[i]; } } so it seems that rather than adding a separate access function for the base pointer object (which could be dependent on i as well!) we shouldn't do that at all. Sebastian - do you remember anything about this code? I'd simply do Index: tree-data-ref.c =================================================================== --- tree-data-ref.c (revision 177891) +++ tree-data-ref.c (working copy) @@ -862,34 +862,6 @@ dr_analyze_indices (struct data_referenc aref = TREE_OPERAND (aref, 0); } - if (nest - && (INDIRECT_REF_P (aref) - || TREE_CODE (aref) == MEM_REF)) - { - op = TREE_OPERAND (aref, 0); - access_fn = analyze_scalar_evolution (loop, op); - access_fn = instantiate_scev (before_loop, loop, access_fn); - base = initial_condition (access_fn); - split_constant_offset (base, &base, &off); - if (TREE_CODE (aref) == MEM_REF) - off = size_binop (PLUS_EXPR, off, - fold_convert (ssizetype, TREE_OPERAND (aref, 1))); - access_fn = chrec_replace_initial_condition (access_fn, - fold_convert (TREE_TYPE (base), off)); - - TREE_OPERAND (aref, 0) = base; - VEC_safe_push (tree, heap, access_fns, access_fn); - } - - if (TREE_CODE (aref) == MEM_REF) - TREE_OPERAND (aref, 1) - = build_int_cst (TREE_TYPE (TREE_OPERAND (aref, 1)), 0); as that seems what is conservatively correct. Now that will loose any index analysis on pointers ... but with some obfuscation we can arbitrarily mix array and pointer-based code, so ... Relying on SCEV or access function count mismatches looks incredibly fragile ... For different array-views on the same array we also get bogus dependences: int a[256] = { 0, 0, 0, 0, 7, 9, 0, 0, }; int main() { int i; for (i = 0; i < 128; ++i) { a[i] = (*((char(*)[128])&a[0]))[i+128]; } } (Data Dep: #(Data Ref: # bb: 3 # stmt: D.2728_5 = MEM[(char[128] *)&a][D.2727_4]; # ref: MEM[(char[128] *)&a][D.2727_4]; # base_object: a # Access function 0: {128, +, 1}_1 #) #(Data Ref: # bb: 3 # stmt: a[i_13] = D.2729_6; # ref: a[i_13]; # base_object: a # Access function 0: {0, +, 1}_1 #) (no dependence) so what the access_fns assume is that if ever two base objects are the same then the access functions, in order of appearance in the vector, are compatible (iff there is the same number of access functions).