http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60042
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
With some more dumping I seee
himenobmtxpa.c:296:9: note: === vect_prune_runtime_alias_test_list ===
himenobmtxpa.c:296:9: note: merging ranges for *_205, *_324 and *_49, *_324
himenobmtxpa.c:296:9: note: merging ranges for *_205, *_324 and *_192, *_324
himenobmtxpa.c:296:9: note: merging ranges for *_168, *_324 and *_69, *_324
himenobmtxpa.c:296:9: note: merging ranges for *_168, *_324 and *_154, *_324
himenobmtxpa.c:296:9: note: merging ranges for *_265, *_324 and *_296, *_324
himenobmtxpa.c:296:9: note: merging ranges for *_265, *_324 and *_89, *_324
himenobmtxpa.c:296:9: note: merging ranges for *_174, *_324 and *_248, *_324
himenobmtxpa.c:296:9: note: merging ranges for *_174, *_324 and *_161, *_324
himenobmtxpa.c:296:9: note: merging ranges for *_211, *_324 and *_231, *_324
himenobmtxpa.c:296:9: note: merging ranges for *_211, *_324 and *_199, *_324
himenobmtxpa.c:296:9: note: improved number of alias checks from 31 to 21
and
Creating dr for *_205
analyze_innermost: success.
base_address: pretmp_1004 + (sizetype) ((long unsigned int) pretmp_1009
* 4)
offset from base address: 0
constant offset from base address: 0
step: 4
aligned to: 128
base_object: *pretmp_1004 + (sizetype) ((long unsigned int) pretmp_1009
* 4)
Access function 0: {0B, +, 4}_7
Creating dr for *_168
analyze_innermost: success.
base_address: pretmp_1004 + (sizetype) ((long unsigned int) pretmp_1023
* 4)
offset from base address: 0
constant offset from base address: 0
step: 4
aligned to: 128
base_object: *pretmp_1004 + (sizetype) ((long unsigned int) pretmp_1023
* 4)
Access function 0: {0B, +, 4}_7
Creating dr for *_265
analyze_innermost: success.
base_address: pretmp_1004 + (sizetype) ((long unsigned int) pretmp_1034
* 4)
offset from base address: 0
constant offset from base address: 0
step: 4
aligned to: 128
base_object: *pretmp_1004 + (sizetype) ((long unsigned int) pretmp_1034
* 4)
Access function 0: {0B, +, 4}_7
Creating dr for *_174
analyze_innermost: success.
base_address: pretmp_1004 + (sizetype) ((long unsigned int) pretmp_1063
* 4)
offset from base address: 0
constant offset from base address: 0
step: 4
aligned to: 128
base_object: *pretmp_1004 + (sizetype) ((long unsigned int) pretmp_1063
* 4)
Access function 0: {0B, +, 4}_7
...
so the remaining DDRs against *_324 all look related.
pretmp_1062 = pretmp_1020 + pretmp_1047;
pretmp_1063 = _25 * pretmp_1062;
pretmp_1033 = j_380 + pretmp_1020;
pretmp_1034 = _25 * pretmp_1033;
pretmp_1022 = pretmp_1020 + pretmp_1021;
pretmp_1023 = _25 * pretmp_1022;
but SCEV doesn't expand stmts before the loop and thus doesn't see this.
It's obviously far from trivial to merge segments with symbolic start
addresses ... these are multi-dimensional accesses:
for(k=1 ; k<kmax ; k++){
s0= MR(a,0,i,j,k)*MR(p,0,i+1,j, k)
+ MR(a,1,i,j,k)*MR(p,0,i, j+1,k)
+ MR(a,2,i,j,k)*MR(p,0,i, j, k+1)
+ MR(b,0,i,j,k)
*( MR(p,0,i+1,j+1,k) - MR(p,0,i+1,j-1,k)
- MR(p,0,i-1,j+1,k) + MR(p,0,i-1,j-1,k) )
+ MR(b,1,i,j,k)
*( MR(p,0,i,j+1,k+1) - MR(p,0,i,j-1,k+1)
- MR(p,0,i,j+1,k-1) + MR(p,0,i,j-1,k-1) )
+ MR(b,2,i,j,k)
*( MR(p,0,i+1,j,k+1) - MR(p,0,i-1,j,k+1)
- MR(p,0,i+1,j,k-1) + MR(p,0,i-1,j,k-1) )
+ MR(c,0,i,j,k) * MR(p,0,i-1,j, k)
+ MR(c,1,i,j,k) * MR(p,0,i, j-1,k)
+ MR(c,2,i,j,k) * MR(p,0,i, j, k-1)
+ MR(wrk1,0,i,j,k);
ss= (s0*MR(a,3,i,j,k) - MR(p,0,i,j,k))*MR(bnd,0,i,j,k);
gosa+= ss*ss;
MR(wrk2,0,i,j,k)= MR(p,0,i,j,k) + omega*ss;
}
and we manage to merge the fastest varying dimension +-1 ones AFAIK,
but not for example the ones for MR(p,0,i+1,j+1,k) and MR(p,0,i+1,j-1,k).
Ideally we would be able to derive a single check for each array
(which would require analyzing the DRs in the outer loops as well to
gather info about the other dimensions).