On Mon, May 8, 2017 at 6:11 PM, Robin Dapp <rd...@linux.vnet.ibm.com> wrote: >> So the new part is the last point? There's a lot of refactoring in > 3/3 that >> makes it hard to see what is actually changed ... you need to resist >> in doing this, it makes review very hard. > > The new part is actually spread across the three last "-"s. Attached is > a new version of [3/3] split up into two patches with hopefully less > blending of refactoring and new functionality. > > [3/4] Computes full costs when peeling for unknown alignment, uses > either read or write and compares the better one with the peeling costs > for known alignment. If the peeling for unknown alignment "aligns" more > than twice the number of datarefs, it is preferred over the peeling for > known alignment. > > [4/4] Computes the costs for no peeling and compares them with the costs > of the best peeling so far. If it is not more expensive, no peeling > will be performed. > >> I think it's always best to align a ref with known alignment as that > simplifies >> conditions and allows followup optimizations (unrolling of the >> prologue / epilogue). >> I think for this it's better to also compute full costs rather than > relying on >> sth as simple as "number of same aligned refs". >> >> Does the code ever end up misaligning a previously known aligned ref? > > The following case used to get aligned via the known alignment of dd but > would not anymore since peeling for unknown alignment aligns two > accesses. I guess the determining factor is still up for scrutiny and > should probably > 2. Still, on e.g. s390x no peeling is performed due > to costs.
Ok, in principle this makes sense if we manage to correctly compute the costs. What exactly is profitable or not is of course subject to the target costs. Richard. > void foo(int *restrict a, int *restrict b, int *restrict c, int > *restrict d, unsigned int n) > { > int *restrict dd = __builtin_assume_aligned (d, 8); > for (unsigned int i = 0; i < n; i++) > { > b[i] = b[i] + a[i]; > c[i] = c[i] + b[i]; > dd[i] = a[i]; > } > } > > Regards > Robin >