On 27/11/15 14:13, Richard Biener wrote:

The following fixes the excessive peeling for gaps we do when doing
SLP now that I removed most of the restrictions on having gaps in
the first place.

This should make low-trip vectorized loops more efficient (sth
also the combine-epilogue-with-vectorized-body-by-masking patches
claim to do).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2015-11-27  Richard Biener  <rguent...@suse.de>

        PR tree-optimization/68559
        * tree-vect-data-refs.c (vect_analyze_group_access_1): Move
        peeling for gap checks ...
        * tree-vect-stmts.c (vectorizable_load): ... here and relax
        for SLP.
        * tree-vect-loop.c (vect_analyze_loop_2): Re-set
        LOOP_VINFO_PEELING_FOR_GAPS before re-trying without SLP.

        * gcc.dg/vect/slp-perm-4.c: Adjust again.
        * gcc.dg/vect/pr45752.c: Likewise.

Since this, we have

FAIL: gcc.dg/vect/pr45752.c -flto -ffat-lto-objects scan-tree-dump-times vect "gaps requires scalar epilogue loop" 0 FAIL: gcc.dg/vect/pr45752.c scan-tree-dump-times vect "gaps requires scalar epilogue loop" 0

on aarch64 platforms (aarch64-none-linux-gnu, aarch64-none-elf, aarch64_be-none-elf).


Thanks, Alan

Reply via email to