On 10/28/2015 11:45 AM, Yuri Rumyantsev wrote:
Hi All,
Here is a preliminary patch to combine vectorized loop with its scalar
remainder, draft of which was proposed by Kirill Yukhin month ago:
https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01435.html
It was tested wwith '-mavx2' option to run on Haswell processor.
The main goal of it is to improve performance of vectorized loops for AVX512.
Ought this really be enabled for avx2? While it's nice for testing to be able
to use normal vcond patterns to be able to test with current hardware, I have
trouble imagining that it's an improvement without the real masked operations.
I tried to have a look myself at what kind of output we'd be getting, but the
very first test I tried produced an ICE:
void foo(float *a, float *b, int n)
{
int i;
for (i = 0; i < n; ++i)
a[i] += b[i];
}
$ ./cc1 -O3 -mavx2 z.c
foo
Analyzing compilation unit
Performing interprocedural optimizations
<*free_lang_data> <visibility> <build_ssa_passes> <opt_local_passes>
<free-inline-summary> <whole-program> <profile_estimate> <icf> <devirt> <cp>
<targetclone> <inline> <pure-const> <static-var> <single-use>
<comdats>Assembling functions:
<dispachercalls> foo
z.c: In function ‘foo’:
z.c:1:6: error: bogus comparison result type
void foo(float *a, float *b, int n)
^
vector(8) signed int
vect_vec_mask_.24_116 = vect_vec_iv_.22_112 < vect_cst_.23_115;
z.c:1:6: internal compiler error: verify_gimple failed
0xb20d17 verify_gimple_in_cfg(function*, bool)
../../git-master/gcc/tree-cfg.c:5082
0xa16d77 execute_function_todo
../../git-master/gcc/passes.c:1940
0xa1769b execute_todo
../../git-master/gcc/passes.c:1995
Please submit a full bug report,
r~