61114

Alan Lawrence Thu, 18 Sep 2014 04:41:20 -0700

The end goal here is to remove this code from tree-vect-loop.c(vect_create_epilog_for_reduction):


      if (BYTES_BIG_ENDIAN)
        bitpos = size_binop (MULT_EXPR,
                             bitsize_int (TYPE_VECTOR_SUBPARTS (vectype) - 1),
                             TYPE_SIZE (scalar_type));
      else

as this is the root cause of PR/61114 (see testcase there, failing on allbigendian targets supporting reduc_[us]plus_optab). Quoting Richard Biener, "allcode conditional on BYTES/WORDS_BIG_ENDIAN in tree-vect* is suspicious". Thecode snippet above is used on two paths:

(Path 1) (patches 1-6) Reductions using REDUC_(PLUS|MIN|MAX)_EXPR =reduc_[us](plus|min|max)_optab.The optab is documented as "the scalar result is stored in the least significantbits of operand 0", but the tree code as "the first element in the vectorholding the result of the reduction of all elements of the operand". Thismismatch means that when the tree code is folded, the code snippet above readsthe result from the wrong end of the vector.

The strategy (as per https://gcc.gnu.org/ml/gcc-patches/2014-08/msg00041.html)is to define new tree codes and optabs that produce scalar results directly;this seems better than tying (the element of the vector into which the result isplaced) to (the endianness of the target), and avoids generating extra moves oncurrent bigendian targets. However, the previous optabs are retained for now asa migration strategy so as not to break existing backends; moving individualplatforms over will follow.

A complication here is on AArch64, where we directly generate REDUC_PLUS_EXPRsfrom intrinsics in gimple_fold_builtin; I temporarily remove this folding inorder to decouple the midend and AArch64 backend.

(Path 2) (patches 7-13) Reductions using whole-vector-shifts, i.e.VEC_RSHIFT_EXPR and vec_shr_optab. Here the tree code as well as the optab isdefined in an endianness-dependent way, leading to significant complication infold-const.c. (Moreover, the "equivalent" vec_shl_optab is never used!). Fewplatforms appear to handle vec_shr_optab (and fewer bigendian - I see onlyPowerPC and MIPS), so it seems pertinent to change the existing optab to beendianness-neutral.

Patch 10 defines vec_shr for AArch64, for the old specification; patch 13updates that implementation to fit the new endianness-neutral specification,serving as a guide for other existing backends. Patches/RFCs 15 and 16 areequivalents for MIPS and PowerPC; I haven't tested these but hope they act asuseful pointers for the port maintainers.

Finally patch 14 cleans up the affected part of tree-vect-loop.c(vect_create_epilog_for_reduction).


--Alan

[PATCH 0/14+2][Vectorizer] Made reductions endianness-neutral, fixes PR/61114

Reply via email to