https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92884

            Bug ID: 92884
           Summary: [SVE] Add support for chained extract-last reductions
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rsandifo at gcc dot gnu.org
  Target Milestone: ---

Extract-last (i.e. CLASTB) reductions can't yet handle chained
conditions, such as those seen in gcc.dg/vect/vect-cond-reduc-5.c.
We just fall back to the normal COND_REDUCTION handling instead.

If we have:

    res_0 = PHI <res_n(latch), init(entry)>;
    res_1 = COND_EXPR <cond_1, res_0, val_1>;
    res_2 = COND_EXPR <cond_2, res_1, val_2>;
    ...
    res_n = COND_EXPR <cond_n, res_{n-1}, val_n>;

one alternative would be (pseudo-code):

    res_0 = PHI <res_n(latch), init(entry)>;
    vec.res_1 = vec.val_1;
    vec.res_2 = VEC_COND_EXPR <vec.cond_2, vec.res_1, vec.val_2>;
    ...
    vec.res_n = VEC_COND_EXPR <vec.cond_n, vec.res_{n-1}, vec.val_n>;
    vec.cond_any = IOR_EXPR <vec.cond_1, ..., vec.cond_n>;
    res_n = .EXTRACT_LAST (res_0, vec.cond_any, vec.res_n);

Perhaps it would make sense to move the IFN_EXTRACT_LAST generation
from vectorizable_condition to vect_create_epilog_for_reduction.
All vectorizable_condition would need to do differently from
COND_REDUCTION is to handle the special case of:

    vec.res_1 = vec.val_1;

instead of using a VEC_COND_EXPR between vec.val_1 and vec.res_0.
(res_0 isn't vectorised for EXTRACT_LAST_REDUCTION.)

Reply via email to