https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92884
Bug ID: 92884 Summary: [SVE] Add support for chained extract-last reductions Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rsandifo at gcc dot gnu.org Target Milestone: --- Extract-last (i.e. CLASTB) reductions can't yet handle chained conditions, such as those seen in gcc.dg/vect/vect-cond-reduc-5.c. We just fall back to the normal COND_REDUCTION handling instead. If we have: res_0 = PHI <res_n(latch), init(entry)>; res_1 = COND_EXPR <cond_1, res_0, val_1>; res_2 = COND_EXPR <cond_2, res_1, val_2>; ... res_n = COND_EXPR <cond_n, res_{n-1}, val_n>; one alternative would be (pseudo-code): res_0 = PHI <res_n(latch), init(entry)>; vec.res_1 = vec.val_1; vec.res_2 = VEC_COND_EXPR <vec.cond_2, vec.res_1, vec.val_2>; ... vec.res_n = VEC_COND_EXPR <vec.cond_n, vec.res_{n-1}, vec.val_n>; vec.cond_any = IOR_EXPR <vec.cond_1, ..., vec.cond_n>; res_n = .EXTRACT_LAST (res_0, vec.cond_any, vec.res_n); Perhaps it would make sense to move the IFN_EXTRACT_LAST generation from vectorizable_condition to vect_create_epilog_for_reduction. All vectorizable_condition would need to do differently from COND_REDUCTION is to handle the special case of: vec.res_1 = vec.val_1; instead of using a VEC_COND_EXPR between vec.val_1 and vec.res_0. (res_0 isn't vectorised for EXTRACT_LAST_REDUCTION.)