https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122723
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target| |x86_64-*-*
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #2)
> > that's from
> >
> > t.c:4:21: note: vect_recog_mask_conversion_pattern: detected: _ifc__32 =
> > .COND_ADD (_26, sum_18, val_13, sum_18);
> > t.c:4:21: note: mask_conversion pattern recognized: patt_40 = .COND_ADD
> > (patt_39, sum_18, val_13, sum_18);
> > t.c:4:21: note: extra pattern stmt: patt_39 = (<signed-boolean:1>) _26;
> >
[...]
>
> Hmm, no, for V8DFmode we unpack the 64bit mask to 8 8bit masks, but for the
> V4DFmode epilog we'd have a 32bit mask and require unpacking to 8 4bit masks,
> something we fail to do because we're not really special-casing mask
> vectors in vectorizable_conversion. We have vec_unpacks_sbool_{hi,lo} for
> this but for a multi-step conversion we're not considering this only
> for the second step.
if (VECTOR_BOOLEAN_TYPE_P (intermediate_type)
&& VECTOR_BOOLEAN_TYPE_P (prev_type)
&& intermediate_mode == prev_mode
&& SCALAR_INT_MODE_P (prev_mode))
{
/* If the input and result modes are the same, a different optab
is needed where we pass in the number of units in vectype. */
optab3 = vec_unpacks_sbool_lo_optab;
optab4 = vec_unpacks_sbool_hi_optab;
}
else
{
optab3 = optab_for_tree_code (c1, intermediate_type, optab_default);
optab4 = optab_for_tree_code (c2, intermediate_type, optab_default);
}
if (!optab3 || !optab4
|| (icode1 = optab_handler (optab1, prev_mode)) == CODE_FOR_nothing
|| insn_data[icode1].operand[0].mode != intermediate_mode
|| (icode2 = optab_handler (optab2, prev_mode)) == CODE_FOR_nothing
|| insn_data[icode2].operand[0].mode != intermediate_mode
|| ((icode1 = optab_handler (optab3, intermediate_mode))
== CODE_FOR_nothing)
|| ((icode2 = optab_handler (optab4, intermediate_mode))
== CODE_FOR_nothing))
break;
do not mix together. For prev_mode == HImode, intermediate_mode == QImode
we get optab3/4 to be vec_unpacks_{hi,lo} but for QImode that results in
QImode, so we need to use vec_unpacks_sbool_{lo,hi}.