https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110897

--- Comment #14 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
(In reply to rsand...@gcc.gnu.org from comment #12)
> (In reply to JuzheZhong from comment #11)
> > You can see "_9 = _5 >> _8;". We should vectorize SImode instead of HImode.
> > The correct follow should be first extend HI -> SImode, Then vectorize
> > logical shift right for SImode, and finally truncate SImode to HImode.
> The point of vect_recog_over_widening_pattern is to avoid the extension and
> truncation.  So this is working as expected.  The question is why doing the
> optimisation prevents vectorisation, given that the target apparently
> provides HImode shifts right.

Oh, thanks Richard.

After deep analysis, I found this code make it failed:

      incompatible_op1_vectype_p
        = (op1_vectype == NULL_TREE
           || maybe_ne (TYPE_VECTOR_SUBPARTS (op1_vectype),
                        TYPE_VECTOR_SUBPARTS (vectype))
           || TYPE_MODE (op1_vectype) != TYPE_MODE (vectype));
      if (incompatible_op1_vectype_p
          && (!slp_node
              || SLP_TREE_DEF_TYPE (slp_op1) != vect_constant_def
              || slp_op1->refcnt != 1))
        {
          if (dump_enabled_p ())
            dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
                             "unusable type for last operand in"
                             " vector/vector shift/rotate.\n");
          return false;
        }

incompatible_op1_vectype_p is true.

The reason it becomes true is op1_vectype has the different NUNTIS with
vectype.

The reason why they are different NUNITS is because 

op1_vectype = get_vectype_for_scalar_type = RVVM1SImode.
vectype = STMT_VINFO_VECTYPE (stmt_info) = RVVMF2SImode.

That's the reason why they are different make it failed.

As for easier understand for ARM SVE, I believe ARM sve:

op1_vectype = get_vectype_for_scalar_type = VNx4SImode.
vectype = STMT_VINFO_VECTYPE (stmt_info) = VNx2SImode.

Then ARM SVE also failed.

When revert that commit, they are the same (both are RVVM1SImode for RISCV
or VNx4SImode for ARM SVE).

Could you tell me how to fix that ? 

Thanks.

Reply via email to