https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116275

--- Comment #4 from Roger Sayle <roger at nextmovesoftware dot com> ---
Created attachment 58868
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58868&action=edit
proposed patch

Here's my proposed fix (the first of two patches) that resolves the ICE with
the testcase.  The problem is that i386.md has a *extenddi2_doubleword_highpart
pattern on !TARGET_64BIT for noticing that sign-extending a doubleword can be
optimized to just manipulating the highpart.  The STV pass recognizes this as a
candidate, where previously ASHIFTRT of V2DI required AVX512VL, which runs into
problems as the post-reload splitter *extendv2di2_highpart_stv is conditional
on AVX512VL.

The first patch resolves this problem by adding a pre-reload splitter for this
case, called imaginatively called *extendv2di2_highpart_stv_noavx512vl.  This
relies on split1's recursive splitting, so that the generated code for
(x<<8)>>8 in the testcase becomes:

        vpsllq  $8, %xmm0, %xmm0
        vpsrad  $8, %xmm0, %xmm1
        vpsrlq  $8, %xmm0, %xmm0
        vpblendd        $5, %xmm0, %xmm1, %xmm0

I'll post a follow-up patch (part 2) that provides a better implementation for
this V2DI highpart sign extension, which can be done in 3 insns (taking
advantage of SSE's ability to arithmetic shift right the "highpart"):

        vpsllq  $8, %xmm0, %xmm1
        vpsrad  $8, %xmm1, %xmm1
        vpblendd        $5, %xmm0, %xmm1, %xmm0

Sorry for the inconvenience.  In the meantime, -mno-stv can be used as a
workaround.  The bootstrap and regression testing should finish in a little
while.

Reply via email to