On Thu, Aug 14, 2014 at 1:08 AM, Evgeny Stupachenko <[email protected]> wrote: > Ping. > > On Thu, Jul 10, 2014 at 7:29 PM, Evgeny Stupachenko <[email protected]> > wrote: >> On Mon, Jul 7, 2014 at 6:40 PM, Richard Henderson <[email protected]> wrote: >>> On 07/03/2014 02:53 AM, Evgeny Stupachenko wrote: >>>> -expand_vec_perm_palignr (struct expand_vec_perm_d *d) >>>> +expand_vec_perm_palignr (struct expand_vec_perm_d *d, int insn_num) >>> >>> insn_num might as well be "bool avx2", since it's only ever set to two >>> values. >> >> Agree. However: >> after the alignment, one operand permutation could be just move and >> take 2 instructions for AVX2 as well >> for AVX2 there could be other cases when the scheme takes 4 or 5 >> instructions >> we can leave it for potential avx512 extension >> >>> >>>> - /* Even with AVX, palignr only operates on 128-bit vectors. */ >>>> - if (!TARGET_SSSE3 || GET_MODE_SIZE (d->vmode) != 16) >>>> + /* SSSE3 is required to apply PALIGNR on 16 bytes operands. */ >>>> + if (GET_MODE_SIZE (d->vmode) == 16) >>>> + { >>>> + if (!TARGET_SSSE3) >>>> + return false; >>>> + } >>>> + /* AVX2 is required to apply PALIGNR on 32 bytes operands. */ >>>> + else if (GET_MODE_SIZE (d->vmode) == 32) >>>> + { >>>> + if (!TARGET_AVX2) >>>> + return false; >>>> + } >>>> + /* Other sizes are not supported. */ >>>> + else >>>> return false; >>> >>> And you'd better check it up here because... >>> >> >> Correct. The following should resolve the issue: >> /* For AVX2 we need more than 2 instructions when the alignment >> by itself does not produce the desired permutation. */ >> if (TARGET_AVX2 && insn_num <= 2) >> return false; >> >>>> + /* For SSSE3 we need 1 instruction for palignr plus 1 for one >>>> + operand permutaoin. */ >>>> + if (insn_num == 2) >>>> + { >>>> + ok = expand_vec_perm_1 (&dcopy); >>>> + gcc_assert (ok); >>>> + } >>>> + /* For AVX2 we need 2 instructions for the shift: vpalignr and >>>> + vperm plus 4 instructions for one operand permutation. */ >>>> + else if (insn_num == 6) >>>> + { >>>> + ok = expand_vec_perm_vpshufb2_vpermq (&dcopy); >>>> + gcc_assert (ok); >>>> + } >>>> + else >>>> + ok = false; >>>> return ok; >>> >>> ... down here you'll simply ICE from the gcc_assert. >>
Can you modify your patch to fix https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62128 with a testcase? -- H.J.
