https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856

--- Comment #36 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #35)
> (In reply to Richard Biener from comment #33)
> > Created attachment 50308 [details]
> > patch
> > 
> > I am testing the following.
> 
> It FAILs
> 
> FAIL: gcc.target/i386/avx512dq-concatv2di-1.c scan-assembler
> vpinsrq[^\\n\\r]*\\
> \\\$1[^\\n\\r]*%[re]si[^\\n\\r]*%xmm18[^\\n\\r]*%xmm19

That's exactly the case we're looking after.  V2DI concat from two GPRs.

> FAIL: gcc.target/i386/avx512dq-concatv2di-1.c scan-assembler
> vpinsrq[^\\n\\r]*\\\\\$1[^\\n\\r]*%rsi[^\\n\\r]*%xmm16[^\\n\\r]*%xmm17

This is, like below, a MEM case.

> FAIL: gcc.target/i386/avx512vl-concatv2di-1.c scan-assembler
> vmovhps[^\\n\\r]*%[re]si[^\\n\\r]*%xmm18[^\\n\\r]*%xmm19

This one is because nonimmediate_gr_operand also matches a MEM, in this case
we apply the peephole to

(insn 12 11 13 2 (set (reg/v:V2DI 55 xmm19 [ c ])
        (vec_concat:V2DI (reg:DI 54 xmm18 [91]) 
            (mem:DI (reg/v/f:DI 4 si [orig:86 y ] [86]) [1 *y_8(D)+0 S8 A64]))) 

latency-wise memory isn't any better than a GPR so the decision to split
is reasonable.

> I'll see how to update those next week.

So I updated the above to check for vpunpcklqdq instead.

Reply via email to