On Mon, Jul 14, 2025 at 3:11 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > On Mon, Jul 14, 2025 at 5:32 AM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > On Mon, Jul 14, 2025 at 2:14 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > On Sat, Jul 12, 2025 at 7:51 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > > > > > On Sat, Jul 12, 2025 at 1:41 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > > > > > On Sat, Jul 12, 2025 at 5:58 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > > > > > > > > > On Sat, Jul 12, 2025 at 11:52 AM H.J. Lu <hjl.to...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > On Sat, Jul 12, 2025 at 5:03 PM Uros Bizjak <ubiz...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > > On Fri, Jul 11, 2025 at 6:05 AM H.J. Lu <hjl.to...@gmail.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > commit 77473a27bae04da99d6979d43e7bd0a8106f4557 > > > > > > > > > Author: H.J. Lu <hjl.to...@gmail.com> > > > > > > > > > Date: Thu Jun 26 06:08:51 2025 +0800 > > > > > > > > > > > > > > > > > > x86: Also handle all 1s float vector constant > > > > > > > > > > > > > > > > > > replaces > > > > > > > > > > > > > > > > > > (insn 29 28 30 5 (set (reg:V2SF 107) > > > > > > > > > (mem/u/c:V2SF (symbol_ref/u:DI ("*.LC0") [flags 0x2]) > > > > > > > > > [0 S8 A64])) 2031 > > > > > > > > > {*movv2sf_internal} > > > > > > > > > (expr_list:REG_EQUAL (const_vector:V2SF [ > > > > > > > > > (const_double:SF -QNaN [-QNaN]) repeated x2 > > > > > > > > > ]) > > > > > > > > > (nil))) > > > > > > > > > > > > > > > > > > with > > > > > > > > > > > > > > > > > > (insn 98 13 14 3 (set (reg:V8QI 112) > > > > > > > > > (const_vector:V8QI [ > > > > > > > > > (const_int -1 [0xffffffffffffffff]) repeated > > > > > > > > > x8 > > > > > > > > > ])) -1 > > > > > > > > > (nil)) > > > > > > > > > ... > > > > > > > > > (insn 29 28 30 5 (set (reg:V2SF 107) > > > > > > > > > (subreg:V2SF (reg:V8QI 112) 0)) 2031 > > > > > > > > > {*movv2sf_internal} > > > > > > > > > (expr_list:REG_EQUAL (const_vector:V2SF [ > > > > > > > > > (const_double:SF -QNaN [-QNaN]) repeated x2 > > > > > > > > > ]) > > > > > > > > > (nil))) > > > > > > > > > > > > > > > > > > which leads to > > > > > > > > > > > > > > > > > > pr121015.c: In function ‘render_result_from_bake_h’: > > > > > > > > > pr121015.c:34:1: error: unrecognizable insn: > > > > > > > > > 34 | } > > > > > > > > > | ^ > > > > > > > > > (insn 98 13 14 3 (set (reg:V8QI 112) > > > > > > > > > (const_vector:V8QI [ > > > > > > > > > (const_int -1 [0xffffffffffffffff]) repeated > > > > > > > > > x8 > > > > > > > > > ])) -1 > > > > > > > > > (expr_list:REG_EQUIV (const_vector:V8QI [ > > > > > > > > > (const_int -1 [0xffffffffffffffff]) repeated > > > > > > > > > x8 > > > > > > > > > ]) > > > > > > > > > (nil))) > > > > > > > > > during RTL pass: ira > > > > > > > > > > > > > > > > > > 1. Add vector_const0_or_m1_operand for vector 0 or integer > > > > > > > > > vector -1. > > > > > > > > > 2. Add nonimm_or_vector_const0_or_m1_operand for > > > > > > > > > nonimmediate, vector 0 > > > > > > > > > or integer vector -1 operand. > > > > > > > > > 3. Add BX constraint for MMX vector constant all 0s/1s > > > > > > > > > operand. > > > > > > > > > 4. Update MMXMODE:*mov<mode>_internal to support integer all > > > > > > > > > 1s vectors. > > > > > > > > > Replace <v,C> with <v,BX> to generate > > > > > > > > > > > > > > > > > > pcmpeqd %xmm0, %xmm0 > > > > > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > (set (reg/i:V8QI 20 xmm0) > > > > > > > > > (const_vector:V8QI [(const_int -1 [0xffffffffffffffff]) > > > > > > > > > repeated x8])) > > > > > > > > > > > > > > > > > > NB: The upper 64 bits in XMM0 are all 1s, instead of all 0s. > > > > > > > > > > > > > > > > Actually, we don't want this, we should keep the top 64 bits > > > > > > > > zero, > > > > > > > > especially for floating point, where the pattern represents NaN. > > > > > > > > > > > > > > > > So, I think the correct way is to avoid the transformation for > > > > > > > > narrower modes in the first place. > > > > > > > > > > > > > > > > > > > > > > How does your latest patch handle this? > > > > > > > > > > > > > > typedef char __v8qi __attribute__ ((__vector_size__ (8))); > > > > > > > > > > > > > > __v8qi > > > > > > > m1 (void) > > > > > > > { > > > > > > > return __extension__(__v8qi){-1, -1, -1, -1, -1, -1, -1, -1}; > > > > > > > } > > > > > > > > > > > > No, my patch is also not appropriate, because it also introduces > > > > > > "pcmpeq %xmm, %xmm". We should not generate 8-byte all-ones load > > > > > > using > > > > > > pcmpeq, because upper 64 bits are also all 1s. > > > > > > > > > > > > The correct way is to avoid generating 64 bit all-ones, because this > > > > > > constant is not supported and standard_sse_constant_p () correctly > > > > > > reports this. > > > > > > > > > > We can generate > > > > > > > > > > pcmpeqd %xmm0, %xmm0 > > > > > movq %xmm0, %xmm0 > > > > > > > > > > for V8QI and > > > > > > > > > > pcmpeqd %xmm0, %xmm0 > > > > > movd %xmm0, %xmm0 > > > > > > > > > > for V4QI. > > > > > > > > I don't think this is better than skipping the transformation for > > > > instructions that we in fact emulate altogether. While loading > > > > all-zero is OK in any mode, loading all-one is not OK for narrow > > > > modes. So, this transformation should simply be skipped for all-one in > > > > narrow modes. > > > > > > Here is the v3 patch, which allows 4-byte/8-byte all 1s in mmx.md > > > and split to load from memory if the destination is an XMM register. > > > > Why don't we just skip the generation of narrow-mode all-ones vector > > constants in the new pass altogether? It is not worth complicating > > move patterns for a very seldom used feature and for very small (if at > > all) gain. > > > > Please just change the pass to not generate vetro all-ones in 64bit or > > narrower modes. > > I'm not familiar with the pass, but IMO the attached patch should be a > good starting point. We don't want to CSE narrow all-ones with their > wide counterparts, because we want zeros in top bytes of the narrow > all-ones operands. > > Uros.
I am testing this. -- H.J.
From de1cc2ee480483d03170e75b8b189f340bc71154 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" <hjl.to...@gmail.com> Date: Sun, 13 Jul 2025 08:59:34 +0800 Subject: [PATCH v4] x86: Skip all 1s vector constant narrower than 16 bytes MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit 77473a27bae04da99d6979d43e7bd0a8106f4557 Author: H.J. Lu <hjl.to...@gmail.com> Date: Thu Jun 26 06:08:51 2025 +0800 x86: Also handle all 1s float vector constant replaces (insn 29 28 30 5 (set (reg:V2SF 107) (mem/u/c:V2SF (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0 S8 A64])) 2031 {*movv2sf_internal} (expr_list:REG_EQUAL (const_vector:V2SF [ (const_double:SF -QNaN [-QNaN]) repeated x2 ]) (nil))) with (insn 98 13 14 3 (set (reg:V8QI 112) (const_vector:V8QI [ (const_int -1 [0xffffffffffffffff]) repeated x8 ])) -1 (nil)) ... (insn 29 28 30 5 (set (reg:V2SF 107) (subreg:V2SF (reg:V8QI 112) 0)) 2031 {*movv2sf_internal} (expr_list:REG_EQUAL (const_vector:V2SF [ (const_double:SF -QNaN [-QNaN]) repeated x2 ]) (nil))) which leads to pr121015.c: In function ‘render_result_from_bake_h’: pr121015.c:34:1: error: unrecognizable insn: 34 | } | ^ (insn 98 13 14 3 (set (reg:V8QI 112) (const_vector:V8QI [ (const_int -1 [0xffffffffffffffff]) repeated x8 ])) -1 (expr_list:REG_EQUIV (const_vector:V8QI [ (const_int -1 [0xffffffffffffffff]) repeated x8 ]) (nil))) during RTL pass: ira 1. Update the remove_redundant_vector pass to skip all 1s vector constant narrower than 16 bytes. 2. Convert integer register loads from CONSTM1_RTX in memory to constm1_rtx move. gcc/ PR target/121015 * config/i386/i386.cc (ix86_broadcast_inner): Skip all 1s vector constant narrower than 16 bytes. * config/i386/mmx.md: Add MMXMODE and V_32 splitters to convert integer register loads from CONSTM1_RTX in memory to constm1_rtx move. gcc/testsuite/ PR target/121015 * gcc.target/i386/pr121015-1.c: New test. * gcc.target/i386/pr121015-2a.c: Likewise. * gcc.target/i386/pr121015-2b.c: Likewise. * gcc.target/i386/pr121015-3.c: Likewise. * gcc.target/i386/pr121015-4.c: Likewise. * gcc.target/i386/pr121015-5a.c: Likewise. * gcc.target/i386/pr121015-5b.c: Likewise. * gcc.target/i386/pr121015-5c.c: Likewise. * gcc.target/i386/pr121015-6.c: Likewise. * gcc.target/i386/pr121015-7a.c: Likewise. * gcc.target/i386/pr121015-7b.c: Likewise. * gcc.target/i386/pr121015-7c.c: Likewise. * gcc.target/i386/pr121015-8.c: Likewise. * gcc.target/i386/pr121015-9.c: Likewise. * gcc.target/i386/pr121015-10a.c: Likewise. * gcc.target/i386/pr121015-10b.c: Likewise. * gcc.target/i386/pr121015-10c.c: Likewise. Signed-off-by: H.J. Lu <hjl.to...@gmail.com> --- gcc/config/i386/i386-features.cc | 5 ++ gcc/config/i386/mmx.md | 48 ++++++++++++++++++++ gcc/testsuite/gcc.target/i386/pr121015-1.c | 32 +++++++++++++ gcc/testsuite/gcc.target/i386/pr121015-10a.c | 32 +++++++++++++ gcc/testsuite/gcc.target/i386/pr121015-10b.c | 16 +++++++ gcc/testsuite/gcc.target/i386/pr121015-10c.c | 21 +++++++++ gcc/testsuite/gcc.target/i386/pr121015-11a.c | 21 +++++++++ gcc/testsuite/gcc.target/i386/pr121015-11b.c | 13 ++++++ gcc/testsuite/gcc.target/i386/pr121015-11c.c | 17 +++++++ gcc/testsuite/gcc.target/i386/pr121015-2a.c | 23 ++++++++++ gcc/testsuite/gcc.target/i386/pr121015-2b.c | 6 +++ gcc/testsuite/gcc.target/i386/pr121015-3.c | 35 ++++++++++++++ gcc/testsuite/gcc.target/i386/pr121015-4.c | 22 +++++++++ gcc/testsuite/gcc.target/i386/pr121015-5a.c | 21 +++++++++ gcc/testsuite/gcc.target/i386/pr121015-5b.c | 16 +++++++ gcc/testsuite/gcc.target/i386/pr121015-5c.c | 20 ++++++++ gcc/testsuite/gcc.target/i386/pr121015-6.c | 23 ++++++++++ gcc/testsuite/gcc.target/i386/pr121015-7a.c | 23 ++++++++++ gcc/testsuite/gcc.target/i386/pr121015-7b.c | 6 +++ gcc/testsuite/gcc.target/i386/pr121015-7c.c | 8 ++++ gcc/testsuite/gcc.target/i386/pr121015-8.c | 13 ++++++ gcc/testsuite/gcc.target/i386/pr121015-9.c | 14 ++++++ 22 files changed, 435 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-10a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-10b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-10c.c create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-11a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-11b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-11c.c create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-2a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-2b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-3.c create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-4.c create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-5a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-5b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-5c.c create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-6.c create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-7a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-7b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-7c.c create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-8.c create mode 100644 gcc/testsuite/gcc.target/i386/pr121015-9.c diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc index 054f8d5ddc8..20b15544408 100644 --- a/gcc/config/i386/i386-features.cc +++ b/gcc/config/i386/i386-features.cc @@ -3546,6 +3546,11 @@ ix86_broadcast_inner (rtx op, machine_mode mode, || (GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT && float_vector_all_ones_operand (op, mode))) { + /* Skip if vector size is less than 16 bytes since all 1s SSE + constants must be at leas 16 bytes. */ + if (GET_MODE_SIZE (mode) < 16) + return nullptr; + *scalar_mode_p = QImode; *kind_p = X86_CSE_CONSTM1_VECTOR; *insn_p = nullptr; diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 29a8cb599a7..00f3657f796 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -304,6 +304,30 @@ (define_insn "*mov<mode>_internal" ] (symbol_ref "true")))]) +(define_split + [(set (match_operand:MMXMODE 0 "register_operand") + (match_operand:MMXMODE 1 "memory_operand"))] + "TARGET_64BIT && reload_completed && GENERAL_REG_P (operands[0])" + [(const_int 0)] +{ + rtx op1 = operands[1]; + rtx op = find_reg_note (curr_insn, REG_EQUAL, nullptr); + if (!op) + op = find_reg_note (curr_insn, REG_EQUIV, nullptr); + if (op) + { + op = XEXP (op, 0); + if (int_float_vector_all_ones_operand (op, <MODE>mode)) + { + rtx reg = gen_rtx_REG (DImode, REGNO (operands[0])); + emit_move_insn (reg, constm1_rtx); + op1 = gen_rtx_SUBREG (<MODE>mode, reg, 0); + } + } + emit_move_insn (operands[0], op1); + DONE; +}) + (define_split [(set (match_operand:MMXMODE 0 "nonimmediate_gr_operand") (match_operand:MMXMODE 1 "nonimmediate_gr_operand"))] @@ -407,6 +431,30 @@ (define_insn "*mov<mode>_internal" ] (symbol_ref "true")))]) +(define_split + [(set (match_operand:V_32 0 "register_operand") + (match_operand:V_32 1 "memory_operand"))] + "reload_completed && GENERAL_REG_P (operands[0])" + [(const_int 0)] +{ + rtx op1 = operands[1]; + rtx op = find_reg_note (curr_insn, REG_EQUAL, nullptr); + if (!op) + op = find_reg_note (curr_insn, REG_EQUIV, nullptr); + if (op) + { + op = XEXP (op, 0); + if (int_float_vector_all_ones_operand (op, <MODE>mode)) + { + rtx reg = gen_rtx_REG (SImode, REGNO (operands[0])); + emit_move_insn (reg, constm1_rtx); + op1 = gen_rtx_SUBREG (<MODE>mode, reg, 0); + } + } + emit_move_insn (operands[0], op1); + DONE; +}) + ;; 16-bit, 32-bit and 64-bit constant vector stores. After reload, ;; convert them to immediate integer stores. (define_insn_and_split "*mov<mode>_imm" diff --git a/gcc/testsuite/gcc.target/i386/pr121015-1.c b/gcc/testsuite/gcc.target/i386/pr121015-1.c new file mode 100644 index 00000000000..57c8bff14ef --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-1.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=x86-64-v3" } */ + +extern union { + int i; + float f; +} int_as_float_u; + +extern int render_result_from_bake_w; +extern int render_result_from_bake_h_seed_pass; +extern float *render_result_from_bake_h_primitive; +extern float *render_result_from_bake_h_seed; + +float +int_as_float(int i) +{ + int_as_float_u.i = i; + return int_as_float_u.f; +} + +void +render_result_from_bake_h(int tx) +{ + while (render_result_from_bake_w) { + for (; tx < render_result_from_bake_w; tx++) + render_result_from_bake_h_primitive[1] = + render_result_from_bake_h_primitive[2] = int_as_float(-1); + if (render_result_from_bake_h_seed_pass) { + *render_result_from_bake_h_seed = 0; + } + } +} diff --git a/gcc/testsuite/gcc.target/i386/pr121015-10a.c b/gcc/testsuite/gcc.target/i386/pr121015-10a.c new file mode 100644 index 00000000000..67b574cc837 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-10a.c @@ -0,0 +1,32 @@ +/* { dg-do compile { target fpic } } */ +/* { dg-options "-O2 -march=x86-64 -fpic" } */ +/* Keep labels and directives ('.cfi_startproc', '.cfi_endproc'). */ +/* { dg-final { check-function-bodies "**" "" "" { target { ! ia32 } } {^\t?\.} } } */ + +/* +**__bid64_to_binary80: +**.LFB[0-9]+: +** .cfi_startproc +** mov(l|q) __bid64_to_binary80_x_out@GOTPCREL\(%rip\), %(r|e)ax +** movq \$-1, \(%(r|e)ax\) +** ret +**... +*/ + +typedef struct { + struct { + unsigned short lo4; + unsigned short lo3; + unsigned short lo2; + unsigned short lo1; + } i; +} BID_BINARY80LDOUBLE; +extern BID_BINARY80LDOUBLE __bid64_to_binary80_x_out; +void +__bid64_to_binary80 (void) +{ + __bid64_to_binary80_x_out.i.lo4 + = __bid64_to_binary80_x_out.i.lo3 + = __bid64_to_binary80_x_out.i.lo2 + = __bid64_to_binary80_x_out.i.lo1 = 65535; +} diff --git a/gcc/testsuite/gcc.target/i386/pr121015-10b.c b/gcc/testsuite/gcc.target/i386/pr121015-10b.c new file mode 100644 index 00000000000..06cb58f702d --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-10b.c @@ -0,0 +1,16 @@ +/* { dg-do compile { target { fpic && lp64 } } } */ +/* { dg-options "-O2 -march=x86-64 -fno-pic -mcmodel=large" } */ + +/* +**__bid64_to_binary80: +**.LFB[0-9]+: +** .cfi_startproc +** movabsq \$.LC0, %rax +** movq \(%rax\), %rdx +** movabsq \$__bid64_to_binary80_x_out, %rax +** movq %rdx, \(%rax\) +** ret +**... +*/ + +#include "pr121015-10a.c" diff --git a/gcc/testsuite/gcc.target/i386/pr121015-10c.c b/gcc/testsuite/gcc.target/i386/pr121015-10c.c new file mode 100644 index 00000000000..573a1562883 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-10c.c @@ -0,0 +1,21 @@ +/* { dg-do compile { target { fpic && lp64 } } } */ +/* { dg-options "-O2 -march=x86-64 -fpic -mcmodel=large" } */ + +/* +**__bid64_to_binary80: +**.LFB[0-9]+: +** .cfi_startproc +**.L2: +** leaq .L2\(%rip\), %rax +** movabsq \$_GLOBAL_OFFSET_TABLE_-.L2, %r11 +** movabsq \$__bid64_to_binary80_x_out@GOT, %rdx +** movabsq \$.LC0@GOTOFF, %rcx +** addq %r11, %rax +** movq \(%rax,%rdx\), %rdx +** movq \(%rax,%rcx\), %rax +** movq %rax, \(%rdx\) +** ret +**... +*/ + +#include "pr121015-10a.c" diff --git a/gcc/testsuite/gcc.target/i386/pr121015-11a.c b/gcc/testsuite/gcc.target/i386/pr121015-11a.c new file mode 100644 index 00000000000..b8bb3849fb7 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-11a.c @@ -0,0 +1,21 @@ +/* { dg-do compile { target fpic } } */ +/* { dg-options "-O2 -march=x86-64 -fpic" } */ +/* Keep labels and directives ('.cfi_startproc', '.cfi_endproc'). */ +/* { dg-final { check-function-bodies "**" "" "" { target { ! ia32 } } {^\t?\.} } } */ + +/* +**foo: +**.LFB[0-9]+: +** .cfi_startproc +** movd .LC0\(%rip\), %xmm0 +**... +*/ + +typedef char __v4qi __attribute__ ((__vector_size__ (4))); + +void +foo (void) +{ + __v4qi x = __extension__(__v4qi){-1, -1, -1, -1}; + asm ("reg %0" : : "v" (x)); +} diff --git a/gcc/testsuite/gcc.target/i386/pr121015-11b.c b/gcc/testsuite/gcc.target/i386/pr121015-11b.c new file mode 100644 index 00000000000..9ff2908829b --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-11b.c @@ -0,0 +1,13 @@ +/* { dg-do compile { target { fpic && lp64 } } } */ +/* { dg-options "-O2 -march=x86-64 -fno-pic -mcmodel=large" } */ + +/* +**foo: +**.LFB[0-9]+: +** .cfi_startproc +** movabsq \$.LC0, %rax +** movd \(%rax\), %xmm0 +**... +*/ + +#include "pr121015-11a.c" diff --git a/gcc/testsuite/gcc.target/i386/pr121015-11c.c b/gcc/testsuite/gcc.target/i386/pr121015-11c.c new file mode 100644 index 00000000000..f0e6ccb2b92 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-11c.c @@ -0,0 +1,17 @@ +/* { dg-do compile { target { fpic && lp64 } } } */ +/* { dg-options "-O2 -march=x86-64 -fpic -mcmodel=large" } */ + +/* +**foo: +**.LFB[0-9]+: +** .cfi_startproc +**.L2: +** movabsq \$_GLOBAL_OFFSET_TABLE_-.L2, %r11 +** leaq .L2\(%rip\), %rax +** movabsq \$.LC0@GOTOFF, %rdx +** addq %r11, %rax +** movd \(%rax,%rdx\), %xmm0 +**... +*/ + +#include "pr121015-11a.c" diff --git a/gcc/testsuite/gcc.target/i386/pr121015-2a.c b/gcc/testsuite/gcc.target/i386/pr121015-2a.c new file mode 100644 index 00000000000..f94848023da --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-2a.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=x86-64" } */ + +void +foo (int *c1, int *c2) +{ + if (c1) + { + c1 = __builtin_assume_aligned (c1, 16); + c1[0] = 0; + c1[1] = 0; + } + if (c2) + { + c2 = __builtin_assume_aligned (c2, 16); + c2[0] = 0; + c2[1] = 0; + } +} + +/* { dg-final { scan-assembler-times "movl\[ \\t\]+\\\$0," 4 { target ia32 } } } */ +/* { dg-final { scan-assembler-times "movq\[ \\t\]+\\\$0," 2 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-not "xmm" { target { ! ia32 } } } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr121015-2b.c b/gcc/testsuite/gcc.target/i386/pr121015-2b.c new file mode 100644 index 00000000000..9df2766c612 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-2b.c @@ -0,0 +1,6 @@ +/* { dg-do compile { target ia32 } } */ +/* { dg-options "-O2 -mno-sse" } */ + +#include "pr121015-2a.c" + +/* { dg-final { scan-assembler-times "movl\[ \\t\]+\\\$0," 4 } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr121015-3.c b/gcc/testsuite/gcc.target/i386/pr121015-3.c new file mode 100644 index 00000000000..44bf63c73e6 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-3.c @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=x86-64" } */ + +typedef enum { CPP_NUMBER } cpp_ttype; +typedef struct { + bool unsignedp; + bool overflow; +} cpp_num; +extern cpp_num value, __trans_tmp_1; +extern cpp_ttype eval_token_token_0; +extern int eval_token_temp; +static cpp_num +eval_token(void) +{ + cpp_num __trans_tmp_2, result; + result.overflow = false; + switch (eval_token_token_0) + { + case CPP_NUMBER: + switch (eval_token_temp) + { + case 1: + return __trans_tmp_1; + } + result.unsignedp = false; + __trans_tmp_2 = result; + return __trans_tmp_2; + } + return result; +} +void +_cpp_parse_expr_pfile(void) +{ + value = eval_token(); +} diff --git a/gcc/testsuite/gcc.target/i386/pr121015-4.c b/gcc/testsuite/gcc.target/i386/pr121015-4.c new file mode 100644 index 00000000000..2848a946dd1 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-4.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=x86-64" } */ +/* Keep labels and directives ('.cfi_startproc', '.cfi_endproc'). */ +/* { dg-final { check-function-bodies "**" "" "" { target { ! ia32 } } {^\t?\.} } } */ + +/* +**zero: +**.LFB0: +** .cfi_startproc +** xorps %xmm0, %xmm0 +** ret +**... +*/ + +typedef float __v2sf __attribute__ ((__vector_size__ (8))); +extern __v2sf f1; + +__v2sf +zero (void) +{ + return __extension__(__v2sf){0, 0}; +} diff --git a/gcc/testsuite/gcc.target/i386/pr121015-5a.c b/gcc/testsuite/gcc.target/i386/pr121015-5a.c new file mode 100644 index 00000000000..605a87db1fc --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-5a.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=x86-64" } */ +/* Keep labels and directives ('.cfi_startproc', '.cfi_endproc'). */ +/* { dg-final { check-function-bodies "**" "" "" { target { ! ia32 } } {^\t?\.} } } */ + +/* +**m1: +**.LFB[0-9]+: +** .cfi_startproc +** movq .LC[0-9]+\(%rip\), %xmm0 +** ret +**... +*/ + +typedef char __v8qi __attribute__ ((__vector_size__ (8))); + +__v8qi +m1 (void) +{ + return __extension__(__v8qi){-1, -1, -1, -1, -1, -1, -1, -1}; +} diff --git a/gcc/testsuite/gcc.target/i386/pr121015-5b.c b/gcc/testsuite/gcc.target/i386/pr121015-5b.c new file mode 100644 index 00000000000..22d51fd33ef --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-5b.c @@ -0,0 +1,16 @@ +/* { dg-do compile { target { fpic && lp64 } } } */ +/* { dg-options "-O2 -march=x86-64 -fno-pic -mcmodel=large" } */ +/* Keep labels and directives ('.cfi_startproc', '.cfi_endproc'). */ +/* { dg-final { check-function-bodies "**" "" "" { target "*-*-*" } {^\t?\.} } } */ + +/* +**m1: +**.LFB[0-9]+: +** .cfi_startproc +** movabsq \$.LC0, %rax +** movq \(%rax\), %xmm0 +** ret +**... +*/ + +#include "pr121015-5a.c" diff --git a/gcc/testsuite/gcc.target/i386/pr121015-5c.c b/gcc/testsuite/gcc.target/i386/pr121015-5c.c new file mode 100644 index 00000000000..bb210fa71ff --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-5c.c @@ -0,0 +1,20 @@ +/* { dg-do compile { target { fpic && lp64 } } } */ +/* { dg-options "-O2 -march=x86-64 -fpic -mcmodel=large" } */ +/* Keep labels and directives ('.cfi_startproc', '.cfi_endproc'). */ +/* { dg-final { check-function-bodies "**" "" "" { target "*-*-*" } {^\t?\.} } } */ + +/* +**m1: +**.LFB[0-9]+: +** .cfi_startproc +**.L2: +** movabsq \$_GLOBAL_OFFSET_TABLE_-.L2, %r11 +** leaq .L2\(%rip\), %rax +** movabsq \$.LC0@GOTOFF, %rdx +** addq %r11, %rax +** movq \(%rax,%rdx\), %xmm0 +** ret +**... +*/ + +#include "pr121015-5a.c" diff --git a/gcc/testsuite/gcc.target/i386/pr121015-6.c b/gcc/testsuite/gcc.target/i386/pr121015-6.c new file mode 100644 index 00000000000..daebcb0acc5 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-6.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=x86-64" } */ +/* Keep labels and directives ('.cfi_startproc', '.cfi_endproc'). */ +/* { dg-final { check-function-bodies "**" "" "" { target { ! ia32 } } {^\t?\.} } } */ + +/* +**m1: +**.LFB[0-9]+: +** .cfi_startproc +** pcmpeqd %xmm0, %xmm0 +** ret +**... +*/ + +#include <x86intrin.h> + +__m128i +m1 (void) +{ + __m64 x = _mm_set1_pi8 (-1); + __m128i y = _mm_set1_epi64 (x); + return y; +} diff --git a/gcc/testsuite/gcc.target/i386/pr121015-7a.c b/gcc/testsuite/gcc.target/i386/pr121015-7a.c new file mode 100644 index 00000000000..94037e33d81 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-7a.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=x86-64" } */ + +void +foo (int *c1, int *c2) +{ + if (c1) + { + c1 = __builtin_assume_aligned (c1, 16); + c1[0] = -1; + c1[1] = -1; + } + if (c2) + { + c2 = __builtin_assume_aligned (c2, 16); + c2[0] = -1; + c2[1] = -1; + } +} + +/* { dg-final { scan-assembler-times "movq\[ \\t\]+\[^\n\]*%xmm" 4 { target ia32 } } } */ +/* { dg-final { scan-assembler-times "movq\[ \\t\]+\\\$-1," 2 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-not "xmm" { target { ! ia32 } } } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr121015-7b.c b/gcc/testsuite/gcc.target/i386/pr121015-7b.c new file mode 100644 index 00000000000..3784ce0dfed --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-7b.c @@ -0,0 +1,6 @@ +/* { dg-do compile { target ia32 } } */ +/* { dg-options "-O2 -mno-sse" } */ + +#include "pr121015-7a.c" + +/* { dg-final { scan-assembler-times "movl\[ \\t\]+\\\$-1," 4 } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr121015-7c.c b/gcc/testsuite/gcc.target/i386/pr121015-7c.c new file mode 100644 index 00000000000..33b2df3ac9e --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-7c.c @@ -0,0 +1,8 @@ +/* { dg-do compile { target fpic } } */ +/* { dg-options "-O2 -march=x86-64 -fpic" } */ + +#include "pr121015-7a.c" + +/* { dg-final { scan-assembler-times "movq\[ \\t\]+\[^\n\]*%xmm" 4 { target ia32 } } } */ +/* { dg-final { scan-assembler-times "movq\[ \\t\]+\\\$-1," 2 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-not "xmm" { target { ! ia32 } } } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr121015-8.c b/gcc/testsuite/gcc.target/i386/pr121015-8.c new file mode 100644 index 00000000000..f911ecc0fc9 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-8.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-Og -fno-dce -mtune=generic" } */ + +typedef int __attribute__((__vector_size__ (4))) S; +extern int bar (S); + +int +foo () +{ + return bar ((S){-1}); +} + +/* { dg-final { scan-assembler-times "movl\[ \\t\]+\\\$-1, " 1 } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr121015-9.c b/gcc/testsuite/gcc.target/i386/pr121015-9.c new file mode 100644 index 00000000000..05c2021ba05 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr121015-9.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-Og -fno-dce -mtune=generic" } */ + +typedef int __attribute__((__vector_size__ (4))) S; +extern int bar (S); + +int +foo () +{ + return bar ((S){0}); +} + +/* { dg-final { scan-assembler-times "movl\[ \\t\]+\\\$0, \\(%esp\\)" 1 { target ia32 } } } */ +/* { dg-final { scan-assembler-times "movl\[ \\t\]+\\\$0, %edi" 1 { target { ! ia32 } } } } */ -- 2.50.1