Hi RuoYao
It’s probably because loongarch64 doesn’t support
can_vec_perm_const_p(result_mode, op_mode, sel2, false)
I’m not sure whether if loongarch will support it or should I just limit the
test target for pr54346.c?
Best Regards
Levy
> On 12 Oct 2022, at 9:51 pm, Xi Ruoyao wr
Added implementation for builtin overflow detection, new patterns are listed
below.
signed addition:
add t0, t1, t2
sltit3, t2, 0
slt t4, t0, t1
bne t3, t4, overflow
unsigned addition:
add t0, t1, t2
bltut0, t1, overflow
sig
From: LevyHsu
Added implementation for builtin overflow detection, new patterns are listed
below.
---
Addition:
signed addition (SImode with RV32 || DImode with RV64):
add t0, t1, t2
sltit3, t2, 0
slt
From: LevyHsu
Added implementation for builtin overflow detection, new patterns are listed
below.
---
Addition:
signed addition (SImode in RV32 || DImode in RV64):
add t0, t1, t2
sltit3, t2, 0
slt t
From: Liwei Xu
This patch optimize byte swaps in vectors using SSE2 instructions.
It targets 8-byte and 16-byte vectors, efficiently handling patterns like
__builtin_shufflevector(v, v, 1, 0, 3, 2, ...).
PR target/107563
gcc/ChangeLog:
* config/i386/i386-expand.cc (expand_vec
This patch extends support for BF16 vector operations in GCC, including bitwise
AND, ANDNOT, ABS, NEG, COPYSIGN, and XORSIGN for V8BF, V16BF, and V32BF modes.
Bootstrapped and tested on x86_64-linux-gnu. ok for trunk?
gcc/ChangeLog:
* config/i386/i386-expand.cc (ix86_expand_fp_absneg_ope
embly code generation for configurations
supporting SSE2.
Bootstrapped and tested on x86_64-linux-gnu, OK for trunk?
Best
Levy
gcc/ChangeLog:
PR target/107563
* config/i386/i386-expand.cc (expand_vec_perm_psrlw_psllw_por): New
subroutine.
(ix86_expand_vec_perm_co
Replaced arithmetic shifts with logical shifts in
expand_vec_perm_psrlw_psllw_por to avoid sign bit extension issues. Also
corrected gen_vlshrv8hi3 to gen_lshrv8hi3 and gen_vashlv8hi3 to gen_ashlv8hi3.
Bootstrapped and tested on x86_64-linux-gnu, OK for trunk?
Co-authored-by: H.J. Lu
gcc/Chan
gcc/ChangeLog:
* config/i386/i386-expand.cc
(ix86_vectorize_vec_perm_const): Convert BF to HI using subreg.
* config/i386/predicates.md
(vcvtne2ps2bf_parallel): New define_insn_and_split.
* config/i386/sse.md
(vpermt2_sepcial_bf16_shuffle_): New pred
This patch updates the GCC x86 backend to efficiently handle
odd, incrementally increasing permutations of BF16 vectors
using the cvtne2ps2bf16 instruction.
It modifies ix86_vectorize_vec_perm_const to support these operations
and adds a specific predicate to ensure proper sequence handling.
Boots
PR target/107563
gcc/ChangeLog:
* config/i386/i386-expand.cc (expand_vec_perm_psrlw_psllw_por): New
subroutine.
(ix86_expand_vec_perm_const_1): New Entry.
gcc/testsuite/ChangeLog:
* g++.target/i386/pr107563.C: New test.
---
gcc/config/i386/i386-expand.cc
embly code generation for configurations
supporting SSE2.
This update addresses the issue detailed in Bugzilla report 107563.
Bootstrapped and tested on x86_64-linux-gnu, OK for trunk?
BRs,
Levy
gcc/ChangeLog:
* config/i386/i386-expand.cc (expand_vec_perm_psrlw_psllw_por)
handling.
Bootstrapped and tested on x86_64-linux-gnu, OK for trunk?
BRs,
Levy
gcc/ChangeLog:
* config/i386/i386-expand.cc
(ix86_vectorize_vec_perm_const): Convert BF to HI using subreg.
* config/i386/predicates.md
(vcvtne2ps2bf_parallel): New define_insn_and_split
embly code generation for configurations
supporting SSE2.
Bootstrapped and tested on x86_64-linux-gnu, OK for trunk?
Best
Levy
gcc/ChangeLog:
PR target/107563
* config/i386/i386-expand.cc (expand_vec_perm_psrlw_psllw_por): New
subroutine.
(ix86_expand_vec_perm_co
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?
This patch introduces new mode iterators and expands for the i386 architecture
to support partial vectorization of bf16 operations using AVX10.2 instructions.
These operations include addition, subtraction, multiplication, d
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?
This patch supports sminmax for partial vectorized V2BF/V4BF.
gcc/ChangeLog:
* config/i386/mmx.md (3): New define_expand for
V2BF/V4BFsmaxmin
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx10_2-partial-bf-v
Hi
This change adds BFmode support to the ix86_preferred_simd_mode function
enhancing SIMD vectorization for BF16 operations. The update ensures
optimized usage of SIMD capabilities improving performance and aligning
vector sizes with processor capabilities.
Bootstrapped and tested on x86-64-pc-l
Hi
This patch adds support for bf16 operations in V2BF and V4BF modes on i386,
handling signbit, xorsign, copysign, abs, neg, and various logical operations.
Bootstrapped and tested on x86-64-pc-linux-gnu.
Ok for trunk?
gcc/ChangeLog:
* config/i386/i386.cc (ix86_build_const_vector): Ad
Hi
Bootstrapped and tested on x86-64-pc-linux-gnu.
Ok for trunk?
This patch introduces support for vectorized FMA operations for bf16 types in
V2BF and V4BF modes on the i386 architecture. New mode iterators and
define_expand entries for fma, fnma, fms, and fnms operations are added in
mmx.md, e
Simple testcase fix, ok for trunk?
This patch removes specific register checks to account for possible
register spills and disables tests in 32-bit mode. This adjustment
is necessary because V4BF operations in 32-bit mode require duplicating
instructions, which lead to unintended test failures. It
gcc/ChangeLog:
* config/i386/i386.cc (ix86_get_mask_mode):
Enable BFmode for targetm.vectorize.get_mask_mode with AVX10.2.
* config/i386/mmx.md (vec_cmpqi):
Implement vec_cmpv2bfqi and vec_cmpv4bfqi.
gcc/testsuite/ChangeLog:
* gcc.target/i386/part-vect-vec
Simple testcase fix, ok for trunk?
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx10_2-partial-bf-vector-fma-1.c: Separated 32-bit
scan
and removed register checks in spill situations.
---
.../i386/avx10_2-partial-bf-vector-fma-1.c | 12
1 file changed, 8 i
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?
gcc/ChangeLog:
* config/i386/i386.cc (ix86_get_mask_mode):
Enable BFmode for targetm.vectorize.get_mask_mode with AVX10.2.
* config/i386/mmx.md (vec_cmpqi):
Implement vec_cmpv2bfqi and vec_cmpv
Bootstrapped and tested on x86_64-linux-gnu, OK for trunk?
gcc/ChangeLog:
* config/i386/i386.md: Rewrite insn truncsfbf2.
gcc/testsuite/ChangeLog:
* gcc.target/i386/truncsfbf-1.c: New test.
* gcc.target/i386/truncsfbf-2.c: New test.
---
gcc/config/i386/i386.md
This patch enables vectorization of the popcount operation for V2QI, V4QI,
V8QI, V2HI, V4HI, and V2SI modes.
gcc/ChangeLog:
* config/i386/mmx.md:
(VQI_16_32_64): New mode iterator for 8-byte, 4-byte, and 2-byte QImode.
(popcount2): New pattern for popcount of V2QI/V4QI/V8Q
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m64}.
Ok for trunk?
This patch enables the use of the VCOMSBF16 instruction from AVX10.2 for
efficient BF16 comparisons.
gcc/ChangeLog:
* config/i386/i386-expand.cc (ix86_expand_branch): Handle BFmode
when TARGET_AVX10_2_256 is e
26 matches
Mail list logo