Since there's no evex version for vpcmpeq ymm, ymm, ymm.
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ready to push to trunk and backport to GCC13.
gcc/ChangeLog:
PR target/110227
* config/i386/sse.md (mov_internal>): Use x instead of v
for alter
t for excess
errors)
FAIL: c-c++-common/Wfree-nonheap-object-3.c -std=gnu++98 (test for warnings,
line 45)
with GCC configured with
../../gcc/configure
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-1805/usr
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld
--with
apped and tested on powerpc64-linux BE and LE with no regressions.
>
> Thanks
> Gui Haochen
>
> ChangeLog
> 2023-05-26 Haochen Gui
>
> gcc/
> PR target/104124
> * config/rs6000/altivec.md (*altivec_vupkhs_direct): Rename
> to...
> (
Hi Carl,
on 2023/6/15 04:37, Carl Love wrote:
> Kewen, GCC maintainers:
>
> Version 4, added missing cases for new xxexpqp, xsxexpdp and xsxsigqp
> cases to rs6000_expand_builtin. Merged the new define_insn definitions
> with the existing definitions. Renamed the builtins
Hi,
Gentle ping this series:
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html
BR,
Kewen
>
>>
>> on 2022/11/24 17:15, Kewen Lin wrote:
>>> Hi,
>>>
>>> Following Segher's suggestion, this patch series is to rework
>>>
Hi,
I'd like to gentle ping this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614818.html
BR,
Kewen
> on 2023/3/29 15:18, Kewen.Lin via Gcc-patches wrote:
>> Hi,
>>
>> By addressing Alexander's comments, against v1 this
>> patch v2 mainly
Hi,
Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609993.html
BR,
Kewen
on 2023/1/16 17:08, Kewen.Lin via Gcc-patches wrote:
> Hi,
>
> As Honza pointed out in [1], the current uses of function
> optimize_function_for_speed_p in rs6000_option_override_in
On 2023-06-13 17:18, Jiufu Guo via Gcc-patches wrote:
Hi David,
Thanks for your valuable comments!
David Edelsohn writes:
...
Do you have any measurement of how expensive it is to test all of
these additional methods to generate a constant? How much does this
affect the
compile time
-2.c execution test
FAIL: gcc.target/i386/sse2-packuswb-1.c execution test
Bootstrapped and regtested on x86_64-pc-linux-gnu.
Ok for trunk?
gcc/ChangeLog:
PR target/110235
* config/i386/i386-expand.cc (ix86_split_mmx_pack): Use
UNSPEC_US_TRUNCATE instead of original
test.
Bootstrapped and regtested on x86_64-pc-linux-gnu.
Ok for trunk?
gcc/ChangeLog:
PR target/110235
* config/i386/sse.md (_packsswb): Split
to below 3 new define_insns.
(sse2_packsswb): New define_insn.
(avx2_packsswb): Ditto.
(avx512bw_pac
Thanks Richard.
I have addressed all comments on V7 patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624220.html
Drop vlse/vsse codegen optimization in RISC-V backend, instead I will support
LEN_MASK_STRIDED_LOAD/LEN_MASK_STRIDE_STORE
in the future.
Thanks.
juzhe.zh...@rivai.ai
_p (, ) is always true, so
ira_reg_class_subset[ALL_REGS][NO_REGS] ends up being set to
cl3 = NO_LD_REGS. Adding a continue if hard_reg_set_empty_p (temp_hard_regset)
fixes the problem for me.
Does the below patch look ok? Bootstrapping and regression
testing passed on x86_64.
Regards
Se
.cc -std=c++14 (test for excess errors)
FAIL: g++.dg/vect/pr110557.cc -std=c++17 (test for excess errors)
FAIL: g++.dg/vect/pr110557.cc -std=c++20 (test for excess errors)
FAIL: g++.dg/vect/pr110557.cc -std=c++98 (test for excess errors)
with GCC configured with
../../gcc/configure
--prefix
From: Yanzhang Wang
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_save_reg_p): Save ra for leaf
when enabling -mno-omit-leaf-frame-pointer
(riscv_option_override): Override omit-frame-pointer.
(riscv_frame_pointer_required): Save s0 for non-leaf function
Hi Carl,
on 2023/7/12 02:06, Carl Love wrote:
> GCC maintainers:
>
> Ver 4, Removed extra space in subject line. Added comment to commit
> log comments about new __SET_FPSCR_RN_RETURNS_FPSCR__ define. Changed
> Added to Add and Renamed to Rename in ChangeLog. Update
Hi Carl,
on 2023/7/8 04:18, Carl Love wrote:
>
> GCC maintainers:
>
> Version 3, added code to altivec_resolve_overloaded_builtin so the
> correct instruction is selected for the size of the second argument.
> This restores the instruction counts to the original values w
From: Naveen H S
This patch adds lowering bit-field and opposite endian accesses pass.
The patch addresses many issues in:-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19466
2023-07-14 Andrew Pinski
Co-authored-by: Naveen H S
gcc/ChangeLog:
* Makefile.in (OBJS): Add gimple-lower
ported variable 'new_temp'
as NULL_TREE.
Confirmed this patch fixed the reported issue in PR110652
(with the same configuration).
Is it ok for trunk?
BR,
Kewen
-
PR tree-optimization/110652
gcc/ChangeLog:
* tree-vect-stmts.cc (vectorizable_load): Initialize new_
-gnu{-m32,}
Ready to push to trunk.
libpng/pngread.c: In function ‘png_read_image’:
libpng/pngread.c:786:1: internal compiler error: in final_scan_insn_1, at
final.cc:2813
786 | }
| ^
0x73ac3d final_scan_insn_1
../../gcc/final.cc:2813
0xb3420b final_scan_insn(rtx_insn*, _IO_FILE*, int
on 2023/7/17 14:39, Richard Biener wrote:
> On Mon, Jul 17, 2023 at 4:22 AM Kewen.Lin wrote:
>>
>> Hi,
>>
>> As PR110652 and its duplicate PRs show, there could be one
>> build error
>>
>> error: 'new_temp' may be used uninitialized
>>
>> for some build configurations. It's a false positive war
From: Yanzhang Wang
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_save_reg_p): Save ra for leaf
when enabling -mno-omit-leaf-frame-pointer
(riscv_option_override): Override omit-frame-pointer.
(riscv_frame_pointer_required): Save s0 for non-leaf function
vsrwz has lower latency than xxextractuw. So it should be generated
Nice, it also has lower latency than vextuw[lr]x.
> even with p9 vector enabled if possible. Also the instruction is
> already zero extended. A combine pattern is needed to eliminate
> redundant zero extend instruct
needs some justification why
it changes like that and the change is expected.
BR,
Kewen
on 2023/7/18 23:39, Carl Love wrote:
> Ping
>
> On Thu, 2023-06-01 at 16:11 -0700, Carl Love wrote:
>> GCC maintainers:
>>
>> The following patch updates the expected instruction coun
Any recommendations? Thanks a lot.
Sorry for the late review, this patch is okay for trunk with the below
nit tweaked or not. Thanks!
>
> ChangeLog
> 2022-09-26 Haochen Gui
>
> gcc/
> PR target/103605
> * config/rs6000/rs6000-builtin.cc (rs6000_gimple_fol
it is
> not
> PCREL and also when the user explicitly requests TOC or fixed. If the register
> r2 is fixed, it is made as non-volatile. Changes in register preservation
> roles
> can be accomplished with the help of available target hooks
> (TARGET_CONDITIONAL_REGISTER_USAGE).
Hi Fangrui,
on 2023/7/19 14:33, Fangrui Song wrote:
> On Thu, Nov 24, 2022 at 7:26 PM Kewen.Lin via Gcc-patches
> wrote:
>>
>> Hi Richard,
>>
>> on 2022/11/23 00:08, Richard Sandiford wrote:
>>> "Kewen.Lin" writes:
>>>> Hi Richard,
&g
and
movxo pattern to disallow these types of addresses, which assists LRA in
resolving this issue. Furthermore, the mode size 16 check has been
removed in vsx_quad_dform_memory_operand to allow OOmode and
quad_address_p already handles less than size 16.
2023-07-19 Jeevitha Palanisamy
gcc
Palanisamy
gcc/
PR target/110411
* config/rs6000/rs6000.h (enum rs6000_builtin_type_index): Add fields
to hold PTImode type.
* config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Add node
for PTImode type.
gcc/testsuite/
PR target/106895
utable
> +FAIL: gcc.target/i386/float16-7.c (test for errors, line 7)
>
> Perhaps we need to tweak
> gcc/testsuite/lib/target-supports.exp (add_options_for_float16)
> so that it adds -msse2 for i?86-*-* x86_64-*-* (that would likely
> fix up floatn-convert) and for the others
rapped and regtested on x86_64-pc-linux-gnu{-m32,}.
If AMD also like such optimization, Ok for trunk?
gcc/ChangeLog:
* config/i386/sse.md (_lddqu): Change to
define_expand, expand as simple move when TARGET_AVX
&& ( == 16 || !TARGET_AVX256_SPLIT_UNALIGNED_LOAD).
the function_section.
As Fangrui suggested[1], this patch is to add a bit more test
coverage. I didn't find a good way to check all linked_to
symbols are different, so I checked for LPFE[012] here.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624866.html
Tested well on x86_64-r
110744. This patch is to fix the related handlings with
the correct index.
Bootstrapped and regress-tested on x86_64-redhat-linux,
powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9/P10.
Is it ok for trunk?
BR,
Kewen
-
PR tree-optimization/110744
gcc/ChangeLog:
* tree-ssa
: g++.dg/cpp0x/udlit-extended-id-3.C -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp0x/udlit-extended-id-3.C -std=c++20 (test for excess errors)
with GCC configured with
../../gcc/configure
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-2629/usr
--enable-clocale=gnu
-pr95839-v8.c -flto -ffat-lto-objects scan-tree-dump
slp2 "optimized: basic block"
FAIL: gcc.dg/vect/bb-slp-pr95839-v8.c scan-tree-dump slp2 "optimized: basic
block"
with GCC configured with
../../gcc/configure
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r1
Hi Carl,
on 2023/7/18 03:19, Carl Love wrote:
>
> GCC maintainers:
>
> The rs6000 function find_instance assumes that it is called for built-
> ins with only two arguments. There is no checking for the actual
> number of aruguments used in the built-in. This patch ad
Hi Carl,
on 2023/7/18 03:20, Carl Love wrote:
> GCC maintainers:
>
> Version 4, changed the new RS6000_OVLD_VEC_REPLACE_UN case statement
> rs6000/rs6000-c.cc. The existing REPLACE_ELT iterator name was changed
> to REPLACE_ELT_V along with the associated define_mode_attr. Rena
way to check all linked_to
>> symbols are different, so I checked for LPFE[012] here.
>>
>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624866.html
>>
>> Tested well on x86_64-redhat-linux, powerpc64-linux-gnu
>> P7/P8/P9 and powerpc64le-linux-gnu
-linux,
>> powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9/P10.
>>
>> Is it ok for trunk?
>>
>> BR,
>> Kewen
>> -
>> PR tree-optimization/110744
>>
>> gcc/ChangeLog:
>>
>> * tree-ssa-sccvn.cc (vn_reference_lookup_3): Correct the index of bias
>> operand for ifn IFN_LEN_STORE.
>
> OK, thanks.
>
Thanks Richard! Pushed as r14-2694.
BR,
Kewen
cc.dg/vect/slp-perm-{1,5,6,7}.c
Bootstrapped and regtested on x86_64-redhat-linux,
aarch64-linux-gnu, powerpc64-linux-gnu P8/P9 and
powerpc64le-linux-gnu P9/P10.
Is it ok for trunk?
BR,
Kewen
-
PR tree-optimization/110740
gcc/ChangeLog:
* tree-vect-lo
(test for excess errors)
FAIL: g++.dg/gomp/pr58567.C -std=c++17 (test for excess errors)
FAIL: g++.dg/gomp/pr58567.C -std=c++20 (test for excess errors)
FAIL: g++.dg/gomp/pr58567.C -std=c++98 (test for excess errors)
with GCC configured with
../../gcc/configure
--prefix=/export/users/haochenj
]*zmm
FAIL: gcc.target/i386/pr93089-3.c scan-assembler vmulps[^\n\r]*zmm
with GCC configured with
../../gcc/configure
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-2709/usr
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld
--with-fpmath=sse --enable-languages=c,c
Hi Iain,
on 2023/7/22 23:58, Iain Sandoe wrote:
> Hi Kewen,
>
> This patch breaks bootstrap on powerpc-darwin (which has Altivec, but not
> VSX) while building libgfortran.
>
>> On 3 Jul 2023, at 04:19, Kewen.Lin via Gcc-patches
>> wrote:
>
> Please
Hi Carl,
on 2023/7/22 07:38, Carl Love wrote:
> GCC maintainers:
>
> Version 2: Updated a number of formatting and spacing issues. Added
> the NARGS description to the header comment for function find_instance.
> This patch was tested on Power 8 LE/BE, Power 9 LE/BE and Power
Hi Carl,
on 2023/7/22 07:38, Carl Love wrote:
> GCC maintainers:
>
> Version 5, Fixed patch description, the first argument should be of
> type vector. Fixed comment in vsx.md to say "Vector and scalar
> extract_elt iterator/attr ". Removed a few of t
> which can help eliminate redundant zero extend.
>
> Compared to last version, the main change is to add a new expand for V4SI
> and separate "vsx_extract_si" to 2 insn patterns.
> https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622101.html
>
> Bootstrap
on 2023/7/21 19:49, Richard Biener wrote:
> On Fri, Jul 21, 2023 at 8:08 AM Kewen.Lin wrote:
>>
>> Hi,
>>
>> The function vect_update_epilogue_niters which has been
>> removed by r14-2281 has some code taking care of that if
>> there is only one scalar iteration left for epilogue then
>> we won't
AIL: gfortran.dg/gomp/pr99226.f90 -O (test for excess errors)
with GCC configured with
../../gcc/configure
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-2754/usr
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld
--with-fpmath=sse --enable-languages=c,c++,fort
9/P10.
Is it ok for trunk?
BR,
Kewen
-
Co-authored-by: Richard Biener
PR tree-optimization/110776
gcc/ChangeLog:
* tree-vect-stmts.cc (vectorizable_load): Always cost VMAT_ELEMENTWISE
as scalar load.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr110776.c
nu P9/P10.
I'll push this soon and backport to release branches after
a week or so.
BR,
Kewen
-
PR target/110741
gcc/ChangeLog:
* config/rs6000/vsx.md (define_insn xxeval): Correct vsx
operands output with "x".
gcc/testsuite/ChangeLog:
, IBM 128-bit long double
> * Power9, LE, --with-cpu=power9, IEEE 128-bit long double
> * Power9, LE, --with-cpu=power9, 64-bit default long double
> * Power9, BE, --with-cpu=power9, IBM 128-bit long double
> * Power8, BE, --with-cpu=power8, IBM 128-bit long d
on 2023/7/26 18:02, Richard Biener wrote:
> On Wed, Jul 26, 2023 at 4:52 AM Kewen.Lin wrote:
>>
>> Hi,
>>
>> PR110776 exposes one issue that we could query unaligned
>> load for vector type but actually no unaligned vector load
>> is supported there. The reason is that the costed load is
>> with
cmpltps 3
FAIL: g++.target/i386/pr98218-1.C -std=gnu++98 scan-assembler-times pcmpgtd 2
with GCC configured with
../../gcc/configure
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-2786/usr
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld
--with-fpmath=sse --enable
Prevent rtl optimization of vec_duplicate + zero_extend to
vpbroadcastm since there could be an extra kmov after RA.
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}
Ready to push to trunk.
gcc/ChangeLog:
PR target/110788
* config/i386/sse.md (avx512cd_maskb_vec_dup
/vector/bool/110807.cc (test for excess errors)
with GCC configured with
../../gcc/configure
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-2797/usr
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet
;
> vector long long
> splat_dup_l_0 (vector long long v)
> {
> return __builtin_vec_splats (__builtin_vec_extract (v, 0));
> }
>
> would generate:
>
> mfvsrld 9,34
> mtvsrdd 34,9,9
> blr
>
> With this patch, GC
x27;t need a redundant vector extraction at all.
> is a memory operand. Only one 'stxsi[hb]x' instruction is enough.
>
> The V4SImode is fixed in a previous patch.
> https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622101.html
>
> Bootstrapped and tested on powe
below vmv.
vsetvli zero,a2,e32,m1,ta,ma
vmv.v.i v1,0
vs1r.v v1,0(a0)
It will elimate the mul with const 0 instruction to the simple mov
instruction.
Signed-off-by: Yanzhang Wang
gcc/ChangeLog:
* config/riscv/autovec-opt.md: Add a split pattern.
gcc/testsuite/ChangeLog
[^\n\r]*xmm[0-9] 1
FAIL: gcc.target/i386/pr87007-5.c scan-assembler-times vxorps[^\n\r]*xmm[0-9] 1
with GCC configured with
../../gcc/configure
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-2834/usr
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld
--with-fpmath
srwz
> which helps eliminate redundant zero extend.
>
> Compared to last version, the main change is to move "vsx_extract_v4si_w1"
> and "*mfvsrwz" to the front of "*vsx_extract__di_p9". Also some insn
> conditions are changed to assertions.
> https://gc
Hi Carl,
on 2023/7/28 23:00, Carl Love wrote:
> GCC maintainers:
>
> The following patch cleans up the definition for the
> __builtin_altivec_vcmpnet. The current implementation implies that the
s/__builtin_altivec_vcmpnet/__builtin_altivec_vcmpne[bhw]/
> built-in is only suppo
vpextrd $3, %xmm0, %eax
vmovddup %xmm3, %xmm0
vrndscalepd $9, %xmm0, %xmm0
vunpckhpd %xmm0, %xmm0, %xmm3
for vrndscalepd, no need to insert pxor since it reuses input register
xmm0 to avoid partial sse dependece.
Pushed to trunk.
gcc/testsuite/ChangeLog:
* gcc.target/i386
AVX512FP16 supports vfmaddsubXXXph and vfmsubaddXXXph.
Also remove scalar mode from fmaddsub/fmsubadd pattern since there's
no scalar instruction for that.
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ready to push to trunk.
gcc/ChangeLog:
PR target/81904
* c
to load port
comparing to vbroadcasti128, For latency perspective,vbroadcasti is no
worse than vlddqu + vinserti128.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625122.html
Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
Ok for trunk?
gcc/ChangeLog:
* config/i386/sse.md (*avx
-std=gnu++98 pr102690 (test for bogus
messages, line 22)
with GCC configured with
../../gcc/configure
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-1357/usr
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld
--with-fpmath=sse --enable-languages=c,c
Hi Richi,
Thanks for the insightful comments!
on 2022/7/1 16:40, Richard Biener wrote:
> On Thu, Jun 23, 2022 at 4:03 AM Kewen.Lin wrote:
>>
>> Hi,
>>
>> Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596212.html
>>
>> BR,
>> Kew
.f90 -O1 output pattern test
with GCC configured with
../../gcc/configure
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-1395/usr
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl
at line 32 (test for warnings,
line 28)
FAIL: gcc.dg/analyzer/allocation-size-4.c note at line 33 (test for warnings,
line 28)
FAIL: gcc.dg/analyzer/allocation-size-4.c warning at line 31 (test for
warnings, line 28)
with GCC configured with
../../gcc/configure
--prefix=/local/skpandey/gccwork
(test for excess errors)
FAIL: gcc.dg/auto-init-uninit-4.c (test for excess errors)
with GCC configured with
../../gcc/configure
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-1450/usr
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld
--with-fpmath=sse --enable
-access-path-13.c scan-tree-dump-times fre1 "return
123" 1
with GCC configured with
../../gcc/configure
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-1460/usr
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld
--with-fpmath=sse --enable-languages=c,c
.c scan-assembler pandn
FAIL: gcc.target/i386/pr65105-5.c scan-assembler ptest
with GCC configured with
../../gcc/configure
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-1509/usr
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld
--with-fpmath=sse --enable
f it looks good to you. Thanks!
-
PR target/106091
gcc/ChangeLog:
* config/rs6000/rs6000-p8swap.cc (replace_swapped_aligned_store): Copy
REG_EH_REGION when replacing one store insn having it.
(replace_swapped_aligned_load): Likewise.
gcc/t
the question, I'm not sure :(,
when I was drafting this patch, I wondered if there is one function
passing/copying reg_note REG_EH_REGION for this kind of need,
so I went through almost all the places related to REG_EH_REGION,
but nothing desired was found (though I may miss sth
the statement was only defining one ssa name.
OK? Bootstrapped and tested on x86_64 with no regressions.
PR tree-optimization/106087
gcc/ChangeLog:
* tree-ssa-dce.cc (simple_dce_from_worklist): Check
to make sure the statement is only defining one operand.
gcc/testsuite
ailed. I checked the unrecognizable
pattern and the original patch, I guessed it needs a tiny adjustment
like below:
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index dde123e87b8..0a089f12510 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -42
xcess errors)
with GCC configured with
../../gcc/configure
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-1573/usr
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl
--enable-l
the statement was only defining one ssa name.
Committed as approved after a bootstrapped and tested on x86_64 with no
regressions.
PR tree-optimization/106087
gcc/ChangeLog:
* tree-ssa-dce.cc (simple_dce_from_worklist): Check
to make sure the statement is only defining one
)
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?
gcc/ChangeLog:
PR target/106038
* config/i386/mmx.md (3): Expand
with (clobber (reg:CC flags_reg)) under TARGET_64BIT
(mmx_code>3): Ditto.
(*mmx_3_1): New define_insn, add post_rel
pproved it. I
guessed that
thread escaped from your mail radar somehow, it started from [1].
[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-July/597595.html
BR,
Kewen
cgraph node for it, w/o this patch function
>>> optimize_function_for_speed_p returns true eventually, while it
>>> returns false with this patch. Since the command line option -Os
>>> is specified, there is no reason to interpret it as "for speed".
>>&
x-gnu{-m32,}.
Also test the patch for SPEC2017 and find there's complex type vectorization
in 510/549(but no performance impact).
Any comments?
gcc/ChangeLog:
PR tree-optimization/106010
* tree-vect-data-refs.cc (vect_get_data_access_cost):
Pass complex_p to vect_get_
Hi Jeff,
Thanks for the patch, one question is inlined below.
on 2022/7/4 14:58, Jiufu Guo wrote:
> The high part of the symbol address is invalid for the constant pool. In
> function rs6000_cannot_force_const_mem, we already return true for
> "HIGH with UNSPEC" rtx. During
at function can also be safely removed.
>
The TargetVariable rs6000_builtin_mask in rs6000.opt is useless, it seems
it can be removed together?
> I have tested this on current systems (P8,P9,P10) without regressions.
>
> OK for trunk?
>
>
> Thanks,
> -
inux-gnu{-m32,}.
No big imact on SPEC2017(Most same binary).
Ok for trunk?
gcc/ChangeLog:
PR target/106038
* config/i386/mmx.md (3): Expand
with (clobber (reg:CC flags_reg)) under TARGET_64BIT
(mmx_code>3): Ditto.
(*mmx_3_gpr): New define_insn, add po
And split it after reload.
>IMO, the only case it is worth adding is a direct immediate store to
>memory, which HJ recently added.
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?
gcc/ChangeLog:
PR target/106038
* config/i386/mmx.md (3): Extend to A
ore).
Any comments?
gcc/ChangeLog:
PR tree-optimization/106010
* tree-vect-data-refs.cc (vect_get_data_access_cost):
Pass complex_p to vect_get_num_copies to avoid ICE.
(vect_analyze_data_refs): Support vectorization for Complex
type with vector sc
e combine/forwprop will do optimization.
> Please use if (!register_operand (operands[2], mode)) instead.
Changed.
Update patch.
gcc/ChangeLog:
PR target/106038
* config/i386/mmx.md (3): New define_expand, it's
original "3".
(*3): New define_insn, i
as been tested as before, so this patch is OK. Thanks!
> gcc/
> * config/rs6000/rs6000-c.cc: Update comments.
> (rs6000_target_modify_macros): Remove bu_mask references.
> (rs6000_define_or_undefine_macro): Replace bu_mask reference
> wit
_sincos additionaly expands pow&cabs, this patch
split that part into a separate pass named pass_expand_powcabs which
remains the old pass position.
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Observe more libmvec sin/cos vectorization in specfp, but no big performance.
Ok fo
PLEX_CST
for rhs. And it will enable vectorization for pr106010-8a.c.
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?
2022-07-20 Richard Biener
Hongtao Liu
gcc/ChangeLog:
PR tree-optimization/106010
* tree-complex.cc (init_dont_simulate_a
From: Sören Tempel
The `(( expression ))` syntax is a Bash extension and not supported by
POSIX shell [1]. However, the arithmetic expressions used by the
gobuild() function can also be expressed using arithmetic POSIX
expansions with `$(( expression ))` [2].
Contrary to the Bash extension, arit
-linux-gnu P7 and P8,
and powerpc64le-linux-gnu P9. Bootstrapped on
powerpc64le-linux-gnu P10, but one failure was exposed during
regression testing there, it's identified as one miss
optimization and can be reproduced without this support,
PR106365 was opened for further tracking.
Is it fo
ailures
on gcc.target/powerpc/pr92398.p9-.c fixed, I can see it helps to
bring back some testing coverage like:
NA->PASS: gcc.target/powerpc/pr92398.p9+.c
NA->PASS: gcc.target/powerpc/pr93453-1.c
I'll push this soon if no objections.
BR,
Kewen
-
PR testsuite/106345
gc
original -mdejagnu-cpu
when it's required) accordingly.
Tested on powerpc64-linux-gnu P7 and P8 and
powerpc64le-linux-gnu P9 and P10, also with explicit p10
tune setting for configuration.
I'll push this soon if no objections.
BR,
Kewen
-
PR testsuite/106345
gcc/testsuite
c.dg/pr23911.c scan-tree-dump-times dce3 "__complex__ \\(1.0e\\+0,
0.0\\)" 2
FAIL: gcc.dg/pr56837.c scan-tree-dump-times optimized "memset ..c, 68, 16384.;"
1
with GCC configured with
../../gcc/configure
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-1762/u
And split it after reload.
gcc/ChangeLog:
PR target/106038
* config/i386/mmx.md (3): New define_expand, it's
original "3".
(*3): New define_insn, it's original
"3" be extended to handle memory and immediate
operan
Hi Richi,
on 2022/7/21 17:01, Richard Biener via Gcc-patches wrote:
> The following teaches VN to handle reads from .MASK_STORE and
> .LEN_STORE. For this push_partial_def is extended first for
> convenience so we don't have to handle the full def case in the
> caller (possibl
Hi Segher,
Thanks for the comments!
on 2022/7/22 06:09, Segher Boessenkool wrote:
> On Wed, Jul 20, 2022 at 05:32:01PM +0800, Kewen.Lin wrote:
>> As the failure of test case gcc.target/powerpc/pr92398.p9-.c in
>> PR106345 shows, some test sources for some powerpc effective
>> targets use empty tr
Hi!
on 2022/7/22 09:02, Segher Boessenkool wrote:
> Hi!
>
> On Fri, Jul 22, 2022 at 08:41:43AM +0800, Kewen.Lin wrote:
>> Hi Segher,
>>
>> Thanks for the comments!
>
> Always.
>
This patch is to fix empty TUs with one dummy variable definition
accordingly.
>>>
>>> You can also use
>>>
FAIL: gcc.dg/analyzer/stdarg-3.c (test for excess errors)
with GCC configured with
../../gcc/configure
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-1786/usr
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld
--with-fpmath=sse --enable-languages=c,c++,fortran
r13-1762-gf9d4c3b45c5ed5f45c8089c990dbd4e181929c3d lower complex type
move to scalars, but testcase pr23911 is supposed to scan __complex__
constant which is never available, so adjust testcase to scan
IMAGPART/REALPART_EXPR constants separately.
Pushed as obvious patch.
gcc/testsuite/ChangeLog
401 - 500 of 41351 matches
Mail list logo