Re: [PATCH v7] Implement new RTL optimizations pass: fold-mem-offsets.

2023-11-27 Thread Jakub Jelinek
On Mon, Oct 16, 2023 at 01:11:01PM -0600, Jeff Law wrote: > > gcc/ChangeLog: > > > > * Makefile.in: Add fold-mem-offsets.o. > > * passes.def: Schedule a new pass. > > * tree-pass.h (make_pass_fold_mem_offsets): Declare. > > * common.opt: New options. > > * doc/invoke.texi: Docu

[PATCH]middle-end: prevent LIM from hoising vector compares from gconds if target does not support it.

2023-11-27 Thread Tamar Christina
Hi All, LIM notices that in some cases the condition and the results are loop invariant and tries to move them out of the loop. While the resulting code is operationally sound, moving the compare out of the gcond results in generating code that no longer branches, so cbranch is no longer applicab

[PATCH]middle-end: refactor vectorizable_live_operation into helper method for codegen

2023-11-27 Thread Tamar Christina
Hi All, To make code review of the updates to add multiple exit supports to vectorizable_live_operation easier I've extracted the refactoring part to its own patch. This patch is a straight extract of the function with no functional changes. Bootstrapped Regtested on aarch64-none-linux-gnu and n

RE: [PATCH 8/21]middle-end: update vectorizable_live_reduction with support for multiple exits and different exits

2023-11-27 Thread Tamar Christina
> > > This is a respun patch with a fix for VLA. > > > > This adds support to vectorizable_live_reduction to handle multiple > > exits by doing a search for which exit the live value should be > > materialized in. > > > > Additionally which value in the index we're after depends on whether > > th

RE: [PATCH 13/21]middle-end: Update loop form analysis to support early break

2023-11-27 Thread Tamar Christina
Ping > -Original Message- > From: Tamar Christina > Sent: Monday, November 6, 2023 7:41 AM > To: gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: [PATCH 13/21]middle-end: Update loop form analysis to support > early break > > Hi All, > > This sets L

RE: [PATCH 12/21]middle-end: Add remaining changes to peeling and vectorizer to support early breaks

2023-11-27 Thread Tamar Christina
Ping > -Original Message- > From: Tamar Christina > Sent: Monday, November 6, 2023 7:41 AM > To: gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: [PATCH 12/21]middle-end: Add remaining changes to peeling and > vectorizer to support early breaks > > H

RE: [PATCH 10/21]middle-end: implement relevancy analysis support for control flow

2023-11-27 Thread Tamar Christina
Ping > -Original Message- > From: Tamar Christina > Sent: Monday, November 6, 2023 7:40 AM > To: gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: [PATCH 10/21]middle-end: implement relevancy analysis support for > control flow > > Hi All, > > This u

RE: [PATCH 9/21]middle-end: implement vectorizable_early_exit for codegen of exit code

2023-11-27 Thread Tamar Christina
Ping > -Original Message- > From: Tamar Christina > Sent: Monday, November 6, 2023 7:40 AM > To: gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: [PATCH 9/21]middle-end: implement vectorizable_early_exit for > codegen of exit code > > Hi All, > > Th

[COMMITTED] Fix time-profiler-3.c after r14-5628-g53ba8d669550d3

2023-11-27 Thread Andrew Pinski
This testcase started to fail after r14-5628-g53ba8d669550d3 because IPA-VRP can now start to figure out the functions return a constant value and there was nothing that profiling needed to profile any more. This disables IPA-VRP for this testcase to be able to profile again. Bootrapped/tested on

[PATCH] aarch64: Improve cost of `a ? {-,}1 : b`

2023-11-27 Thread Andrew Pinski
While looking into PR 112454, I found the cost for `(if_then_else (cmp) (const_int 1) (reg))` was being recorded as 8 (or `COSTS_N_INSNS (2)`) but it should have been 4 (or `COSTS_N_INSNS (1)`). This improves the cost by not adding the cost of `(const_int 1)` to the total cost. It does not does no

Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-11-27 Thread Michael Meissner
On Fri, Nov 10, 2023 at 06:03:40PM -0600, Peter Bergner wrote: > On 8/25/23 6:20 AM, Kewen.Lin wrote: > > btw, I was also expecting that we don't implicitly set > > OPTION_MASK_PCREL any more for Power10, that is to remove > > OPTION_MASK_PCREL from OTHER_POWER10_MASKS. > > So my patch removes the

[PATCH] fold-mem-offsets: Fix powerpc64le-linux profiledbootstrap [PR111601]

2023-11-27 Thread Jakub Jelinek
On Mon, Nov 27, 2023 at 09:52:14PM +0100, Jakub Jelinek wrote: > On Mon, Oct 16, 2023 at 01:11:01PM -0600, Jeff Law wrote: > > > gcc/ChangeLog: > > > > > > * Makefile.in: Add fold-mem-offsets.o. > > > * passes.def: Schedule a new pass. > > > * tree-pass.h (make_pass_fold_mem_offsets): Declar

Re: [PATCH] fold-mem-offsets: Fix powerpc64le-linux profiledbootstrap [PR111601]

2023-11-27 Thread Andrew Pinski
On Mon, Nov 27, 2023 at 3:51 PM Jakub Jelinek wrote: > > On Mon, Nov 27, 2023 at 09:52:14PM +0100, Jakub Jelinek wrote: > > On Mon, Oct 16, 2023 at 01:11:01PM -0600, Jeff Law wrote: > > > > gcc/ChangeLog: > > > > > > > > * Makefile.in: Add fold-mem-offsets.o. > > > > * passes.def: Schedule a n

Re: [committed v2] libstdc++: Define std::ranges::to for C++23 (P1206R7) [PR111055]

2023-11-27 Thread Hans-Peter Nilsson
> From: Jonathan Wakely > Date: Thu, 23 Nov 2023 17:51:38 + > libstdc++-v3/ChangeLog: > > PR libstdc++/111055 > * include/bits/ranges_base.h (from_range_t): Define new tag > type. > (from_range): Define new tag object. > * include/bits/version.def (ranges_to_con

[PATCH] libcpp: Fix unsigned promotion for unevaluated divide by zero [PR112701]

2023-11-27 Thread Lewis Hyatt
Hello- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112701 Here is a one-line fix to an edge case in libcpp's expression evaluator noted in the PR. Bootstrap + regtest all languages on x86-64 Linux. Is it OK please? Thanks! -Lewis -- >8 -- When libcpp encounters a divide by zero while processi

Re: c: tree: target: C2x (...) function prototypes and va_start relaxation

2023-11-27 Thread Joseph Myers
On Sat, 25 Nov 2023, Gerald Pfeifer wrote: > On Fri, 21 Oct 2022, Joseph Myers wrote: > > C2x allows function prototypes to be given as (...), a prototype > > meaning a variable-argument function with no named arguments. > > I noticed this did not make it into gcc-13/changes.html ? Was that > in

Re: [PATCH 3/4] c23: aliasing of compatible tagged types

2023-11-27 Thread Joseph Myers
On Sun, 26 Nov 2023, Martin Uecker wrote: > My understand is that it is used for aliasing analysis and also > checking of conversions. TYPE_CANONICAL must be consistent with > the idea the middle-end has about type conversions. But as long > as we do not give the same TYPE_CANONICAL to types the

Re: [PATCH] libcpp: Fix unsigned promotion for unevaluated divide by zero [PR112701]

2023-11-27 Thread Joseph Myers
On Mon, 27 Nov 2023, Lewis Hyatt wrote: > Hello- > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112701 > > Here is a one-line fix to an edge case in libcpp's expression evaluator > noted in the PR. Bootstrap + regtest all languages on x86-64 Linux. Is it OK > please? Thanks! OK. -- Joseph S

Re: [PATCH 4/5] diagnostics: add diagnostic_context::get_location_text

2023-11-27 Thread David Malcolm
On Tue, 2023-11-21 at 17:20 -0500, David Malcolm wrote: > No functional change intended. > > gcc/ChangeLog: > * diagnostic.cc (diagnostic_get_location_text): Convert to... > (diagnostic_context::get_location_text): ...this, and convert > return type from char * to label_tex

Re: [PATCH 5/5] diagnostics: don't print annotation lines when there's no column info

2023-11-27 Thread David Malcolm
On Tue, 2023-11-21 at 17:20 -0500, David Malcolm wrote: > gcc/ChangeLog: > * diagnostic-show-locus.cc > (layout::maybe_add_location_range): > Don't print annotation lines for ranges when there's no > column > info. > (selftest::test_one_liner_no_column): New. >  

Re: [PATCH] RISC-V: Fix VSETVL PASS regression

2023-11-27 Thread juzhe.zhong
committed as it passed zvl128/256/512/1024 no regression. Replied Message FromJuzhe-ZhongDate11/27/2023 21:24 Togcc-patches@gcc.gnu.org Cckito.ch...@gmail.com,kito.ch...@sifive.com,jeffreya...@gmail.com,rdapp@gmail.com,Juzhe-ZhongSubject[PATCH] RISC-V: Fix VSETVL PASS regression

[PATCH 1/4] [RISC-V] prefer Zicond primitive semantics to SFB

2023-11-27 Thread Fei Gao
Move Zicond md files ahead of SFB to recognize Zicond first. Take the following case for example. CFLAGS: -mtune=sifive-7-series -march=rv64gc_zicond -mabi=lp64d long primitiveSemantics_00(long a, long b) { return a == 0 ? 0 : b; } before patch: primitiveSemantics_00: bne a0,zero,1f

[PATCH 3/4] [ifcvt] optimize x=c ? (y op const_int) : y by RISC-V Zicond like insns

2023-11-27 Thread Fei Gao
op=[PLUS, MINUS, IOR, XOR, ASHIFT, ASHIFTRT, LSHIFTRT, ROTATE, ROTATERT] Co-authored-by: Xiao Zeng gcc/ChangeLog: * ifcvt.cc (noce_cond_zero_shift_op_supported): check if OP is shift like operation (noce_cond_zero_binary_op_supported): restructure & call noce_cond_zero_shift_op

[PATCH 4/4] [V2] [ifcvt] prefer SFB to Zicond for x=c ? (y op CONST) : y.

2023-11-27 Thread Fei Gao
In x=c ? (y op CONST) : y cases, Zicond based czero ifcvt generates more true dependency in code sequence than SFB based movcc. So exit noce_try_cond_zero_arith in such cases to have a better code sequence generated by noce_try_cmove_arith. Take the following case for example. CFLAGS: -mtune=sifi

[PATCH 2/4] [ifcvt] optimize x=c ? (y op z) : y by RISC-V Zicond like insns

2023-11-27 Thread Fei Gao
op=[PLUS, MINUS, IOR, XOR, ASHIFT, ASHIFTRT, LSHIFTRT, ROTATE, ROTATERT] SIGN_EXTEND, ZERO_EXTEND and SUBREG has been considered to support SImode in 64-bit machine. Conditional op, if zero rd = (rc == 0) ? (rs1 op rs2) : rs1 --> czero.nez rd, rs2, rc op rd, rs1, rd Conditional op, if non-zero r

Re: Re: [PATCH 2/4] [ifcvt] if convert x=c ? y+z : y by RISC-V Zicond like insns

2023-11-27 Thread Fei Gao
On 2023-11-20 14:46  Jeff Law wrote: > > > >On 10/30/23 21:35, Fei Gao wrote: > >>> So just a few notes to further illustrate why I'm currently looking to >>> take the VRULL+Ventana implementation.  The code above would be much >>> better handled by just calling noce_emit_cmove.  noce_emit_cmove w

[PATCH v2] gimple-match.pd Add more optimization for gimple_cond

2023-11-27 Thread Feng Wang
The link of PATCH v1: https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg326661.html This patch add another condition for gimple-cond optimization. Refer to the following test case. int foo1 (int data, int res) { res = data & 0xf; res |= res << 4; if (res < 0x22) return 0x22; retu

Re: Re: [PATCH 2/4] [ifcvt] if convert x=c ? y+z : y by RISC-V Zicond like insns

2023-11-27 Thread Fei Gao
On 2023-11-20 14:59  Jeff Law wrote: > > > >On 10/30/23 01:25, Fei Gao wrote: >> Conditional add, if zero >> rd = (rc == 0) ? (rs1 + rs2) : rs1 >> --> >> czero.nez rd, rs2, rc >> add rd, rs1, rd >> >> Conditional add, if non-zero >> rd = (rc != 0) ? (rs1 + rs2) : rs1 >> --> >> czero.eqz rd, rs2, r

Re: [PATCH V2] introduce light expander sra

2023-11-27 Thread Jiufu Guo
Hi, Thanks so much for your helpful review! Richard Biener writes: > On Fri, Oct 27, 2023 at 3:51 AM Jiufu Guo wrote: >> >> Hi, >> >> Compare with previous version: >> https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632399.html >> This verion supports TI/VEC mode of the access. >> >>

Re: Re: [PATCH 4/4] [ifcvt] if convert x=c ? y&z : y by RISC-V Zicond like insns

2023-11-27 Thread Fei Gao
On 2023-11-20 15:10  Jeff Law wrote: > > > >On 10/30/23 01:25, Fei Gao wrote: > >> diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc >> index 6e341fc4d4b..cfa9bc4b850 100644 >> --- a/gcc/ifcvt.cc >> +++ b/gcc/ifcvt.cc >> @@ -2911,7 +2911,7 @@ noce_try_sign_mask (struct noce_if_info *if_info) >>   static bo

Re: [PATCH v2] gimple-match.pd Add more optimization for gimple_cond

2023-11-27 Thread Andrew Pinski
On Mon, Nov 27, 2023 at 6:56 PM Feng Wang wrote: > > The link of PATCH v1: > https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg326661.html > This patch add another condition for gimple-cond optimization. Refer to > the following test case. > int foo1 (int data, int res) > { > res = data &

Re: [PATCH 1/4] [RISC-V] prefer Zicond primitive semantics to SFB

2023-11-27 Thread Kito Cheng
Personally I don't like to play with the pattern order to tweak the code gen since it kinda introduces implicit relation/rule here, but I guess the only way to prevent that is to duplicate the pattern for SFB again, which is not an ideal solution... Anyway, it's obviously a better code gen, so LGT

[PATCH 2/5] LoongArch: Use standard pattern name for xvfrsqrt/vfrsqrt instructions.

2023-11-27 Thread Jiahao Xu
Rename lasx_xvfrsqrt*/lsx_vfrsqrt* to rsqrt2 to align with standard pattern name. gcc/ChangeLog: * config/loongarch/lasx.md (lasx_xvfrsqrt_): Renamed to .. (*rsqrt2): .. this. * config/loongarch/loongarch-builtins.cc (CODE_FOR_lsx_vfrsqrt_d): Redefine to standard p

[PATCH 1/5] LoongArch: Add support for approximate instructions.

2023-11-27 Thread Jiahao Xu
LA664 introduces new instructions for reciprocal approximation and reciprocal square root approximation. It includes the scalar instructions frecipe and frsrte, as well as their corresponding vector instructions [x]vfrecipe and [x]vfrsqrte. This patch adds define_insn/builtins/intrinsics for the

[PATCH 4/5] LoongArch: New options -mrecip and -mrecip= with ffast-math.

2023-11-27 Thread Jiahao Xu
When -mrecip option is turned on, use approximate reciprocal instructions and approximate reciprocal square root instructions with additional Newton-Raphson steps to implement single precision floating-point division, square root and reciprocal square root operations for better throughput. gcc/

[PATCH 5/5] LoongArch: Vectorized loop unrolling is not performed on divf/sqrtf/rsqrtf with turns on -mrecip.

2023-11-27 Thread Jiahao Xu
Using -mrecip generates a sequence of instructions to replace divf, sqrtf and rsqrtf. The number of generated instructions is close to or exceeds the maximum issue of the LoongArch, so vectorized loop unrolling is not performed on them. gcc/ChangeLog: * config/loongarch/loongarch.cc (l

[PATCH 3/5] LoongArch: Redefine pattern for xvfrecip/vfrecip instructions.

2023-11-27 Thread Jiahao Xu
Redefine pattern for [x]vfrecip instructions use rtx code instead of unspec, and enable [x]vfrecip instructions to be generated during auto-vectorization. gcc/ChangeLog: * config/loongarch/lasx.md (lasx_xvfrecip_): Renamed to .. (recip3): .. this. * config/loongarch/loong

Re: [PATCH v6 1/1] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-11-27 Thread waffl3x
On Sunday, November 26th, 2023 at 7:40 PM, Jason Merrill wrote: > > > On 11/26/23 20:44, waffl3x wrote: > > > > > > > The other problem I'm having is > > > > > > > > > > > > auto f0 = [n = 5, &m](this auto const&){ n = 10; }; > > > > > > This errors just fine, the lambda is unconditionally

[PATCH 0/5] LoongArch: Add -mrecip option support

2023-11-27 Thread Jiahao Xu
LoongArch V1.1 instructions adds support for approximate instructions, which are utilized along with additional Newton-Raphson steps implement single precision floating-point division, square root and reciprocal square root operations for better throughput. Control the generation of approximate

Re: [PATCH 0/4] Add vector pair support to PowerPC attribute((vector_size(32)))

2023-11-27 Thread Michael Meissner
On Fri, Nov 24, 2023 at 05:41:02PM +0800, Kewen.Lin wrote: > on 2023/11/20 16:56, Michael Meissner wrote: > > On Mon, Nov 20, 2023 at 08:24:35AM +0100, Richard Biener wrote: > >> I wouldn't expose the "fake" larger modes to the vectorizer but rather > >> adjust m_suggested_unroll_factor (which you

Re: [PATCH v2] rs6000: Add new pass for replacement of contiguous addresses vector load lxv with lxvp

2023-11-27 Thread Michael Meissner
On Fri, Nov 24, 2023 at 05:31:20PM +0800, Kewen.Lin wrote: > Hi Ajit, > > Don't forget to CC David (CC-ed) :), some comments are inlined below. > > on 2023/10/8 03:04, Ajit Agarwal wrote: > > Hello All: > > > > This patch add new pass to replace contiguous addresses vector load lxv > > with mma

[PATCH] MATCH: Fix invalid signed boolean type usage

2023-11-27 Thread Andrew Pinski
This fixes the incorrect assumption that was done in r14-3721-ge6bcf839894783, that being able to doing the negative after the conversion would be a valid thing but really it is not valid for boolean types. OK? Bootstrapped and tested on x86_64-linux-gnu. gcc/ChangeLog: PR tree-optimiza

Re: [PATCH 4/4] [V2] [ifcvt] prefer SFB to Zicond for x=c ? (y op CONST) : y.

2023-11-27 Thread Jeff Law
On 11/27/23 19:32, Fei Gao wrote: In x=c ? (y op CONST) : y cases, Zicond based czero ifcvt generates more true dependency in code sequence than SFB based movcc. So exit noce_try_cond_zero_arith in such cases to have a better code sequence generated by noce_try_cmove_arith. Take the following

Re: [PATCH 1/4] [RISC-V] prefer Zicond primitive semantics to SFB

2023-11-27 Thread Jeff Law
On 11/27/23 20:09, Kito Cheng wrote: Personally I don't like to play with the pattern order to tweak the code gen since it kinda introduces implicit relation/rule here, but I guess the only way to prevent that is to duplicate the pattern for SFB again, which is not an ideal solution... I won'

Re: [PATCH 2/4] [ifcvt] if convert x=c ? y+z : y by RISC-V Zicond like insns

2023-11-27 Thread Jeff Law
On 11/27/23 19:46, Fei Gao wrote: On 2023-11-20 14:46  Jeff Law wrote: On 10/30/23 21:35, Fei Gao wrote: So just a few notes to further illustrate why I'm currently looking to take the VRULL+Ventana implementation.  The code above would be much better handled by just calling noce_emit_c

Re: [RFA] New pass for sign/zero extension elimination

2023-11-27 Thread Jeff Law
On 11/27/23 10:36, Joern Rennecke wrote: On 11/20/23 11:26, Richard Sandiford wrote: + + mask = GET_MODE_MASK (GET_MODE (SUBREG_REG (x))) << bit; + if (!mask) + mask = -0x1ULL; Not sure I follow this. What does the -0x1ULL constant indicate? Also, isn't it the ma

Re: [RFA] New pass for sign/zero extension elimination

2023-11-27 Thread Jeff Law
On 11/27/23 11:19, Joern Rennecke wrote: You are applying PATTERN to an INSN_LIST. I know :-) That was the late change to clean up some of the horrific control flow in the code. jeff

Re: Re: [PATCH v2] gimple-match.pd Add more optimization for gimple_cond

2023-11-27 Thread Feng Wang
On 2023-11-28 11:06  Andrew Pinski wrote: >On Mon, Nov 27, 2023 at 6:56 PM Feng Wang wrote: >> >> The link of PATCH v1: >> https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg326661.html >> This patch add another condition for gimple-cond optimization. Refer to >> the following test case. >>

[V2] New pass for sign/zero extension elimination -- not ready for "final" review

2023-11-27 Thread Jeff Law
I've still got some comments from Richard S to work through, but some folks are trying to play with this and thus I want to get the fixes to date in their hands. Changes since V1: - Fix handling of CALL_INSN_FUNCTION_USAGE so we don't apply PATTERN to an EXPR_LIST. - Various comments and

Re: Re: [PATCH v2] gimple-match.pd Add more optimization for gimple_cond

2023-11-27 Thread Andrew Pinski
On Mon, Nov 27, 2023 at 10:04 PM Feng Wang wrote: > > On 2023-11-28 11:06 Andrew Pinski wrote: > >On Mon, Nov 27, 2023 at 6:56 PM Feng Wang > >wrote: > >> > >> The link of PATCH v1: > >> https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg326661.html > >> This patch add another condition

Re: [PATCH 3/4] c23: aliasing of compatible tagged types

2023-11-27 Thread Martin Uecker
Am Dienstag, dem 28.11.2023 um 01:00 + schrieb Joseph Myers: > On Sun, 26 Nov 2023, Martin Uecker wrote: > > > My understand is that it is used for aliasing analysis and also > > checking of conversions. TYPE_CANONICAL must be consistent with > > the idea the middle-end has about type convers

Re: [PATCH v2] rs6000: Add new pass for replacement of contiguous addresses vector load lxv with lxvp

2023-11-27 Thread Michael Meissner
I tried using this patch to compare with the vector size attribute patch I posted. I could not build it as a cross compiler on my x86_64 because the assembler gives the following error: Error: operand out of domain (11 is not a multiple of 2) for std_stacktrace-elf.o. If you look at the assemble

[PATCH v1 1/2] LoongArch: Accelerate optimization of scalar signed/unsigned popcount.

2023-11-27 Thread Li Wei
In LoongArch, the vector popcount has corresponding instructions, while the scalar does not. Currently, the scalar popcount is calculated through a loop, and the value of a non-power of two needs to be iterated several times, so the vector popcount instruction is considered for optimization. gcc/C

[PATCH v1 2/2] LoongArch: Optimize vector constant extract-{even/odd} permutation.

2023-11-27 Thread Li Wei
For vector constant extract-{even/odd} permutation replace the default [x]vshuf instruction combination with [x]vilv{l/h} instruction, which can reduce instructions and improves performance. gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_is_odd_extraction): Supplement

[PATCH] Expand: Pass down equality only flag to cmpmem expand

2023-11-27 Thread HAO CHEN GUI
Hi, This patch passes down the equality only flags from emit_block_cmp_hints to cmpmem optab so that the target specific expand can generate optimized insns for equality only compare. Targets (e.g. rs6000) can generate more efficient insn sequence if the block compare is equality only. Bootstr

Re: [PATCH][RFC] middle-end/110237 - wrong MEM_ATTRs for partial loads/stores

2023-11-27 Thread Richard Biener
On Mon, 27 Nov 2023, Jeff Law wrote: > > > On 11/27/23 05:39, Robin Dapp wrote: > >> The easiest way to avoid running into the alias analysis problem is > >> to scrap the MEM_EXPR when we expand the internal functions for > >> partial loads/stores. That avoids the disambiguation we run into > >

[PATCH] Take register pressure into account for vec_construct when the components are not loaded from memory.

2023-11-27 Thread liuhongt
For vec_contruct, the components must be live at the same time if they're not loaded from memory, when the number of those components exceeds available registers, spill happens. Try to account that with a rough estimation. ??? Ideally, we should have an overall estimation of register pressure if we

[PATCH v1] LoongArch: Remove duplicate definition of CLZ_DEFINED_VALUE_AT_ZERO.

2023-11-27 Thread Li Wei
In the r14-5547 commit, C[LT]Z_DEFINED_VALUE_AT_ZERO were defined at the same time, but in fact, CLZ_DEFINED_VALUE_AT_ZERO has already been defined, so remove the duplicate definition. gcc/ChangeLog: * config/loongarch/loongarch.h (CTZ_DEFINED_VALUE_AT_ZERO): Add description.

PR111754

2023-11-27 Thread juzhe.zh...@rivai.ai
Hi, there is a regression in RISC-V caused by this patch: FAIL: gcc.dg/vect/pr111754.c -flto -ffat-lto-objects scan-tree-dump optimized "return { 0.0, 9.0e\\+0, 0.0, 0.0 }" FAIL: gcc.dg/vect/pr111754.c scan-tree-dump optimized "return { 0.0, 9.0e\\+0, 0.0, 0.0 }" I have checked the dump is : F

<    1   2