RE: [PATCH]middle-end: fix masking for partial vectors and early break [PR119351]

2025-04-16 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, April 16, 2025 9:57 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [PATCH]middle-end: fix masking for partial vectors and early > break > [PR119351] > > On Wed, 16 A

[PATCH] testsuite: force AMDGCN test for vect-early-break_18.c to consistent architecture [PR119286]

2025-04-16 Thread Tamar Christina
Hi All, The given test is intended to test vectorization of a strided access done by having a step of > 1. GCN target doesn't support load lanes, so the testcase is expected to fail, other targets create a permuted load here which we then then reject. However some GCN arch don't seem to support

[PATCH]middle-end: fix masking for partial vectors and early break [PR119351]

2025-04-16 Thread Tamar Christina
Hi All, The following testcase shows an incorrect masked codegen: #define N 512 #define START 1 #define END 505 int x[N] __attribute__((aligned(32))); int __attribute__((noipa)) foo (void) { int z = 0; for (unsigned int i = START; i < END; ++i) { z++; if (x[i] > 0)

RE: [PATCH]middle-end: Fix incorrect codegen with PFA and VLS [PR119351]

2025-04-15 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, April 15, 2025 12:50 PM > To: Tamar Christina > Cc: Richard Sandiford ; gcc-patches@gcc.gnu.org; > nd > Subject: RE: [PATCH]middle-end: Fix incorrect codegen with PFA and VLS > [PR119351] > &

RE: [PATCH]middle-end: Fix incorrect codegen with PFA and VLS [PR119351]

2025-04-15 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, April 15, 2025 12:49 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [PATCH]middle-end: Fix incorrect codegen with PFA and VLS > [PR119351] > > On Tue, 15 Apr 2025, Tamar

RE: [PATCH]middle-end: Fix incorrect codegen with PFA and VLS [PR119351]

2025-04-15 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Tuesday, April 15, 2025 10:52 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de > Subject: Re: [PATCH]middle-end: Fix incorrect codegen with PFA and VLS > [PR119351] > > Tamar

[PATCH]middle-end: Fix incorrect codegen with PFA and VLS [PR119351]

2025-04-15 Thread Tamar Christina
Hi All, The following example: #define N 512 #define START 2 #define END 505 int x[N] __attribute__((aligned(32))); int __attribute__((noipa)) foo (void) { for (signed int i = START; i < END; ++i) { if (x[i] == 0) return i; } return -1; } generates incorrect code with

RE: [PATCH][contrib]: support json output from check_GNU_style_lib.py

2025-04-09 Thread Tamar Christina
Ping > -Original Message- > From: Tamar Christina > Sent: Tuesday, July 23, 2024 3:30 PM > To: Jonathan Wakely ; Filip Kastl > Cc: gcc-patches@gcc.gnu.org; nd > Subject: RE: [PATCH][contrib]: support json output from check_GNU_style_lib.py > > Hi Both, >

RE: [PATCH v2] aarch64, Darwin: Initial implementation of Apple cores [PR113257].

2025-04-07 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Monday, March 31, 2025 1:43 PM > To: i...@sandoe.co.uk > Cc: Tamar Christina ; GCC Patches patc...@gcc.gnu.org>; Alice Carlotti ; Richard > Sandiford > ; s...@gentoo.org > Subject: Re: [PATCH v2] aarch64, Dar

RE: [PATCH] testsuite: update early-break tests for non-load-lanes targets [PR119286]

2025-03-18 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, March 18, 2025 10:48 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [PATCH] testsuite: update early-break tests for non-load-lanes > targets > [PR119286] > > On Mon

[PATCH] testsuite: update early-break tests for non-load-lanes targets [PR119286]

2025-03-17 Thread Tamar Christina
Hi All, Broadly speaking, these tests were failing because the BB limitation for SLP'ing loads in an || in an early break makes the loads end up in different BBs and so today we can't SLP them. This results in load_lanes being required to vectorize them because the alternative is loads with permu

RE: [1/3 PATCH]AArch64: add support for partial modes to last extractions [PR118464]

2025-03-11 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Wednesday, March 5, 2025 11:27 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [1/3 PATCH]AArch64: add support for partial modes to last &g

RE: [1/3 PATCH]AArch64: add support for partial modes to last extractions [PR118464]

2025-03-06 Thread Tamar Christina
> > diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64- > sve.md > > index > a93bc463a909ea28460cc7877275fce16e05f7e6..205eeec2e35544de848e0dbb > 48e3f5ae59391a88 100644 > > --- a/gcc/config/aarch64/aarch64-sve.md > > +++ b/gcc/config/aarch64/aarch64-sve.md > > @@ -3107,12

RE: [1/3 PATCH]AArch64: add support for partial modes to last extractions [PR118464]

2025-03-06 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, March 6, 2025 10:40 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [1/3 PATCH]AArch64: add support for partial modes to last &g

RE: [1/3 PATCH]AArch64: add support for partial modes to last extractions [PR118464]

2025-03-05 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Monday, March 3, 2025 11:53 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [1/3 PATCH]AArch64: add support for partial modes to last &g

RE: [3/3 PATCH v4]middle-end: delay checking for alignment to load [PR118464]

2025-03-03 Thread Tamar Christina
> >/* For now assume all conditional loads/stores support unaligned > > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > > index > 6bbb16beff2c627fca11a7403ba5ee3a5faa21c1..b661dd400e5826fc1c4f70 > 957b335d1741fa 100644 > > --- a/gcc/tree-vect-stmts.cc > > +++ b/gcc/tree-vect-

RE: [PATCH]AArch64: force operand to fresh register to avoid subreg issues [PR118892]

2025-03-03 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Monday, March 3, 2025 10:12 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: force operand to fresh register to avoid subr

[PATCH]AArch64: force operand to fresh register to avoid subreg issues [PR118892]

2025-02-27 Thread Tamar Christina
Hi All, When the input is already a subreg and we try to make a paradoxical subreg out of it for copysign this can fail if it violates the sugreg relationship. Use force_lowpart_subreg instead of lowpart_subreg to then force the results to a register instead of ICEing. Bootstrapped Regtested on

RE: [3/3 PATCH v3]middle-end: delay checking for alignment to load [PR118464]

2025-02-26 Thread Tamar Christina
> > > > > > No, I don't think so. The code that eventually performs a > > > contiguous sub-group access directly should never extend > > > the load beyond GROUP_SIZE - or should be gated on the DR > > > not executed speculatively. That is, we should "fix" this > > > elsewhere. > > > > > > > It do

RE: [3/3 PATCH v3]middle-end: delay checking for alignment to load [PR118464]

2025-02-26 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, February 26, 2025 1:52 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: RE: [3/3 PATCH v3]middle-end: delay checking for alignment to load > [PR118464] > > On Wed, 26 Feb

RE: [3/3 PATCH v3]middle-end: delay checking for alignment to load [PR118464]

2025-02-26 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, February 26, 2025 12:30 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [3/3 PATCH v3]middle-end: delay checking for alignment to load > [PR118464] > > On Tue, 25 Feb

[3/3 PATCH v3]middle-end: delay checking for alignment to load [PR118464]

2025-02-25 Thread Tamar Christina
Hi All, This fixes two PRs on Early break vectorization by delaying the safety checks to vectorizable_load when the VF, VMAT and vectype are all known. This patch does add two new restrictions: 1. On LOAD_LANES targets, where the buffer size is known, we reject uneven group sizes, as they are

[2/3 PATCH][committed] testsuite: Add pragma novector to more tests [PR118464]

2025-02-25 Thread Tamar Christina
Hi All, These loops will now vectorize the entry finding loops. As such we get more failures because they were not expecting to be vectorized. Fixed by adding #pragma GCC novector. Bootstrapped Regtested on aarch64-none-linux-gnu, arm-none-linux-gnueabihf, x86_64-pc-linux-gnu -m32, -m64 and no

[1/3 PATCH]AArch64: add support for partial modes to last extractions [PR118464]

2025-02-25 Thread Tamar Christina
Hi All, The last extraction instructions work full both full and partial SVE vectors, however we currrently only define them for FULL vectors. Early break code for VLA now however requires partial vector support, which relies on extract_last support. I have not added any new testcases as they ov

RE: [PATCH v2]middle-end: delay checking for alignment to load [PR118464]

2025-02-13 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, February 13, 2025 4:55 PM > To: Tamar Christina > Cc: Richard Biener ; gcc-patches@gcc.gnu.org; nd > > Subject: Re: [PATCH v2]middle-end: delay checking for alignment to load > [PR118464] >

RE: [PATCH v2]middle-end: delay checking for alignment to load [PR118464]

2025-02-12 Thread Tamar Christina
> -Original Message- > From: Tamar Christina > Sent: Wednesday, February 12, 2025 3:20 PM > To: Richard Biener > Cc: gcc-patches@gcc.gnu.org; nd > Subject: RE: [PATCH v2]middle-end: delay checking for alignment to load > [PR118464] > > > -Original

RE: [PATCH v2]middle-end: delay checking for alignment to load [PR118464]

2025-02-12 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, February 12, 2025 2:58 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [PATCH v2]middle-end: delay checking for alignment to load > [PR118464] > > On Tue, 11 Feb

[PATCH]AArch64: Fix GCC 13 backport of big.Little CPU detection [PR118800]

2025-02-10 Thread Tamar Christina
Hi All, It seems I ran regressions but forgot to check them last time `(*>?<*)? On the GCC-13 branch the backport caused a failure due to the branch not having generic-armv8-a and also it still treating the generic cpu special. This made it return NULL when trying to find the default CPU. In GC

[PATCH]middle-end: Fix two testisms on x86 after PFA [PR118754]

2025-02-10 Thread Tamar Christina
Hi All, These two tests now vectorize the result finding loop with PFA and so the number of loops checked fails. This fixes them by adding #pragma GCC novector to the testcases. Regtested on x86_64-pc-linux-gnu on an AVX512 machine with -m32, -m64 and test pass again. Ok for master? Thanks, Ta

RE: [PATCH]middle-end: delay checking for alignment to load [PR118464]

2025-02-07 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, February 5, 2025 1:15 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: RE: [PATCH]middle-end: delay checking for alignment to load > [PR118464] > > On Wed, 5 Feb

[PATCH]middle-end: Remove unused internal function after IVopts cleanup [PR118756]

2025-02-06 Thread Tamar Christina
Hi All, It seems that after my IVopts patches the function contain_complex_addr_expr became unused and clang is rightfully complaining about it. This removes the unused internal function. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLo

RE: [PATCH]middle-end: delay checking for alignment to load [PR118464]

2025-02-05 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, February 5, 2025 10:16 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: RE: [PATCH]middle-end: delay checking for alignment to load > [PR118464] > > On Wed, 5 Feb

RE: [PATCH]middle-end: delay checking for alignment to load [PR118464]

2025-02-05 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, February 4, 2025 12:49 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: RE: [PATCH]middle-end: delay checking for alignment to load > [PR118464] > > On Tue, 4 Feb

RE: [PATCH 1/4] vect: Set counts of early break exit blocks correctly [PR117790]

2025-02-05 Thread Tamar Christina
> -Original Message- > From: Jan Hubicka > Sent: Tuesday, February 4, 2025 4:25 PM > To: Alex Coplan > Cc: gcc-patches@gcc.gnu.org; Richard Biener ; Tamar > Christina > Subject: Re: [PATCH 1/4] vect: Set counts of early break exit blocks correctly > [PR

RE: [PATCH]middle-end: delay checking for alignment to load [PR118464]

2025-02-03 Thread Tamar Christina
Looks like a last minute change I made accidentally blocked SVE. Fixed and re-sending: Hi All, This fixes two PRs on Early break vectorization by delaying the safety checks to vectorizable_load when the VF, VMAT and vectype are all known. This patch does add two new restrictions: 1. On LOAD_LA

RE: [PATCH 4/4] vect: Fix scale_profile_for_vect_loop for multiple exits [PR117790]

2025-02-03 Thread Tamar Christina
Ping > -Original Message- > From: Tamar Christina > Sent: Friday, January 24, 2025 9:18 AM > To: Alex Coplan ; gcc-patches@gcc.gnu.org > Cc: Richard Biener ; Jan Hubicka > Subject: RE: [PATCH 4/4] vect: Fix scale_profile_for_vect_loop for multiple > exits &

RE: [PATCH 3/4] vect: Ensure profile consistency when adding epilog guard [PR117790]

2025-02-03 Thread Tamar Christina
Ping > -Original Message- > From: Tamar Christina > Sent: Friday, January 24, 2025 9:18 AM > To: Alex Coplan ; gcc-patches@gcc.gnu.org > Cc: Richard Biener ; Jan Hubicka > Subject: RE: [PATCH 3/4] vect: Ensure profile consistency when adding epilog > guard

[PATCH]middle-end: delay checking for alignment to load [PR118464]

2025-02-03 Thread Tamar Christina
Hi All, This fixes two PRs on Early break vectorization by delaying the safety checks to vectorizable_load when the VF, VMAT and vectype are all known. This patch does add two new restrictions: 1. On LOAD_LANES targets, where the buffer size is known, we reject uneven group sizes, as they are

RE: [PATCH 2/4] cfgloopmanip: Add infrastructure for scaling of multi-exit loops [PR117790]

2025-02-03 Thread Tamar Christina
Ping > -Original Message- > From: Tamar Christina > Sent: Friday, January 24, 2025 9:18 AM > To: Alex Coplan ; gcc-patches@gcc.gnu.org > Cc: Richard Biener ; Jan Hubicka > Subject: RE: [PATCH 2/4] cfgloopmanip: Add infrastructure for scaling of > multi-exit > lo

RE: [PATCH 1/4] vect: Set counts of early break exit blocks correctly [PR117790]

2025-02-03 Thread Tamar Christina
Ping > -Original Message- > From: Tamar Christina > Sent: Friday, January 24, 2025 9:17 AM > To: Alex Coplan ; 'gcc-patches@gcc.gnu.org' patc...@gcc.gnu.org> > Cc: 'Richard Biener' ; 'Jan Hubicka' > Subject: RE: [PATCH 1/4] ve

RE: [PATCH 4/4] vect: Fix scale_profile_for_vect_loop for multiple exits [PR117790]

2025-01-24 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Wednesday, January 15, 2025 2:08 PM > To: Alex Coplan ; gcc-patches@gcc.gnu.org > Cc: Richard Biener ; Jan Hubicka > Subject: RE: [PATCH 4/4] vect: Fix scale_profile_for_vect_loop for multiple > exits &

RE: [PATCH 2/4] cfgloopmanip: Add infrastructure for scaling of multi-exit loops [PR117790]

2025-01-24 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Wednesday, January 15, 2025 2:08 PM > To: Alex Coplan ; gcc-patches@gcc.gnu.org > Cc: Richard Biener ; Jan Hubicka > Subject: RE: [PATCH 2/4] cfgloopmanip: Add infrastructure for scaling of > multi-e

RE: [PATCH 1/4] vect: Set counts of early break exit blocks correctly [PR117790]

2025-01-24 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Wednesday, January 15, 2025 2:07 PM > To: Alex Coplan ; gcc-patches@gcc.gnu.org > Cc: Richard Biener ; Jan Hubicka > Subject: RE: [PATCH 1/4] vect: Set counts of early break exit blocks correctly >

RE: [PATCH 3/4] vect: Ensure profile consistency when adding epilog guard [PR117790]

2025-01-24 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Wednesday, January 15, 2025 2:08 PM > To: Alex Coplan ; gcc-patches@gcc.gnu.org > Cc: Richard Biener ; Jan Hubicka > Subject: RE: [PATCH 3/4] vect: Ensure profile consistency when adding epilog > guard

RE: [PATCH]AArch64: Drop ILP32 from default elf multilibs after deprecation

2025-01-20 Thread Tamar Christina
> -Original Message- > From: Tamar Christina > Sent: Friday, January 17, 2025 5:07 PM > To: Kyrylo Tkachov ; Richard Sandiford > > Cc: GCC Patches ; nd ; Richard > Earnshaw ; ktkac...@gcc.gnu.org > Subject: RE: [PATCH]AArch64: Drop ILP32 from default elf multi

RE: [PATCH] aarch64: Provide initial specifications for Apple CPU cores.

2025-01-20 Thread Tamar Christina
> -Original Message- > From: Iain Sandoe > Sent: Monday, January 20, 2025 6:15 PM > To: Andrew Carlotti > Cc: Kyrylo Tkachov ; GCC Patches patc...@gcc.gnu.org>; Tamar Christina ; Richard > Sandiford ; Sam James > Subject: Re: [PATCH] aarch64: Provide initial

[PATCH]middle-end: use ncopies both when registering and reading masks [PR118273]

2025-01-20 Thread Tamar Christina
Hi All, When registering masks for SIMD clone we end up using nmasks instead of nvectors where nmasks seems to compute the number of input masks required for the call given the current simdlen. This is however wrong as vect_record_loop_mask wants to know how many masks you want to create from the

RE: [gcc r15-6807] vect: Force alignment peeling to vectorize more early break loops [PR118211]

2025-01-20 Thread Tamar Christina
> -Original Message- > From: Thomas Schwinge > Sent: Monday, January 13, 2025 9:54 AM > To: Tamar Christina ; Alex Coplan > ; gcc-patches@gcc.gnu.org > Cc: Andrew Stubbs > Subject: Re: [gcc r15-6807] vect: Force alignment peeling to vectorize more > early

RE: [PATCH]AArch64: Drop ILP32 from default elf multilibs after deprecation

2025-01-17 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Friday, January 17, 2025 3:10 PM > To: Richard Sandiford > Cc: Tamar Christina ; GCC Patches patc...@gcc.gnu.org>; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: Drop ILP32 fr

RE: [PATCH]AArch64: Drop ILP32 from default elf multilibs after deprecation

2025-01-17 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Friday, January 17, 2025 1:22 PM > To: Tamar Christina > Cc: GCC Patches ; nd ; Richard > Earnshaw ; ktkac...@gcc.gnu.org; Richard > Sandiford > Subject: Re: [PATCH]AArch64: Drop ILP32 from default elf multi

RE: [PATCH]AArch64: Drop ILP32 from default elf multilibs after deprecation

2025-01-17 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Friday, January 17, 2025 1:04 PM > To: Tamar Christina > Cc: GCC Patches ; nd ; Richard > Earnshaw ; ktkac...@gcc.gnu.org; Richard > Sandiford > Subject: Re: [PATCH]AArch64: Drop ILP32 from default elf multi

RE: [PATCH v3 1/2] aarch64: Use standard names for saturating arithmetic

2025-01-17 Thread Tamar Christina
16-bit tests. * gcc.target/aarch64/saturating_arithmetic_3.c: 32-bit tests. * gcc.target/aarch64/saturating_arithmetic_4.c: 64-bit tests. Co-authored-by: Tamar Christina -- inline copy -- diff --git a/gcc/config/aarch64/aarch64-builtins.cc b/gcc/

[PATCH]AArch64: Drop ILP32 from default elf multilibs after deprecation

2025-01-17 Thread Tamar Christina
Hi All, Following the deprecation of ILP32 *-elf builds fail now due to -Werror on the deprecation warning. This is because on embedded builds ILP32 is part of the default multilib. This patch removed it from the default target as the build would fail anyway. Cross compiled on aarch64-none-elf

RE: [PATCH] AArch64: Deprecate -mabi=ilp32

2025-01-17 Thread Tamar Christina
> -Original Message- > From: Wilco Dijkstra > Sent: Tuesday, January 14, 2025 5:30 PM > To: Richard Sandiford > Cc: Richard Earnshaw ; ktkac...@nvidia.com; GCC > Patches ; sch...@linux-m68k.org > Subject: Re: [PATCH] AArch64: Deprecate -mabi=ilp32 > > Hi Richard, > > >> +  if (TARGET_IL

RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-16 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, January 16, 2025 7:11 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensi

[PATCH]middle-end: Add early break conditions to vect-switch-search-line-fast.c [PR118451]

2025-01-16 Thread Tamar Christina
Hi All, When this test was added initially it didn't add the early break effective target tests. This means that the test was "passing" (as in, it was failing to vectorize) because many targets don't support early break. But the test should not have been run for these targets. When the vectoriz

Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Tamar Christina
Re-reading again I realize I misread cache size from your question with cache line size. Cache size can be whatever yes. Cache line size must match. But that doesn't change the fact that this patch is correct. Thanks, Tamar From: Tamar Christina

RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Wednesday, January 15, 2025 3:23 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensi

RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Tamar Christina
> -Original Message- > From: Xi Ruoyao > Sent: Wednesday, January 15, 2025 1:40 PM > To: Tamar Christina ; gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > ktkac...@gcc.gnu.org; Richard Sandiford > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architec

RE: [PATCH 3/4] vect: Ensure profile consistency when adding epilog guard [PR117790]

2025-01-15 Thread Tamar Christina
Ping > -Original Message- > From: Alex Coplan > Sent: Monday, January 6, 2025 11:35 AM > To: gcc-patches@gcc.gnu.org > Cc: Richard Biener ; Jan Hubicka ; Tamar > Christina > Subject: [PATCH 3/4] vect: Ensure profile consistency when adding epilog guard > [PR11779

RE: [PATCH 2/4] cfgloopmanip: Add infrastructure for scaling of multi-exit loops [PR117790]

2025-01-15 Thread Tamar Christina
Ping > -Original Message- > From: Alex Coplan > Sent: Monday, January 6, 2025 11:35 AM > To: gcc-patches@gcc.gnu.org > Cc: Richard Biener ; Jan Hubicka ; Tamar > Christina > Subject: [PATCH 2/4] cfgloopmanip: Add infrastructure for scaling of > multi-exit >

RE: [PATCH 4/4] vect: Fix scale_profile_for_vect_loop for multiple exits [PR117790]

2025-01-15 Thread Tamar Christina
Ping > -Original Message- > From: Alex Coplan > Sent: Monday, January 6, 2025 11:36 AM > To: gcc-patches@gcc.gnu.org > Cc: Richard Biener ; Jan Hubicka ; Tamar > Christina > Subject: [PATCH 4/4] vect: Fix scale_profile_for_vect_loop for multiple exits > [PR1

RE: [PATCH 1/4] vect: Set counts of early break exit blocks correctly [PR117790]

2025-01-15 Thread Tamar Christina
Ping > -Original Message- > From: Alex Coplan > Sent: Monday, January 6, 2025 11:34 AM > To: gcc-patches@gcc.gnu.org > Cc: Richard Biener ; Jan Hubicka ; Tamar > Christina > Subject: [PATCH 1/4] vect: Set counts of early break exit blocks correctly > [PR117790

RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Tamar Christina
> -Original Message- > From: Xi Ruoyao > Sent: Wednesday, January 15, 2025 1:29 PM > To: Tamar Christina ; gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > ktkac...@gcc.gnu.org; Richard Sandiford > Subject: Re: [PATCH]AArch64: have -mcpu=native detect ar

RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Monday, January 13, 2025 8:55 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensi

[PATCH]middle-end: Fix incorrect type replacement in operands_equals [PR118472]

2025-01-15 Thread Tamar Christina
Hi All, In g:3c32575e5b6370270d38a80a7fa8eaa144e083d0 I made a mistake and incorrectly replaced the type of the arguments of an expression with the type of the expression. This is of course wrong. This reverts that change and I have also double checked the other replacements and they are fine.

RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-13 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Monday, January 13, 2025 6:35 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensi

[PATCH]AArch64: don't override march to assembler with mcpu if march is specified [PR110901]

2025-01-11 Thread Tamar Christina
Hi All, When both -mcpu and -march are specified, the value of -march wins out. This is done correctly for the calls to cc1 and for the assembler directives we put out in assembly files. However in the call to as we don't do this and instead use the arch from the cpu. This leads to a situation

[PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-11 Thread Tamar Christina
Hi All, in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for -mcpu=native on unknown CPUs to still enable architecture extensions. This has worked great but was only added for homogenous systems. However the same thing works for big.LITTLE as in such system the cores must have

RE: [PATCH][libstdc++]: backport inline keyword on std::find

2025-01-10 Thread Tamar Christina
> -Original Message- > From: Jonathan Wakely > Sent: Friday, January 10, 2025 2:36 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; libstd...@gcc.gnu.org > Subject: Re: [PATCH][libstdc++]: backport inline keyword on std::find > > On Fri, 10

[PATCH][libstdc++]: backport inline keyword on std::find

2025-01-10 Thread Tamar Christina
Hi All, This is a backport version of the same patch as https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671618.html for the release branches. I'd like to backport this to GCC 14,13 and 12 where the first regression showed up. I am however aware that GCC 12 is going to get it's last rele

[PATCH]AArch64: correct Cortex-X4 MIDR

2025-01-09 Thread Tamar Christina
Hi All, The Parts Num field for the MIDR for Cortex-X4 is wrong. It's currently the parts number for a Cortex-A720 (which does have the right number). The correct number can be found in the Cortex-X4 Technical Reference Manual [1] on page 382 in Issue Number 5. [1] https://developer.arm.com/doc

RE: [PATCH 2/2][libstdc++]: Adjust probabilities of hashmap loop conditions

2025-01-09 Thread Tamar Christina
or the element is placed in a what I assume to be crowded bucket. It does seem to be beneficial for some user defined datatypes, I assume due to some IPA shenanigans. But overall there were more and larger wins using probability of 0 rather than 1. Kind regards, Tamar From: Tamar Christina

RE: [PATCH]AArch64: Fix costing of emulated gathers/scatters [PR118188]

2025-01-09 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, January 9, 2025 3:09 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: Fix costing of emulated gathers/scatters &g

RE: [PATCH]AArch64: Fix costing of emulated gathers/scatters [PR118188]

2025-01-08 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Wednesday, January 8, 2025 10:30 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: Fix costing of emulated gathers/scatters &g

RE: [PATCH 3/4] arm, testsuite: fix arm_v8_3a_fp16_complex_neon_ok

2025-01-08 Thread Tamar Christina
> -Original Message- > From: Richard Earnshaw (lists) > Sent: Wednesday, January 8, 2025 1:18 PM > To: Christophe Lyon ; gcc-patches@gcc.gnu.org; > Richard Sandiford ; Tamar Christina > ; Andre Simoes Dias Vieira > ; ktkac...@nvidia.com; > raman...@nvidia.com &

RE: [PATCH]AArch64: Fix costing of emulated gathers/scatters [PR118188]

2025-01-07 Thread Tamar Christina
> >> i.e. we use separate address arithmetic and avoid UMOVs. Counting > >> two loads and one store for each element of the scatter store seems > >> like overkill for that. > > > > Hmm agreed.. > > > > How about for stores we increase the load counts by count / 2? > > > > This would account for th

RE: [RFC][PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS

2025-01-06 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Monday, January 6, 2025 5:54 PM > To: Jennifer Schmitz > Cc: Richard Biener ; Richard Biener > ; Tamar Christina ; > gcc-patches@gcc.gnu.org; Kyrylo Tkachov > Subject: Re: [RFC

RE: [PATCH v2 2/3] cfgexpand: Rewrite add_scope_conflicts_2 to use cache and look back further [PR111422]

2025-01-06 Thread Tamar Christina
> -Original Message- > From: Tamar Christina > Sent: Tuesday, December 31, 2024 1:04 PM > To: Richard Biener ; Andrew Pinski > > Cc: gcc-patches@gcc.gnu.org > Subject: RE: [PATCH v2 2/3] cfgexpand: Rewrite add_scope_conflicts_2 to use > cache and loo

RE: [PATCH]AArch64: Implement four and eight chunk VLA concats [PR118272]

2025-01-03 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Friday, January 3, 2025 10:59 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: Implement four and eight chunk VLA concats &g

RE: [PATCH]AArch64: Implement four and eight chunk VLA concats [PR118272]

2025-01-03 Thread Tamar Christina
> > > > How about instead doing something like: > > > > worklist.reserve (nelts); > > for (int i = 0; i < nelts; ++i) > > worklist.quick_push (force_reg (elem_mode, XVECEXP (vals, 0, i))); > > > > while (nelts > 2) > > { > > for (int i = 0; i < nelts; i += 2) > > { > >

RE: [PATCH 2/2][libstdc++]: Adjust probabilities of hashmap loop conditions

2025-01-02 Thread Tamar Christina
I’ll run the numbers with this change. Thanks, Tamar From: François Dumont Sent: Monday, December 30, 2024 5:08 PM To: Jonathan Wakely Cc: Tamar Christina ; gcc-patches@gcc.gnu.org; nd ; libstd...@gcc.gnu.org Subject: Re: [PATCH 2/2][libstdc++]: Adjust probabilities of hashmap loop condition

RE: [PATCH]AArch64: Fix costing of emulated gathers/scatters [PR118188]

2025-01-02 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, January 2, 2025 5:54 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: Fix costing of emulated gathers/scatters &g

RE: [PATCH]AArch64: Implement four and eight chunk VLA concats [PR118272]

2025-01-02 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, January 2, 2025 5:19 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: Implement four and eight chunk VLA concats &g

RE: [PATCH]AArch64: Fix costing of emulated gathers/scatters [PR118188]

2025-01-02 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, January 2, 2025 4:52 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: Fix costing of emulated gathers/scatters &g

[PATCH]AArch64: Implement four and eight chunk VLA concats [PR118272]

2025-01-02 Thread Tamar Christina
Hi All, The following testcase #pragma GCC target ("+sve") extern char __attribute__ ((simd, const)) fn3 (int, short); void test_fn3 (float *a, float *b, double *c, int n) { for (int i = 0; i < n; ++i) a[i] = fn3 (b[i], c[i]); } at -Ofast ICEs because my previous patch only a

[PATCH]AArch64: Fix costing of emulated gathers/scatters [PR118188]

2025-01-02 Thread Tamar Christina
Hi All, When a target does not support gathers and scatters the vectorizer tries to emulate these using scalar loads/stores and a reconstruction of vectors from scalar. The loads are still marked with VMAT_GATHER_SCATTER to indicate that they are gather/scatters, however the vectorizer also asks

RE: [PATCH v3] LoongArch: Implement vector cbranch optab for LSX and LASX

2024-12-31 Thread Tamar Christina
Hi, > -Original Message- > From: Jiahao Xu > Sent: Wednesday, December 25, 2024 10:00 AM > To: gcc-patches@gcc.gnu.org > Cc: xry...@xry111.site; i...@xen0n.name; chengl...@loongson.cn; > xucheng...@loongson.cn; dengjia...@loongson.cn; Jiahao Xu > > Subject: [PATCH v3] LoongArch: Implemen

RE: [PATCH v2 2/3] cfgexpand: Rewrite add_scope_conflicts_2 to use cache and look back further [PR111422]

2024-12-31 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, November 20, 2024 11:28 AM > To: Andrew Pinski > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH v2 2/3] cfgexpand: Rewrite add_scope_conflicts_2 to use > cache and look back further [PR111422] > > On Sat, Nov 16, 2024 at 5

RE: [PATCH 7/7]AArch64: Implement vector concat of partial SVE vectors

2024-12-19 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, December 19, 2024 11:03 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH 7/7]AArch64: Implement vector concat of partial SVE

RE: [PATCH 2/2][libstdc++]: Adjust probabilities of hashmap loop conditions

2024-12-18 Thread Tamar Christina
> e791e52ec329277474f3218d8a44cd37ded14ac3..8101d868d0c5f7ac4f97931a > > ffcf71d826c88094 100644 > > > --- a/libstdc++-v3/include/bits/hashtable.h > > > +++ b/libstdc++-v3/include/bits/hashtable.h > > > @@ -2171,7 +2171,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > > > if (this->_M_equals(__k,

RE: [PATCH 2/2][libstdc++]: Adjust probabilities of hashmap loop conditions

2024-12-17 Thread Tamar Christina
> On Fri, 13 Dec 2024 at 17:13, Tamar Christina wrote: > > > > Hi All, > > > > We are currently generating a loop which has more comparisons than you'd > > typically need as the probablities on the small size loop are such that it > > assumes the

RE: [PATCH 2/7]AArch64: Add SVE support for simd clones [PR96342]

2024-12-17 Thread Tamar Christina
imd_clone_adjust): Adapt safelen check to be compatible with VLA simdlen. gcc/testsuite/ChangeLog: PR target/96342 * gcc.target/aarch64/declare-simd-2.c: Add SVE clone scan. * gcc.target/aarch64/vect-simd-clone-1.c: New test. * g++.target/aarch64/vect-simd

[PATCH]Arm: [committed] fix bootstrap after MVE changes

2024-12-15 Thread Tamar Christina
Hi All, The recent commits for MVE on Saturday have broken armhf bootstrap due to a -Werror false positive: inlined from 'virtual rtx_def* {anonymous}::vstrq_scatter_base_impl::expand(arm_mve::function_expander&) const' at /gcc/config/arm/arm-mve-builtins-base.cc:352:17: ./genrtl.h:38:16: e

RE: [PATCH 7/7]AArch64: Implement vector concat of partial SVE vectors

2024-12-13 Thread Tamar Christina
> > ;; 2 element quad vector modes. > > (define_mode_iterator VQ_2E [V2DI V2DF]) > > > > @@ -1678,7 +1686,15 @@ (define_mode_attr VHALF [(V8QI "V4QI") (V16QI > "V8QI") > > (V2DI "DI")(V2SF "SF") > > (V4SF "V2SF") (V4HF "V2HF") > >

[PATCH 2/2][libstdc++]: Adjust probabilities of hashmap loop conditions

2024-12-13 Thread Tamar Christina
Hi All, We are currently generating a loop which has more comparisons than you'd typically need as the probablities on the small size loop are such that it assumes the likely case is that an element is not found. This again generates a pattern that's harder for branch predictors to follow, but al

[PATCH 1/2][libstdc++]: Add inline keyword to _M_locate

2024-12-13 Thread Tamar Christina
Hi All, In GCC 12 there was a ~40% regression in the performance of hashmap->find. This regression came about accidentally: Before GCC 12 the find function was small enough that IPA would inline it even though it wasn't marked inline. In GCC-12 an optimization was added to perform a linear sear

RE: [PATCH 2/7]AArch64: Add SVE support for simd clones [PR96342]

2024-12-11 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Wednesday, December 4, 2024 12:17 PM > To: gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > ktkac...@gcc.gnu.org; Richard Sandiford > Subject: [PATCH 2/7]AArch64: Add SVE support for simd clones

RE: [PATCH 3/7]AArch64: Disable `omp declare variant' tests for aarch64 [PR96342]

2024-12-11 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Wednesday, December 4, 2024 12:17 PM > To: gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > ktkac...@gcc.gnu.org; Richard Sandiford > Subject: [PATCH 3/7]AArch64: Disable `omp declare variant' te

RE: [PATCH 7/7]AArch64: Implement vector concat of partial SVE vectors

2024-12-11 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Wednesday, December 4, 2024 12:18 PM > To: gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > ktkac...@gcc.gnu.org; Richard Sandiford > Subject: [PATCH 7/7]AArch64: Implement vector concat of partial SV

  1   2   3   4   5   6   7   8   9   10   >