Re: [PATCH] [GCCJIT] support dynamic alloca stub

2024-11-09 Thread Schrodinger ZHU Yifan
Thank you for the quick review. Indeed, I reverted the changes to tree-ssa-ccp.cc and applied your changes. All tests still pass. Schrodinger ZHU Yifan, Ph.D. Student Computer Science Department, University of Rochester Personal Email: i...@zhuyi.fan Work Email: yifan...@rochester.edu Website:

Re: [PATCH] [GCCJIT] support dynamic alloca stub

2024-11-09 Thread Andrew Pinski
On Sat, Nov 9, 2024 at 8:30 PM Schrodinger ZHU Yifan wrote: > > This patch adds dynamic alloca stubs support to GCCJIT. > > DEF_BUILTIN_STUB only define the enum for builtins instead of > providing the type. Therefore, builtins with stub will lead to > ICE before this patch. This applies to `alloc

[PATCH] [GCCJIT] support dynamic alloca stub

2024-11-09 Thread Schrodinger ZHU Yifan
This patch adds dynamic alloca stubs support to GCCJIT. DEF_BUILTIN_STUB only define the enum for builtins instead of providing the type. Therefore, builtins with stub will lead to ICE before this patch. This applies to `alloca_with_align`, `stack_save` and `stack_restore`. This patch add special

[RFC/RFA] [PATCH v7 11/12] Replace the original CRC loops with a faster CRC calculation.

2024-11-09 Thread Mariam Arutunian
After the loop exit an internal function call (CRC, CRC_REV) is added, and its result is assigned to the output CRC variable (the variable where the calculated CRC is stored after the loop execution). The removal of the loop is left to CFG cleanup and DCE. gcc/ * gimple-crc-optimization.cc

[RFC/RFA] [PATCH v7 08/12] Add a new pass for naive CRC loops detection.

2024-11-09 Thread Mariam Arutunian
This patch adds a new compiler pass aimed at identifying naive CRC implementations, characterized by the presence of a loop calculating a CRC (polynomial long division). Upon detection of a potential CRC, the pass prints an informational message. Performs CRC optimization if optimization level is

[RFC/RFA] [PATCH v7 07/12] aarch64: Add CRC built-ins test for the target AES.

2024-11-09 Thread Mariam Arutunian
gcc/testsuite/gcc.target/aarch64/ * crc-builtin-pmul64.c: New test. Signed-off-by: Mariam Arutunian --- .../gcc.target/aarch64/crc-builtin-pmul64.c | 61 +++ 1 file changed, 61 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-builtin-pmul64.c diff --

[RFC/RFA][PATCH v7 06/12] aarch64: Implement new expander for efficient CRC computation.

2024-11-09 Thread Mariam Arutunian
This patch introduces two new expanders for the aarch64 backend, dedicated to generate optimized code for CRC computations. The new expanders are designed to leverage specific hardware capabilities to achieve faster CRC calculations, particularly using the crc32, crc32c and pmull instructions when

[RFC/RFA] [PATCH v7 04/12] RISC-V: Add CRC built-ins tests for the target ZBC.

2024-11-09 Thread Mariam Arutunian
gcc/testsuite/gcc.target/riscv/ * crc-builtin-zbc32.c: New file. * crc-builtin-zbc64.c: Likewise. Signed-off-by: Mariam Arutunian Mentored-by: Jeff Law --- .../gcc.target/riscv/crc-builtin-zbc32.c | 21 ++ .../gcc.target/riscv/crc-builtin-zbc64.c | 66 +++

[RFC/RFA] [PATCH v7 10/12] Verify detected CRC loop with symbolic execution and LFSR matching.

2024-11-09 Thread Mariam Arutunian
Symbolically execute potential CRC loops and check whether the loop actually calculates CRC (uses LFSR matching). Calculated CRC and created LFSR are compared on each iteration of the potential CRC loop. gcc/ * Makefile.in (OBJS): Add crc-verification.o. * crc-verification.cc: New file.

[RFC/RFA][PATCH v7 03/12] RISC-V: Add CRC expander to generate faster CRC.

2024-11-09 Thread Mariam Arutunian
If the target is ZBC or ZBKC, it uses clmul instruction for the CRC calculation. Otherwise, if the target is ZBKB, generates table-based CRC, but for reversing inputs and the output uses bswap and brev8 instructions. Add new tests to check CRC generation for ZBC, ZBKC and ZBKB targets. gcc/

[RFC/RFA][PATCH v7 05/12] i386: Implement new expander for efficient CRC computation.

2024-11-09 Thread Mariam Arutunian
This patch introduces two new expanders for the i386 backend, dedicated to generating optimized code for CRC computations. The new expanders are designed to leverage specific hardware capabilities to achieve faster CRC calculations, particularly using the pclmulqdq or crc32 instructions when suppor

[RFC/RFA][PATCH v7 02/12] Add built-ins and tests for bit-forward and bit-reversed CRCs.

2024-11-09 Thread Mariam Arutunian
This patch introduces new built-in functions to GCC for computing bit-forward and bit-reversed CRCs. These builtins aim to provide efficient CRC calculation capabilities. When the target architecture supports CRC operations (as indicated by the presence of a CRC optab), the builtins will utilize th

[RFC/RFA] [PATCH v7 01/12] Implement internal functions for efficient CRC computation.

2024-11-09 Thread Mariam Arutunian
Add two new internal functions (IFN_CRC, IFN_CRC_REV), to provide faster CRC generation. One performs bit-forward and the other bit-reversed CRC computation. If CRC optabs are supported, they are used for the CRC computation. Otherwise, table-based CRC is generated. The supported data and CRC sizes

[RFC/RFA][PATCH v7 00/12] CRC optimization.

2024-11-09 Thread Mariam Arutunian
Hello, This patch series is a revised version of the following: https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668229.html. In this version: - Patch 09/12 has been updated with comments provided by Matevos, which were missing in the previously submitted series. - Patch 06/12 in

Re: [PATCH 10/15] Support for 64-bit location_t: C++ modules parts

2024-11-09 Thread Lewis Hyatt
On Tue, Nov 5, 2024 at 10:57 AM Jason Merrill wrote: > > OK. > > On 11/3/24 5:22 PM, Lewis Hyatt wrote: > > The modules implementation is necessarily sensitive to the internal workings > > of class line_map, and so it needed changes in order to handle a 64-bit > > location_t. The changes mostly bo

[RFC/RFA] [PATCH v6 09/12] Add symbolic execution support.

2024-11-09 Thread Mariam Arutunian
Gives an opportunity to execute the code on bit level, assigning symbolic values to the variables which don't have initial values. Supports only CRC specific operations. Example: uint8_t crc; uint8_t pol = 1; crc = crc ^ pol; during symbolic execution crc's value will be: crc(8), crc(7), ... crc

[RFC/RFA][PATCH v6 06/12] aarch64: Implement new expander for efficient CRC computation.

2024-11-09 Thread Mariam Arutunian
This patch introduces two new expanders for the aarch64 backend, dedicated to generate optimized code for CRC computations. The new expanders are designed to leverage specific hardware capabilities to achieve faster CRC calculations, particularly using the crc32, crc32c and pmull instructions when

[RFC/RFA] [PATCH v6 11/12] Replace the original CRC loops with a faster CRC calculation.

2024-11-09 Thread Mariam Arutunian
After the loop exit an internal function call (CRC, CRC_REV) is added, and its result is assigned to the output CRC variable (the variable where the calculated CRC is stored after the loop execution). The removal of the loop is left to CFG cleanup and DCE. gcc/ * gimple-crc-optimization.cc

[RFC/RFA] [PATCH v6 08/12] Add a new pass for naive CRC loops detection.

2024-11-09 Thread Mariam Arutunian
This patch adds a new compiler pass aimed at identifying naive CRC implementations, characterized by the presence of a loop calculating a CRC (polynomial long division). Upon detection of a potential CRC, the pass prints an informational message. Performs CRC optimization if optimization level is

[RFC/RFA][PATCH v6 05/12] i386: Implement new expander for efficient CRC computation.

2024-11-09 Thread Mariam Arutunian
This patch introduces two new expanders for the i386 backend, dedicated to generating optimized code for CRC computations. The new expanders are designed to leverage specific hardware capabilities to achieve faster CRC calculations, particularly using the pclmulqdq or crc32 instructions when suppor

[RFC/RFA] [PATCH v6 10/12] Verify detected CRC loop with symbolic execution and LFSR matching.

2024-11-09 Thread Mariam Arutunian
Symbolically execute potential CRC loops and check whether the loop actually calculates CRC (uses LFSR matching). Calculated CRC and created LFSR are compared on each iteration of the potential CRC loop. gcc/ * Makefile.in (OBJS): Add crc-verification.o. * crc-verification.cc: New file.

[RFC/RFA] [PATCH v6 07/12] aarch64: Add CRC built-ins test for the target AES.

2024-11-09 Thread Mariam Arutunian
gcc/testsuite/gcc.target/aarch64/ * crc-builtin-pmul64.c: New test. Signed-off-by: Mariam Arutunian --- .../gcc.target/aarch64/crc-builtin-pmul64.c | 61 +++ 1 file changed, 61 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-builtin-pmul64.c diff --

[RFC/RFA] [PATCH v6 04/12] RISC-V: Add CRC built-ins tests for the target ZBC.

2024-11-09 Thread Mariam Arutunian
gcc/testsuite/gcc.target/riscv/ * crc-builtin-zbc32.c: New file. * crc-builtin-zbc64.c: Likewise. Signed-off-by: Mariam Arutunian Mentored-by: Jeff Law --- .../gcc.target/riscv/crc-builtin-zbc32.c | 21 ++ .../gcc.target/riscv/crc-builtin-zbc64.c | 66 +++

[RFC/RFA][PATCH v6 03/12] RISC-V: Add CRC expander to generate faster CRC.

2024-11-09 Thread Mariam Arutunian
If the target is ZBC or ZBKC, it uses clmul instruction for the CRC calculation. Otherwise, if the target is ZBKB, generates table-based CRC, but for reversing inputs and the output uses bswap and brev8 instructions. Add new tests to check CRC generation for ZBC, ZBKC and ZBKB targets. gcc/

[RFC/RFA][PATCH v6 02/12] Add built-ins and tests for bit-forward and bit-reversed CRCs.

2024-11-09 Thread Mariam Arutunian
This patch introduces new built-in functions to GCC for computing bit-forward and bit-reversed CRCs. These builtins aim to provide efficient CRC calculation capabilities. When the target architecture supports CRC operations (as indicated by the presence of a CRC optab), the builtins will utilize th

[RFC/RFA] [PATCH v6 01/12] Implement internal functions for efficient CRC computation.

2024-11-09 Thread Mariam Arutunian
Add two new internal functions (IFN_CRC, IFN_CRC_REV), to provide faster CRC generation. One performs bit-forward and the other bit-reversed CRC computation. If CRC optabs are supported, they are used for the CRC computation. Otherwise, table-based CRC is generated. The supported data and CRC sizes

[RFC/RFA][PATCH v6 00/12] CRC optimization.

2024-11-09 Thread Mariam Arutunian
Hello, This patch series is a revised version of the following: https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665855.html . I have addressed the feedback on the emit_crc function in patch 01/12, and Matevos has provided additional comments for patch 09/12. Thanks, Mariam

Re: [PATCH V2 1/11] Add rs6000 architecture masks.

2024-11-09 Thread Peter Bergner
On 11/8/24 5:12 PM, Segher Boessenkool wrote: > On Fri, Nov 08, 2024 at 02:28:11PM -0600, Peter Bergner wrote: >> On 11/8/24 1:44 PM, Michael Meissner wrote: >>> diff --git a/gcc/config/rs6000/rs6000-arch.def >>> b/gcc/config/rs6000/rs6000-arch.def >>> new file mode 100644 >>> index 000..e

Re: [PATCH v2] c: add Wzero-as-null-pointer-constant [PR117059]

2024-11-09 Thread Alejandro Colomar
Hi Martin, On Sat, Nov 09, 2024 at 06:15:35PM GMT, Martin Uecker wrote: > > This patch enables the Wzero-as-null-pointer-constant for C. > The second version added more tests and fixes one condition > to not incorrectly warn for nullptr. > > > Bootstrapped and regression tested on x86_64. > >

[PATCH v2] c: add Wzero-as-null-pointer-constant [PR117059]

2024-11-09 Thread Martin Uecker
This patch enables the Wzero-as-null-pointer-constant for C. The second version added more tests and fixes one condition to not incorrectly warn for nullptr. Bootstrapped and regression tested on x86_64. c: add Wzero-as-null-pointer-constant [PR117059] Add warnings for the use o

[patch, Fortran] Reject UNSIGNED for COMPLEX

2024-11-09 Thread Thomas Koenig
Hello world, the attached patch rejects UNSIGNED arguments for the COMPLEX function, which is an extension. It also documents CMPLX, INT and REAL as taking UNSIGNED arguments. Regression-tested. OK for trunk? Best regards Thomas gcc/fortran/ChangeLog: * check.cc (gfc_check_c

[pushed] Darwin: Support '-ObjC{, ++}' as shorthand for -xobjective-c{, ++} [PR117478].

2024-11-09 Thread Iain Sandoe
Tested on x86_64-darwin19, 21 and on x86_64-linux, pushed to trunk, thanks, Iain --- 8< --- This improves compatibility with clang, and is used by some projects. PR target/117478 gcc/ChangeLog: * config/darwin-driver.cc (darwin_driver_init): Handle ObjC/ObjC++ * config/

[committed] contrib: Add 2 further ignored commits

2024-11-09 Thread Jakub Jelinek
Hi! r15-4998 and r15-5004 had wrong commit message, add those to ignored commits. ChangeLog will need to be added manually. The patch below is what I've committed before gcc_update_version and attached patch after it to fix this up. 2024-11-09 Jakub Jelinek * gcc-changelog/git_updat

Re: [PATCH] m2: Fix up dependencies some more

2024-11-09 Thread Gaius Mulley
Jakub Jelinek writes: > Hi! > > Anyway, bootstrapped/regtested successfully on x86_64-linux and i686-linux, > ok for trunk? That doesn't mean all dependencies are correct, just that > this change didn't make things worse. > > 2024-11-06 Jakub Jelinek > > gcc/m2/ > * Make-lang.in (m2_OBJ

Re: [PATCH] c: add Wzero-as-null-pointer-constant [PR117059]

2024-11-09 Thread Alejandro Colomar
Hi Martin, > +void foo(void*); > +void bar() > +{ > + foo(0); /* { dg-warning "zero as null pointer > constant" } */ > + foo(NULL); > + > + void *p = 0;/* { dg-warning "zero as null pointer > constant" } */ > + void *r = NULL; > +

[PATCH] c: add Wzero-as-null-pointer-constant [PR117059]

2024-11-09 Thread Martin Uecker
This patch enables the Wzero-as-null-pointer-constant for C. Bootstrapped and regression tested on x86_64. c: add Wzero-as-null-pointer-constant [PR117059] Add warnings for the use of zero as a null pointer constant to the C FE. PR c/117059 gcc/c-family/

Re: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-09 Thread Jeff Law
On 11/7/24 4:34 PM, Li, Pan2 wrote: Thanks Tamar and Jeff for comments. I'm not sure it's that simple. It'll depend on the micro-architecture. So things like strength of the branch predictors, how fetch blocks are handled (can you have embedded not-taken branches, short-forward-branch optim

Re: [PATCH 1/4] openmp: Tune omp_max_vf for offload targets

2024-11-09 Thread Thomas Schwinge
Hi Andrew! On 2024-11-06T15:27:19+, Andrew Stubbs wrote: > If requested, return the vectorization factor appropriate for the offload > device, if any. > --- a/gcc/omp-general.cc > +++ b/gcc/omp-general.cc > @@ -987,10 +987,11 @@ find_combined_omp_for (tree *tp, int *walk_subtrees, > void *d

Re: [PATCH v2] arm: Don't ICE on arm_mve.h pragma without MVE types [PR117408]

2024-11-09 Thread Torbjorn SVENSSON
On 2024-11-08 18:44, Christophe Lyon wrote: On Thu, 7 Nov 2024 at 18:05, Torbjörn SVENSSON wrote: Changes since v1: - Updated the error message to mention that arm_mve_types.h needs to be included. - Corrected some spelling errors in commit message. As the warning for pure functions re

[patch,avr] Fix PR117500: Don't ICE on invalid asm operands.

2024-11-09 Thread Georg-Johann Lay
This patch avoids an internal compiler error when a %i gets an operand that's not valid for %i. It uses output_operand_lossage that outputs an ordinary error. Ok to apply? Johann -- AVR: target/117500 - Use output_operand_lossage in avr_print_operand. PR target/117500 gcc/ *

Re: [PATCH v7] Target-independent store forwarding avoidance.

2024-11-09 Thread Konstantinos Eleftheriou
Hi, thanks for the feedback! It turned out to be an endianness issue and we needed to treat the call to `store_bit_field` differently for machines with BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN, like H8. I've sent a new version (https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668214.html). Than

[PATCH v8] Target-independent store forwarding avoidance.

2024-11-09 Thread Konstantinos Eleftheriou
From: kelefth This pass detects cases of expensive store forwarding and tries to avoid them by reordering the stores and using suitable bit insertion sequences. For example it can transform this: strbw2, [x1, 1] ldr x0, [x1] # Expensive store forwarding to larger load. To

[PATCH v2 3/3] c++/modules: Prevent ICE when writing class-scope lambdas without mangling scope [PR116568]

2024-11-09 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? Alternatively, after this I'll work on an update of my P1815 (TU-local entities) patch series [1] which would also solve this ICE by erroring early due to attempting to emit a TU-local entity. As discussed in patch #1 I believe this

[PATCH v2 2/3] c++: Update mangling of lambdas in expressions

2024-11-09 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? -- >8 -- https://github.com/itanium-cxx-abi/cxx-abi/pull/85 clarifies that mangling a lambda expression should use 'L' rather than "tl". gcc/cp/ChangeLog: * mangle.cc (write_expression): Update mangling for lambdas. gcc/t

[PATCH v2 1/3] c++: Fix mangling of otherwise unattached class-scope lambdas [PR107741]

2024-11-09 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? Given that this doesn't actually fix the modules PR c++/116568 anymore I've pulled my workaround for that out as a separate patch (3/3). -- >8 -- This is a step closer to implementing the suggested changes for https://github.com/it