Re: [PATCH] i386: Utilize VCOMSBF16 for BF16 Comparisons with AVX10.2

2024-11-03 Thread Hongtao Liu
On Fri, Nov 1, 2024 at 8:33 AM Hongyu Wang wrote: > > From: Levy Hsu > > This patch enables the use of the VCOMSBF16 instruction from AVX10.2 for > efficient BF16 comparisons. > > Bootstrapped & regtested on x86-64-pc-linux-gnu. > Ok for trunk? Ok. > > gcc/ChangeLog: > > * config/i386/i38

Re: [PATCH v3 7/8] i386: Add else operand to masked loads.

2024-11-03 Thread Hongtao Liu
On Sat, Nov 2, 2024 at 8:58 PM Robin Dapp wrote: > > From: Robin Dapp > > This patch adds a zero else operand to masked loads, in particular the > masked gather load builtins that are used for gather vectorization. > > gcc/ChangeLog: > > * config/i386/i386-expand.cc (ix86_expand_special_a

[RFC][PATCH] RISC-V: Support TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN

2024-11-03 Thread Zhijin Zeng
I can't find the vector function name mangling of risc-v, so in order to support TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN, TARGET_SIMD_CLONE_ADJUST and TARGET_SIMD_CLONE_USABLE, I add risc-v vector function mangling rules as follow:     _ZGVNv_     'x' is the LMUL, if the LMUL is 1/2/4/

Re: [Ping][PATCH v2 00/12] AArch64/OpenMP: Test SVE ACLE types with various OpenMP constructs.

2024-11-03 Thread Tejas Belagod
Ping. Thanks, Tejas. On 10/18/24 11:59 AM, Tejas Belagod wrote: Hi Jakub, Just wanted to add that I'm sorry for the delay in respinning the patchset - I was caught up with another piece of work. Thanks for the reviews so far and thank you for your patience. Thanks, Tejas. On 10/18/24 11:5

[PATCH v2] aarch64: Optimise calls to ldexp with SVE FSCALE instruction [PR111733]

2024-11-03 Thread Soumya AR
Changes since v1: This revision makes use of the extended definition of aarch64_ptrue_reg to generate predicate registers with the appropriate set bits. Earlier, there was a suggestion to add support for half floats as well. I extended the patch to include HFs but GCC still emits a libcall for ld

Re: [PATCH 00/15] Support for 64-bit location_t

2024-11-03 Thread Nathaniel Shead
On Sun, Nov 03, 2024 at 05:22:05PM -0500, Lewis Hyatt wrote: > Hello- > > There is no shortage of PRs complaining about things that go wrong when the > line_maps data structure in libcpp starts to run into its limits. Being > restricted to a 32-bit location_t to cover all source locations means it

[PATCH 15/15] Support for 64-bit location_t: Configury parts

2024-11-03 Thread Lewis Hyatt
Add --enable-large-source-locations (off by default for now) to enable 64-bit location_t. gcc/ChangeLog: * configure.ac: Add --enable-large-source-locations. * config.in: Regenerate. * configure: Regenerate. * doc/install.texi: Document the new option. libcpp/Chan

[PATCH 13/15] Support for 64-bit location_t: Internal parts

2024-11-03 Thread Lewis Hyatt
Several of the selftests in diagnostic-show-locus.cc and input.cc are sensitive to linemap internals. Adjust them here so they will support 64-bit location_t if configured. Likewise, handle 64-bit location_t in the support for -fdump-internal-locations. As was done with the analyzer, convert to (u

[PATCH 11/15] Support for 64-bit location_t: RTL parts

2024-11-03 Thread Lewis Hyatt
Some RTL objects need to store a location_t. Currently, they store it in the rt_int field of union rtunion, but in a world where location_t could be 64-bit, they need to store it in a larger variable. Unfortunately, rtunion does not currently have a 64-bit int type for that purpose, so add one. In

[PATCH 07/15] Support for 64-bit location_t: toplev parts

2024-11-03 Thread Lewis Hyatt
The recommended bits reserved in a line_map to store ranges has always been 5, meaning that identifiers up to length 32 can be stored without generating an ad-hoc location. When 64-bit location_t is configured, there are plenty of bits to go around, and so the recommended default is larger. line-ma

[PATCH 10/15] Support for 64-bit location_t: C++ modules parts

2024-11-03 Thread Lewis Hyatt
The modules implementation is necessarily sensitive to the internal workings of class line_map, and so it needed changes in order to handle a 64-bit location_t. The changes mostly boil down to supporting that in the debug dumping routines (which is accomplished by using a new custom code %K for tha

[PATCH 04/15] tree-phinodes: Use 4 instead of 2 as the minimum number of phi args

2024-11-03 Thread Lewis Hyatt
Currently, when we allocate a gphi object, we round up the capacity for the trailing arguments array such that it will make full use of the page size that ggc will allocate. While there is also an explicit minimum of 2 arguments, in practice after rounding to the ggc page size there is always room

[PATCH 14/15] Support for 64-bit location_t: Testsuite parts

2024-11-03 Thread Lewis Hyatt
Add support to the testsuite for effective target "large_location_t" indicating if 64-bit location support has been configured. Adjust the tests that are sensitive to location_t internals so they can test large locations too. gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_no_co

[PATCH 01/15] Support for 64-bit location_t: libcpp parts

2024-11-03 Thread Lewis Hyatt
This patch adds support in the libcpp line-maps infrastructure for a 64-bit location_t type that is capable of representing more source locations than the current 32-bit location_t. The support will be made configurable at build time in a subsequent patch; for now it will be off by default. libcpp

[PATCH 09/15] Support for 64-bit location_t: Frontend parts

2024-11-03 Thread Lewis Hyatt
The C/C++ frontend code contains a couple instances where a callback receiving a "location_t" argument is prototyped to take "unsigned int" instead. This will make a difference once location_t can be configured to a different type, so adjust that now. gcc/c-family/ChangeLog: * c-lex.cc (c

[PATCH 12/15] Support for 64-bit location_t: Backend parts

2024-11-03 Thread Lewis Hyatt
A few targets have been using "unsigned int" function arguments that need to receive a "location_t". Change to "location_t" to prepare for the possibility that location_t can be configured to be a different type. gcc/ChangeLog: * config/aarch64/aarch64-c.cc (aarch64_resolve_overloaded_bui

[PATCH 06/15] gimple: Handle tail padding when computing gimple_ops_offset

2024-11-03 Thread Lewis Hyatt
The array gimple_ops_offset_[], which is used to find the trailing op[] array for a given gimple struct, is computed assuming that op[] will be found at sizeof(tree) bytes away from the end of the struct. This is only correct if the alignment requirement of a pointer is the same as the alignment re

[PATCH 05/15] c++: Fix tree_contains_struct for TRAIT_EXPR

2024-11-03 Thread Lewis Hyatt
CODE_CONTAINS_STRUCT () currently reports that a TRAIT_EXPR contains a TS_EXP struct, but it does not actually start with a TS_EXP as an initial sequence. In modules.cc, when we stream out a tree, we explicitly check for the TS_EXP case and call note_location(t->exp.locus) if so. Currently, this ac

[PATCH 08/15] Support for 64-bit location_t: Analyzer parts

2024-11-03 Thread Lewis Hyatt
The analyzer occasionally prints internal location_t values for debugging; adjust those parts so they will work if location_t is 64-bit. For simplicity, to avoid hassling with the printf format string, just convert to (unsigned long long) in either case. gcc/analyzer/ChangeLog: * checker-

Re: [PATCH] cgraph: remove dead if stmt in build_cgraph_edges pass

2024-11-03 Thread Josef Melcr
Hi, I've looked at the statements slipping through and they are all atomic loads and stores. What I find strange is these statements don't get expanded only in tests for errors, when the code is not supposed to compile. In all other cases all statements get expanded just fine, including atomi

[PATCH 03/15] tree-cfg: Fix call to next_discriminator_for_locus()

2024-11-03 Thread Lewis Hyatt
While testing 64-bit location_t support, I ran into an -fcompare-debug issue that was traced back here. Despite the name, next_discriminator_for_locus() is meant to take an integer line number argument, not a location_t. There is one call site which has been passing a location_t instead. For the mo

[PATCH 02/15] libcpp: Fix potential unaligned access in cpp_buffer

2024-11-03 Thread Lewis Hyatt
libcpp makes use of the cpp_buffer pfile->a_buff to store things while it is handling macros. It uses it to store pointers (cpp_hashnode*, for macro arguments) and cpp_macro objects. This works fine because a cpp_hashnode* and a cpp_macro have the same alignment requirement on either 32-bit or 64-b

[PATCH 00/15] Support for 64-bit location_t

2024-11-03 Thread Lewis Hyatt
Hello- There is no shortage of PRs complaining about things that go wrong when the line_maps data structure in libcpp starts to run into its limits. Being restricted to a 32-bit location_t to cover all source locations means it has the following limitations, among others: -Column numbers larg

Re: [PATCH] match: Fix `a != 0 ? a - 1 : 0` pattern [PR117363]

2024-11-03 Thread Andrew Pinski
On Sun, Nov 3, 2024 at 12:01 PM Eric Botcazou wrote: > > > --- a/gcc/match.pd > > +++ b/gcc/match.pd > > @@ -3396,10 +3396,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > simplify (X != 0 ? X + ~0 : 0) to (X - X != 0). */ > > The rightmost ( in the comment should be moved 2 tokens right. Yes

Re: [PATCH] match: Fix `a != 0 ? a - 1 : 0` pattern [PR117363]

2024-11-03 Thread Eric Botcazou
> --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -3396,10 +3396,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > simplify (X != 0 ? X + ~0 : 0) to (X - X != 0). */ The rightmost ( in the comment should be moved 2 tokens right. -- Eric Botcazou

Re: [PATCH] Add COBOL to gcc (was: Add 'cobol' to Makefile.def)

2024-11-03 Thread James K. Lowden
On Fri, 1 Nov 2024 20:27:21 +0100 Jakub Jelinek wrote: > The second (which can only be committed when the first one is manually > picked by one of gccadmins) should be just adding the ChangeLog. ... > see e.g. https://gcc.gnu.org/r13-4591 The technique here novel to me. Rather than a diff from

Re: [PATCH] Add COBOL to gcc

2024-11-03 Thread James K. Lowden
On Fri, 1 Nov 2024 20:27:21 +0100 Jakub Jelinek wrote: > the first patch should be like > https://gcc.gnu.org/r13-4588 > or > https://gcc.gnu.org/r13-4589 > adding support to contrib/gcc-changelog/git_commit.py > This commit should have a ChangeLog entry in the commit message, see > the above two

Re: [PATCH 1/1] Unify registered_pp_pragmas and registered_pragmas

2024-11-03 Thread Paul Iannetta
On Fri, Nov 01, 2024 at 11:45:07AM -0400, Jason Merrill wrote: > On 10/31/24 6:43 AM, Paul Iannetta wrote: > > gcc/c-family/ChangeLog: > > > > * c-pragma.cc (struct pragma_pp_data): Use (struct > > internal_pragma_handler); > > (c_register_pragma_1): Always register name and space for all

[PATCHv2 0/3] ada: Add GNU/Hurd x86_64 support

2024-11-03 Thread Samuel Thibault
I reworked the patch to factorize the bsd signal definitions. I have split off the system definitions because the priority range of GNU/Mach has diverged from the original BSD kernels. Samuel Thibault (3): ada: Factorize bsd signal definitions ada: Fix GNU/Hurd priority range ada: Add GNU/Hu

[PATCHv2 2/3] ada: Fix GNU/Hurd priority range

2024-11-03 Thread Samuel Thibault
GNU/Mach currently uses a 0..63 range. gcc/ada/ChangeLog: * libgnat/system-gnu.ads: New file. * Makefile.rtl (x86-gnuhurd): Use libgnat/system-gnu.ads instead of libgnat/system-freebsd.ads. Signed-off-by: Samuel Thibault --- gcc/ada/ChangeLog | 4 + gcc/a

[PATCHv2 3/3] ada: Add GNU/Hurd x86_64 support

2024-11-03 Thread Samuel Thibault
This is essentially the same as the i386-pc-gnu section, the differences are the same as between freebsd i386 and freebsd x86_64. gcc/ada/ChangeLog: * Makefile.rtl: Add x86_64-pc-gnu section. Signed-off-by: Samuel Thibault --- gcc/ada/ChangeLog| 2 ++ gcc/ada/Makefile.rtl | 32 +++

[PATCHv2 1/3] ada: Factorize bsd signal definitions

2024-11-03 Thread Samuel Thibault
They are all the same on all BSD-like systems (including GNU/Hurd). gcc/ada/ChangeLog: * libgnarl/a-intnam__freebsd.ads: Rename to... * libgnarl/a-intnam__bsd.ads: ... new file. * libgnarl/a-intnam__dragonfly.ads: Remove file. * Makefile.rtl (x86-kfreebsd, x86-gnuh

Re: [PATCH] match: Fix `a != 0 ? a - 1 : 0` pattern [PR117363]

2024-11-03 Thread Richard Biener
On Thu, Oct 31, 2024 at 4:44 PM Andrew Pinski wrote: > > There are a couple of things wrong with this pattern which > I missed during the review. First each nop_convert should > be nop_convert1 or nop_convert2. > Second is we need to the minus in the same type as the minus > was originally so we d

Re: [PATCH 2/3] Add one more argument to simulate_builtin_function_decl.

2024-11-03 Thread Richard Biener
On Fri, Nov 1, 2024 at 9:22 AM KuanLin Chen wrote: > > simulate_builtin_function_decl may return decl that be ggc_freed already > in pushdecl when duplicate_decls is true. It shouldn't do that. It should either return the duplicate or NULL, so this is definitely not a good fix - well, it's to th

Re: [PATCH] docs: Document that __builtin_assoc_barrier also can be used for FMAs [PR115023]

2024-11-03 Thread Richard Biener
On Sun, Nov 3, 2024 at 7:41 AM Andrew Pinski wrote: > > I noticed that __builtin_assoc_barrier makes a differnce for FMAs formation > but it was not documented. This adds that documentation even with a small > example. > > Build the HTML documents to make sure everything looks correct. OK. Rich

Re: [PATCH #6/7] ifcombine across noncontiguous blocks

2024-11-03 Thread Richard Biener
On Sat, Nov 2, 2024 at 8:39 AM Alexandre Oliva wrote: > > On Oct 30, 2024, Richard Biener wrote: > > > I think since you make the outer condition the short-circuiting one what's > > in > > the inner block isn't executed when it wasn't before the transform? So in > > fact you shouldn't need to p