On Thu, Nov 7, 2024 at 1:40 PM H.J. Lu wrote:
>
> On Sat, Nov 2, 2024 at 6:48 AM H.J. Lu wrote:
> >
> > On Sat, Oct 26, 2024 at 7:25 AM H.J. Lu wrote:
> > >
> > > On Sun, Oct 20, 2024 at 6:42 AM H.J. Lu wrote:
> > > >
> > > > On Sun, Oct 13, 2024, 10:07 AM H.J. Lu wrote:
> > > >>
> > > >> Adju
Pre-approved for that change, so you don't need to wait for another response :)
Just a reminder that this requires either adding a new exp file or
adding a few new lines in riscv.exp.
On Thu, Nov 14, 2024 at 3:28 PM Li, Pan2 wrote:
>
> Make sense and sure thing, let me file another patch for thi
Make sense and sure thing, let me file another patch for this.
Pan
-Original Message-
From: Kito Cheng
Sent: Thursday, November 14, 2024 3:22 PM
To: 钟居哲
Cc: Li, Pan2 ; gcc-patches ;
jeffreyalaw ; Robin Dapp
Subject: Re: [PATCH v1] RISC-V: Rearrange the test files for scalar SAT_ADD
Hi Pan:
Could you create a sub folder in RISC-V to contain all saturation
related testcase?
e.g. gcc/testsuite/gcc.target/riscv/sat/
On Thu, Nov 14, 2024 at 2:48 PM 钟居哲 wrote:
>
> LGTM
>
>
> juzhe.zh...@rivai.ai
>
>
> From: pan2.li
> Date: 2024-11-14 14:42
> To:
From: Pan Li
The test files of scalar SAT_ADD only has numbers as the suffix.
Rearrange the file name to -{form number}-{target-type}. For example,
test form 3 for uint32_t SAT_ADD will have -3-u32.c for asm check and
-run-3-u32.c for the run test.
The below test suites are passed for this patc
Sounds like a very good idea.
Moreover friend declaration could be limited to another _Hashtable<>
type with same _Key, _Value and _Alloc types to be compatible.
On 08/11/2024 11:33, Jonathan Wakely wrote:
On Thu, 7 Nov 2024 at 22:18, Jonathan Wakely wrote:
I realised that _M_merge_unique an
On Wed, Nov 13, 2024 at 10:00 AM Hongyu Wang wrote:
>
> Hi,
>
> For cstorebf4 it uses comparison_operator for BFmode compare, which is
> incorrect when directly uses ix86_expand_setcc as it does not canonicalize
> the input comparison to correct the compare code by swapping operands.
> Since the o
Hi all,
In GCC13, the error for GCC14+ is actually a warning for the pointer type.
Correct that in testcase.
Commit as obvious.
Thx,
Haochen
gcc/testsuite/ChangeLog:
* gcc.target/i386/cmpccxadd-1b.c: Change to dg-warning.
---
gcc/testsuite/gcc.target/i386/cmpccxadd-1b.c | 4 ++--
1 fi
gcc/ChangeLog:
* config/i386/i386-expand.cc (ix86_expand_int_cfmovcc): Expand
to cfcmov pattern.
* config/i386/i386-opts.h (enum apx_features): New.
* config/i386/i386-protos.h (ix86_expand_int_cfmovcc): Define.
* config/i386/i386.cc (ix86_rtx_costs): Add U
Hi,
Many thanks to Richard for the suggestion that conditional load is like a
scalar instance of maskload_optab . So this version has use maskload and
maskstore optab to expand and generate cfcmov in ifcvt pass.
All the changes passed bootstrap & regtest x86-64-pc-linux-gnu.
We also tested spec
LGTM.
juzhe.zh...@rivai.ai
From: Robin Dapp
Date: 2024-11-14 00:57
To: gcc-patches
CC: pal...@dabbelt.com; kito.ch...@gmail.com; juzhe.zh...@rivai.ai;
jeffreya...@gmail.com; pan2...@intel.com; rdapp@gmail.com
Subject: [PATCH] RISC-V: Add VLS modes to strided loads.
Hi,
this patch adds V
On Wed, Nov 13, 2024 at 5:06 AM Jovan Vukic wrote:
>
> The patch simplifies expressions (a - 1) & -a, (a - 1) | -a, and (a - 1) ^ -a
> to the constants 0, -1, and -1, respectively.
>
> Currently, GCC does not perform these simplifications.
>
> Bootstrapped and tested on x86-linux-gnu with no regre
On 11/13/24 2:26 PM, Harald Anlauf wrote:
Dear all,
the attached patch is the third part of a series to fix the handling of
NULL() passed to pointer dummy arguments. This one addresses character
dummy arguments (scalar, assumed-shape, assumed-rank) for various
uses in the caller.
The patch is
On Wed, Nov 13, 2024 at 07:03:44PM +, Richard Sandiford wrote:
> Andrew Carlotti writes:
> > On Tue, Nov 12, 2024 at 10:42:50PM +, Richard Sandiford wrote:
> >> Sorry for the slow review. I think Jeff's much better placed to comment
> >> on this than I am, but here's a stab. Mostly it lo
On Wed, Nov 13, 2024 at 5:14 AM Jovan Vukic wrote:
>
> The patch makes the following simplifications:
> ((X - 1) & ~X) < 0 -> X == 0
> ((X - 1) & ~X) >= 0 -> X != 0
>
> On x86, the number of instructions is reduced from 4 to 3,
> but on platforms like RISC-V, it reduces to a single instruction.
>
Indu Bhagat writes:
> Store Allocation Tags (st2g) is an Armv8.5-A memory tagging (MTE)
> instruction. It stores an allocation tag to two tag granules of memory.
>
> TBD:
> - Not too sure what is the best way to generate the st2g yet; A
> subsequent patch will emit them in one of the target
On Thu, Nov 7, 2024 at 1:41 PM Indu Bhagat wrote:
>
> subg (Subtract with Tag) is an Armv8.5-A memory tagging (MTE)
> instruction. It can be used to subtract an immediate value scaled by
> the tag granule from the address in the source register.
>
> gcc/ChangeLog:
>
> * config/aarch64/aar
Indu Bhagat writes:
> subg (Subtract with Tag) is an Armv8.5-A memory tagging (MTE)
> instruction. It can be used to subtract an immediate value scaled by
> the tag granule from the address in the source register.
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64.md (subg): New definition.
T
This patch seems to have been over looked.
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663101.html
I ran a set of spec 2017 benchmarks with this patch applied and compared it to
a run without the patch applied. There were no regressions, but 3 benchmarks
had slight improvement in ru
Dear all,
the attached patch is the third part of a series to fix the handling of
NULL() passed to pointer dummy arguments. This one addresses character
dummy arguments (scalar, assumed-shape, assumed-rank) for various
uses in the caller.
The patch is a little larger than I expected, due to corn
Note, in the V2 patch series, I forgot to post this patch.
As part of the architecture flags patches, this patch changes the use of
TARGET_CMPB to TARGET_POWER6. The CMPB instruction was added in power6 (ISA
2.05).
I have built both big endian and little endian bootstrap compilers and there
were
We recently forced -Werror when building libgcc for aarch64, to make
sure we'd catch and fix the kind of problem described in the PR.
In this case, when building for aarch64_be (so, big endian), gcc emits
this warning/error:
libgcc/config/libbid/bid_conf.h:847:25: error: missing braces around
ini
On 10/30/24 11:31 AM, David Faust wrote:
Translate DW_TAG_GNU_annotation DIEs created for C attributes
btf_decl_tag and btf_type_tag into an in-memory representation in the
CTF/BTF container. They will be output in BTF as BTF_KIND_DECL_TAG and
BTF_KIND_TYPE_TAG records.
The new CTF kinds used t
This patch seems to have been overlooked:
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/666393.html
--
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com
On Tue, 12 Nov 2024, Richard Sandiford wrote:
Evgeny Karpov writes:
Hello,
Thank you for reviewing v2!
v3 addresses all comments on v2.
Changes in v3:
- Refactor implementation for the offset limit extension in
"symbol + offset" from 1MB to 16MB.
- Apply HOST_WIDE_INT_PRINT_UNSIGNED in ASM
I've pushed this now.
On Wed, 6 Nov 2024 at 15:50, Jonathan Wakely wrote:
>
> This attempts to simplify and clean up our std::hash code. The primary
> benefit is improved diagnostics for users when they do something wrong
> involving std::hash or unordered containers. An additional benefit is
> t
I've pushed this series now.
On Fri, 8 Nov 2024 at 15:46, Jonathan Wakely wrote:
>
> This patch series attempts to remove some unnecessary complexity in the
> internals of std::unordered_xxx containers. There is a lot of overloading, tag
> dispatching, and inheritance that can be removed by using
This patch makes -mtune=future use the same tuning decision as -mtune=power11.
2024-11-13 Michael Meissner
gcc/
* config/rs6000/power10.md (all reservations): Add future as an
alterntive to power10 and power11.
---
gcc/config/rs6000/power10.md | 144 +-
Ping the following patch series to add PowerPC Future support for Dense Math
Registers:
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/62.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/63.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/64.html
https://g
The system_time() function used the wrong element of the splits array.
Also add a comment about the units for time measurements.
libstdc++-v3/ChangeLog:
* testsuite/util/testsuite_performance.h (time_counter): Add
comment about times.
(time_counter::system_time): Use corr
This fixes some -Wdeprecated-declarations warnings.
libstdc++-v3/ChangeLog:
* testsuite/performance/ext/pb_ds/hash_int_erase_mem.cc: Replace
std::unary_function with result_type and argument_type typedefs.
* testsuite/util/performance/assoc/multimap_common_type.hpp:
The results of 'make check-performance' are appended to the .sum file,
with no indication where one set of results ends and the next begins. We
could just remove the file when starting a new run, but appending makes
it a little easier to compare with previous runs, without having to copy
and store
With recent glibc releases the __gthread_active_p() function is always
true, so we always append "-thread" onto performance benchmark names.
Use the __gnu_cxx::__is_single_threaded() function instead.
libstdc++-v3/ChangeLog:
* testsuite/util/testsuite_performance.h: Use
__gnu_cxx
The use of unnamed std::lock_guard temporaries was intentional here, as
they were used like barriers (but std::barrier isn't available until
C++20). But that gives nodiscard warnings, because unnamed temporary
locks are usually unintentional. Use named variables in new block scopes
instead.
libstd
This patch appears to be overlooked:
The first link is the long explanation of the patch, and the second link is the
patch itself.
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667451.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667452.html
--
Michael Meissner, IBM
PO
On Fri, Nov 08, 2024 at 02:28:11PM -0600, Peter Bergner wrote:
> On 11/8/24 1:44 PM, Michael Meissner wrote:
> > diff --git a/gcc/config/rs6000/rs6000-arch.def
> > b/gcc/config/rs6000/rs6000-arch.def
> > new file mode 100644
> > index 000..e5b6e958133
> > --- /dev/null
> > +++ b/gcc/config
This patch adds the support that can be used in developing GCC support for
future PowerPC processors.
2024-11-13 Michael Meissner
* config.gcc (powerpc*-*-*): Add support for --with-cpu=future.
* config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for -mcpu=future.
* conf
Two tests used -mvsx to raise the processor level to at least power7. These
tests were rewritten to add cpu=power7 support.
I have built both big endian and little endian bootstrap compilers and there
were no regressions.
In addition, I constructed a test case that used every archiecture define
As part of the architecture flags patches, this patch changes the use of
TARGET_MODULO to TARGET_POWER9. The modulo instructions were added in power9
(ISA
3.0). Note, I did not change the uses of TARGET_MODULO where it was explicitly
generating different code if the machine had a modulo instruct
This patch restructures the code so that -mvsx for example will not silently
convert the processor to power7. The user must now use -mcpu=power7 or higher.
This means if the user does -mvsx and the default processor does not have VSX
support, it will be an error.
I have built both big endian and
As part of the architecture flags patches, this patch changes the use of
TARGET_POPCNTD to TARGET_POWER7. The POPCNTD instruction was added in power7
(ISA 2.06).
I have built both big endian and little endian bootstrap compilers and there
were no regressions.
In addition, I constructed a test ca
As part of the architecture flags patches, this patch changes the use of
TARGET_FPRND to TARGET_POWER5X. The FPRND instruction was added in power5+.
I have built both big endian and little endian bootstrap compilers and there
were no regressions.
In addition, I constructed a test case that used
As part of the architecture flags patches, this patch changes the use of
TARGET_POPCNTB to TARGET_POWER5. The POPCNTB instruction was added in ISA 2.02
(power5).
I have built both big endian and little endian bootstrap compilers and there
were no regressions.
In addition, I constructed a test ca
These patches replaces the first patch in the 11 patch set that separates
PowerPC architecture bits from ISA flags that use command line options.
The V2 patch thread starts at:
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668177.html
The are two differences from the previous patches:
For the newer architectures, this patch changes GCC to define the _ARCH_PWR
macros using the new architecture flags instead of relying on isa options like
-mpower10.
The -mpower8-internal, -mpower10, and -mpower11 options were removed. The
-mpower11 option was removed completely, since it was jus
Note, this patch fixes the attribution and the copyright year from the previous
V2 page.
This patch begins the journey to move architecture bits that are not user ISA
options from rs6000_isa_flags to a new targt variable rs6000_arch_flags. The
intention is to remove switches that are currently is
Unchanged in v4
Changed in v3:
* Moved the testsuite to a separate patch
* Updated copyright year
* class text_sink: New.
* class file: Add default ctor, copy ctor, move ctor; make m_inner
non-const
* class physical_location: Add default ctor
* class logical_location: Make m_inner non-const
* cl
Changed in v4:
* Fix SARIF schema URL
* Various changes to help with API docs
Changed in v3:
* split out the C and C++ API tests into this patch
* heavily rewritten libdiagnostics.exp; added support for Python tests
* tests updated for API changes, rewritten and extended
gcc/testsuite/ChangeLog:
This patch implements JSON parsing support.
It's based on the parsing parts of the patch I posted here:
https://gcc.gnu.org/legacy-ml/gcc-patches/2017-08/msg00417.html
with the parsing moved to a separate source file and header, heavily
rewritten to capture source location information for JSON val
Changed in v4:
* added DIAGNOSTIC_SARIF_VERSION_2_2_PRERELEASE
Changed in v3:
* Added support for execution paths
* Moved the test cases to another patch
* diagnostic_manager_add_sarif_sink: add param "main_input_file"
* Added diagnostic_text_sink_set_colorize
* Added DIAGNOSTIC_LEVEL_SORRY
* Upda
I need to use this cleanup logic for the testsuite for libdiagnostics
where it's too awkward to directly use gcc-dg.exp itself.
No functional change intended.
gcc/testsuite/ChangeLog:
* lib/dg-test-cleanup.exp: New file, from material moved from
lib/gcc-dg.exp.
* lib/gcc-d
gcc/ChangeLog:
* doc/libdiagnostics/Makefile: New file.
* doc/libdiagnostics/conf.py: New file.
* doc/libdiagnostics/index.rst: New file.
* doc/libdiagnostics/make.bat: New file.
* doc/libdiagnostics/topics/diagnostic-manager.rst: New file.
* doc/libd
Changed in v4:
* Updated for the various changes to diagnostics in trunk
* Reimplement FAIL_IF_NULL to stop checks being optimized away
Changed in v3:
* Added a --enable-libdiagnostics to configure.ac. It is disabled
by default, and requires --enable-host-shared.
* Split out gcc/testsuite/libdi
Here's v4 of my patch kit for "libdiagnostics", which makes GCC's
diagnostics subsystem available as a shared library; see:
https://gcc.gnu.org/wiki/libdiagnostics
New in v4:
* tutorial and API documentation (see patch 4)
* added DIAGNOSTIC_SARIF_VERSION_2_2_PRERELEASE
* reimplemented FAIL_IF_NU
The fix for PR117191
Wrong code appears after dse2 pass because it removes necessary insns.
(ie insn 554 - store to frame spill slot)
This happened because LRA pass doesn't cleanup the code exactly like reload
does.
The reload1.c has a special pass for such cleanup.
The reload removes CLOBBER in
Andrew Carlotti writes:
> On Tue, Nov 12, 2024 at 10:42:50PM +, Richard Sandiford wrote:
>> Sorry for the slow review. I think Jeff's much better placed to comment
>> on this than I am, but here's a stab. Mostly it looks really good to me
>> FWIW.
>>
>> Andrew Carlotti writes:
>> > This pa
On Wed, 2024-08-21 at 10:34 +0200, Richard Biener wrote:
> On Wed, Aug 21, 2024 at 2:01 AM David Malcolm
> wrote:
> >
> > On Tue, 2024-08-20 at 11:49 +0200, Richard Biener wrote:
> > > On Thu, Aug 15, 2024 at 8:13 PM David Malcolm
> > >
> > > wrote:
> > > >
> > > > Here's v3 of my patch kit for
Richard Biener writes:
> On Tue, 12 Nov 2024, Richard Sandiford wrote:
>
>> Sorry for the slow review. I think Jeff's much better placed to comment
>> on this than I am, but here's a stab. Mostly it looks really good to me
>> FWIW.
>>
>> Andrew Carlotti writes:
>> > This pass is used to optimi
On Tue, Nov 12, 2024 at 10:42:50PM +, Richard Sandiford wrote:
> Sorry for the slow review. I think Jeff's much better placed to comment
> on this than I am, but here's a stab. Mostly it looks really good to me
> FWIW.
>
> Andrew Carlotti writes:
> > This pass is used to optimise assignment
There are some SVE intrinsics that support one set of suffixes for
one extension (E1, say) and another set of suffixes for another
extension (E2, say). It is usually the case that, mutatis mutandis,
E2 extends E1. Listing E1 first would then ensure that the manual
C overload would also require E1
This patch adds support for the following intrinsics:
- svmlalb[_f16_mf8]_fpm
- svmlalb[_n_f16_mf8]_fpm
- svmlalt[_f16_mf8]_fpm
- svmlalt[_n_f16_mf8]_fpm
- svmlalb_lane[_f16_mf8]_fpm
- svmlalt_lane[_f16_mf8]_fpm
- svmlallbb[_f32_mf8]_fpm
- svmlallbb[_n_f32_mf8]_fpm
- svmlallbt[_f32_mf8]_fpm
- svml
This is a small speed up. If there is only one know stack variable, there
is no reason figure out the scope conflicts as there are none. So don't
go through all the live range calculations just to see there are none.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
If s-trasym.adb (System.Traceback.Symbolic, used as a renaming by
GNAT.Traceback.Symbolic) is given a traceback from a
position-independent executable, it does not include the executable's
load address in the report. This is necessary in order to decode the
traceback report.
Note, this has already
Hi,
this patch adds VLS modes to the strided load expanders.
Regtested on rv64gcv and handing it over to the CI.
Regards
Robin
gcc/ChangeLog:
* config/riscv/autovec.md: Add VLS modes.
* config/riscv/vector-iterators.md: Ditto.
* config/riscv/vector.md: Ditto.
---
gcc/
Some intrinsics require setting the fpm register before calling the
specific asm opcode required.
In order to simplify review, this patch:
- adds the fpm_mode_index attribute to function_group_info and
function_instance objects
- updates existing initialisations and call sites.
- updates equalit
The ACLE defines a new set of fp8 vector types and intrinsics that operate on
these, some of them operating on the vectors as if they were bags of bits and
some requiring an additional argument of type fpm_t.
The following patches introduce:
- the types
- intrinsics that operate without the fpm_
This patch adds the following intrinsics:
- svcvt1_bf16[_mf8]_fpm
- svcvt1_f16[_mf8]_fpm
- svcvt2_bf16[_mf8]_fpm
- svcvt2_f16[_mf8]_fpm
- svcvtlt1_bf16[_mf8]_fpm
- svcvtlt1_f16[_mf8]_fpm
- svcvtlt2_bf16[_mf8]_fpm
- svcvtlt2_f16[_mf8]_fpm
- svcvtn_mf8[_f16_x2]_fpm (unpredicated)
- svcvtnb_mf8[_f32_
When configuring GCC for RV32EC with:
./configure \
--target=riscv32-none-elf \
--with-multilib-generator="rv32ec-ilp32e--" \
--with-abi=ilp32e \
--with-arch=rv32ec
Then the build fails becaus
This patch adds support for the following intrinsics:
- svmlalb[_f16_mf8]_fpm
- svmlalb[_n_f16_mf8]_fpm
- svmlalt[_f16_mf8]_fpm
- svmlalt[_n_f16_mf8]_fpm
- svmlalb_lane[_f16_mf8]_fpm
- svmlalt_lane[_f16_mf8]_fpm
- svmlallbb[_f32_mf8]_fpm
- svmlallbb[_n_f32_mf8]_fpm
- svmlallbt[_f32_mf8]_fpm
- svml
This patch adds the following intrinsics:
- svcvt1_bf16[_mf8]_fpm
- svcvt1_f16[_mf8]_fpm
- svcvt2_bf16[_mf8]_fpm
- svcvt2_f16[_mf8]_fpm
- svcvtlt1_bf16[_mf8]_fpm
- svcvtlt1_f16[_mf8]_fpm
- svcvtlt2_bf16[_mf8]_fpm
- svcvtlt2_f16[_mf8]_fpm
- svcvtn_mf8[_f16_x2]_fpm (unpredicated)
- svcvtnb_mf8[_f32_
The ACLE defines a new set of fp8 vector types and intrinsics that operate on
these, some of them operating on the vectors as if they were bags of bits and
some requiring an additional argument of type fpm_t.
The following patches introduce:
- the types
- intrinsics that operate without the fpm_
Please disregard this series, posted as v2 by mistake.
Cheers,
Claudio
On 11/13/2024 4:34 PM, Claudio Bantaloukas wrote:
The ACLE defines a new set of fp8 vector types and intrinsics that operate on
these, some of them operating on the vectors as if they were bags of bits and
some requiring an
Some intrinsics require setting the fpm register before calling the
specific asm opcode required.
In order to simplify review, this patch:
- adds the fpm_mode_index attribute to function_group_info and
function_instance objects
- updates existing initialisations and call sites.
- updates equalit
Takayuki, thank you for the quick fix!
It seems works good now except only one degradation. Instead generating two
instructions:
7 ptr += (i & 1);
0x40078564 <+12>:extui a9, a8, 0, 1
0x40078567 <+15>:addx2 a2, a9, a2
Now it generates three:
7 ptr
Hi!
I'd like to add selftests for an aspect of the GCC/nvptx back end's
multilib configuration, outside of the language front ends: at
Makefile/shell level. Looking into GCC's selftest implementation,
I found one issue to potentially refactor:
On 2018-10-13T09:12:03-0400, David Malcolm wrote:
>
On Nov 13 2024, Michael Matz wrote:
> @@ -31658,6 +31660,17 @@ requires @code{.plt} and @code{.got}
> sections that are both writable and executable.
> This is a PowerPC 32-bit SYSV ABI option.
>
> +@opindex msplit-patch-nops
> +@item -msplit-patch-nops
> +When adding NOPs for a patchable area
Hello,
this is essentially
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651025.html
from Kewen in functionality. When discussing this with Segher at the
Cauldron he expressed reservations about changing the default
implementation of -fpatchable-function-entry. So, to move forward, l
On Wed 2024-11-13 15:18:32, Jan Hubicka wrote:
> > - sincos and all functions working with arrays ... Because these
> > functions have pointer arguments and that would require a bigger
> > rework of ix86_veclibabi_aocl(). Also, I'm not sure if GCC even ever
> > generates calls to these funct
Tested on hppa-unknown-linux-gnu and hppa64-hp-hpux11.11. Committed
to all active branches.
Dave
---
hppa: Remove inner `fix:SF/DF` from fixed-point patterns
2024-11-13 John David Anglin
gcc/ChangeLog:
PR target/117525
* config/pa/pa.md (fix_truncsfsi2): Remove inner `fix:S
OK.
For your other patch I suggest you resubmit with the RISC-V typo fixed so the
CI can pick it up. Generally, it looks reasonable.
--
Regards
Robin
Yury Khrustalev writes:
> From: Szabolcs Nagy
>
> Nonlocal stack save and restore has to also save and restore the GCS
> pointer. This is used in __builtin_setjmp/longjmp and nonlocal goto.
>
> The GCS specific code is only emitted if GCS branch-protection is
> enabled and the code always checks
On Wed, 6 Nov 2024, Jan Hubicka wrote:
> Hi,
> this is updated patch which adds -fmalloc-dce flag to control malloc/free
> removal. I ended up copying what -fallocation-dse does so -fmalloc-dce=1
> enables malloc/free removal provided return value is unused otherwise and
> -fmalloc-dce=2 allows a
> - sincos and all functions working with arrays ... Because these
> functions have pointer arguments and that would require a bigger
> rework of ix86_veclibabi_aocl(). Also, I'm not sure if GCC even ever
> generates calls to these functions.
GCC is able to recognize sin and cos calls and tu
> On Tue, Nov 12, 2024 at 04:00:03PM +0100, Jan Hubicka wrote:
> > Hi,
> > with __builtin_operator_new we now can optimize away unused std::vectors.
> > This adds testcases mentioned in the PR.
> >
> > Regtested x86_64-linux and comitted.
> >
> > PR tree-optimization/96945
> >
> > gcc/testsu
On Tue, Nov 12, 2024 at 2:15 AM Jeff Law wrote:
> > +
> > +
> > +/* Generate assembly to calculate CRC using clmul instruction.
> > + The following code will be generated when the CRC and data sizes are
> equal:
> > + li a4,quotient
> > + li a5,polynomial
> > + xor a0,
In addition to a single DR we also require a single lane, not a splat.
Boostrap and regtest running on x86_64-unknown-linux-gnu.
PR tree-optimization/117554
* tree-vect-stmts.cc (get_group_load_store_type): We can
use gather/scatter only for a single-lane single element gr
The following fixes SLP live lane generation for load-lanes which
fails to analyze for gcc.dg/vect/vect-live-slp-3.c because the
VLA division doesn't work out but it would also wrongly use the
transposed vector defs I think. The following properly disables
the actual load-lanes SLP node from live
> Hi!
>
> clang++ adds __builtin_operator_{new,delete} builtins which as documented
> work similarly to ::operator {new,delete}, except that it is an error
> if the called ::operator {new,delete} is not a replaceable global operator
> and allow optimizations which C++ normally allows just when tho
Hi PA,
thanks for the updated patch!
Paul-Antoine Arras wrote:
OpenMP: C++ front-end support for dispatch + adjust_args
This patch adds C++ support for the `dispatch` construct and the `adjust_args`
clause. It relies on the c-family bits comprised in the corresponding C
f
Hi Richard,
> ...I still think we should avoid testing can_create_pseudo_p.
> Does it work with the last part replaced by:
>
> if (!DECIMAL_FLOAT_MODE_P (mode))
> {
> if (aarch64_can_const_movi_rtx_p (src, mode)
> || aarch64_float_const_representable_p (src)
> || aarch64
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r15-5202-g5ace2b23199f42.
gcc/analyzer/ChangeLog:
* checker-path.cc (checker_path::debug): Explicitly use
global_dc's reference printer.
* diagnostic-manager.cc
(diagnostic_manager::pr
Richard Sandiford writes:
> Yury Khrustalev writes:
>> From: Szabolcs Nagy
>>
>> Add new builtins for GCS:
>>
>> void *__builtin_aarch64_gcspr (void)
>> uint64_t __builtin_aarch64_gcspopm (void)
>> void *__builtin_aarch64_gcsss (void *)
>>
>> The builtins are always enabled, but should be
The patch makes the following simplifications:
((X - 1) & ~X) < 0 -> X == 0
((X - 1) & ~X) >= 0 -> X != 0
On x86, the number of instructions is reduced from 4 to 3,
but on platforms like RISC-V, it reduces to a single instruction.
Bootstrapped and tested on x86-linux-gnu with no regressions.
gcc
Hybrid analysis is confused by the mask_conversion pattern making a
uniform mask non-uniform. As load/store lanes only uses a single
lane to mask all data lanes the SLP graph doesn't cover the alternate
(redundant) mask lanes and thus their pattern defs. The following adds
a hack to mark them cov
The patch simplifies expressions (a - 1) & -a, (a - 1) | -a, and (a - 1) ^ -a
to the constants 0, -1, and -1, respectively.
Currently, GCC does not perform these simplifications.
Bootstrapped and tested on x86-linux-gnu with no regressions.
gcc/ChangeLog:
* match.pd: New pattern.
gcc/t
Hi Honza,
Here is the second version of the patch.
On Mon 2024-11-11 18:31:47, Jan Hubicka wrote:
> > We currently support generating vectorized math calls to the AMD core
> > math library (ACML) (-mveclibabi=acml). That library is end-of-life and
> > its successor is the math library from AMD O
Yury Khrustalev writes:
> From: Richard Ball
>
> This patch adds a new testcase and docs for indirect_return
> attribute.
>
> gcc/ChangeLog:
>
> * doc/extend.texi: Add AArch64 docs for indirect_return
> attribute.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/indirect_re
Yury Khrustalev writes:
> From: Szabolcs Nagy
>
> Tail calls of indirect_return functions from non-indirect_return
> functions are disallowed even if BTI is disabled, since the call
> site may have BTI enabled.
>
> Following x86, mismatching attribute on function pointers is not
> a type error ev
Jennifer Schmitz writes:
> As follow-up to
> https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665472.html,
> this patch implements folding of svmul and svdiv by -1 to svneg for
> unsigned SVE vector types. The key idea is to reuse the existing code that
> does this fold for signed types and
Hi Eric,
On Thu, Oct 17, 2024 at 03:20:11PM GMT, Eric Gallager wrote:
> On Thu, Oct 17, 2024 at 10:54 AM Alejandro Colomar wrote:
> >
> > Just like we already do for git-send-email(1). In some cases, patches
> > are prepared with git-format-patch(1), but are sent with a different
> > program, or
1 - 100 of 134 matches
Mail list logo