On Wed, 13 Nov 2024, Soumya AR wrote:
>
>
> > On 12 Nov 2024, at 6:19 PM, Richard Biener wrote:
> >
> > External email: Use caution opening links or attachments
> >
> >
> > On Mon, 11 Nov 2024, Soumya AR wrote:
> >
> >> Hi Richard,
> >>
> >>> On 7 Nov 2024, at 6:10 PM, Richard Biener wrot
On Wed, 13 Nov 2024, Tamar Christina wrote:
> > -Original Message-
> > From: Richard Biener
> > Sent: Monday, November 11, 2024 12:17 PM
> > To: gcc-patches@gcc.gnu.org
> > Cc: Richard Sandiford ; Tamar Christina
> >
> > Subject: [PATCH 1/2] Add suggested_epilogue_mode to vector costs
>
On Tue, 12 Nov 2024, Jakub Jelinek wrote:
> Hi!
>
> As mentioned in an earlier thread, C2Y voted in a change which made
> various library APIs callable with NULL arguments in certain cases,
> e.g.
> memcpy (NULL, NULL, 0);
> is now valid, although
> memcpy (NULL, NULL, 1);
> remains invalid. Thi
error: unrecognizable insn:
(insn 35 34 36 2 (set (subreg:RVVM1SF (reg/v:RVVM1x4SF 142 [ _r ]) 0)
(unspec:RVVM1SF [
(const_vector:RVVM1SF repeat [
(const_double:SF 0.0 [0x0.0p+0])
])
(reg:DI 0 zero)
On Wed, Nov 13, 2024 at 10:13:23AM +0100, Richard Biener wrote:
> > /* Return true if OP can be inferred to be a non-NULL after STMT
> > - executes by using attributes. */
> > + executes by using attributes. If OP2 is non-NULL and nonnull_if_nonzero
> > + is the only attribute implying OP
Hi Eric,
On Thu, Oct 17, 2024 at 03:20:11PM GMT, Eric Gallager wrote:
> On Thu, Oct 17, 2024 at 10:54 AM Alejandro Colomar wrote:
> >
> > Just like we already do for git-send-email(1). In some cases, patches
> > are prepared with git-format-patch(1), but are sent with a different
> > program, or
Hi Honza,
Here is the second version of the patch.
On Mon 2024-11-11 18:31:47, Jan Hubicka wrote:
> > We currently support generating vectorized math calls to the AMD core
> > math library (ACML) (-mveclibabi=acml). That library is end-of-life and
> > its successor is the math library from AMD O
The patch simplifies expressions (a - 1) & -a, (a - 1) | -a, and (a - 1) ^ -a
to the constants 0, -1, and -1, respectively.
Currently, GCC does not perform these simplifications.
Bootstrapped and tested on x86-linux-gnu with no regressions.
gcc/ChangeLog:
* match.pd: New pattern.
gcc/t
Hybrid analysis is confused by the mask_conversion pattern making a
uniform mask non-uniform. As load/store lanes only uses a single
lane to mask all data lanes the SLP graph doesn't cover the alternate
(redundant) mask lanes and thus their pattern defs. The following adds
a hack to mark them cov
On Wed, Nov 13, 2024 at 09:22:45AM +0100, Richard Biener wrote:
> While I'm far from an expert here this doesn't look right and instead the
> const_0_to_255_operand looks bogus to me in not properly taking into
> account 'mode'.
I think the bug is in use of const_0_to_255* predicates with QImode o
Jennifer Schmitz writes:
> As follow-up to
> https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665472.html,
> this patch implements folding of svmul and svdiv by -1 to svneg for
> unsigned SVE vector types. The key idea is to reuse the existing code that
> does this fold for signed types and
Yury Khrustalev writes:
> From: Szabolcs Nagy
>
> Tail calls of indirect_return functions from non-indirect_return
> functions are disallowed even if BTI is disabled, since the call
> site may have BTI enabled.
>
> Following x86, mismatching attribute on function pointers is not
> a type error ev
Yury Khrustalev writes:
> From: Richard Ball
>
> This patch adds a new testcase and docs for indirect_return
> attribute.
>
> gcc/ChangeLog:
>
> * doc/extend.texi: Add AArch64 docs for indirect_return
> attribute.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/indirect_re
The patch makes the following simplifications:
((X - 1) & ~X) < 0 -> X == 0
((X - 1) & ~X) >= 0 -> X != 0
On x86, the number of instructions is reduced from 4 to 3,
but on platforms like RISC-V, it reduces to a single instruction.
Bootstrapped and tested on x86-linux-gnu with no regressions.
gcc
Richard Sandiford writes:
> Yury Khrustalev writes:
>> From: Szabolcs Nagy
>>
>> Add new builtins for GCS:
>>
>> void *__builtin_aarch64_gcspr (void)
>> uint64_t __builtin_aarch64_gcspopm (void)
>> void *__builtin_aarch64_gcsss (void *)
>>
>> The builtins are always enabled, but should be
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r15-5202-g5ace2b23199f42.
gcc/analyzer/ChangeLog:
* checker-path.cc (checker_path::debug): Explicitly use
global_dc's reference printer.
* diagnostic-manager.cc
(diagnostic_manager::pr
Hi Richard,
> ...I still think we should avoid testing can_create_pseudo_p.
> Does it work with the last part replaced by:
>
> if (!DECIMAL_FLOAT_MODE_P (mode))
> {
> if (aarch64_can_const_movi_rtx_p (src, mode)
> || aarch64_float_const_representable_p (src)
> || aarch64
> On 12 Nov 2024, at 6:19 PM, Richard Biener wrote:
>
> External email: Use caution opening links or attachments
>
>
> On Mon, 11 Nov 2024, Soumya AR wrote:
>
>> Hi Richard,
>>
>>> On 7 Nov 2024, at 6:10 PM, Richard Biener wrote:
>>>
>>> External email: Use caution opening links or attach
On Tue, Nov 12, 2024 at 06:34:39PM +0100, Jakub Jelinek wrote:
> What do you think about this? So far lightly tested.
Unfortunately bootstrap/regtest revealed some issues in the patch,
the tree-ssa-ccp.cc changes break bootstrap because fntype in there
may be NULL and that is what get_nonnull_arg
> -Original Message-
> From: Richard Biener
> Sent: Monday, November 11, 2024 12:17 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Sandiford ; Tamar Christina
>
> Subject: [PATCH 1/2] Add suggested_epilogue_mode to vector costs
>
> The following enables targets to suggest the vector mode
On 10/30/24 11:31 AM, David Faust wrote:
gcc/
* doc/extend.texi (Common Variable Attributes): Document new
btf_decl_tag attribute.
(Common Type Attributes): Document new btf_type_tag attribute.
---
gcc/doc/extend.texi | 68 +
On 10/30/24 11:31 AM, David Faust wrote:
Support the btf_decl_tag and btf_type_tag attributes in BTF by creating
and emitting BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG records,
respectively, for them.
Some care is required when -gprune-btf is in effect to avoid emitting
decl or type tags for decla
On Tue, Nov 12, 2024 at 2:15 AM Jeff Law wrote:
> > +
> > +
> > +/* Generate assembly to calculate CRC using clmul instruction.
> > + The following code will be generated when the CRC and data sizes are
> equal:
> > + li a4,quotient
> > + li a5,polynomial
> > + xor a0,
> On Tue, Nov 12, 2024 at 04:00:03PM +0100, Jan Hubicka wrote:
> > Hi,
> > with __builtin_operator_new we now can optimize away unused std::vectors.
> > This adds testcases mentioned in the PR.
> >
> > Regtested x86_64-linux and comitted.
> >
> > PR tree-optimization/96945
> >
> > gcc/testsu
> - sincos and all functions working with arrays ... Because these
> functions have pointer arguments and that would require a bigger
> rework of ix86_veclibabi_aocl(). Also, I'm not sure if GCC even ever
> generates calls to these functions.
GCC is able to recognize sin and cos calls and tu
Yury Khrustalev writes:
> From: Szabolcs Nagy
>
> Nonlocal stack save and restore has to also save and restore the GCS
> pointer. This is used in __builtin_setjmp/longjmp and nonlocal goto.
>
> The GCS specific code is only emitted if GCS branch-protection is
> enabled and the code always checks
Hello,
this is essentially
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651025.html
from Kewen in functionality. When discussing this with Segher at the
Cauldron he expressed reservations about changing the default
implementation of -fpatchable-function-entry. So, to move forward, l
On Nov 13 2024, Michael Matz wrote:
> @@ -31658,6 +31660,17 @@ requires @code{.plt} and @code{.got}
> sections that are both writable and executable.
> This is a PowerPC 32-bit SYSV ABI option.
>
> +@opindex msplit-patch-nops
> +@item -msplit-patch-nops
> +When adding NOPs for a patchable area
Some intrinsics require setting the fpm register before calling the
specific asm opcode required.
In order to simplify review, this patch:
- adds the fpm_mode_index attribute to function_group_info and
function_instance objects
- updates existing initialisations and call sites.
- updates equalit
Please disregard this series, posted as v2 by mistake.
Cheers,
Claudio
On 11/13/2024 4:34 PM, Claudio Bantaloukas wrote:
The ACLE defines a new set of fp8 vector types and intrinsics that operate on
these, some of them operating on the vectors as if they were bags of bits and
some requiring an
The ACLE defines a new set of fp8 vector types and intrinsics that operate on
these, some of them operating on the vectors as if they were bags of bits and
some requiring an additional argument of type fpm_t.
The following patches introduce:
- the types
- intrinsics that operate without the fpm_
Some intrinsics require setting the fpm register before calling the
specific asm opcode required.
In order to simplify review, this patch:
- adds the fpm_mode_index attribute to function_group_info and
function_instance objects
- updates existing initialisations and call sites.
- updates equalit
The ACLE defines a new set of fp8 vector types and intrinsics that operate on
these, some of them operating on the vectors as if they were bags of bits and
some requiring an additional argument of type fpm_t.
The following patches introduce:
- the types
- intrinsics that operate without the fpm_
This patch adds the following intrinsics:
- svcvt1_bf16[_mf8]_fpm
- svcvt1_f16[_mf8]_fpm
- svcvt2_bf16[_mf8]_fpm
- svcvt2_f16[_mf8]_fpm
- svcvtlt1_bf16[_mf8]_fpm
- svcvtlt1_f16[_mf8]_fpm
- svcvtlt2_bf16[_mf8]_fpm
- svcvtlt2_f16[_mf8]_fpm
- svcvtn_mf8[_f16_x2]_fpm (unpredicated)
- svcvtnb_mf8[_f32_
When configuring GCC for RV32EC with:
./configure \
--target=riscv32-none-elf \
--with-multilib-generator="rv32ec-ilp32e--" \
--with-abi=ilp32e \
--with-arch=rv32ec
Then the build fails becaus
This patch adds the following intrinsics:
- svcvt1_bf16[_mf8]_fpm
- svcvt1_f16[_mf8]_fpm
- svcvt2_bf16[_mf8]_fpm
- svcvt2_f16[_mf8]_fpm
- svcvtlt1_bf16[_mf8]_fpm
- svcvtlt1_f16[_mf8]_fpm
- svcvtlt2_bf16[_mf8]_fpm
- svcvtlt2_f16[_mf8]_fpm
- svcvtn_mf8[_f16_x2]_fpm (unpredicated)
- svcvtnb_mf8[_f32_
This patch adds support for the following intrinsics:
- svmlalb[_f16_mf8]_fpm
- svmlalb[_n_f16_mf8]_fpm
- svmlalt[_f16_mf8]_fpm
- svmlalt[_n_f16_mf8]_fpm
- svmlalb_lane[_f16_mf8]_fpm
- svmlalt_lane[_f16_mf8]_fpm
- svmlallbb[_f32_mf8]_fpm
- svmlallbb[_n_f32_mf8]_fpm
- svmlallbt[_f32_mf8]_fpm
- svml
Hi,
this patch adds VLS modes to the strided load expanders.
Regtested on rv64gcv and handing it over to the CI.
Regards
Robin
gcc/ChangeLog:
* config/riscv/autovec.md: Add VLS modes.
* config/riscv/vector-iterators.md: Ditto.
* config/riscv/vector.md: Ditto.
---
gcc/
If s-trasym.adb (System.Traceback.Symbolic, used as a renaming by
GNAT.Traceback.Symbolic) is given a traceback from a
position-independent executable, it does not include the executable's
load address in the report. This is necessary in order to decode the
traceback report.
Note, this has already
This is a small speed up. If there is only one know stack variable, there
is no reason figure out the scope conflicts as there are none. So don't
go through all the live range calculations just to see there are none.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
This patch adds support for the following intrinsics:
- svmlalb[_f16_mf8]_fpm
- svmlalb[_n_f16_mf8]_fpm
- svmlalt[_f16_mf8]_fpm
- svmlalt[_n_f16_mf8]_fpm
- svmlalb_lane[_f16_mf8]_fpm
- svmlalt_lane[_f16_mf8]_fpm
- svmlallbb[_f32_mf8]_fpm
- svmlallbb[_n_f32_mf8]_fpm
- svmlallbt[_f32_mf8]_fpm
- svml
There are some SVE intrinsics that support one set of suffixes for
one extension (E1, say) and another set of suffixes for another
extension (E2, say). It is usually the case that, mutatis mutandis,
E2 extends E1. Listing E1 first would then ensure that the manual
C overload would also require E1
> On 13 Nov 2024, at 2:49 PM, Richard Biener wrote:
>
> External email: Use caution opening links or attachments
>
>
> On Wed, 13 Nov 2024, Soumya AR wrote:
>
>>
>>
>>> On 12 Nov 2024, at 6:19 PM, Richard Biener wrote:
>>>
>>> External email: Use caution opening links or attachments
>>>
The early-ra pass often didn't print a dump message when aborting the
allocation. This patch uses a similar helper to the previous patch.
gcc/
* config/aarch64/aarch64-early-ra.cc
(early_ra::record_allocation_failure): New member function.
(early_ra::get_allocno_subgroup):
So far, early_ra has used a single m_allocation_successful bool
to record whether the current region is still being allocated.
But there are (at least) two reasons why we might pull out of
attempting an allocation:
(1) We can't track the liveness of individual FP allocnos,
due to some awkward
At the moment, early-ra ducks out of allocating any region
that contains a register with both a strong FPR affinity and
a strong GPR affinity. The proper allocators are much better
at handling that situation.
But this means that early-ra tends not to allocate a region
of vector code that ends in
ISEL was introduced to translate vector comparison and vector
condition combinations back to internal function calls mapping to
one of the vcond[u][_eq][_mask] and vec_cmp[_eq] optabs. With
removing the legacy non-mask vcond expanders we now rely on all
vector comparisons and vector conditions to
Yury Khrustalev writes:
> Note that compared to __builtin_aarch64_chkfeat (x) the ACLE __chkfeat(x)
> flips the bits to be more intuitive (xor the input to output).
>
> gcc/ChangeLog:
> * config/aarch64/arm_acle.h (__chkfeat): New.
> ---
> gcc/config/aarch64/arm_acle.h | 13 +
>
This removes the obsolete API use by vector divmod lowering.
Bootstrapped and tested on x86_64-unknown-linux-gnu.
* tree-vect-generic.cc (expand_vector_divmod): Query vector
comparison and vec_cond_mask capability.
---
gcc/tree-vect-generic.cc | 4 +++-
1 file changed, 3 insertio
The following refactors the check with the last remaininig
expand_vec_cond_expr_p call with a comparison code to make it
obvious we are not relying on those anymore.
Bootstrapped and tested on x86_64-unknown-linux-gnu.
* tree-vect-stmts.cc (vectorizable_condition): Refactor
target
This series makes some minor tweaks to early-ra. The main patch is
really the last one, which tries to apply early-ra to a situation
that it currently avoids handling. It removes some MOVs from x264,
for a very minor speed improvement.
Bootstrapped & regression-tested on aarch64-linux-gnu. Also
At least on aarch64, modes_tieable_p is a stricter condition than
can_change_mode_class. can_change_mode_class tells us whether the
subreg rules produce a sensible result for a particular mode change.
modes_tieable_p in addition tells us whether a mode change is
reasonable for optimisation purpose
record_insn_refs has three distinct phases: model the definitions,
model any call, and model the uses. This patch splits each phase
out into its own function.
This isn't beneficial on its own, but it helps with later patches.
gcc/
* config/aarch64/aarch64-early-ra.cc
(early_ra::r
When early-ra treats a block as an isolated allocation region,
it opportunistically splits the block into smaller regions
at points where no FPRs or FPR allocnos are live. Previously
it only did this if m_allocation_successful, since the contrary
included cases in which the live range information
When we classify an SLP access as VMAT_ELEMENTWISE we still consider
overrun - the reset of it is later overwritten. The following fixes
this, resolving a few RISC-V FAILs with --param vect-force-slp=1.
Bootstrap and regtest running on x86_64-unknown-linux-gnu.
* tree-vect-stmts.cc (get_
Yury Khrustalev writes:
> From: Szabolcs Nagy
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/acle/chkfeat-1.c: New test.
> * gcc.target/aarch64/chkfeat-1.c: New test.
> * gcc.target/aarch64/chkfeat-2.c: New test.
>
> Co-authored-by: Yury Khrustalev
> Co-authored-by: Rich
Yury Khrustalev writes:
> From: Szabolcs Nagy
>
> Add new builtins for GCS:
>
> void *__builtin_aarch64_gcspr (void)
> uint64_t __builtin_aarch64_gcspopm (void)
> void *__builtin_aarch64_gcsss (void *)
>
> The builtins are always enabled, but should be used behind runtime
> checks in case t
The following makes sure to lower all VEC_COND_EXPRs that we cannot
trivially expand.
Bootstrapped and tested on x86_64-unknown-linux-gnu.
* tree-vect-generic.cc (expand_vector_condition): Lower
vector conditions that we cannot trivially expand.
---
gcc/tree-vect-generic.cc | 28
Here is an updated version of the patch following earlier reviews in the
series.
--
PAcommit 8f67de476decf151f853d68eb26223200535cc57
Author: Paul-Antoine Arras
Date: Fri May 24 19:04:35 2024 +0200
OpenMP: common C/C++ testcases for dispatch + adjust_args
gcc/testsuite/ChangeLog:
Since XTheadvector does not support vsetivli, vl needs to be put into
registers during the expand phase.
PR 116593
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins.cc
(function_expander::add_input_operand):
Put const to GPR for vl
* config/riscv/thead-vector.m
On Tue, Nov 12, 2024 at 12:31 AM Jeff Law wrote:
>
>
> On 11/9/24 12:43 PM, Mariam Arutunian wrote:
> > Add two new internal functions (IFN_CRC, IFN_CRC_REV), to provide faster
> > CRC generation.
> > One performs bit-forward and the other bit-reversed CRC computation.
> > If CRC optabs are suppo
On Tue, 12 Nov 2024, Richard Sandiford wrote:
> Sorry for the slow review. I think Jeff's much better placed to comment
> on this than I am, but here's a stab. Mostly it looks really good to me
> FWIW.
>
> Andrew Carlotti writes:
> > This pass is used to optimise assignments to the FPMR regist
> On 12 Nov 2024, at 18:55, Richard Sandiford wrote:
>
> Wilco Dijkstra writes:
>> Hi,
>>
> What do you think about disabling late scheduling as well?
I think this would definitely need separate consideration and evaluation
given the above.
Another thing to con
Pushed, thanks!
On Tue, Nov 5, 2024 at 11:21 AM Yangyu Chen wrote:
>
> This patch series adds support for Function Multi-Versioning (FMV) to
> RISC-V. The FMV feature allows users to specify multiple versions of a
> function and select the version at runtime based on the target_clones
> and targe
On Wed, Nov 13, 2024 at 6:22 AM H.J. Lu wrote:
>
> On Wed, Nov 13, 2024 at 11:25 AM H.J. Lu wrote:
> >
> > On Wed, Nov 13, 2024 at 10:23 AM Hongtao Liu wrote:
> > >
> > > On Wed, Nov 13, 2024 at 8:29 AM H.J. Lu wrote:
> > > >
> > > > On Wed, Nov 13, 2024 at 5:57 AM H.J. Lu wrote:
> > > > >
> >
On Tue, Nov 12, 2024 at 10:42:50PM +, Richard Sandiford wrote:
> Sorry for the slow review. I think Jeff's much better placed to comment
> on this than I am, but here's a stab. Mostly it looks really good to me
> FWIW.
>
> Andrew Carlotti writes:
> > This pass is used to optimise assignment
On Wed, 2024-08-21 at 10:34 +0200, Richard Biener wrote:
> On Wed, Aug 21, 2024 at 2:01 AM David Malcolm
> wrote:
> >
> > On Tue, 2024-08-20 at 11:49 +0200, Richard Biener wrote:
> > > On Thu, Aug 15, 2024 at 8:13 PM David Malcolm
> > >
> > > wrote:
> > > >
> > > > Here's v3 of my patch kit for
Andrew Carlotti writes:
> On Tue, Nov 12, 2024 at 10:42:50PM +, Richard Sandiford wrote:
>> Sorry for the slow review. I think Jeff's much better placed to comment
>> on this than I am, but here's a stab. Mostly it looks really good to me
>> FWIW.
>>
>> Andrew Carlotti writes:
>> > This pa
Here's v4 of my patch kit for "libdiagnostics", which makes GCC's
diagnostics subsystem available as a shared library; see:
https://gcc.gnu.org/wiki/libdiagnostics
New in v4:
* tutorial and API documentation (see patch 4)
* added DIAGNOSTIC_SARIF_VERSION_2_2_PRERELEASE
* reimplemented FAIL_IF_NU
Changed in v4:
* Updated for the various changes to diagnostics in trunk
* Reimplement FAIL_IF_NULL to stop checks being optimized away
Changed in v3:
* Added a --enable-libdiagnostics to configure.ac. It is disabled
by default, and requires --enable-host-shared.
* Split out gcc/testsuite/libdi
The fix for PR117191
Wrong code appears after dse2 pass because it removes necessary insns.
(ie insn 554 - store to frame spill slot)
This happened because LRA pass doesn't cleanup the code exactly like reload
does.
The reload1.c has a special pass for such cleanup.
The reload removes CLOBBER in
gcc/ChangeLog:
* doc/libdiagnostics/Makefile: New file.
* doc/libdiagnostics/conf.py: New file.
* doc/libdiagnostics/index.rst: New file.
* doc/libdiagnostics/make.bat: New file.
* doc/libdiagnostics/topics/diagnostic-manager.rst: New file.
* doc/libd
Richard Biener writes:
> On Tue, 12 Nov 2024, Richard Sandiford wrote:
>
>> Sorry for the slow review. I think Jeff's much better placed to comment
>> on this than I am, but here's a stab. Mostly it looks really good to me
>> FWIW.
>>
>> Andrew Carlotti writes:
>> > This pass is used to optimi
Changed in v4:
* added DIAGNOSTIC_SARIF_VERSION_2_2_PRERELEASE
Changed in v3:
* Added support for execution paths
* Moved the test cases to another patch
* diagnostic_manager_add_sarif_sink: add param "main_input_file"
* Added diagnostic_text_sink_set_colorize
* Added DIAGNOSTIC_LEVEL_SORRY
* Upda
I need to use this cleanup logic for the testsuite for libdiagnostics
where it's too awkward to directly use gcc-dg.exp itself.
No functional change intended.
gcc/testsuite/ChangeLog:
* lib/dg-test-cleanup.exp: New file, from material moved from
lib/gcc-dg.exp.
* lib/gcc-d
Changed in v4:
* Fix SARIF schema URL
* Various changes to help with API docs
Changed in v3:
* split out the C and C++ API tests into this patch
* heavily rewritten libdiagnostics.exp; added support for Python tests
* tests updated for API changes, rewritten and extended
gcc/testsuite/ChangeLog:
This patch implements JSON parsing support.
It's based on the parsing parts of the patch I posted here:
https://gcc.gnu.org/legacy-ml/gcc-patches/2017-08/msg00417.html
with the parsing moved to a separate source file and header, heavily
rewritten to capture source location information for JSON val
Unchanged in v4
Changed in v3:
* Moved the testsuite to a separate patch
* Updated copyright year
* class text_sink: New.
* class file: Add default ctor, copy ctor, move ctor; make m_inner
non-const
* class physical_location: Add default ctor
* class logical_location: Make m_inner non-const
* cl
As part of the architecture flags patches, this patch changes the use of
TARGET_POPCNTB to TARGET_POWER5. The POPCNTB instruction was added in ISA 2.02
(power5).
I have built both big endian and little endian bootstrap compilers and there
were no regressions.
In addition, I constructed a test ca
As part of the architecture flags patches, this patch changes the use of
TARGET_FPRND to TARGET_POWER5X. The FPRND instruction was added in power5+.
I have built both big endian and little endian bootstrap compilers and there
were no regressions.
In addition, I constructed a test case that used
As part of the architecture flags patches, this patch changes the use of
TARGET_POPCNTD to TARGET_POWER7. The POPCNTD instruction was added in power7
(ISA 2.06).
I have built both big endian and little endian bootstrap compilers and there
were no regressions.
In addition, I constructed a test ca
This patch restructures the code so that -mvsx for example will not silently
convert the processor to power7. The user must now use -mcpu=power7 or higher.
This means if the user does -mvsx and the default processor does not have VSX
support, it will be an error.
I have built both big endian and
For the newer architectures, this patch changes GCC to define the _ARCH_PWR
macros using the new architecture flags instead of relying on isa options like
-mpower10.
The -mpower8-internal, -mpower10, and -mpower11 options were removed. The
-mpower11 option was removed completely, since it was jus
Note, this patch fixes the attribution and the copyright year from the previous
V2 page.
This patch begins the journey to move architecture bits that are not user ISA
options from rs6000_isa_flags to a new targt variable rs6000_arch_flags. The
intention is to remove switches that are currently is
These patches replaces the first patch in the 11 patch set that separates
PowerPC architecture bits from ISA flags that use command line options.
The V2 patch thread starts at:
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668177.html
The are two differences from the previous patches:
As part of the architecture flags patches, this patch changes the use of
TARGET_MODULO to TARGET_POWER9. The modulo instructions were added in power9
(ISA
3.0). Note, I did not change the uses of TARGET_MODULO where it was explicitly
generating different code if the machine had a modulo instruct
Two tests used -mvsx to raise the processor level to at least power7. These
tests were rewritten to add cpu=power7 support.
I have built both big endian and little endian bootstrap compilers and there
were no regressions.
In addition, I constructed a test case that used every archiecture define
This patch adds the support that can be used in developing GCC support for
future PowerPC processors.
2024-11-13 Michael Meissner
* config.gcc (powerpc*-*-*): Add support for --with-cpu=future.
* config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for -mcpu=future.
* conf
Hi PA,
thanks for the updated patch!
Paul-Antoine Arras wrote:
OpenMP: C++ front-end support for dispatch + adjust_args
This patch adds C++ support for the `dispatch` construct and the `adjust_args`
clause. It relies on the c-family bits comprised in the corresponding C
f
> Hi!
>
> clang++ adds __builtin_operator_{new,delete} builtins which as documented
> work similarly to ::operator {new,delete}, except that it is an error
> if the called ::operator {new,delete} is not a replaceable global operator
> and allow optimizations which C++ normally allows just when tho
The following fixes SLP live lane generation for load-lanes which
fails to analyze for gcc.dg/vect/vect-live-slp-3.c because the
VLA division doesn't work out but it would also wrongly use the
transposed vector defs I think. The following properly disables
the actual load-lanes SLP node from live
In addition to a single DR we also require a single lane, not a splat.
Boostrap and regtest running on x86_64-unknown-linux-gnu.
PR tree-optimization/117554
* tree-vect-stmts.cc (get_group_load_store_type): We can
use gather/scatter only for a single-lane single element gr
On Wed 2024-11-13 15:18:32, Jan Hubicka wrote:
> > - sincos and all functions working with arrays ... Because these
> > functions have pointer arguments and that would require a bigger
> > rework of ix86_veclibabi_aocl(). Also, I'm not sure if GCC even ever
> > generates calls to these funct
OK.
For your other patch I suggest you resubmit with the RISC-V typo fixed so the
CI can pick it up. Generally, it looks reasonable.
--
Regards
Robin
Tested on hppa-unknown-linux-gnu and hppa64-hp-hpux11.11. Committed
to all active branches.
Dave
---
hppa: Remove inner `fix:SF/DF` from fixed-point patterns
2024-11-13 John David Anglin
gcc/ChangeLog:
PR target/117525
* config/pa/pa.md (fix_truncsfsi2): Remove inner `fix:S
On Wed, 6 Nov 2024, Jan Hubicka wrote:
> Hi,
> this is updated patch which adds -fmalloc-dce flag to control malloc/free
> removal. I ended up copying what -fallocation-dse does so -fmalloc-dce=1
> enables malloc/free removal provided return value is unused otherwise and
> -fmalloc-dce=2 allows a
Hi!
I'd like to add selftests for an aspect of the GCC/nvptx back end's
multilib configuration, outside of the language front ends: at
Makefile/shell level. Looking into GCC's selftest implementation,
I found one issue to potentially refactor:
On 2018-10-13T09:12:03-0400, David Malcolm wrote:
>
Takayuki, thank you for the quick fix!
It seems works good now except only one degradation. Instead generating two
instructions:
7 ptr += (i & 1);
0x40078564 <+12>:extui a9, a8, 0, 1
0x40078567 <+15>:addx2 a2, a9, a2
Now it generates three:
7 ptr
The system_time() function used the wrong element of the splits array.
Also add a comment about the units for time measurements.
libstdc++-v3/ChangeLog:
* testsuite/util/testsuite_performance.h (time_counter): Add
comment about times.
(time_counter::system_time): Use corr
Ping the following patch series to add PowerPC Future support for Dense Math
Registers:
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/62.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/63.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/64.html
https://g
1 - 100 of 134 matches
Mail list logo