v1 -> v2:
1. Change the code format.
2. Fix bugs in the code.
Both regression tests and spec2006 passed.
The problem mentioned in the link does not move the four immediate load
instructions out of the loop. It has been optimized. Now, as in the test case,
four immediate load instructions are gene
On 2022-10-31 09:18, Eric Botcazou wrote:
hello Eric!
This also changes libstdc++ to pass -D_WIN32_WINNT=0x0600 but only when
the
switch --enable-libstdcxx-threads is passed, which means that C++11
threads
are still disabled by default *unless* MinGW-W64 itself is configured
for
Windows Vista
Segher Boessenkool writes:
> On Mon, Oct 31, 2022 at 04:13:38PM -0600, Jeff Law wrote:
>> On 10/30/22 20:42, Jiufu Guo via Gcc-patches wrote:
>> >We know that for struct variable assignment, memory copy may be used.
>> >And for memcpy, we may load and store more bytes as possible at one time.
>>
Jeff Law writes:
> On 10/30/22 20:42, Jiufu Guo via Gcc-patches wrote:
>> Hi,
>>
>> We know that for struct variable assignment, memory copy may be used.
>> And for memcpy, we may load and store more bytes as possible at one time.
>> While it may be not best here:
>> 1. Before/after stuct variabl
Segher Boessenkool writes:
> Hi!
>
> On Mon, Oct 31, 2022 at 10:42:35AM +0800, Jiufu Guo wrote:
>> #define FN 4
>> typedef struct { double a[FN]; } A;
>>
>> A foo (const A *a) { return *a; }
>> A bar (const A a) { return a; }
>> ///
>>
>> If FN<=2; the size of "A" fits into TImode, then thi
On Tue, Nov 1, 2022 at 9:21 AM Kong, Lingling via Gcc-patches
wrote:
>
> Hi
>
> The patch is for mention Intel __bf16 support in AVX512BF16 intrinsics.
> Ok for master ?
>
> Thanks,
> Lingling
>
> ---
> htdocs/gcc-13/changes.html | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/htdocs/g
Tested x86_64-pc-linux-gnu, applying to trunk.
-- >8 --
genericize might introduce function calls (and does on the contracts
branch), so it's safer to set this flag later.
gcc/cp/ChangeLog:
* decl.cc (finish_function): Set TREE_NOTHROW later in the function.
---
gcc/cp/decl.cc | 16 +++
Hi
The patch is for mention Intel __bf16 support in AVX512BF16 intrinsics.
Ok for master ?
Thanks,
Lingling
---
htdocs/gcc-13/changes.html | 2 ++
1 file changed, 2 insertions(+)
diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html index
7c6bfa6e..cd0282f1 100644
--- a/htdocs/
On Mon, Oct 31, 2022 at 04:13:38PM -0600, Jeff Law wrote:
> On 10/30/22 20:42, Jiufu Guo via Gcc-patches wrote:
> >We know that for struct variable assignment, memory copy may be used.
> >And for memcpy, we may load and store more bytes as possible at one time.
> >While it may be not best here:
>
Hi!
On Mon, Oct 31, 2022 at 10:42:35AM +0800, Jiufu Guo wrote:
> #define FN 4
> typedef struct { double a[FN]; } A;
>
> A foo (const A *a) { return *a; }
> A bar (const A a) { return a; }
> ///
>
> If FN<=2; the size of "A" fits into TImode, then this code can be optimized
> (by subreg/cse/
These cases actually doesn't care about -mabi, they just need 'v' in -march.
Can you tell me how to fix these testcases for "fails on targets without
ilp32d" ?
These failures are bogus failures since if you specify -mabi=ilp32d when you
are using GNU toolchain which is build up with "--arch=ilp32
On 10/1/22 12:55, Bernhard Reutner-Fischer wrote:
On Fri, 30 Sep 2022 17:32:34 -0600
Jeff Law wrote:
+ /* This looks good from a CFG standpoint. Now look at the guts
+ of PRED. Basically we want to verify there are no PHI nodes
+ and no real statements. */
+ if (! gimple_seq_emp
On Mon, 31 Oct 2022, FX via Gcc-patches wrote:
> - rounded conversions: converting, from an integer or floating point
> type, into another floating point type, with specific rounding mode
> passed as argument
These don't have standard C names. The way to do these in C would be
using the FENV_
On Mon, 31 Oct 2022 15:00:49 PDT (-0700), gcc-patches@gcc.gnu.org wrote:
On 10/30/22 19:40, juzhe.zh...@rivai.ai wrote:
From: Ju-Zhe Zhong
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/abi-2.c: Change ilp32d to ilp32.
* gcc.target/riscv/rvv/base/abi-3.c: Ditto.
These testcases are not depend on the ABI specification.
I pick up the minimum ABI setting so that it won't fail.
The naming of abi-* tests may be confusing, I can change the naming in the next
time.
juzhe.zh...@rivai.ai
From: Jeff Law
Date: 2022-11-01 06:00
To: juzhe.zhong; gcc-patches
CC: sc
On 10/29/22 03:01, Xiongchuan Tan wrote:
Reviewed-by: Palmer Dabbelt
Acked-by: Palmer Dabbelt
libitm/ChangeLog:
* configure.tgt: Add riscv support.
* config/riscv/asm.h: New file.
* config/riscv/sjlj.S: New file.
* config/riscv/target.h: New file.
Pushed
On 10/30/22 20:42, Jiufu Guo via Gcc-patches wrote:
Hi,
We know that for struct variable assignment, memory copy may be used.
And for memcpy, we may load and store more bytes as possible at one time.
While it may be not best here:
1. Before/after stuct variable assignment, the vaiable may be o
On 10/30/22 19:40, juzhe.zh...@rivai.ai wrote:
From: Ju-Zhe Zhong
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/abi-2.c: Change ilp32d to ilp32.
* gcc.target/riscv/rvv/base/abi-3.c: Ditto.
* gcc.target/riscv/rvv/base/abi-4.c: Ditto.
* gcc.target/ris
On 10/31/22 05:57, Tamar Christina wrote:
Hi All,
The current vector extract pattern can only extract from a vector when the
position to extract is a multiple of the vector bitsize as a whole.
That means extract something like a V2SI from a V4SI vector from position 32
isn't possible as 32 is
On Linux/x86_64,
259a11555c90783e53c046c310080407ee54a31e is the first bad commit
commit 259a11555c90783e53c046c310080407ee54a31e
Author: Jakub Jelinek
Date: Mon Oct 31 09:09:48 2022 +0100
builtins: Add various complex builtins for _Float{16,32,64,128,32x,64x,128x}
caused
FAIL: g++.dg/ot
On 10/31/22 05:57, Tamar Christina wrote:
Hi All,
This adds a new optab and IFNs for REDUC_PLUS_WIDEN where the resulting
scalar reduction has twice the precision of the input elements.
At some point in a later patch I will also teach the vectorizer to recognize
this builtin once I figure out
On 10/31/22 05:56, Tamar Christina wrote:
Hi All,
This patch series is to add recognition of pairwise operations (reductions)
in match.pd such that we can benefit from them even at -O1 when the vectorizer
isn't enabled.
Ths use of these allow for a lot simpler codegen in AArch64 and allows us
On 10/31/22 05:42, Tamar Christina via Gcc-patches wrote:
Hi,
This is a cleaned up version addressing all feedback.
Bootstrapped Regtested on aarch64-none-linux-gnu,
x86_64-pc-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
* match.pd: Add new rule.
gcc/tests
On 10/31/22 05:53, Tamar Christina wrote:
Hi All,
This adds a new test-and-branch optab that can be used to do a conditional test
of a bit and branch. This is similar to the cbranch optab but instead can
test any arbitrary bit inside the register.
This patch recognizes boolean comparisons a
When converting integer computations into vector ones, we build a chain
from an integer definition instruction together with all dependent use
instructions. The integer computations on the chain are converted to
vector ones if the total vector costs are lower than the integer ones.
Since the same
Tested on x86_64-pc-linux-gnu, does this look OK for trunk?
libstdc++-v3/ChangeLog:
* include/std/ranges (as_rvalue_view): Define.
(enable_borrowed_range): Define.
(views::__detail::__can_as_rvalue_view): Define.
(views::_AsRvalue, views::as_rvalue): Define.
Hi Mikael,
thanks a lot, your testcases broke my initial (and incorrect) patch
in multiple ways. I understand now that the right solution is much
simpler and smaller.
I've added your testcases, see attached, with a simple scan of the
dump for the generated order of hidden arguments in the funct
Hi,
Just adding, from the Fortran 2018 perspective, things we will need to
implement for which I think support from the middle-end might be necessary:
- rounded conversions: converting, from an integer or floating point type, into
another floating point type, with specific rounding mode passed
The analyzer's file-descriptor state machine tracks the access mode of
opened files, so that it can emit -Wanalyzer-fd-access-mode-mismatch.
To do this, its symbolic execution needs to "know" the values of the
constants "O_RDONLY", "O_WRONLY", and "O_ACCMODE". Currently
analyzer/sm-fd.cc simply u
On 10/21/22 2:28 AM, Indu Bhagat via Gcc-patches wrote:
On 10/19/22 19:05, Guillermo E. Martinez wrote:
Hello,
The following is patch v4 to update BTF/CTF backend supporting
BTF_KIND_ENUM64 type. Changes from v3:
+ Remove `ctf_enum_binfo' structure.
+ Remove -m{little,big}-endian from dg
On Fri, 28 Oct 2022, Jeff Law via Gcc-patches wrote:
> Joseph, do you have bits in this space that are going to be landing soon, or
> is your C2X work focused elsewhere? Are there other C2X routines we need to
> be proving builtins for?
I don't have any builtins work planned for GCC 13 (maybe ad
On 10/31/22 05:34, Tamar Christina wrote:
The type of the expression should be available via the mode and the
signedness, no? So maybe to avoid having both RTX and TREE on the target
hook pass it a wide_int instead for the divisor?
Done.
Bootstrapped Regtested on aarch64-none-linux-gnu, x86
On Mon, 31 Oct 2022 at 17:03, Eric Botcazou wrote:
>
> > I suppose we could use memcmp on the as variable itself, to inspect
> > the actual stored padding rather than the returned copy of it.
>
> Yes, that's probably the only safe stance when optimization is enabled.
Strictly speaking, it's not
On Mon, 31 Oct 2022 at 16:57, Jakub Jelinek wrote:
>
> On Mon, Oct 31, 2022 at 10:26:11AM +, Jonathan Wakely wrote:
> > > --- libstdc++-v3/include/std/complex.jj 2022-10-21 08:55:43.037675332
> > > +0200
> > > +++ libstdc++-v3/include/std/complex2022-10-21 17:05:36.802243229
> > > +0200
> I suppose we could use memcmp on the as variable itself, to inspect
> the actual stored padding rather than the returned copy of it.
Yes, that's probably the only safe stance when optimization is enabled.
--
Eric Botcazou
On Mon, Oct 31, 2022 at 10:26:11AM +, Jonathan Wakely wrote:
> > --- libstdc++-v3/include/std/complex.jj 2022-10-21 08:55:43.037675332 +0200
> > +++ libstdc++-v3/include/std/complex2022-10-21 17:05:36.802243229 +0200
> > @@ -142,8 +142,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >
> >/
On 10/30/22 19:44, Cui, Lili wrote:
On 10/20/22 19:52, Cui, Lili via Gcc-patches wrote:
Hi Honza,
Gentle ping
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601934.html
gcc/ChangeLog
* ipa-inline-analysis.cc (do_estimate_edge_time): Add function attribute
judgement for INL
On 10/31/22 05:51, Tamar Christina via Gcc-patches wrote:
Hi All,
Here's a respin addressing review comments.
Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
* match.pd: Add bitfield and shift folding
On 10/31/22 05:38, Tamar Christina via Gcc-patches wrote:
Hi All,
This is a respin with all feedback addressed.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
* match.pd: Add fneg/fadd rule.
gcc/testsuite/ChangeLog:
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
Makefile.in (MPURE_CODE): New macro defines __PURE_CODE__.
(gcc_compile): Appended MPURE_CODE.
lib1funcs.S (FUNC_START_SECTION): Set flags for __PURE_CODE__.
clz2.S (__clzsi2): Added -mpure-code compatible instructions.
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/eabi/fcast.S (__aeabi_h2f, __aeabi_f2h): Added functions.
* config/arm/fp16 (__gnu_f2h_ieee, __gnu_h2f_ieee,
__gnu_f2h_alternative,
__gnu_h2f_alternative): Disable build for v6m multilibs.
* config/arm/t-b
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/bpabi-lib.h (muldi3): Removed duplicate.
(fixunssfsi) Removed obsolete RENAME_LIBRARY directive.
* config/arm/eabi/ffixed.S (__aeabi_f2iz, __aeabi_f2uiz,
__aeabi_f2lz, __aeabi_f2ulz): New file.
* co
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/bpabi-lib.h (__floatdisf, __floatundisf):
Remove obsolete RENAME_LIBRARY directives.
* config/arm/eabi/ffloat.S (__aeabi_i2f, __aeabi_l2f, __aeabi_ui2f,
__aeabi_ul2f): New file.
* config/arm/lib1fun
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/eabi/fcast.S (__aeabi_d2f, __aeabi_f2d): New file.
* config/arm/lib1funcs.S: #include eabi/fcast.S (v6m only).
* config/arm/t-elf (LIB1ASMFUNCS): Added _arm_d2f and _arm_f2d.
---
libgcc/config/arm/eabi/fcast.S | 2
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/eabi/fdiv.S (__divsf3, __fp_divloopf): New file.
* config/arm/lib1funcs.S: #include eabi/fdiv.S (v6m only).
* config/arm/t-elf (LIB1ASMFUNCS): Added _divsf3 and _fp_divloopf.
---
libgcc/config/arm/eabi/fdiv.S | 26
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/eabi/fmul.S (__mulsf3): New file.
* config/arm/lib1funcs.S: #include eabi/fmul.S (v6m only).
* config/arm/t-elf (LIB1ASMFUNCS): Moved _mulsf3 to global scope
(this object was previously blocked on v6m build
With the complete CM0 library integrated, regression testing showed new
failures with the message "compilation failed to produce executable":
gcc.dg/fixed-point/convert-float-1.c
gcc.dg/fixed-point/convert-float-3.c
gcc.dg/fixed-point/convert-sat.c
Investigating, this appears to be ca
Since this is the first import of single-precision functions, some common
parsing and formatting routines are also included. These common rotines
will be referenced by other functions in subsequent commits.
However, even if the size penalty is accounted entirely to __addsf3(),
the total compiled s
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/bpabi-v6m.S (__aeabi_cfcmpeq, __aeabi_cfcmple,
__aeabi_cfrcmple, __aeabi_fcmpeq, __aeabi_fcmple, aeabi_fcmple,
__aeabi_fcmpgt, aeabi_fcmpge): Moved to ...
* config/arm/eabi/fcmp.S: New file.
* confi
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/bpabi-v6m.S (__aeabi_ldivmod/ldivmod): Moved to ...
* config/arm/eabi/ldiv.S: New file.
* config/arm/lib1funcs.S: #include eabi/ldiv.S (v6m only).
---
libgcc/config/arm/bpabi-v6m.S | 81 -
These functions are significantly smaller and faster than the wrapper
functions and soft-float implementation they replace. Using the first
comparison operator (e.g. '<=') in any program costs about 70 bytes
initially, but every additional operator incrementally adds just 4 bytes.
NOTE: It seems
This will make it easier to isolate changes in subsequent patches.
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/bpabi-v6m.S (__aeabi_frsub): Moved to ...
* config/arm/eabi/fadd.S: New file.
* config/arm/lib1funcs.S: #include eabi/fadd.S (v6m only).
---
libg
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/eabi/lmul.S: New file for __muldi3(), __mulsidi3(), and
__umulsidi3().
* config/arm/lib1funcs.S: #eabi/lmul.S (v6m only).
* config/arm/t-elf: Add the new objects to LIB1ASMFUNCS.
---
libgcc/config/arm/eab
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/eabi/idiv.S: New file for __udivsi3() and __divsi3().
* config/arm/lib1funcs.S: #include eabi/idiv.S (v6m only).
---
libgcc/config/arm/eabi/idiv.S | 299 ++
libgcc/config/arm/lib1funcs.S |
The functional overlap between the single- and double-word functions makes
functions makes this implementation about half the size of the C functions
if both functions are linked in the same application.
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/parity.S: New file for __
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/bpabi.c: Deleted unused file.
* config/arm/eabi/ldiv.S (__aeabi_ldivmod, __aeabi_uldivmod):
Replaced wrapper functions with a complete implementation.
* config/arm/t-bpabi (LIB2ADD_ST): Removed bpabi.c.
This will make it easier to isolate changes in subsequent patches.
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/bpabi-v6m.S (__aeabi_lcmp, __aeabi_ulcmp): Moved to ...
* config/arm/eabi/lcmp.S: New file.
* config/arm/lib1funcs.S: #include eabi/lcmp.S.
---
l
This implementation provides an efficient tail call to __clzsi2(), making the
functions rather smaller and faster than the C versions.
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/bits/clz2.S (__clrsbsi2, __clrsbdi2):
Added new functions.
* config/arm/t-elf
This implementation provides an efficient tail call to __clzdi2(), making the
functions rather smaller and faster than the C versions.
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/bits/ctz2.S (__ffssi2, __ffsdi2): New functions.
* config/arm/t-elf (LIB1ASMFUNCS): Ad
This effectively merges support for all architecture variants into a
common function path with appropriate build conditions.
ARM performance is 1-2 instructions faster; Thumb-2 is about 50% faster.
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/bpabi.S (__aeabi_lcmp, __aeabi_
The Thumb versions of these functions are each 1-2 instructions smaller
and faster, and branchless when the IT instruction is available.
The ARM versions were converted to the "xxl/xxh" big-endian register
naming convention, but are otherwise unchanged.
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Eng
This version combines __ctzdi2() with __ctzsi2() into a single object with
an efficient tail call. The former implementation of __ctzdi2() was in C.
On architectures without __ARM_FEATURE_CLZ, this version merges the formerly
separate Thumb and ARM code sequences into a unified instruction sequen
These are 2-5 instructions smaller and just as fast. Branches are
minimized, which will allow easier adaptation to Thumb-2/ARM mode.
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/eabi/lcmp.S (__aeabi_lcmp, __aeabi_ulcmp): Replaced;
add macro configuration to build _
On architectures without __ARM_FEATURE_CLZ, this version combines __clzdi2()
with __clzsi2() into a single object with an efficient tail call. Also, this
version merges the formerly separate Thumb and ARM code implementations
into a unified instruction sequence. This change significantly improves
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/lib1funcs.S (RETLDM, ARM_DIV_BODY, ARM_MOD_BODY,
_interwork_call_via_lr): Moved condition code after the flags
update specifier "s".
(ARM_FUNC_START, THUMB_LDIV0): Removed redundant ".syntax".
---
libgcc/c
This will make it easier to isolate changes in subsequent patches.
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/lib1funcs.S (__ashldi3, __ashrdi3, __lshldi3): Moved to ...
* config/arm/eabi/lshift.S: New file.
---
libgcc/config/arm/eabi/lshift.S | 123 +
The functional overlap between the single- and double-word functions
makes this implementation about 30% smaller than the C functions
if both functions are linked together in the same appliation.
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/popcnt.S (__popcountsi, __popcoun
These macros complement and extend the existing do_it() macro.
Together, they streamline the process of optimizing short branchless
contitional sequences to support ARM, Thumb-2, and Thumb-1.
The inherent architecture limitations of Thumb-1 means that writing
assembly code is somewhat more tedious
Since THUMB_FUNC_START does not insert the ".text" directive, it aligns
more closely with the new FUNC_ENTRY maro and is renamed accordingly.
THUMB_FUNC_START usage has been universally synonymous with the
".force_thumb" directive, so this is now folded into the definition.
Usage of ".force_thumb"
This will make it easier to isolate changes in subsequent patches.
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/lib1funcs.S (__clzsi2i, __clzdi2): Moved to ...
* config/arm/clz2.S: New file.
---
libgcc/config/arm/clz2.S | 145 ++
This will make it easier to isolate changes in subsequent patches.
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/lib1funcs.S (__ctzsi2): Moved to ...
* config/arm/ctz2.S: New file.
---
libgcc/config/arm/ctz2.S | 86 +++
libgcc/co
This will make it easier to isolate changes in subsequent patches.
gcc/libgcc/ChangeLog:
2022-10-09 Daniel Engel
* config/arm/t-elf (LIB1ASMFUNCS): Split macros into logical groups.
---
libgcc/config/arm/t-elf | 66 +
1 file changed, 53 insertions
Most of these changes support subsequent patches in this series.
Particularly, the FUNC_START macro becomes part of a new macro chain:
* FUNC_ENTRY Common global symbol directives
* FUNC_START_SECTION FUNC_ENTRY to start a new
* FUNC_START FUNC_START_SECTION <
Hi Richard,
I am re-submitting my libgcc patch from 2021:
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563585.html
https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587383.html
I believe I have finally made the stage1 window.
Regards,
Daniel
---
Changes since v6:
Hi,
I'm looking into vec_set with variable index on s390. Uros posted a
patch [1] that did not make it upstream in Nov 2020. It changed the
mode of the index operand to whatever the target supports in
can_vec_set_var_idx_p. I missed it back then but we indeed do not make
proper use of vec_set w
On Mon, 31 Oct 2022 at 15:34, Eric Botcazou wrote:
>
> > The test was only failing for me with -m32 (and not -m64), so I didn't
> > notice until now. That probably means we should make the test fail more
> > reliably if the padding isn't being cleared.
>
> The tests fail randomly for me on SPARC64
Hi,
This patch adds the support for pacbti multlilib linking by making
"-mbranch-protection=none" as default in the command line for all M-profile
targets and uses "-mbranch-protection=none" for multilib matching. If any
valid value is passed to "-mbranch-protection" in the command line, this
new
> The test was only failing for me with -m32 (and not -m64), so I didn't
> notice until now. That probably means we should make the test fail more
> reliably if the padding isn't being cleared.
The tests fail randomly for me on SPARC64/Linux:
FAIL: 29_atomics/atomic/compare_exchange_padding.cc ex
This was introduced with the fix and backports of PR103530 on
x86_64-linux-gnux32 with older glibc versions (checked with 2.31), where dladdr
is still in the libdl.so library, and not included in libc.so as in newer glibc
versions.
Linking of libgnat.so fails with
[...]
/usr/x86_64-linux-gnux3
On 2022-10-31 09:18, Eric Botcazou wrote:
Hi Eric!
thank you very much for the job!
I will try to build our (MinGW-Builds project) builds using this patch
and will report back.
@Jonathan
what the next steps to be taken to accept this patch?
best!
I have attached a revised version of th
I recently saw that gfortran does not support derived type components
with 'target update', an OpenMP 5.0 feature.
When adding it, I also found out that strides where not handled. There
is probably some room of improvement about what to copy and what not,
but copying too much should be fine.
Bui
> "Mark" == Mark Wielaard writes:
Mark> DW_LANG_Rust_old was used by old rustc compilers <= 2016 before DWARF5
Mark> assigned an official number. It might be recognized by some
Mark> debuggers.
FWIW I wouldn't worry about it any more.
We could probably just remove the '_old' constant.
Tom
Ping x2.
On 2022/10/17 10:29 PM, Chung-Lin Tang wrote:
> Ping.
>
> On 2022/9/21 3:45 PM, Chung-Lin Tang via Gcc-patches wrote:
>> Hi Tom,
>> I had a patch submitted earlier, where I reported that the current way of
>> implementing
>> barriers in libgomp on nvptx created a quite significant perfo
Committed, thanks!
On Fri, Oct 28, 2022 at 6:47 AM Jeff Law via Gcc-patches
wrote:
>
>
> On 10/27/22 08:41, juzhe.zh...@rivai.ai wrote:
> > From: Ju-Zhe Zhong
> >
> > According to
> > https://github.com/gcc-mirror/gcc/commit/f95d3d5de72a1c43e8d529bad3ef59afc3214705.
> > Since GCC 4.8.6 doesn't
My recent patch to add additional vector lengths didn't address the
vector reductions yet.
This patch adds the missing support. Shorter vectors use fewer reduction
steps, and the means to extract the final value has been adjusted.
Lacking from this is any useful costs, so for loops the vect p
This patch adds patterns for the fmin and fmax operators, for scalars,
vectors, and vector reductions.
The compiler uses smin and smax for most floating-point optimizations,
etc., but not where the user calls fmin/fmax explicitly. On amdgcn the
hardware min/max instructions are already IEEE c
A function parameter was left over from a previous draft of my
multiple-vector-length patch. This patch silences the harmless warning.
Andrewamdgcn: Silence unused parameter warning
gcc/ChangeLog:
* config/gcn/gcn.cc (gcn_simd_clone_compute_vecsize_and_simdlen):
Set base_type a
Hi,
> -Original Message-
> From: Christophe Lyon
> Sent: Monday, October 17, 2022 2:30 PM
> To: Srinath Parvathaneni ; gcc-
> patc...@gcc.gnu.org
> Cc: Richard Earnshaw
> Subject: Re: [GCC][PATCH] arm: Add cde feature support for Cortex-M55
> CPU.
>
> Hi Srinath,
>
>
> On 10/10/22 10:
Tamar Christina writes:
> Hi All,
>
> Our zero and sign extend and extract patterns are currently very limited and
> only work for the original register size of the instructions. i.e. limited by
> GPI patterns. However these instructions extract bits and extend. This means
> that any register si
Hi All,
Currently we often times generate an r -> r add even if it means we need two
reloads to perform it, i.e. in the case that the values are on the SIMD side.
The pairwise operations expose these more now and so we get suboptimal codegen.
Normally I would have liked to use ^ or $ here, but w
Hi All,
The target has various zero and sign extension patterns. These however live in
various locations around the MD file and almost all of them are split
differently. Due to the various patterns we also ended up missing valid
extensions. For instance smov is almost never generated.
This cha
Hi All,
The backend has an existing V2HFmode that is used by pairwise operations.
This mode was however never made fully functional. Amongst other things it was
never declared as a vector type which made it unusable from the mid-end.
It's also lacking an implementation for load/stores so reload
Hi All,
Says what it does on the tin. In case some operations form in RTL due to
a split, combine or any RTL pass then still try to recognize them.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
* config/aarch64/aarch64-sim
Hi All,
This implements the new widening reduction optab in the backend.
Instead of introducing a duplicate definition for the same thing I have
renamed the intrinsics defintions to use the same optab.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
Hi All,
This patch series is to add recognition of pairwise operations (reductions)
in match.pd such that we can benefit from them even at -O1 when the vectorizer
isn't enabled.
Ths use of these allow for a lot simpler codegen in AArch64 and allows us to
avoid quite a lot of codegen warts.
As an
Hi All,
The current vector extract pattern can only extract from a vector when the
position to extract is a multiple of the vector bitsize as a whole.
That means extract something like a V2SI from a V4SI vector from position 32
isn't possible as 32 is not a multiple of 64. Ideally this optab sho
Hi All,
This adds a new optab and IFNs for REDUC_PLUS_WIDEN where the resulting
scalar reduction has twice the precision of the input elements.
At some point in a later patch I will also teach the vectorizer to recognize
this builtin once I figure out how the various bits of reductions work.
For
Hi All,
Our zero and sign extend and extract patterns are currently very limited and
only work for the original register size of the instructions. i.e. limited by
GPI patterns. However these instructions extract bits and extend. This means
that any register size can be used as an input as long a
Hi All,
This implements the new tbranch optab for AArch64.
Instead of emitting the instruction directly I've chosen to expand the pattern
using a zero extract and generating the existing pattern for comparisons for two
reasons:
1. Allows for CSE of the actual comparison.
2. It looks like the
Hi All,
This adds a new test-and-branch optab that can be used to do a conditional test
of a bit and branch. This is similar to the cbranch optab but instead can
test any arbitrary bit inside the register.
This patch recognizes boolean comparisons and single bit mask tests.
Bootstrapped Regtes
1 - 100 of 121 matches
Mail list logo