[PATCH] remove bogus asserts in expr.c

2020-01-21 Thread stefan
4]) (nil The combine pass creates a parallel insn with auto inc, which is resolved during reload to valid insns again. This patch removes these checks, since they aren't helpful. Stefan --- a/gcc/expr.c +++ b/gcc/expr.c @@ -1,5 +1,5 @@ /* Convert

AW: [PATCH] m68k architecture: support ccmode + lra

2019-11-25 Thread stefan
t https://github.com/mc68kghost/gcc got an update. Tests are not yet at 100% (master branch fails too many tests) but it's closer to master branch now. The code is to 50% identical, a fair amount has swapped cmp/bcc, few are a tad worse and some yield surprisingly better code. Stefan

[PATCH] Overflow-trapping integer arithmetic routines7code

2020-10-05 Thread Stefan Kanthak
.de/gcc.html> for some examples as well as the expected assembly. The attached diff/patch provides better implementations. Stefan libgcc2.diff Description: Binary data

Re: [PATCH] Overflow-trapping integer arithmetic routines7code

2020-11-10 Thread Stefan Kanthak
Jeff Law wrote: > On 10/5/20 10:49 AM, Stefan Kanthak wrote: >> The implementation of the functions __absv?i2(), __addv?i3() etc. for >> trapping integer overflow provided in libgcc2.c is rather bad. >> Same for __cmp?i2() and __ucmp?i2() >> >> At least for AMD6

Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-10 Thread Stefan Kanthak
u]subDI3() functions ... which are but missing from libgcc.a Stefan Kanthak

Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-11 Thread Stefan Kanthak
, so (u) & 0x00ff is signed too -- and producing a negative value (or overflow) from the left-shift of a signed int, i.e. shifting into (or beyond) the sign bit, is undefined behaviour! JFTR: both -fsanitize=signed-integer-overflow and -fsanitize=undefined fail to catch this BUGBUGBUG, which surfaces on i386 and AMD64 with -O1 or -O0! Stefan Kanthak PS: even worse, -fsanitize=signed-integer-overflow fails to catch 1 << 31 or 128 << 24!

[PATCH] handle virtual registers inside PLUS - taking an address plus offset

2020-01-21 Thread Stefan Franke
The function instantiate_virtual_regs_in_insn does not handle the case if an address with offset is taken. Then a virtual register may appear (e.g argptr) inside a PLUS. This patch adds the handling for PLUS. Stefan --- a/gcc/function.c +++ b/gcc/function.c @@ -1,5 +1,5

Re: AW: [PATCH] m68k architecture: support ccmode + lra

2019-12-11 Thread Stefan Franke
ome optimizations/mechanisms do only work if HAVE_CC0 is defined - way more ... And the current implementation is IMHO unusable for lra since it did not introduce a CC register to track clobbering. So it's a dead end. I can live with the fact that my patch was refuted since I simply use my *working* fork, where I fixed the issues mentioned above. /cheers Stefan

Re: AW: [PATCH] m68k architecture: support ccmode + lra

2019-12-12 Thread Stefan Franke
Am 2019-12-12 10:32, schrieb Richard Sandiford: Stefan Franke writes: Am 2019-12-08 01:54, schrieb Oleg Endo: On Tue, 2019-11-26 at 07:38 +0100, ste...@franke.ms wrote: > On 11/21/19 10:30 AM, ste...@franke.ms wrote: > > Hi there, > > > > here is mc68k's patch to sw

[PATCH, c] all plattforms: support using a CC_REG instead cc0_rtx

2019-12-13 Thread Stefan Franke
Hi there, I suggest this patch to allow architectures do substitute cc0_rtx with a generated cc register. Why? If you are using a cc register plus your architecture as many instructions which may clobber that cc register, some passes (e.g. gcse) will reorder the insns. This can lead to the s

Re: [PATCH, c] all plattforms: support using a CC_REG instead cc0_rtx

2019-12-13 Thread Stefan Franke
Am 2019-12-13 18:58, schrieb Segher Boessenkool: Hi! On Fri, Dec 13, 2019 at 05:25:41PM +0100, Stefan Franke wrote: I suggest this patch to allow architectures do substitute cc0_rtx with a generated cc register. Why? If you are using a cc register plus your architecture as many instructions

Re: [PATCH, c] all plattforms: support using a CC_REG instead cc0_rtx

2019-12-13 Thread Stefan Franke
Am 2019-12-13 21:59, schrieb Segher Boessenkool: On Fri, Dec 13, 2019 at 08:55:15PM +0100, Stefan Franke wrote: Am 2019-12-13 18:58, schrieb Segher Boessenkool: >On Fri, Dec 13, 2019 at 05:25:41PM +0100, Stefan Franke wrote: >>Why? If you are using a cc register plus your architectur

Re: [PATCH, c] all plattforms: support using a CC_REG instead cc0_rtx

2019-12-13 Thread Stefan Franke
Am 2019-12-14 04:03, schrieb Andrew Pinski: On Fri, Dec 13, 2019 at 6:56 PM Stefan Franke wrote: Am 2019-12-13 21:59, schrieb Segher Boessenkool: > On Fri, Dec 13, 2019 at 08:55:15PM +0100, Stefan Franke wrote: >> Am 2019-12-13 18:58, schrieb Segher Boessenkool: >> >On Fri,

Re: [PATCH v3] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV

2024-03-12 Thread Stefan O'Rear
On Tue, Mar 12, 2024, at 2:15 AM, pan2...@intel.com wrote: > From: Pan Li > > Update in v3: > * Add pre-defined __riscv_v_fixed_vlen when zvl. > > Update in v2: > * Cleanup some unused code. > * Fix some typo of commit log. > > Original log: > > This patch would like to introduce one new gcc attri

Re: [Patch 2/2] Localization problem in regex

2013-08-23 Thread Stefan Schweter
for (std::wsregex_token_iterator p{str2.begin(), str2.end(), re2, {1}}; p != end; ++p) { std::wcout << *p << std::endl; } return 0; } May be it's better to throw an exception here? Thanks in advance, Stefan

Re: [PATCH] Fix illegal cast to rtx (*insn_gen_fn) (rtx, ...)

2013-08-27 Thread Stefan Kristiansson
On Tue, Aug 27, 2013 at 11:03:32AM +0200, Richard Biener wrote: > On Wed, Jul 10, 2013 at 3:14 AM, Stefan Kristiansson > wrote: > > The (static arg) generator functions are casted to a var arg > > function pointer, making the assumption that the ABI for passing > > the ar

Re: [WWWDocs] Deprecate support for non-thumb ARM devices

2016-02-25 Thread Stefan Ring
On Thu, Feb 25, 2016 at 10:20 AM, Richard Earnshaw (lists) wrote: > The point is to permit the compiler to use interworking compatible > sequences of code when generating ARM code, not to force users to use > Thumb code. The necessary instruction (BX) is available in armv5 and > armv5e, even thou

Re: [WWWDocs] Deprecate support for non-thumb ARM devices

2016-02-25 Thread Stefan Ring
On Thu, Feb 25, 2016 at 3:15 PM, David Brown wrote: > The "t" is thumb, "e" means "DSP-like extensions", and I suspect the "l" > is a misprint for "j", meaning the Jazelle (Java) acceleration instructions. I doubt that as "armv5tejl" is also quite common.

Re: [WWWDocs] Deprecate support for non-thumb ARM devices

2016-02-25 Thread Stefan Ring
On Thu, Feb 25, 2016 at 3:15 PM, David Brown wrote: > Great link, thanks!

[PATCH] Simplified construction of constants for __popcountSI2/__popcountDI2 in libgcc2.c

2020-11-20 Thread Stefan Kanthak
The construction of the "magic" constants 0x55...55, 0x33...33, 0x0f...0f and 0x01...01 in __popcountSI2 and __popcountDI2 with macros is awkward; these constants can simply be written as ((UWtype) ~0 / 3), ((UWtype) ~0 / 5), ((UWtype) ~0 / 17) and ((UWtype) ~0 / 255) Stefan Kantha

Re: [PATCH] Simplified construction of constants for __popcountSI2/__popcountDI2 in libgcc2.c

2020-11-20 Thread Stefan Kanthak
Jakub Jelinek wrote: > On Fri, Nov 20, 2020 at 11:08:41AM +0100, Stefan Kanthak wrote: >> The construction of the "magic" constants 0x55...55, 0x33...33, 0x0f...0f >> and 0x01...01 in __popcountSI2 and __popcountDI2 with macros is awkward; >> these constants can si

Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-24 Thread Stefan Kanthak
Andreas Schwab wrote 2020-11-11: > On Nov 10 2020, Stefan Kanthak wrote: > >> Eric Botcazou wrote: >> >>>> The implementation of the __ashlDI3(), __ashrDI3() and __lshrDI3() >>>> functions >>>> is rather bad, it yields bad machine code a

Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-24 Thread Stefan Kanthak
Andreas Schwab wrote: > On Nov 24 2020, Stefan Kanthak wrote: > >> 'nuff said > > What's your point? Pinpoint deficiencies and bugs in GCC and libgcc, plus a counter example to your "argument"! I recommend careful reading. Stefan

Re: [PATCH] Overflow-trapping integer arithmetic routines7code

2020-11-25 Thread Stefan Kanthak
Jeff Law wrote: > On 11/10/20 10:21 AM, Stefan Kanthak wrote: > >>> So with all that in mind, I installed everything except the bits which >>> have the LIBGCC2_BAD_CODE ifdefs after testing on the various crosses. >>> If you could remove the ifdefs on the a

Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-25 Thread Stefan Kanthak
Jeff Law wrote: > On 11/24/20 8:40 AM, Stefan Kanthak wrote: >> Andreas Schwab wrote: >> >>> On Nov 24 2020, Stefan Kanthak wrote: >>> >>>> 'nuff said >>> What's your point? >> Pinpoint deficiencies and bugs in GCC and libgcc

Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-25 Thread Stefan Kanthak
Jakub Jelinek wrote: > On Wed, Nov 25, 2020 at 09:22:53PM +0100, Stefan Kanthak wrote: >> > As Jakub has already indicated, your change will result in infinite >> > recursion on avr.Ã, I happened to have a cr16 handy and it looks like >> > it'd generate infin

Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-25 Thread Stefan Kanthak
too, just like "double-word" addition and subtraction. A possible/reasonable explanation would be code size, i.e. if the synthesized instructions need significantly more memory than the function call (including the argument setup of course). Stefan

Re: [PATCH] Overflow-trapping integer arithmetic routines7code

2020-12-07 Thread Stefan Kanthak
Jeff Law wrote Wednesday, November 25, 2020 7:11 PM: > On 11/25/20 6:18 AM, Stefan Kanthak wrote: >> Jeff Law wrote: [...] >>> My inclination is to leave the overflow checking double-word multiplier >>> as-is. >> See but <https://gcc.gnu.org/piperm

Re: [PATCH] i386: simplify cpu_feature handling

2021-12-16 Thread Stefan Kneifel
? This should not happen - however, a lot of things shouldn't happen... and it might facilitiate locating a potential bug at a later time. Regards, Stefan

[PATCH] ARM: fix -masm-syntax-unified (PR88648)

2019-01-01 Thread Stefan Agner
This allows to use unified asm syntax when compiling for the ARM instruction. This matches documentation and seems what the initial patch was intended doing when the flag got added. --- gcc/config/arm/arm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/config/arm/arm.c

[PATCH] ARM: add test case for -masm-syntax-unified (PR88648)

2019-01-02 Thread Stefan Agner
Add a test case to check whether -masm-syntax-unified is indeed emitting the inline assembler with .syntax unified. --- .../gcc.target/arm/pr88648-asm-syntax-unified.c| 14 ++ 1 file changed, 14 insertions(+) create mode 100644 gcc/testsuite/gcc.target/arm/pr88648-asm-syntax-unifi

Re: [PATCH] ARM: add test case for -masm-syntax-unified (PR88648)

2019-01-08 Thread Stefan Agner
On 08.01.2019 10:35, Kyrill Tkachov wrote: > Hi Stefan, > > On 02/01/19 21:47, Stefan Agner wrote: >> Add a test case to check whether -masm-syntax-unified is indeed >> emitting the inline assembler with .syntax unified. > > Can you please provide a Chan

[PATCH v2] ARM: add test case for -masm-syntax-unified (PR88648)

2019-01-08 Thread Stefan Agner
Add a test case to check whether -masm-syntax-unified is indeed emitting the inline assembler with .syntax unified. gcc/testsuite/ChangeLog * gcc.target/arm/pr88648-asm-syntax-unified.c: add test to check if -masm-syntax-unified gets applied properly --- .../gcc.target/a

Re: [PATCH] ARM: fix -masm-syntax-unified (PR88648)

2019-02-09 Thread Stefan Agner
Hi Kyrill, On 10.01.2019 12:38, Kyrill Tkachov wrote: > Hi Stefan, > > On 08/01/19 09:33, Kyrill Tkachov wrote: >> Hi Stefan, >> >> On 01/01/19 23:34, Stefan Agner wrote: >> > This allows to use unified asm syntax when compiling for the >> > ARM inst

[PATCH v2] Update documentation for ARM architecture

2016-06-06 Thread Stefan Brüns
hanged, 21 insertions(+), 7 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 3e68798..8c1e54b 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,10 @@ +2016-06-06 Stefan Bruens + + * doc/invoke.texi (ARM Options): Use lexicographical ordering. + Correct usage o

[PING] [PATCH] longlong.h: Add prototype for udiv_w_sdiv

2014-09-12 Thread Stefan Liebler
ot;, https://www.sourceware.org/ml/libc-alpha/2014-09/msg00264.html) Please review Andreas Krebbel´s patch and give okay for commit. Bye Stefan

Re: [PATCH] Fix illegal cast to rtx (*insn_gen_fn) (rtx, ...)

2013-07-10 Thread Stefan Kristiansson
printf ("0,\n"); + printf ("{ 0 },\n"); printf ("&operand_data[%d],\n", d->operand_number); printf ("%d,\n", d->n_generator_args); Fair enough, that makes the printing routine a bit more clean and removes some code duplication in the declaration. Stefan

Re: [PATCH] rtl-optimization/110939 Really fix narrow comparison of memory and constant

2023-10-01 Thread Stefan Schulze Frielinghaus
On Fri, Sep 29, 2023 at 01:01:57PM -0600, Jeff Law wrote: > > > On 8/10/23 07:04, Stefan Schulze Frielinghaus via Gcc-patches wrote: > > In the former fix in commit 41ef5a34161356817807be3a2e51fbdbe575ae85 I > > completely missed the fact that the normal form of a generate

[PATCH] s390: Make use of new copysign RTL

2023-10-05 Thread Stefan Schulze Frielinghaus
gcc/ChangeLog: * config/s390/s390.md: Make use of new copysign RTL. --- gcc/config/s390/s390.md | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md index 9631b2a8c60..3f29ba21442 100644 --- a/gcc/config/s390/s390.md +

[PATCH] combine: Fix handling of unsigned constants

2023-10-06 Thread Stefan Schulze Frielinghaus
If a CONST_INT represents an integer of a mode with fewer bits than in HOST_WIDE_INT, then the integer is sign extended. For those two optimizations touched by this patch, the integers of interest have only the most significant bit set w.r.t their mode, therefore, they were sign extended. Thus in

[PATCH] s390: Fix expander popcountv8hi2_vx

2023-10-16 Thread Stefan Schulze Frielinghaus
The normal form of a CONST_INT which represents an integer of a mode with fewer bits than in HOST_WIDE_INT is sign extended. This even holds for unsigned integers. This fixes an ICE during cse1 where we bail out at rtl.h:2297 since INTVAL (x.first) == sext_hwi (INTVAL (x.first), precision) does n

[PATCH] testsuite: Fix _BitInt in gcc.misc-tests/godump-1.c

2023-10-24 Thread Stefan Schulze Frielinghaus
Currently _BitInt is only supported on x86_64 which means that for other targets all tests fail with e.g. gcc.misc-tests/godump-1.c:237:1: sorry, unimplemented: '_BitInt(32)' is not supported on this target 237 | _BitInt(32) b32_v; | ^~~ Instead of requiring _BitInt support for godum

[PATCH] s390: Fix constraint for insn *cmphi_ccu

2023-10-25 Thread Stefan Schulze Frielinghaus
Currently for an unsigned 16-bit comparison between memory and an immediate where the high bit is set, a clc is emitted. This is because the constant is created for mode HI and therefore sign extended. This means constraint D does not hold anymore. Since the mode already restricts the immediate

[PATCH] Fix comparison of trees via tree_cmp

2020-01-22 Thread Stefan Schulze Frielinghaus
x27;s Delight book ;-)). Bootstrapped and tested on s390x. Any thoughts? Cheers, Stefan [1] https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=blob;f=gcc/analyzer/region-model.cc;h=9474c6737d54d68f5b36893903cfa6d19df0efed;hb=HEAD#l1849 diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-mod

Re: [PATCH] analyzer: fixes to tree_cmp and other comparators

2020-01-24 Thread Stefan Schulze Frielinghaus
successfully bootstrap + regtest. Cheers, Stefan

Return slot optimization for stack protector strong

2020-01-27 Thread Stefan Schulze Frielinghaus
functions which are finally implemented via calls to internal functions. The attached patch solves this for me. Any thoughts? Successfully bootstrapped and regtested on s390x. Cheers, Stefan diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index 9864e4344d2..452813efef1 100644 --- a/gcc/cfgexpand.c +++ b

Re: Return slot optimization for stack protector strong

2020-01-28 Thread Stefan Schulze Frielinghaus
On Mon, Jan 27, 2020 at 06:53:51PM +0100, Jakub Jelinek wrote: > On Mon, Jan 27, 2020 at 06:49:23PM +0100, Stefan Schulze Frielinghaus wrote: > > some function calls trigger the stack-protector-strong although such > > calls are later on implemented via calls to internal function

[PATCH] s390: Streamline vector builtins with LLVM

2024-03-01 Thread Stefan Schulze Frielinghaus
Similar as to s390_lcbb, s390_vll, s390_vstl, et al. make use of a signed vector type for vlbb. Furthermore, a const void pointer seems more common and an integer for the mask. For s390_vfi(s,d)b make use of integers for masks, too. Use unsigned integers for all s390_vlbr/vstbr variants. Make u

[PATCH] s390: Streamline NNPA builtins with POP mnemonics

2024-03-01 Thread Stefan Schulze Frielinghaus
At the moment there are no extended mnemonics for vclfn(h,l) and vcrnf defined in the Principles of Operation. Thus, remove the suffix "s" from the builtins and expanders and introduce a further operand for the data type. gcc/ChangeLog: * config/s390/s390-builtin-types.def: Update to ref

[PATCH] s390: Deprecate some vector builtins

2024-03-01 Thread Stefan Schulze Frielinghaus
According to IBM Open XL C/C++ for z/OS version 1.1 builtins - vec_permi - vec_ctd - vec_ctsl - vec_ctul - vec_ld2f - vec_st2f are deprecated. Also deprecate helper builtins vec_ctd_s64 and vec_ctd_u64. Furthermore, the overloads of vec_insert which make use of a bool vector are deprecated, too

Re: [PATCH] s390: Streamline NNPA builtins with POP mnemonics

2024-03-06 Thread Stefan Schulze Frielinghaus
Since there is no straight forward way to introduce an overload with different return types where we would expand differently depending on an immediate operand, lets drop this patch. On Fri, Mar 01, 2024 at 04:18:31PM +0100, Stefan Schulze Frielinghaus wrote: > At the moment there are no exten

Re: [PATCH] s390: Fix test vector/long-double-to-i64.c

2024-03-12 Thread Stefan Schulze Frielinghaus
On Mon, Mar 11, 2024 at 11:14:04AM +0100, Andreas Krebbel wrote: > On 2/29/24 13:15, Stefan Schulze Frielinghaus wrote: > > Starting with r14-8319-g86de9b66480b71 fwprop improved so that vpdi is > > no longer required. > > > > gcc/testsuite/ChangeLog: > > >

Re: RFC: New mechanism for hard reg operands to inline asm

2024-03-15 Thread Stefan Schulze Frielinghaus
%0,%1\n" : "={r4}" (y) : "{r1}" (y)); return y; } IMHO the input is just fine but the output constraint is misleading and it is not obvious in which register variable y resides after the asm statement. With my current implementation, were I don't bail out, it is register r4 contrary to the decl. Interestingly, the other way around where one register is "aliased" by multiple variables is accepted by vanilla GCC: int foo (int x, int y) { register int a asm ("r1") = x; register int b asm ("r1") = y; return a + b; } Though, probably not intentionally. Cheers, Stefan

[PATCH] analyzer: Bail out on function pointer for -Wanalyzer-allocation-size

2024-03-19 Thread Stefan Schulze Frielinghaus
On s390 pr94688.c is failing due to excess error pr94688.c:6:5: warning: allocated buffer size is not a multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size] This is because on s390 functions are by default aligned to an 8-byte boundary and during function type construction size

Re: [PATCH] analyzer: Bail out on function pointer for -Wanalyzer-allocation-size

2024-03-21 Thread Stefan Schulze Frielinghaus
On Tue, Mar 19, 2024 at 12:38:34PM -0400, David Malcolm wrote: > On Tue, 2024-03-19 at 16:10 +0100, Stefan Schulze Frielinghaus wrote: > > On s390 pr94688.c is failing due to excess error > > > > pr94688.c:6:5: warning: allocated buffer size is not a multiple of > >

[PATCH] s390: testsuite: Fix abs-4.c

2024-03-21 Thread Stefan Schulze Frielinghaus
gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/abs-4.c: On s390 we also have a copysign optab for long double. Thus, scan 3 instead of 2 times for it. --- Ok for mainline? gcc/testsuite/gcc.dg/tree-ssa/abs-4.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git

[PATCH] s390: testsuite: Fix backprop-6.c

2024-03-22 Thread Stefan Schulze Frielinghaus
gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/backprop-6.c: On s390 we also have a copysign optab for long double. Thus, scan 3 instead of 2 times for it. --- OK for mainline? gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-)

[PATCH] testsuite: Fix copy-headers-8.c

2024-03-26 Thread Stefan Schulze Frielinghaus
This fixes the test on s390x. I'm also seeing test failures for riscv64-suse-linux-gnu, m68k-unknown-linux-gnu, pru-unknown-elf, and powerpc64le-unknown-linux-gnu. However, I didn't check them so this might or might not fix those, too. OK for mainline? gcc/testsuite/ChangeLog: * gcc.dg

Re: [PATCH] s390x: Optimize vector permute with constant indexes

2024-04-09 Thread Stefan Schulze Frielinghaus
; + concat_mat(mat4, mat3, mat5); > +} > +void concat_mat(MATRIX_T m1, MATRIX_T, MATRIX_T m3) { > + int k; > + for (;; concat_mat_i++) { > +concat_mat_j = 0; > +for (; 4; concat_mat_j++) { > + k = 0; > + for (; k < 4; k++) > +m3[concat_mat_i][concat_mat_j] += m1[concat_mat_i][k]; > +} Just nitpicking, if we could come up with a test case which does not involve integer overflows due to non-terminating loops, I would prefer that. Cheers, Stefan > + } > +} > + > +/* { dg-final { scan-assembler-not "vperm" } } */ > -- > 2.39.3 >

[PATCH] s390: testsuite: Fix loop-interchange-16.c

2024-04-11 Thread Stefan Schulze Frielinghaus
Revert parameter max-completely-peel-times to 16, otherwise, the innermost loop is removed and we are left with no loop interchange which this test is all about. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/loop-interchange-16.c: Revert parameter max-completely-peel-times for s390.

[PATCH] testsuite: Fix loop-interchange-16.c

2024-04-11 Thread Stefan Schulze Frielinghaus
Yes, that works, too. Will commit. Thanks, Stefan -- Prevent loop unrolling of the innermost loop because otherwise we are left with no loop interchange for targets like s390 which have a more aggressive loop unrolling strategy. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/loop

[PATCH] s390: Fix TARGET_SECONDARY_RELOAD for non-SYMBOL_REFs

2024-02-29 Thread Stefan Schulze Frielinghaus
RTX X must not necessarily be a SYMBOL_REF and may e.g. be an UNSPEC_GOTENT for which SYMBOL_FLAG_NOTALIGN2_P fails. gcc/ChangeLog: * config/s390/s390.cc (s390_secondary_reload): Guard SYMBOL_FLAG_NOTALIGN2_P. --- gcc/config/s390/s390.cc | 2 +- 1 file changed, 1 insertion(+), 1

[PATCH] s390: Fix tests rosbg_si_srl and rxsbg_si_srl

2024-02-29 Thread Stefan Schulze Frielinghaus
Starting with r14-2047-gd0e891406b16dc two SI mode tests are optimized into DI mode. Thus, the scan-assembler directives fail. For example RTL expression (ior:SI (subreg:SI (lshiftrt:DI (reg:DI 69) (const_int 2 [0x2])) 4) (subreg:SI (reg:DI 68) 4)) is optimized into (ior:DI (ls

[PATCH] s390: Fix test vector/long-double-to-i64.c

2024-02-29 Thread Stefan Schulze Frielinghaus
Starting with r14-8319-g86de9b66480b71 fwprop improved so that vpdi is no longer required. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/long-double-to-i64.c: Fix scan assembler directive. --- .../gcc.target/s390/vector/long-double-to-i64.c | 13 + 1 file chan

Re: [PATCH] s390: Fix TARGET_SECONDARY_RELOAD for non-SYMBOL_REFs

2024-02-29 Thread Stefan Schulze Frielinghaus
On Thu, Feb 29, 2024 at 01:26:54PM +0100, Andreas Schwab wrote: > On Feb 29 2024, Stefan Schulze Frielinghaus wrote: > > > RTX X must not necessarily be a SYMBOL_REF and may e.g. be an > > False friend: s/must not/need not/ Argh I always fall for this ;-) Thanks for pointing

Re: [RFA] [V3] new pass for sign/zero extension elimination

2024-01-04 Thread Stefan Schulze Frielinghaus
I have successfully bootstrapped and regtested the patch on s390. Out of curiosity I also ran some benchmarks which didn't show much changes except in one case which I will have to analyze further. If there is anything interesting I will reach back to you. Cheers, Stefan On Mon, Jan 01,

[PATCH] s390: Fix expansion of vec_step

2023-12-04 Thread Stefan Schulze Frielinghaus
Add missing "s390" while expanding vec_step to __builtin_s390_vec_step. gcc/ChangeLog: * config/s390/vecintrin.h (vec_step): Expand vec_step to __builtin_s390_vec_step. --- gcc/config/s390/vecintrin.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/con

[PATCH 1/3] s390: Recognize further vpdi and vmr{l,h} pattern

2023-11-09 Thread Stefan Schulze Frielinghaus
Deal with cases where vpdi and vmr{l,h} are still applicable if the operands of those instructions are swapped. For example, currently for V2DI foo (V2DI x) { return (V2DI) {x[1], x[0]}; } the assembler sequence vlgvg %r1,%v24,1 vzero %v0 vlvgg %v0,%r1,0 vmrhg %v24,%v0,%v24 is emitte

[PATCH 3/3] s390: Revise vector reverse elements

2023-11-09 Thread Stefan Schulze Frielinghaus
Replace UNSPEC_VEC_ELTSWAP with a vec_select implementation. Furthermore, for a vector reverse elements operation between registers of mode V8HI perform three rotates instead of a vperm operation since the latter involves loading the permutation vector from the literal pool. Prior z15, instead of

[PATCH] s390: Reduce number of patterns where the condition is false anyway

2023-11-09 Thread Stefan Schulze Frielinghaus
For patterns which make use of two modes, do not build the cross product and then exclude illegal combinations via conditions but rather do not create those in the first place. Here we are following the idea of the attribute TOINTVEC/tointvec and introduce TOINT/toint. Bootstrapped and regtested

[PATCH 2/3] s390: Add expand_perm_reverse_elements

2023-11-09 Thread Stefan Schulze Frielinghaus
Replace expand_perm_with_rot, expand_perm_with_vster, and expand_perm_with_vstbrq with a general implementation expand_perm_reverse_elements. Bootstrapped and regtested on s390. Ok for mainline? gcc/ChangeLog: * config/s390/s390.cc (expand_perm_with_rot): Remove. (expand_perm_re

[PATCH] s390: Fix vec_scatter_element for vectors of floats

2023-11-14 Thread Stefan Schulze Frielinghaus
The offset for vec_scatter_element of floats should be a vector of type UV4SI instead of V4SF. Note, this is an incompatibility change. Bootstrapped on s390. Ok for mainline? gcc/ChangeLog: * config/s390/s390-builtin-types.def: Add/remove types. * config/s390/s390-builtins.def

[PATCH] s390: Fix builtins floating-point convert to/from fixed

2023-11-14 Thread Stefan Schulze Frielinghaus
Remove flags for non-existing operands 2 and 3. Bootstrapped on s390. Ok for mainline? gcc/ChangeLog: * config/s390/s390-builtins.def (s390_vcefb,s390_vcdgb,s390_vcelfb,s390_vcdlgb,s390_vcfeb,s390_vcgdb, s390_vclfeb,s390_vclgdb): Remove flags for non-existing operands

[PATCH] s390: Fix generation of s390-gen-builtins.h

2023-11-15 Thread Stefan Schulze Frielinghaus
By default the preprocessed output includes linemarkers. This leads to an error if -pedantic is used as e.g. during bootstrap: s390-gen-builtins.h:1:3: error: style of line directive is a GCC extension [-Werror] Fixed by omitting linemarkers while generating s390-gen-builtins.h. gcc/ChangeLog:

[PATCH] s390: Streamline NNPA builtins with their LLVM counterparts

2023-11-16 Thread Stefan Schulze Frielinghaus
For the opaque NNP-data type prefer unsigned over signed integer types. gcc/ChangeLog: * config/s390/s390-builtin-types.def: Add/remove types. * config/s390/s390-builtins.def (s390_vclfnhs,s390_vclfnls,s390_vcrnfs,s390_vcfn,s390_vcnf): Replace type V8HI with UV8HI.

Re: [PATCH] s390: Fix builtins floating-point convert to/from fixed

2023-11-27 Thread Stefan Schulze Frielinghaus
Ping. On Tue, Nov 14, 2023 at 04:19:59PM +0100, Stefan Schulze Frielinghaus wrote: > Remove flags for non-existing operands 2 and 3. > > Bootstrapped on s390. Ok for mainline? > > gcc/ChangeLog: > > * config/s390/s390-builtins.def > (s390_vcefb,s390_vcdgb

Re: [PATCH] s390: Fix constraint for insn *cmphi_ccu

2023-11-27 Thread Stefan Schulze Frielinghaus
Ping. On Wed, Oct 25, 2023 at 11:27:33AM +0200, Stefan Schulze Frielinghaus wrote: > Currently for an unsigned 16-bit comparison between memory and an > immediate where the high bit is set, a clc is emitted. This is because > the constant is created for mode HI and therefore sign extend

Re: [PATCH] s390: Streamline NNPA builtins with their LLVM counterparts

2023-11-27 Thread Stefan Schulze Frielinghaus
Ping. On Thu, Nov 16, 2023 at 01:07:30PM +0100, Stefan Schulze Frielinghaus wrote: > For the opaque NNP-data type prefer unsigned over signed integer types. > > gcc/ChangeLog: > > * config/s390/s390-builtin-types.def: Add/remove types. > * config/s390

[PATCH] s390: Fixup builtins vec_rli and verll

2023-11-27 Thread Stefan Schulze Frielinghaus
Commit 248df13b966f46649e16dc3c8c92b263790ef503 restricted the rotate count to immediates. Although the documentation of vec_rli (Vector Element Rotate Left Immediate) can be read as if it where restricted to immediates, this is not the case. Thus, revert this commit. In order to finally allow r

[PATCH] s390: Add missing builtin type

2023-11-27 Thread Stefan Schulze Frielinghaus
One builtin type slipped through the cracks of the last commits. Bootstrapped on s390. Ok for mainline? gcc/ChangeLog: * config/s390/s390-builtin-types.def (BT_FN_UV8HI_UV8HI_UINT): Add missing builtin type. --- gcc/config/s390/s390-builtin-types.def | 1 + 1 file changed, 1 in

Re: PING^5: [PATCH] rtl-optimization/110939 Really fix narrow comparison of memory and constant

2023-09-19 Thread Stefan Schulze Frielinghaus
egers. Is there anyone who can shed some light on _why_ such a normal form was chosen? Independent of why such a normal form was chosen, this patch restores the normal form and solves the bootstrap problem for Loongarch. Cheers, Stefan

Re: [PATCH] Hard register asm constraint

2024-06-26 Thread Stefan Schulze Frielinghaus
On Wed, Jun 26, 2024 at 11:10:38AM -0400, Paul Koning wrote: > > > > On Jun 26, 2024, at 8:54 AM, Stefan Schulze Frielinghaus > > wrote: > > > > On Tue, Jun 25, 2024 at 01:02:39PM -0400, Paul Koning wrote: > >> > >> > >>&g

Re: [PATCH] Hard register asm constraint

2024-06-27 Thread Stefan Schulze Frielinghaus
On Thu, Jun 27, 2024 at 09:45:32AM +0200, Georg-Johann Lay wrote: > > > Am 24.05.24 um 11:13 Am 25.06.24 um 16:03 schrieb Paul Koning: > > > > > > > On Jun 24, 2024, at 1:50 AM, Stefan Schulze Frielinghaus > > > wrote: > > > > > > Pi

Re: [PATCH] Hard register asm constraint

2024-06-28 Thread Stefan Schulze Frielinghaus
On Fri, Jun 28, 2024 at 11:46:08AM +0200, Georg-Johann Lay wrote: > Am 27.06.24 um 10:51 schrieb Stefan Schulze Frielinghaus: > > On Thu, Jun 27, 2024 at 09:45:32AM +0200, Georg-Johann Lay wrote: > > > Am 24.05.24 um 11:13 Am 25.06.24 um 16:03 schrieb Paul Koning: > > >

[PATCH 2/3] s390: Enable vcond_mask for 128-bit ops

2024-07-01 Thread Stefan Schulze Frielinghaus
In preparation of dropping vcond{,u,eq} optabs https://gcc.gnu.org/pipermail/gcc-patches/2024-June/654690.html enable 128-bit operands for vcond_mask---including integer as well as floating point. This fixes partially PR115519 w.r.t. autovec-long-double-signaling-*.c tests. gcc/ChangeLog:

[PATCH 1/3] s390: Emulate vec_cmp{eq,gt,gtu} for 128-bit integers

2024-07-01 Thread Stefan Schulze Frielinghaus
Mode iterator V_HW enables V1TI for target VXE which means vec_cmpv1tiv1ti becomes available which leads to an ICE since there is no corresponding insn. Fixed by emulating comparisons and enabling mode V1TI unconditionally for V_HW. For the sake of symmetry, I also added TI mode to V_HW since TF

[PATCH 3/3] s390: Drop vcond{,u} expanders

2024-07-01 Thread Stefan Schulze Frielinghaus
Optabs vcond{,u} will be removed for GCC 15. Since regtest shows no fallout, dropping the expanders, now. gcc/ChangeLog: PR target/114189 * config/s390/vector.md (V_HW2): Remove. (vcond): Remove. (vcondu): Remove. --- Bootstrapped and regtested on s390. Ok for m

[PATCH 0/3] Prepare and drop vcond expanders

2024-07-01 Thread Stefan Schulze Frielinghaus
h is why I would like to make sure that this patch lands first and included it in this series. Stefan Schulze Frielinghaus (3): s390: Emulate vec_cmp{eq,gt,gtu} for 128-bit integers s390: Enable vcond_mask for 128-bit ops s390: Drop vcond{,u} expanders gcc/config/s390/vector.md

[PATCH] s390: Fix output template for movv1qi

2024-07-02 Thread Stefan Schulze Frielinghaus
Although for instructions MVI and MVIY it does not make a difference whether the immediate is interpreted as signed or unsigned, GAS expects unsigned immediates for instruction format SI_URD. gcc/ChangeLog: * config/s390/vector.md (mov): Fix output template for movv1qi. --- Boots

[PATCH] s390: Fully exploit vgm, vgbm, vrepi

2024-07-02 Thread Stefan Schulze Frielinghaus
Currently instructions vgm and vrepi are utilized only for constant vectors where the element mode equals the element mode of the corresponding instruction. This patch lifts this restriction by making use of those instructions for constant vectors even if element modes do not coincide. For exampl

[PATCH] s390: Align *cjump_64 and *icjump_64

2024-07-11 Thread Stefan Schulze Frielinghaus
During machine reorg we optimize backward jumps and transform insns as e.g. (jump_insn 118 117 119 (set (pc) (if_then_else (ne (reg:CCRAW 33 %cc) (const_int 8 [0x8])) (label_ref 134) (pc))) "dec_math_1.f90":204:8 discrim 1 2161 {*cjump_64} (expr

Re: [PATCH] s390: Align *cjump_64 and *icjump_64

2024-07-11 Thread Stefan Schulze Frielinghaus
On Thu, Jul 11, 2024 at 04:29:19PM +0200, Stefan Schulze Frielinghaus wrote: > During machine reorg we optimize backward jumps and transform insns as > e.g. > > (jump_insn 118 117 119 (set (pc) > (if_then_else (ne (reg:CCRAW 33 %cc) >

Re: [PATCH] s390: Align *cjump_64 and *icjump_64

2024-07-11 Thread Stefan Schulze Frielinghaus
On Thu, Jul 11, 2024 at 05:14:58PM +0200, Jakub Jelinek wrote: > On Thu, Jul 11, 2024 at 05:09:41PM +0200, Stefan Schulze Frielinghaus wrote: > > I didn't have the schedule for 11.5 RC in mind which is tomorrow and the > > release a week afterwards. I hope this is still

Re: [PATCH] s390: Align *cjump_64 and *icjump_64

2024-07-11 Thread Stefan Schulze Frielinghaus
On Thu, Jul 11, 2024 at 07:32:17PM +0200, Stefan Schulze Frielinghaus wrote: > On Thu, Jul 11, 2024 at 05:14:58PM +0200, Jakub Jelinek wrote: > > On Thu, Jul 11, 2024 at 05:09:41PM +0200, Stefan Schulze Frielinghaus wrote: > > > I didn't have the schedule for 11.5 RC in min

[PATCH] s390: Fix unresolved iterators bhfgq and xdee

2024-07-16 Thread Stefan Schulze Frielinghaus
Code attribute bhfgq is missing a mapping for TF. This results in unresolved iterators in assembler templates for *bswaptf. With the TF mapping added the base mnemonics vlbr and vstbr are not "used" anymore but only the extended mnemonics (vlbr was interpreted as vlbr; likewise for vstbr). There

[PATCH] s390: Fix unresolved iterators bhfgq and xdee

2024-07-16 Thread Stefan Schulze Frielinghaus
Code attribute bhfgq is missing a mapping for TF. This results in unresolved iterators in assembler templates for *bswaptf. With the TF mapping added the base mnemonics vlbr and vstbr are not "used" anymore but only the extended mnemonics (vlbr was interpreted as vlbr; likewise for vstbr). There

[PATCH] s390: testsuite: Fix vcond-shift.c

2024-07-19 Thread Stefan Schulze Frielinghaus
Previously we optimized expressions of the form a < 0 ? -1 : 0 to (signed)a >> 31 during vcond expanding. Since r15-1741-g2ccdd0f22312a1 this is done in match.pd. The implementation in the back end as well as in match.pd are basically the same but still distinct. For the tests in vcond-shift.c t

Re: [PATCH] s390: testsuite: Fix vcond-shift.c

2024-07-19 Thread Stefan Schulze Frielinghaus
On Thu, Jul 18, 2024 at 11:58:10PM -0700, Andrew Pinski wrote: > On Thu, Jul 18, 2024 at 10:31 PM Stefan Schulze Frielinghaus > wrote: > > > > Previously we optimized expressions of the form a < 0 ? -1 : 0 to > > (signed)a >> 31 during vcond expanding. Since r15

Re: [PATCH] s390: Fix unresolved iterators bhfgq and xdee

2024-07-19 Thread Stefan Schulze Frielinghaus
I'm pinging this early since I would like to make sure that it gets into 14.2 RC which is about to be done on Tuesday 23rd July. On Tue, Jul 16, 2024 at 04:50:29PM +0200, Stefan Schulze Frielinghaus wrote: > Code attribute bhfgq is missing a mapping for TF. This results in > unresolve

[PATCH v2] s390: Implement TARGET_NOCE_CONVERSION_PROFITABLE_P [PR109549]

2024-05-17 Thread Stefan Schulze Frielinghaus
I've adapted the patch as follows and will push. Thanks, Stefan -- Consider a NOCE conversion as profitable if there is at least one conditional move. gcc/ChangeLog: * config/s390/s390.cc (TARGET_NOCE_CONVERSION_PROFITABLE_P): Define. (s390_noce_conversion_profita

  1   2   3   4   5   >