Re: [PATCH] s390: Make use of new copysign RTL
On 10/5/23 08:46, Stefan Schulze Frielinghaus wrote: > gcc/ChangeLog: > > * config/s390/s390.md: Make use of new copysign RTL. Ok. Thanks! Andreas > --- > gcc/config/s390/s390.md | 6 ++ > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md > index 9631b2a8c60..3f29ba21442 100644 > --- a/gcc/config/s390/s390.md > +++ b/gcc/config/s390/s390.md > @@ -124,7 +124,6 @@ > > ; Byte-wise Population Count > UNSPEC_POPCNT > - UNSPEC_COPYSIGN > > ; Load FP Integer > UNSPEC_FPINT_FLOOR > @@ -11918,9 +11917,8 @@ > > (define_insn "copysign3" >[(set (match_operand:FP 0 "register_operand" "=f") > - (unspec:FP [(match_operand:FP 1 "register_operand" "") > - (match_operand:FP 2 "register_operand" "f")] > - UNSPEC_COPYSIGN))] > + (copysign:FP (match_operand:FP 1 "register_operand" "") > + (match_operand:FP 2 "register_operand" "f")))] >"TARGET_Z196" >"cpsdr\t%0,%2,%1" >[(set_attr "op_type" "RRF")
Re: [PATCH] s390: Fix expander popcountv8hi2_vx
On 10/16/23 13:20, Stefan Schulze Frielinghaus wrote: > The normal form of a CONST_INT which represents an integer of a mode > with fewer bits than in HOST_WIDE_INT is sign extended. This even holds > for unsigned integers. > > This fixes an ICE during cse1 where we bail out at rtl.h:2297 since > INTVAL (x.first) == sext_hwi (INTVAL (x.first), precision) does not hold. > > gcc/ChangeLog: > > * config/s390/vector.md (popcountv8hi2_vx): Sign extend each > unsigned vector element. Ok. Thanks! Bye, Andreas
Re: [PATCH] C++: Fix PR86083
On 06/20/2018 01:41 PM, Andreas Krebbel wrote: > When turning a user-defined numerical literal into an operator > invocation the literal needs to be translated to the execution > character set. > > Bootstrapped and regtested on s390x. x86_64 still running. > Ok to apply if x86_64 is clean? > > Bye, > > -Andreas- > > gcc/cp/ChangeLog: > > 2018-06-20 Andreas Krebbel > > PR C++/86082 > * parser.c (make_char_string_pack): > (cp_parser_userdef_numeric_literal): > > gcc/testsuite/ChangeLog: > > 2018-06-20 Andreas Krebbel > > PR C++/86082 > * g++.dg/pr86082.C: New test. I've tested the patch also on GCC 7 and 8 branch. Ok to apply there as well? The backport will include the testcase fix from Rainer: https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01601.html -Andreas-
Re: [PATCH, S390] Increase function alignment to 16 bytes
On 07/12/2018 01:34 PM, Robin Dapp wrote: > Hi, > >> Please skip '+ && !opts->x_optimize_size)'. I'm attaching patch >> that will >> set opts->x_flag_align_functions to 0 for -Os. It's part of another batch >> alignment patches I'm preparing. > > done in the attached version and added some tests (which do not all fail > without the patch as we can get lucky with the alignment). > > Regtested on s390x. > > Regards > Robin > > -- > > gcc/ChangeLog: > > 2018-07-12 Robin Dapp > > * config/s390/s390.c (s390_default_align): Set default function > alignment to 16. > (s390_override_options_after_change): Call s390_default align. > (s390_option_override_internal): Call s390_default_align. > (TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE): Define. > > gcc/testsuite/ChangeLog: > > 2018-07-12 Robin Dapp > > * gcc.target/s390/function-align1.c: New test. > * gcc.target/s390/function-align2.c: New test. > * gcc.target/s390/function-align3.c: New test. > Ok to apply. Thanks! Andreas
[PATCH] S/390: libstdc++: 64 and 32 bit baseline update
Obviously I missed doing a refresh for some time already. Do the updates look reasonable? Andreas libstdc++-v3/ChangeLog: 2018-07-13 Andreas Krebbel * config/abi/post/s390-linux-gnu/baseline_symbols.txt: Update. * config/abi/post/s390x-linux-gnu/32/baseline_symbols.txt: Update. * config/abi/post/s390x-linux-gnu/baseline_symbols.txt: Update. --- .../config/abi/post/s390-linux-gnu/baseline_symbols.txt| 14 ++ .../abi/post/s390x-linux-gnu/32/baseline_symbols.txt | 14 ++ .../config/abi/post/s390x-linux-gnu/baseline_symbols.txt | 14 ++ 3 files changed, 42 insertions(+) diff --git a/libstdc++-v3/config/abi/post/s390-linux-gnu/baseline_symbols.txt b/libstdc++-v3/config/abi/post/s390-linux-gnu/baseline_symbols.txt index 8deb2b2..3f5dee6 100644 --- a/libstdc++-v3/config/abi/post/s390-linux-gnu/baseline_symbols.txt +++ b/libstdc++-v3/config/abi/post/s390-linux-gnu/baseline_symbols.txt @@ -444,6 +444,7 @@ FUNC:_ZNKSt13basic_fstreamIwSt11char_traitsIwEE7is_openEv@GLIBCXX_3.4 FUNC:_ZNKSt13basic_istreamIwSt11char_traitsIwEE6gcountEv@@GLIBCXX_3.4 FUNC:_ZNKSt13basic_istreamIwSt11char_traitsIwEE6sentrycvbEv@@GLIBCXX_3.4 FUNC:_ZNKSt13basic_ostreamIwSt11char_traitsIwEE6sentrycvbEv@@GLIBCXX_3.4 +FUNC:_ZNKSt13random_device13_M_getentropyEv@@GLIBCXX_3.4.25 FUNC:_ZNKSt13runtime_error4whatEv@@GLIBCXX_3.4 FUNC:_ZNKSt14basic_ifstreamIcSt11char_traitsIcEE5rdbufEv@@GLIBCXX_3.4 FUNC:_ZNKSt14basic_ifstreamIcSt11char_traitsIcEE7is_openEv@@GLIBCXX_3.4.5 @@ -1859,10 +1860,12 @@ FUNC:_ZNSt11char_traitsIcE2eqERKcS2_@@GLIBCXX_3.4.5 FUNC:_ZNSt11char_traitsIcE2eqERKcS2_@GLIBCXX_3.4 FUNC:_ZNSt11char_traitsIwE2eqERKwS2_@@GLIBCXX_3.4.5 FUNC:_ZNSt11char_traitsIwE2eqERKwS2_@GLIBCXX_3.4 +FUNC:_ZNSt11logic_errorC1EOS_@@GLIBCXX_3.4.26 FUNC:_ZNSt11logic_errorC1EPKc@@GLIBCXX_3.4.21 FUNC:_ZNSt11logic_errorC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE@@GLIBCXX_3.4.21 FUNC:_ZNSt11logic_errorC1ERKS_@@GLIBCXX_3.4.21 FUNC:_ZNSt11logic_errorC1ERKSs@@GLIBCXX_3.4 +FUNC:_ZNSt11logic_errorC2EOS_@@GLIBCXX_3.4.26 FUNC:_ZNSt11logic_errorC2EPKc@@GLIBCXX_3.4.21 FUNC:_ZNSt11logic_errorC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE@@GLIBCXX_3.4.21 FUNC:_ZNSt11logic_errorC2ERKS_@@GLIBCXX_3.4.21 @@ -1870,6 +1873,7 @@ FUNC:_ZNSt11logic_errorC2ERKSs@@GLIBCXX_3.4 FUNC:_ZNSt11logic_errorD0Ev@@GLIBCXX_3.4 FUNC:_ZNSt11logic_errorD1Ev@@GLIBCXX_3.4 FUNC:_ZNSt11logic_errorD2Ev@@GLIBCXX_3.4 +FUNC:_ZNSt11logic_erroraSEOS_@@GLIBCXX_3.4.26 FUNC:_ZNSt11logic_erroraSERKS_@@GLIBCXX_3.4.21 FUNC:_ZNSt11range_errorC1EPKc@@GLIBCXX_3.4.21 FUNC:_ZNSt11range_errorC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE@@GLIBCXX_3.4.21 @@ -2230,10 +2234,12 @@ FUNC:_ZNSt13random_device7_M_finiEv@@GLIBCXX_3.4.18 FUNC:_ZNSt13random_device7_M_initERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE@@GLIBCXX_3.4.21 FUNC:_ZNSt13random_device7_M_initERKSs@@GLIBCXX_3.4.18 FUNC:_ZNSt13random_device9_M_getvalEv@@GLIBCXX_3.4.18 +FUNC:_ZNSt13runtime_errorC1EOS_@@GLIBCXX_3.4.26 FUNC:_ZNSt13runtime_errorC1EPKc@@GLIBCXX_3.4.21 FUNC:_ZNSt13runtime_errorC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE@@GLIBCXX_3.4.21 FUNC:_ZNSt13runtime_errorC1ERKS_@@GLIBCXX_3.4.21 FUNC:_ZNSt13runtime_errorC1ERKSs@@GLIBCXX_3.4 +FUNC:_ZNSt13runtime_errorC2EOS_@@GLIBCXX_3.4.26 FUNC:_ZNSt13runtime_errorC2EPKc@@GLIBCXX_3.4.21 FUNC:_ZNSt13runtime_errorC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE@@GLIBCXX_3.4.21 FUNC:_ZNSt13runtime_errorC2ERKS_@@GLIBCXX_3.4.21 @@ -2241,6 +2247,7 @@ FUNC:_ZNSt13runtime_errorC2ERKSs@@GLIBCXX_3.4 FUNC:_ZNSt13runtime_errorD0Ev@@GLIBCXX_3.4 FUNC:_ZNSt13runtime_errorD1Ev@@GLIBCXX_3.4 FUNC:_ZNSt13runtime_errorD2Ev@@GLIBCXX_3.4 +FUNC:_ZNSt13runtime_erroraSEOS_@@GLIBCXX_3.4.26 FUNC:_ZNSt13runtime_erroraSERKS_@@GLIBCXX_3.4.21 FUNC:_ZNSt14basic_ifstreamIcSt11char_traitsIcEE4openEPKcSt13_Ios_Openmode@@GLIBCXX_3.4 FUNC:_ZNSt14basic_ifstreamIcSt11char_traitsIcEE4openERKNSt7__cxx1112basic_stringIcS1_SaIcEEESt13_Ios_Openmode@@GLIBCXX_3.4.21 @@ -3017,6 +3024,7 @@ FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6assignERKS4_@@GLIBCXX FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6assignERKS4_mm@@GLIBCXX_3.4.21 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6assignESt16initializer_listIcE@@GLIBCXX_3.4.21 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6assignEmc@@GLIBCXX_3.4.21 +FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6insertEN9__gnu_cxx17__normal_iteratorIPKcS4_EESt16initializer_listIcE@@GLIBCXX_3.4.26 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6insertEN9__gnu_cxx17__normal_iteratorIPKcS4_EEc@@GLIBCXX_3.4.21 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6insertEN9__gnu_cxx17__normal_iteratorIPKcS4_EEmc@@GLIBCXX_3.4.21 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6insertEN9__gnu_cxx17__normal_iteratorIPcS4_EESt16initializer_listIcE
Re: [PATCH] S/390: libstdc++: 64 and 32 bit baseline update
On 07/13/2018 04:58 PM, Andreas Schwab wrote: > On Jul 13 2018, Andreas Krebbel wrote: > >> @@ -5645,3 +5657,5 @@ OBJECT:8:_ZTTSi@@GLIBCXX_3.4 >> OBJECT:8:_ZTTSo@@GLIBCXX_3.4 >> OBJECT:8:_ZTTSt13basic_istreamIwSt11char_traitsIwEE@@GLIBCXX_3.4 >> OBJECT:8:_ZTTSt13basic_ostreamIwSt11char_traitsIwEE@@GLIBCXX_3.4 >> +TLS:4:_ZSt11__once_call@@GLIBCXX_3.4.11 >> +TLS:4:_ZSt15__once_callable@@GLIBCXX_3.4.11 > > You should not have any TLS entries. Ok, thanks. I've committed the patch with these entries removed. Andreas
Re: [PATCH, S390] Avoid LA with base and index on z13
On 07/16/2018 01:02 PM, Robin Dapp wrote: >> But on zEC12 LA works pretty much the same as on z13/z14, it is >> indeed not cracked, but still a 2-cycle instruction when using >> an index register. So I guess the change really should apply >> to zEC12 as well, and this could be as simple as changing the >> above line to: >> >> if (addr.indx && s390_tune >= PROCESSOR_2817_Z196) >> >> (Note that "addr.base && addr.indx" is the same as just checking >> for addr.indx, since s390_decompose_address will never fill in >> *just* an index.) > > Good point, I adapted the patch and changed the comment. > > Regards > Robin > > -- > > gcc/ChangeLog: > > 2018-07-16 Robin Dapp > > * config/s390/s390.c (preferred_la_operand_p): Do not use > LA with index register on z196 or later. > Ok to apply. Thanks! Andreas
Re: [PATCH 1/3] S/390: Implement -mfentry
On 07/16/2018 09:48 AM, Ilya Leoshkevich wrote: > This is the counterpart of the i386 feature introduced by > 39a5a6a4: Add direct support for Linux kernel __fentry__ patching. > > On i386, the difference between mcount and fentry is that fentry > comes before the prolog. On s390 mcount already comes before the > prolog, but takes 4 instructions. This patch introduces the more > efficient implementation (just 1 instruction) and puts it under > -mfentry flag. > > The produced code is compatible only with newer glibc versions, > which provide the __fentry__ symbol and do not clobber %r0 when > resolving lazily bound functions. Because 31-bit PLT stubs assume > %r12 contains GOT address, which is not the case when the code runs > before the prolog, -mfentry is allowed only for 64-bit code. > > Also, code compiled with -mfentry cannot be used for the nested C > functions, since they both use %r0. In this case instrumentation is > not insterted, and a new warning is issued for each affected nested > function. > > * gcc/common.opt: Add the new warning. > * gcc/config/s390/s390.c (s390_function_profiler): Emit > "brasl %r0,__fentry__" when -mfentry is specified. > (s390_option_override_internal): Disallow -mfentry for > 31-bit CPUs. > * gcc/config/s390/s390.opt: Add the new option. > * gcc/testsuite/gcc.target/s390/mfentry-m64.c: > New testcase. Thanks! I've committed your patch with a modified changelog entry. There are several ChangeLog files in the GCC source tree. Paths have to be relative to these. There is e.g. a separate ChangeLog file for the testsuite. Bye, Andreas > --- > gcc/common.opt | 5 + > gcc/config/s390/s390.c | 18 -- > gcc/config/s390/s390.opt| 5 + > gcc/testsuite/gcc.target/s390/mfentry-m64.c | 8 > 4 files changed, 34 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/s390/mfentry-m64.c > > diff --git a/gcc/common.opt b/gcc/common.opt > index c29abdb5cb1..4d031e81b09 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -571,6 +571,11 @@ Wattribute-alias > Common Var(warn_attributes) Init(1) Warning > Warn about type safety and similar errors in attribute alias and related. > > +Wcannot-profile > +Common Var(warn_cannot_profile) Init(1) Warning > +Warn when profiling instrumentation was requested, but could not be applied > to > +a certain function. > + > Wcast-align > Common Var(warn_cast_align) Warning > Warn about pointer casts which increase alignment. > diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c > index 23c3f3db621..3a406b955a0 100644 > --- a/gcc/config/s390/s390.c > +++ b/gcc/config/s390/s390.c > @@ -13144,14 +13144,22 @@ s390_function_profiler (FILE *file, int labelno) >op[3] = gen_rtx_SYMBOL_REF (Pmode, label); >SYMBOL_REF_FLAGS (op[3]) = SYMBOL_FLAG_LOCAL; > > - op[4] = gen_rtx_SYMBOL_REF (Pmode, "_mcount"); > + op[4] = gen_rtx_SYMBOL_REF (Pmode, flag_fentry ? "__fentry__" : "_mcount"); >if (flag_pic) > { >op[4] = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, op[4]), UNSPEC_PLT); >op[4] = gen_rtx_CONST (Pmode, op[4]); > } > > - if (TARGET_64BIT) > + if (flag_fentry) > +{ > + if (cfun->static_chain_decl) > +warning (OPT_Wcannot_profile, "nested functions cannot be profiled " > + "with -mfentry on s390"); > + else > +output_asm_insn ("brasl\t0,%4", op); > +} > + else if (TARGET_64BIT) > { >output_asm_insn ("stg\t%0,%1", op); >output_asm_insn ("larl\t%2,%3", op); > @@ -15562,6 +15570,12 @@ s390_option_override_internal (bool main_args_p, >/* Call target specific restore function to do post-init work. At the > moment, > this just sets opts->x_s390_cost_pointer. */ >s390_function_specific_restore (opts, NULL); > + > + /* Check whether -mfentry is supported. It cannot be used in 31-bit mode, > + because 31-bit PLT stubs assume that %r12 contains GOT address, which is > + not the case when the code runs before the prolog. */ > + if (opts->x_flag_fentry && !TARGET_64BIT) > +error ("-mfentry is supported only for 64-bit CPUs"); > } > > static void > diff --git a/gcc/config/s390/s390.opt b/gcc/config/s390/s390.opt > index eb16f9c821f..59e97d031b4 100644 > --- a/gcc/config/s390/s390.opt > +++ b/gcc/config/s390/s390.opt > @@ -293,3 +293,8 @@ locations which have been patched as part of using one of > the > -mindirect-branch* or -mfunction-return* options. The sections > consist of an array of 32 bit elements. Each entry holds the offset > from the entry to the patched location. > + > +mfentry > +Target Report Var(flag_fentry) > +Emit profiling counter call at function entry before prologue. The compiled > +code will require a 64-bit CPU and glibc 2.29 or newer to run. > diff --git a/gcc/testsuite/gcc.target/s390
Re: [PATCH 2/3] S/390: Implement -mrecord-mcount
On 07/16/2018 09:48 AM, Ilya Leoshkevich wrote: > This is the counterpart of the i386 feature introduced by > 39a5a6a4: Add direct support for Linux kernel __fentry__ patching. > > * gcc/config/s390/s390.c (s390_function_profiler): Generate > __mcount_loc section. > * gcc/config/s390/s390.opt: Add the new option. > * gcc/testsuite/gcc.target/s390/mrecord-mcount.c: > New testcase. Applied. Thanks! Andreas
Re: [PATCH 3/3] S/390: Implement -mnop-mcount
On 07/16/2018 09:48 AM, Ilya Leoshkevich wrote: > This is the counterpart of the i386 feature introduced by > 39a5a6a4: Add direct support for Linux kernel __fentry__ patching. > > On i386 the profiler call sequence always consists of 1 call > instruction, so -mnop-mcount generates a single nop with the same > length as a call. For S/390 longer sequences may be used in some > cases, so -mnop-mcount generates the corresponding amount of nops. > > * gcc/config/s390/s390.c (s390_function_profiler): Generate > nops instead of profiler call sequences. > * gcc/config/s390/s390.opt: Add the new option. > * gcc/testsuite/gcc.target/s390/mnop-mcount-m31-fpic.c: > New testcase. > * gcc/testsuite/gcc.target/s390/mnop-mcount-m31-mzarch.c > New testcase. > * gcc/testsuite/gcc.target/s390/mnop-mcount-m31.c > New testcase. > * gcc/testsuite/gcc.target/s390/mnop-mcount-m64-mfentry.c > New testcase. > * gcc/testsuite/gcc.target/s390/mnop-mcount-m64.c > New testcase. Applied. Thanks! Andreas
Re: [PATCH] S/390: Add CFI for mcount call sequences
On 07/17/2018 12:48 PM, Ilya Leoshkevich wrote: > Fixes unwind for mcount. > > 2018-07-17 Ilya Leoshkevich > > * config/s390/s390.c (s390_function_profiler): > Generate CFI. Applied. Thanks! Andreas
[Committed] S/390: Don't emit prefetch instructions for clrmem
From: Andreas Krebbel gcc/ChangeLog: 2018-07-31 Andreas Krebbel * config/s390/s390.c (s390_expand_setmem): Make the unrolling to depend on whether prefetch instructions will be emitted or not. Use TARGET_SETMEM_PFD for checking whether prefetch instructions will be emitted or not. * config/s390/s390.h (TARGET_SETMEM_PREFETCH_DISTANCE) (TARGET_SETMEM_PFD): New macros. gcc/testsuite/ChangeLog: 2018-07-31 Andreas Krebbel * gcc.target/s390/memset-1.c: Improve testcase. --- gcc/config/s390/s390.c | 22 + gcc/config/s390/s390.h | 10 gcc/testsuite/gcc.target/s390/memset-1.c | 81 3 files changed, 84 insertions(+), 29 deletions(-) diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index a579e9d..ec588a2 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -5499,12 +5499,15 @@ s390_expand_setmem (rtx dst, rtx len, rtx val) /* Expand setmem/clrmem for a constant length operand without a loop if it will be shorter that way. - With a constant length and without pfd argument a - clrmem loop is 32 bytes -> 5.3 * xc - setmem loop is 36 bytes -> 3.6 * (mvi/stc + mvc) */ + clrmem loop (with PFD)is 30 bytes -> 5 * xc + clrmem loop (without PFD) is 24 bytes -> 4 * xc + setmem loop (with PFD)is 38 bytes -> ~4 * (mvi/stc + mvc) + setmem loop (without PFD) is 32 bytes -> ~4 * (mvi/stc + mvc) */ if (GET_CODE (len) == CONST_INT - && ((INTVAL (len) <= 256 * 5 && val == const0_rtx) - || INTVAL (len) <= 257 * 3) + && ((val == const0_rtx + && (INTVAL (len) <= 256 * 4 + || (INTVAL (len) <= 256 * 5 && TARGET_SETMEM_PFD(val,len + || (val != const0_rtx && INTVAL (len) <= 257 * 4)) && (!TARGET_MVCLE || INTVAL (len) <= 256)) { HOST_WIDE_INT o, l; @@ -5618,12 +5621,11 @@ s390_expand_setmem (rtx dst, rtx len, rtx val) emit_label (loop_start_label); - if (TARGET_Z10 - && (GET_CODE (len) != CONST_INT || INTVAL (len) > 1024)) + if (TARGET_SETMEM_PFD (val, len)) { - /* Issue a write prefetch for the +4 cache line. */ - rtx prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, dst_addr, -GEN_INT (1024)), + /* Issue a write prefetch. */ + rtx distance = GEN_INT (TARGET_SETMEM_PREFETCH_DISTANCE); + rtx prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, dst_addr, distance), const1_rtx, const0_rtx); emit_insn (prefetch); PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true; diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h index 71a12b8..c6aedcd 100644 --- a/gcc/config/s390/s390.h +++ b/gcc/config/s390/s390.h @@ -181,6 +181,16 @@ enum processor_flags #define TARGET_AVOID_CMP_AND_BRANCH (s390_tune == PROCESSOR_2817_Z196) +/* Issue a write prefetch for the +4 cache line. */ +#define TARGET_SETMEM_PREFETCH_DISTANCE 1024 + +/* Expand to a C expressions evaluating to true if a setmem to VAL of + length LEN should be emitted using prefetch instructions. */ +#define TARGET_SETMEM_PFD(VAL,LEN) \ + (TARGET_Z10 \ + && (s390_tune < PROCESSOR_2964_Z13 || (VAL) != const0_rtx) \ + && (!CONST_INT_P (LEN) || INTVAL ((LEN)) > TARGET_SETMEM_PREFETCH_DISTANCE)) + /* Run-time target specification. */ /* Defaults for option flags defined only on some subtargets. */ diff --git a/gcc/testsuite/gcc.target/s390/memset-1.c b/gcc/testsuite/gcc.target/s390/memset-1.c index 7b43b97c..3e201df 100644 --- a/gcc/testsuite/gcc.target/s390/memset-1.c +++ b/gcc/testsuite/gcc.target/s390/memset-1.c @@ -2,16 +2,23 @@ without loop statements. */ /* { dg-do compile } */ -/* { dg-options "-O3 -mzarch" } */ +/* { dg-options "-O3 -mzarch -march=z13" } */ -/* 1 mvc */ +/* 1 stc */ +void +*memset0(void *s, int c) +{ + return __builtin_memset (s, c, 1); +} + +/* 1 stc 1 mvc */ void *memset1(void *s, int c) { return __builtin_memset (s, c, 42); } -/* 3 mvc */ +/* 3 stc 3 mvc */ void *memset2(void *s, int c) { @@ -25,55 +32,62 @@ void return __builtin_memset (s, c, 0); } -/* mvc */ +/* 1 stc 1 mvc */ void *memset4(void *s, int c) { return __builtin_memset (s, c, 256); } -/* 2 mvc */ +/* 2 stc 2 mvc */ void *memset5(void *s, int c) { return __builtin_memset (s, c, 512); } -/* still 2 mvc through the additional first byte */ +/* 2 stc 2 mvc - still due to the stc bytes */ void *memset6(void *s, int c) { return __builtin_memset (s, c, 514); } -/* 3 mvc */ +/* 3 stc 2 mvc */ void *memset7(void *s, int
Re: [PATCH] s390: fix htm-builtins test cases
On 10/25/23 16:50, Juergen Christ wrote: > Transactional and non-transactional stores to the same cache line cause > transactions to abort on newer generations. Add sufficient padding to make > sure another cache line is used. > > Tested on s390. > > gcc/testsuite/ChangeLog: > > * gcc.target/s390/htm-builtins-1.c: Fix. > * gcc.target/s390/htm-builtins-2.c: Fix. Ok. Thanks! Andreas > > Signed-off-by: Juergen Christ > --- > gcc/testsuite/gcc.target/s390/htm-builtins-1.c | 4 +++- > gcc/testsuite/gcc.target/s390/htm-builtins-2.c | 4 +++- > 2 files changed, 6 insertions(+), 2 deletions(-) > > diff --git a/gcc/testsuite/gcc.target/s390/htm-builtins-1.c > b/gcc/testsuite/gcc.target/s390/htm-builtins-1.c > index ff43be9fe736..4f95bf3accaa 100644 > --- a/gcc/testsuite/gcc.target/s390/htm-builtins-1.c > +++ b/gcc/testsuite/gcc.target/s390/htm-builtins-1.c > @@ -53,9 +53,11 @@ __attribute__ ((aligned(256))) struct > __attribute__ ((aligned(256))) struct > { >volatile uint64_t c1; > + char pad1[256 - sizeof(uint64_t)]; >volatile uint64_t c2; > + char pad2[256 - sizeof(uint64_t)]; >volatile uint64_t c3; > -} counters = { 0, 0, 0 }; > +} counters = { 0 }; > > /* local helper functions - > */ > > diff --git a/gcc/testsuite/gcc.target/s390/htm-builtins-2.c > b/gcc/testsuite/gcc.target/s390/htm-builtins-2.c > index bb9d346ea560..2e838caacc8c 100644 > --- a/gcc/testsuite/gcc.target/s390/htm-builtins-2.c > +++ b/gcc/testsuite/gcc.target/s390/htm-builtins-2.c > @@ -94,9 +94,11 @@ float global_float_3 = 0.0; > __attribute__ ((aligned(256))) struct > { >volatile uint64_t c1; > + char pad1[256 - sizeof(uint64_t)]; >volatile uint64_t c2; > + char pad2[256 - sizeof(uint64_t)]; >volatile uint64_t c3; > -} counters = { 0, 0, 0 }; > +} counters = { 0 }; > > /* local helper functions - > */ >
Re: [PATCH] S/390: Use UNSPEC_GET_TP for thread pointer loads
On 23.10.19 13:02, Ilya Leoshkevich wrote: > Boostrapped and regtested on s390x-redhat-linux. > > gcc/ChangeLog: > > 2019-10-21 Ilya Leoshkevich > > * config/s390/s390.c (s390_get_thread_pointer): Use > gen_get_thread_pointer. > (s390_expand_split_stack_prologue): Likewise. > * config/s390/s390.md (UNSPEC_GET_TP): New UNSPEC. > (*get_tp_31): New 31-bit splitter for UNSPEC_GET_TP. > (*get_tp_64): New 64-bit splitter for UNSPEC_GET_TP. > (get_thread_pointer): Use UNSPEC_GET_TP, use > parameterized name. > > gcc/testsuite/ChangeLog: > > 2019-10-21 Ilya Leoshkevich > > * gcc.target/s390/load-thread-pointer-once-2.c: New test. Ok. Thanks! Andreas
Re: [PR testsuite/91842] Skip gcc.dg/ipa/ipa-sra-19.c on power
On 02.10.19 17:06, Martin Jambor wrote: > Hi, > > I seem to remember I minimized gcc.dg/ipa/ipa-sra-19.c on power but > perhaps I am wrong because the testcase fails there with a > power-specific error: > > gcc.dg/ipa/ipa-sra-19.c:19:3: error: AltiVec argument passed to unprototyped > function > > I am going to simply skip it there with the following patch, which I > hope is obvious. Tested by running ipa.exp on both ppc64le-linux and > x86_64-linux. > > Thanks, > > Martin > > > diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c > b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c > index adebaa5f5e1..d219411d8ba 100644 > --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c > +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c > @@ -1,5 +1,6 @@ > /* { dg-do compile } */ > /* { dg-options "-O2" } */ > +/* { dg-skip-if "" { powerpc*-*-* } } */ > > typedef int __attribute__((__vector_size__(16))) vectype; > > I ran into the same problem on IBM Z. Is it important for the testcase to leave the argument list of k unspecified or would it be ok to add it? diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c index d219411d8ba..d9dcd33cb76 100644 --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c @@ -5,7 +5,7 @@ typedef int __attribute__((__vector_size__(16))) vectype; vectype dk(); -vectype k(); +vectype k(vectype); int b; vectype *j;
Re: [PR testsuite/91842] Skip gcc.dg/ipa/ipa-sra-19.c on power
On 24.10.19 15:26, Martin Jambor wrote: > Hi, > > On Thu, Oct 24 2019, Andreas Krebbel wrote: >> On 02.10.19 17:06, Martin Jambor wrote: >>> Hi, >>> >>> I seem to remember I minimized gcc.dg/ipa/ipa-sra-19.c on power but >>> perhaps I am wrong because the testcase fails there with a >>> power-specific error: >>> >>> gcc.dg/ipa/ipa-sra-19.c:19:3: error: AltiVec argument passed to >>> unprototyped function >>> >>> I am going to simply skip it there with the following patch, which I >>> hope is obvious. Tested by running ipa.exp on both ppc64le-linux and >>> x86_64-linux. >>> >>> Thanks, >>> >>> Martin >>> >>> >>> diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c >>> b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c >>> index adebaa5f5e1..d219411d8ba 100644 >>> --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c >>> +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c >>> @@ -1,5 +1,6 @@ >>> /* { dg-do compile } */ >>> /* { dg-options "-O2" } */ >>> +/* { dg-skip-if "" { powerpc*-*-* } } */ >>> >>> typedef int __attribute__((__vector_size__(16))) vectype; >>> >>> >> >> I ran into the same problem on IBM Z. Is it important for the testcase to >> leave the argument list of >> k unspecified or would it be ok to add it? > > I wanted to write to you that the un-prototypedness is on purpose and > essential to test what the bug was in the past but this time I actually > managed to find the associated fix in my ipa-sra branch and found out > that I mis-remembered, that it is not the case. Sorry for not doing > that before. I believe the patch is OK then and we can even remove the > dg-skip-if I added. And by that I mean that although I'm not a > reviewer, I would consider it obvious. Will you do it or should I take > care of it? I will do it. Thanks! Andreas > > Thanks, > > Martin > > >> >> diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c >> b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c >> index d219411d8ba..d9dcd33cb76 100644 >> --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c >> +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c >> @@ -5,7 +5,7 @@ >> typedef int __attribute__((__vector_size__(16))) vectype; >> >> vectype dk(); >> -vectype k(); >> +vectype k(vectype); >> >> int b; >> vectype *j;
[Committed] ipa-sra-19.c: Avoid unprototyped function
Power and IBM Z require a function prototype if a vector argument is passed. Complete the prototype of k to prevent errors from being triggered on these platforms Committed after the discussion here: https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01737.html gcc/testsuite/ChangeLog: 2019-10-24 Andreas Krebbel * gcc.dg/ipa/ipa-sra-19.c: Remove dg-skip-if. Add argument type to prototype of k. --- gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c index d219411d8ba..6186d891a29 100644 --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c @@ -1,11 +1,10 @@ /* { dg-do compile } */ /* { dg-options "-O2" } */ -/* { dg-skip-if "" { powerpc*-*-* } } */ typedef int __attribute__((__vector_size__(16))) vectype; vectype dk(); -vectype k(); +vectype k(vectype); int b; vectype *j; -- 2.23.0
[Committed 0/4] IBM Z: Fix a few testsuite problems
Andreas Krebbel (4): IBM Z: Use tree_fits_uhwi_p in vector_alignment hook IBM Z: Fix testsuite useable_hw check IBM Z: gen-vect-11/32: Set min-vect-loop-bound param back to default IBM Z: gen-vect-26/28: Vectorizing without peeling is ok for Z gcc/config/s390/s390.c | 8 +++- gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11.c | 6 +- gcc/testsuite/gcc.dg/tree-ssa/gen-vect-26.c | 5 +++-- gcc/testsuite/gcc.dg/tree-ssa/gen-vect-28.c | 5 +++-- gcc/testsuite/gcc.dg/tree-ssa/gen-vect-32.c | 4 gcc/testsuite/gcc.target/s390/s390.exp | 22 - 6 files changed, 35 insertions(+), 15 deletions(-) -- 2.23.0
[PATCH 2/4] IBM Z: Fix testsuite useable_hw check
This fixes various issues with the useable_hw check in s390.exp. The check is supposed to verify whether a testcase can be run on the current hardware. - the test never returned true for -m31 because vzero is not available in ESA mode and -m31 defaults to -mesa - the missing v0 clobber on the vzero instruction made the check fail if the stack pointer got saved in f0 - the lcbb instruction used for checking whether we are on a z13 also requires vx. Replace it with an instruction from the generic instruction set extensions. - no support for z14 and z15 so far gcc/testsuite/ChangeLog: 2019-11-05 Andreas Krebbel * gcc.target/s390/s390.exp (check_effective_target_s390_useable_hw): Add inline asm for z14 and z15. Replace instruction for z13 with lochiz. Add register clobbers. Check also for __zarch__ when doing the __VX__ test. --- gcc/testsuite/gcc.target/s390/s390.exp | 22 +- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/gcc/testsuite/gcc.target/s390/s390.exp b/gcc/testsuite/gcc.target/s390/s390.exp index 925eb568832..b4057b00f14 100644 --- a/gcc/testsuite/gcc.target/s390/s390.exp +++ b/gcc/testsuite/gcc.target/s390/s390.exp @@ -87,18 +87,22 @@ proc check_effective_target_s390_useable_hw { } { int main (void) { asm (".machinemode zarch" : : ); - #if __ARCH__ >= 11 - asm ("lcbb %%r2,0(%%r15),0" : : ); + #if __ARCH__ >= 13 + asm ("ncrk %%r2,%%r2,%%r2" : : : "r2"); + #elif __ARCH__ >= 12 + asm ("agh %%r2,0(%%r15)" : : : "r2"); + #elif __ARCH__ >= 11 + asm ("lochiz %%r2,42" : : : "r2"); #elif __ARCH__ >= 10 - asm ("risbgn %%r2,%%r2,0,0,0" : : ); + asm ("risbgn %%r2,%%r2,0,0,0" : : : "r2"); #elif __ARCH__ >= 9 - asm ("sgrk %%r2,%%r2,%%r2" : : ); + asm ("sgrk %%r2,%%r2,%%r2" : : : "r2"); #elif __ARCH__ >= 8 - asm ("rosbg %%r2,%%r2,0,0,0" : : ); + asm ("rosbg %%r2,%%r2,0,0,0" : : : "r2"); #elif __ARCH__ >= 7 - asm ("nilf %%r2,0" : : ); + asm ("nilf %%r2,0" : : : "r2"); #elif __ARCH__ >= 6 - asm ("lay %%r2,0(%%r15)" : : ); + asm ("lay %%r2,0(%%r15)" : : : "r2"); #elif __ARCH__ >= 5 asm ("tam" : : ); #endif @@ -108,8 +112,8 @@ proc check_effective_target_s390_useable_hw { } { asm ("etnd %0" : "=d" (nd)); } #endif - #ifdef __VX__ - asm ("vzero %%v0" : : ); + #if defined (__VX__) && defined (__zarch__) + asm ("vzero %%v0" : : : "v0"); #endif return 0; } -- 2.23.0
[PATCH 3/4] IBM Z: gen-vect-11/32: Set min-vect-loop-bound param back to default
In the Z backend we still set min-vect-loop-bound to 2 to work around corner cases where awkward epilogue code gets generated in the vectorizer. This has a particular bad impact when vectorizing loops with a low iteration count. Due to this we do not vectorize the loop in gen-vect-11/32 - what actually is a pity. The patch sets min-vect-loop-bound back to the default value of 0 in order to enable vectorization. 2019-11-05 Andreas Krebbel * gcc.dg/tree-ssa/gen-vect-11.c: Add --param min-vect-loop-bound=0 for IBM Z. * gcc.dg/tree-ssa/gen-vect-11.c: Likewise. --- gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11.c | 6 +- gcc/testsuite/gcc.dg/tree-ssa/gen-vect-32.c | 4 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11.c b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11.c index 650e73a5ee8..dd1c0ac3eba 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11.c @@ -1,6 +1,10 @@ /* { dg-do run { target vect_cmdline_needed } } */ /* { dg-options "-O2 -ftree-vectorize -fwrapv -fdump-tree-vect-details -fvect-cost-model=dynamic" } */ -/* { dg-options "-O2 -ftree-vectorize -fwrapv -fdump-tree-vect-details -fvect-cost-model=dynamic -mno-sse" { target { i?86-*-* x86_64-*-* } } } */ +/* { dg-additional-options "-mno-sse" { target { i?86-*-* x86_64-*-* } } } */ +/* The IBM Z backend sets the min-vect-loop-bound param to 2 to avoid + awkward epilogue code generation in some cases. This line needs to + be removed after finding an alternate way to fix this. */ +/* { dg-additional-options "--param min-vect-loop-bound=0" { target { s390*-*-* } } } */ #include diff --git a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-32.c b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-32.c index c4bee19b75a..378dd0b831c 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-32.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-32.c @@ -1,6 +1,10 @@ /* { dg-do run { target vect_cmdline_needed } } */ /* { dg-options "-O2 -fno-tree-loop-distribute-patterns -ftree-vectorize -fdump-tree-vect-details -fno-vect-cost-model" } */ /* { dg-additional-options "-mno-sse" { target { i?86-*-* x86_64-*-* } } } */ +/* The IBM Z backend sets the min-vect-loop-bound param to 2 to avoid + awkward epilogue code generation in some cases. This line needs to + be removed after finding an alternate way to fix this. */ +/* { dg-additional-options "--param min-vect-loop-bound=0" { target { s390*-*-* } } } */ #include -- 2.23.0
[PATCH 1/4] IBM Z: Use tree_fits_uhwi_p in vector_alignment hook
This fixes an ICE in gcc.dg/attr-vector_size.c testcase. gcc/ChangeLog: 2019-11-05 Andreas Krebbel * config/s390/s390.c (s390_vector_alignment): Check if the value fits into uhwi before using it. --- gcc/config/s390/s390.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 151b80da0b3..ff0b43c2c29 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -16075,13 +16075,19 @@ s390_support_vector_misalignment (machine_mode mode ATTRIBUTE_UNUSED, static HOST_WIDE_INT s390_vector_alignment (const_tree type) { + tree size = TYPE_SIZE (type); + if (!TARGET_VX_ABI) return default_vector_alignment (type); if (TYPE_USER_ALIGN (type)) return TYPE_ALIGN (type); - return MIN (64, tree_to_shwi (TYPE_SIZE (type))); + if (tree_fits_uhwi_p (size) + && tree_to_uhwi (size) < BIGGEST_ALIGNMENT) +return tree_to_uhwi (size); + + return BIGGEST_ALIGNMENT; } /* Implement TARGET_CONSTANT_ALIGNMENT. Alignment on even addresses for -- 2.23.0
[PATCH 4/4] IBM Z: gen-vect-26/28: Vectorizing without peeling is ok for Z
These tests check if loop peeling has been applied to avoid having to vectorize unaligned loops. On Z we do not have any alignment requirements for vectorization so we also don't need want the loop peeling here. 2019-11-05 Andreas Krebbel * gcc.dg/tree-ssa/gen-vect-26.c: Disable loop peeling check for IBM Z. * gcc.dg/tree-ssa/gen-vect-28.c: Likewise. --- gcc/testsuite/gcc.dg/tree-ssa/gen-vect-26.c | 5 +++-- gcc/testsuite/gcc.dg/tree-ssa/gen-vect-28.c | 5 +++-- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-26.c b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-26.c index 242316893c0..6f3c2b7d88a 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-26.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-26.c @@ -30,5 +30,6 @@ int main () /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { ! avr-*-* } } } } */ -/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" { target { ! avr-*-* } } } } */ -/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! avr-*-* } } } } */ +/* IBM Z does not require special alignment for vectorization. */ +/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" { target { ! { avr-*-* s390*-*-* } } } } } */ +/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! { avr-*-* s390*-*-* } } } } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-28.c b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-28.c index 24853e0e0db..7b26bbdc70c 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-28.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-28.c @@ -38,5 +38,6 @@ int main (void) /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { ! avr-*-* } } } } */ -/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" { target { ! avr-*-* } } } } */ -/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! avr-*-* } } } } */ +/* IBM Z does not require special alignment for vectorization. */ +/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" { target { ! { avr-*-* s390*-*-* } } } } } */ +/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! { avr-*-* s390*-*-* } } } } } */ -- 2.23.0
[Committed] IBM Z: Add pattern for load truth value of comparison into reg
The RTXs used to express an overflow condition check in add/sub/mul are too complex for if conversion. However, there is code in noce_emit_store_flag which generates a simple CC compare as the base for using a conditional load. All we have to do is to provide a pattern to store the truth value of a CC compare into a GPR. Done with the attached patch. Bootstrapped and regression tested on s390x. Committed to mainline. 2019-11-07 Andreas Krebbel * config/s390/s390.md ("*cstorecc_z13"): New insn_and_split pattern. gcc/testsuite/ChangeLog: 2019-11-07 Andreas Krebbel * gcc.target/s390/addsub-signed-overflow-1.c: Expect lochi instructions to be used. * gcc.target/s390/addsub-signed-overflow-2.c: Likewise. * gcc.target/s390/mul-signed-overflow-1.c: Likewise. * gcc.target/s390/mul-signed-overflow-2.c: Likewise. * gcc.target/s390/vector/vec-scalar-cmp-1.c: Check for 32 and 64 bit variant of lochi. Swap the values for the lochi's. * gcc.target/s390/zvector/vec-cmp-1.c: Likewise. --- gcc/config/s390/s390.md | 15 .../s390/addsub-signed-overflow-1.c | 2 + .../s390/addsub-signed-overflow-2.c | 2 + .../gcc.target/s390/mul-signed-overflow-1.c | 2 + .../gcc.target/s390/mul-signed-overflow-2.c | 2 + .../gcc.target/s390/vector/vec-scalar-cmp-1.c | 18 +++-- .../gcc.target/s390/zvector/vec-cmp-1.c | 72 --- 7 files changed, 83 insertions(+), 30 deletions(-) diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md index e3881d07f2b..c1d73d5ca42 100644 --- a/gcc/config/s390/s390.md +++ b/gcc/config/s390/s390.md @@ -6810,6 +6810,21 @@ [(set (match_dup 0) (ashiftrt:SI (match_dup 0) (const_int 28))) (clobber (reg:CC CC_REGNUM))])]) +; Such patterns get directly emitted by noce_emit_store_flag. +(define_insn_and_split "*cstorecc_z13" + [(set (match_operand:GPR 0 "register_operand""=&d") + (match_operator:GPR 1 "s390_comparison" + [(match_operand 2 "cc_reg_operand""c") +(match_operand 3 "const_int_operand" "")]))] + "TARGET_Z13" + "#" + "reload_completed" + [(set (match_dup 0) (const_int 0)) + (set (match_dup 0) + (if_then_else:GPR +(match_op_dup 1 [(match_dup 2) (match_dup 3)]) +(const_int 1) +(match_dup 0)))]) ;; ;; - Conditional move instructions (introduced with z196) diff --git a/gcc/testsuite/gcc.target/s390/addsub-signed-overflow-1.c b/gcc/testsuite/gcc.target/s390/addsub-signed-overflow-1.c index 367dbcb3774..143220d5541 100644 --- a/gcc/testsuite/gcc.target/s390/addsub-signed-overflow-1.c +++ b/gcc/testsuite/gcc.target/s390/addsub-signed-overflow-1.c @@ -79,3 +79,5 @@ main () /* { dg-final { scan-assembler-not "\trisbg" { target { lp64 } } } } */ /* Just one for the ret != 6 comparison. */ /* { dg-final { scan-assembler-times "ci" 1 } } */ +/* { dg-final { scan-assembler-times "\tlochio\t" 6 { target { ! lp64 } } } } */ +/* { dg-final { scan-assembler-times "\tlocghio\t" 6 { target lp64 } } } */ diff --git a/gcc/testsuite/gcc.target/s390/addsub-signed-overflow-2.c b/gcc/testsuite/gcc.target/s390/addsub-signed-overflow-2.c index 230ad4af1e7..798e489cece 100644 --- a/gcc/testsuite/gcc.target/s390/addsub-signed-overflow-2.c +++ b/gcc/testsuite/gcc.target/s390/addsub-signed-overflow-2.c @@ -78,3 +78,5 @@ main () /* { dg-final { scan-assembler-not "\trisbg" { target { lp64 } } } } */ /* Just one for the ret != 3 comparison. */ /* { dg-final { scan-assembler-times "ci" 1 } } */ +/* { dg-final { scan-assembler-times "\tlochio\t" 6 { target { ! lp64 } } } } */ +/* { dg-final { scan-assembler-times "\tlocghio\t" 6 { target lp64 } } } */ diff --git a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c index b3db60ffef5..fdf56d6e695 100644 --- a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c +++ b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c @@ -54,3 +54,5 @@ main () /* { dg-final { scan-assembler-not "\trisbg" { target { lp64 } } } } */ /* Just one for the ret != 3 comparison. */ /* { dg-final { scan-assembler-times "ci" 1 } } */ +/* { dg-final { scan-assembler-times "\tlochio\t" 3 { target { ! lp64 } } } } */ +/* { dg-final { scan-assembler-times "\tlocghio\t" 3 { target lp64 } } } */ diff --git a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c index 76b3fa60361..d0088188aa2 100644 --- a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c +++ b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c @@ -54,3 +54,5 @@
Re: [Committed] IBM Z: Add pattern for load truth value of comparison into reg
On 11.11.19 15:39, Richard Henderson wrote: > On 11/7/19 12:52 PM, Andreas Krebbel wrote: >> +; Such patterns get directly emitted by noce_emit_store_flag. >> +(define_insn_and_split "*cstorecc_z13" >> + [(set (match_operand:GPR 0 "register_operand""=&d") >> +(match_operator:GPR 1 "s390_comparison" >> +[(match_operand 2 "cc_reg_operand""c") >> + (match_operand 3 "const_int_operand" "")]))] > > The clobbered-output seems superfluous, since it can't overlap "c". I thought it would be "more" correct this way, but it might lead to an extra reload being emitted - right? > I believe the only valid const_int is 0, fwiw, so perhaps matching any > const_int is overkill. We also have CCRAW mode where that value is != 0. > Does it help Z12 to allow the 3-insn sequence using LOC(G)R? Prior to z13 we prefer the variant using a conditional branch. Andreas > >> + "TARGET_Z13" >> + "#" > > + "reload_completed" >> + [(set (match_dup 0) (const_int 0)) >> + (set (match_dup 0) >> +(if_then_else:GPR >> + (match_op_dup 1 [(match_dup 2) (match_dup 3)]) >> + (const_int 1) >> + (match_dup 0)))]) > > > r~ >
Re: [PATCH] s390x: Fix popcounthi2_z196 expander [PR93533]
On 2/1/20 9:41 PM, Jakub Jelinek wrote: > Hi! > > The following testcase started to ICE when .POPCOUNT matching has been added > to match.pd; we had __builtin_popcount*, but nothing would use the > popcounthi2 expander before. > > The problem is that the popcounthi2_z196 expander doesn't emit valid RTL: > error: unrecognizable insn: > (insn 138 137 139 27 (set (reg:SI 190) > (ashift:SI (reg:HI 95 [ _105 ]) > (const_int 8 [0x8]))) -1 > (nil)) > during RTL pass: vregs > The following patch is an attempt to fix that, furthermore I've tried to > slightly simplify it as well, it makes no sense to me to perform > (x + (x << 8)) >> 8 when we need to either zero extend or mask the result > at the end in order to avoid bits from above HImode to affect it, when we > can do > (x + (x >> 8)) & 0xff (or zero extension). > > Bootstrapped/regtested on s390x-linux, ok for trunk? Ok. Thanks for the fix! Andreas > > 2020-02-01 Jakub jelinek > > PR target/93533 > * config/s390/s390.md (popcounthi2_z196): Fix up expander to emit > valid RTL to sum up the lowest and second lowest bytes of the popcnt > result. > > --- gcc/config/s390/s390.md.jj2020-01-12 11:54:36.412413424 +0100 > +++ gcc/config/s390/s390.md 2020-02-01 13:34:21.671431689 +0100 > @@ -11670,21 +11670,28 @@ (define_expand "popcountsi2" > }) > > (define_expand "popcounthi2_z196" > - [; popcnt op0, op1 > - (parallel [(set (match_operand:HI 0 "register_operand" "") > + [; popcnt op2, op1 > + (parallel [(set (match_dup 2) > (unspec:HI [(match_operand:HI 1 "register_operand")] > UNSPEC_POPCNT)) > (clobber (reg:CC CC_REGNUM))]) > - ; sllk op2, op0, 8 > - (set (match_dup 2) > - (ashift:SI (match_dup 0) (const_int 8))) > - ; ar op0, op2 > - (parallel [(set (match_dup 0) (plus:SI (match_dup 0) (match_dup 2))) > + ; lr op3, op2 > + (set (match_dup 3) (subreg:SI (match_dup 2) 0)) > + ; srl op4, op3, 8 > + (set (match_dup 4) (lshiftrt:SI (match_dup 3) (const_int 8))) > + ; ar op3, op4 > + (parallel [(set (match_dup 3) (plus:SI (match_dup 3) (match_dup 4))) > (clobber (reg:CC CC_REGNUM))]) > - ; srl op0, op0, 8 > - (set (match_dup 0) (lshiftrt:HI (match_dup 0) (const_int 8)))] > + ; llgc op0, op3 > + (parallel [(set (match_operand:HI 0 "register_operand" "") > +(and:HI (subreg:HI (match_dup 3) 2) (const_int 255))) > + (clobber (reg:CC CC_REGNUM))])] >"TARGET_Z196" > - "operands[2] = gen_reg_rtx (SImode);") > +{ > + operands[2] = gen_reg_rtx (HImode); > + operands[3] = gen_reg_rtx (SImode); > + operands[4] = gen_reg_rtx (SImode); > +}) > > (define_expand "popcounthi2" >[(set (match_dup 2) > --- gcc/testsuite/gcc.c-torture/compile/pr93533.c.jj 2020-02-01 > 13:44:16.296681108 +0100 > +++ gcc/testsuite/gcc.c-torture/compile/pr93533.c 2020-02-01 > 13:43:52.034038073 +0100 > @@ -0,0 +1,9 @@ > +/* PR target/93533 */ > + > +unsigned > +foo (unsigned short a) > +{ > + a = a - (a >> 1 & 21845); > + a = (a & 13107) + (a >> 2 & 13107); > + return (unsigned short) ((a + (a >> 4) & 3855) * 257) >> 8; > +} > --- gcc/testsuite/gcc.target/s390/pr93533.c.jj2020-02-01 > 13:45:41.433428499 +0100 > +++ gcc/testsuite/gcc.target/s390/pr93533.c 2020-02-01 13:45:32.984552824 > +0100 > @@ -0,0 +1,5 @@ > +/* PR target/93533 */ > +/* { dg-do compile } */ > +/* { dg-options "-march=z196 -O2" } */ > + > +#include "../../gcc.c-torture/compile/pr93533.c" > > Jakub >
[Committed] Fix PR92950: Wrong code emitted for movv1qi
The backend emits 16 bit memory loads for single element character vector. As a result the character will not be right justified in the GPR. gcc/ChangeLog: 2019-12-16 Andreas Krebbel PR target/92950 * config/s390/vector.md ("mov" for V_8): Replace lh, lhy, and lhrl with llc. gcc/testsuite/ChangeLog: 2019-12-16 Andreas Krebbel PR target/92950 * gcc.target/s390/vector/pr92950.c: New test. --- gcc/config/s390/vector.md | 12 -- .../gcc.target/s390/vector/pr92950.c | 24 +++ 2 files changed, 29 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/s390/vector/pr92950.c diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md index d40e310f9e7..1e591ba31b6 100644 --- a/gcc/config/s390/vector.md +++ b/gcc/config/s390/vector.md @@ -291,9 +291,9 @@ ; However, this would probably be slower. (define_insn "mov" - [(set (match_operand:V_8 0 "nonimmediate_operand" "=v,v,d,v,R, v, v, v, v,d, Q, S, Q, S, d, d,d,d,d,R,T") -(match_operand:V_8 1 "general_operand" " v,d,v,R,v,j00,jm1,jyy,jxx,d,j00,j00,jm1,jm1,j00,jm1,R,T,b,d,d"))] - "" + [(set (match_operand:V_8 0 "nonimmediate_operand" "=v,v,d,v,R, v, v, v, v,d, Q, S, Q, S, d, d,d,R,T") +(match_operand:V_8 1 "general_operand" " v,d,v,R,v,j00,jm1,jyy,jxx,d,j00,j00,jm1,jm1,j00,jm1,T,d,d"))] + "TARGET_VX" "@ vlr\t%v0,%v1 vlvgb\t%v0,%1,0 @@ -311,12 +311,10 @@ mviy\t%0,-1 lhi\t%0,0 lhi\t%0,-1 - lh\t%0,%1 - lhy\t%0,%1 - lhrl\t%0,%1 + llc\t%0,%1 stc\t%1,%0 stcy\t%1,%0" - [(set_attr "op_type" "VRR,VRS,VRS,VRX,VRX,VRI,VRI,VRI,VRI,RR,SI,SIY,SI,SIY,RI,RI,RX,RXY,RIL,RX,RXY")]) + [(set_attr "op_type" "VRR,VRS,VRS,VRX,VRX,VRI,VRI,VRI,VRI,RR,SI,SIY,SI,SIY,RI,RI,RXY,RX,RXY")]) (define_insn "mov" [(set (match_operand:V_16 0 "nonimmediate_operand" "=v,v,d,v,R, v, v, v, v,d, Q, Q, d, d,d,d,d,R,T,b") diff --git a/gcc/testsuite/gcc.target/s390/vector/pr92950.c b/gcc/testsuite/gcc.target/s390/vector/pr92950.c new file mode 100644 index 000..9c7ed127e61 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/vector/pr92950.c @@ -0,0 +1,24 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -mzarch -march=z13 --save-temps" } */ + +struct a { + int b; + char c; +}; +struct a d = {1, 16}; +struct a *e = &d; + +int f = 0; + +int main() { + struct a g = {0, 0 }; + f = 0; + + for (; f <= 1; f++) { +g = d; +*e = g; + } + + if (d.c != 16) +__builtin_abort(); +} -- 2.23.0
[PATCH 1/1] Work around array out of bounds warning in mkdeps
This suppresses an array out of bounds warning in mkdeps.c as proposed by Martin Sebor in the bugzilla. array subscript 2 is outside array bounds of ‘const char [2]’ Since this warning does occur during bootstrap it currently breaks werror builds on IBM Z. The problem can be reproduced also on x86_64 by changing the inlining threshold using: --param max-inline-insns-auto=80 Bootstrapped and regression tested on x86_64 and IBM Z. Ok for mainline? libcpp/ChangeLog: 2019-12-17 Andreas Krebbel PR tree-optimization/92176 * mkdeps.c (deps_add_default_target): --- libcpp/mkdeps.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libcpp/mkdeps.c b/libcpp/mkdeps.c index 147aa909be7..d1001e30e19 100644 --- a/libcpp/mkdeps.c +++ b/libcpp/mkdeps.c @@ -268,7 +268,7 @@ deps_add_default_target (class mkdeps *d, const char *tgt) return; if (tgt[0] == '\0') -deps_add_target (d, "-", 1); +d->targets.push (xstrdup ("-")); else { #ifndef TARGET_OBJECT_SUFFIX -- 2.23.0
Re: [PATCH] s390: Fix TARGET_SECONDARY_RELOAD for non-SYMBOL_REFs
On 2/29/24 13:13, Stefan Schulze Frielinghaus wrote: > RTX X must not necessarily be a SYMBOL_REF and may e.g. be an > UNSPEC_GOTENT for which SYMBOL_FLAG_NOTALIGN2_P fails. > > gcc/ChangeLog: > > * config/s390/s390.cc (s390_secondary_reload): Guard > SYMBOL_FLAG_NOTALIGN2_P. Ok. Thanks! Andreas > --- > gcc/config/s390/s390.cc | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc > index 943fc9bfd72..12430d77786 100644 > --- a/gcc/config/s390/s390.cc > +++ b/gcc/config/s390/s390.cc > @@ -4778,7 +4778,7 @@ s390_secondary_reload (bool in_p, rtx x, reg_class_t > rclass_i, >if (in_p > && s390_loadrelative_operand_p (x, &symref, &offset) > && mode == Pmode > - && !SYMBOL_FLAG_NOTALIGN2_P (symref) > + && (!SYMBOL_REF_P (symref) || !SYMBOL_FLAG_NOTALIGN2_P (symref)) > && (offset & 1) == 1) > sri->icode = ((mode == DImode) ? CODE_FOR_reloaddi_larl_odd_addend_z10 > : CODE_FOR_reloadsi_larl_odd_addend_z10);
Re: [PATCH] s390: Fix tests rosbg_si_srl and rxsbg_si_srl
On 2/29/24 13:14, Stefan Schulze Frielinghaus wrote: > Starting with r14-2047-gd0e891406b16dc two SI mode tests are optimized > into DI mode. Thus, the scan-assembler directives fail. For example > RTL expression > > (ior:SI (subreg:SI (lshiftrt:DI (reg:DI 69) > (const_int 2 [0x2])) 4) > (subreg:SI (reg:DI 68) 4)) > > is optimized into > > (ior:DI (lshiftrt:DI (reg:DI 69) > (const_int 2 [0x2])) > (reg:DI 68)) > > Fixed by moving operands into memory in order to enforce SI mode > computation. > > Furthermore, in r9-6056-g290dfd9bc7bea2 the starting bit position of the > scan-assembler directive for rosbg was incorrectly set to 32 which > actually should be 32+SHIFT_AMOUNT, i.e., in this particular case 34. > > gcc/testsuite/ChangeLog: > > * gcc.target/s390/md/rXsbg_mode_sXl.c: Fix tests rosbg_si_srl > and rxsbg_si_srl. Ok, thanks! Andreas
Re: [PATCH] s390: Fix test vector/long-double-to-i64.c
On 2/29/24 13:15, Stefan Schulze Frielinghaus wrote: > Starting with r14-8319-g86de9b66480b71 fwprop improved so that vpdi is > no longer required. > > gcc/testsuite/ChangeLog: > > * gcc.target/s390/vector/long-double-to-i64.c: Fix scan > assembler directive. Should we perhaps rather turn the scan-assembler directives into something which checks for the absence of vpdi then? In order to get notified once this really useful optimization breaks? Andreas > --- > .../gcc.target/s390/vector/long-double-to-i64.c | 13 + > 1 file changed, 9 insertions(+), 4 deletions(-) > > diff --git a/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c > b/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c > index 2dbbb5d1c03..ed89878e6ee 100644 > --- a/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c > +++ b/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c > @@ -1,19 +1,24 @@ > /* { dg-do compile } */ > /* { dg-options "-O3 -march=z14 -mzarch --save-temps" } */ > /* { dg-do run { target { s390_z14_hw } } } */ > +/* { dg-final { check-function-bodies "**" "" "" { target { lp64 } } } } */ > + > #include > #include > > +/* > +** long_double_to_i64: > +** ld %f0,0\(%r2\) > +** ld %f2,8\(%r2\) > +** cgxbr %r2,5,%f0 > +** br %r14 > +*/ > __attribute__ ((noipa)) static int64_t > long_double_to_i64 (long double x) > { >return x; > } > > -/* { dg-final { scan-assembler-times {\n\tvpdi\t%v\d+,%v\d+,%v\d+,1\n} 1 } } > */ > -/* { dg-final { scan-assembler-times {\n\tvpdi\t%v\d+,%v\d+,%v\d+,5\n} 1 } } > */ > -/* { dg-final { scan-assembler-times {\n\tcgxbr\t} 1 } } */ > - > int > main (void) > {
Re: [PATCH] s390: Streamline vector builtins with LLVM
On 3/1/24 10:29, Stefan Schulze Frielinghaus wrote: > Similar as to s390_lcbb, s390_vll, s390_vstl, et al. make use of a > signed vector type for vlbb. Furthermore, a const void pointer seems > more common and an integer for the mask. > > For s390_vfi(s,d)b make use of integers for masks, too. > > Use unsigned integers for all s390_vlbr/vstbr variants. > > Make use of type UV16QI for the length operand of s390_vstrs(,z)(h,f). > > Following the Principles of Operation, change from signed to unsigned > type for s390_va(c,cc,ccc)q and s390_vs(,c,bc)biq and s390_vmslg. > > Make use of scalar type UINT128 instead of UV16QI for s390_vgfm(,a)g, > and s390_vsumq(f,g). > > Ok for mainline? > > gcc/ChangeLog: > > * config/s390/s390-builtin-types.def: Update to reflect latest > changes. > * config/s390/s390-builtins.def: Streamline vector builtins with > LLVM. Ok. Thanks! Andreas
Re: [PATCH] s390: Deprecate some vector builtins
On 3/1/24 16:57, Stefan Schulze Frielinghaus wrote: > According to IBM Open XL C/C++ for z/OS version 1.1 builtins > > - vec_permi > - vec_ctd > - vec_ctsl > - vec_ctul > - vec_ld2f > - vec_st2f > > are deprecated. Also deprecate helper builtins vec_ctd_s64 and > vec_ctd_u64. > > Furthermore, the overloads of vec_insert which make use of a bool vector > are deprecated, too. > > gcc/ChangeLog: > > * config/s390/s390-builtins.def (vec_permi): Deprecate. > (vec_ctd): Deprecate. > (vec_ctd_s64): Deprecate. > (vec_ctd_u64): Deprecate. > (vec_ctsl): Deprecate. > (vec_ctul): Deprecate. > (vec_ld2f): Deprecate. > (vec_st2f): Deprecate. > (vec_insert): Deprecate overloads with bool vectors. Ok. Thanks! Andreas
[Committed] IBM Z: Fix -munaligned-symbols
With this fix we make sure that only symbols with a natural alignment smaller than 2 are considered misaligned with -munaligned-symbols. Background is that -munaligned-symbols is only supposed to affect symbols whose natural alignment wouldn't be enough to fulfill our ABI requirement of having all symbols at even addresses. Because only these are the cases where we differ from other architectures. This fixes the unaligned-1 testcase, no regressions. Committed to mainline. gcc/ChangeLog: * config/s390/s390.cc (s390_encode_section_info): Adjust the check for misaligned symbols. * config/s390/s390.opt: Improve documentation. gcc/testsuite/ChangeLog: * gcc.target/s390/aligned-1.c: Add weak and void variables incorporating the cases from unaligned-2.c. * gcc.target/s390/unaligned-1.c: Likewise. * gcc.target/s390/unaligned-2.c: Removed. --- gcc/config/s390/s390.cc | 15 ++- gcc/config/s390/s390.opt| 7 +- gcc/testsuite/gcc.target/s390/aligned-1.c | 101 +-- gcc/testsuite/gcc.target/s390/unaligned-1.c | 103 ++-- gcc/testsuite/gcc.target/s390/unaligned-2.c | 16 --- 5 files changed, 201 insertions(+), 41 deletions(-) delete mode 100644 gcc/testsuite/gcc.target/s390/unaligned-2.c diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc index e63965578f1..372a2324403 100644 --- a/gcc/config/s390/s390.cc +++ b/gcc/config/s390/s390.cc @@ -13802,10 +13802,19 @@ s390_encode_section_info (tree decl, rtx rtl, int first) that can go wrong (i.e. no FUNC_DECLs). All symbols without an explicit alignment are assumed to be 2 byte aligned as mandated by our ABI. This behavior can be -overridden for external symbols with the -munaligned-symbols -switch. */ +overridden for external and weak symbols with the +-munaligned-symbols switch. +For all external symbols without explicit alignment +DECL_ALIGN is already trimmed down to 8, however for weak +symbols this does not happen. These cases are catched by the +type size check. */ + const_tree size = TYPE_SIZE (TREE_TYPE (decl)); + unsigned HOST_WIDE_INT size_num = (tree_fits_uhwi_p (size) +? tree_to_uhwi (size) : 0); if ((DECL_USER_ALIGN (decl) && DECL_ALIGN (decl) % 16) - || (s390_unaligned_symbols_p && !decl_binds_to_current_def_p (decl))) + || (s390_unaligned_symbols_p + && !decl_binds_to_current_def_p (decl) + && (DECL_USER_ALIGN (decl) ? DECL_ALIGN (decl) % 16 : size_num < 16))) SYMBOL_FLAG_SET_NOTALIGN2 (XEXP (rtl, 0)); else if (DECL_ALIGN (decl) % 32) SYMBOL_FLAG_SET_NOTALIGN4 (XEXP (rtl, 0)); diff --git a/gcc/config/s390/s390.opt b/gcc/config/s390/s390.opt index 901ae4beb01..a5b5aa95a12 100644 --- a/gcc/config/s390/s390.opt +++ b/gcc/config/s390/s390.opt @@ -332,7 +332,8 @@ Store all argument registers on the stack. munaligned-symbols Target Var(s390_unaligned_symbols_p) Init(0) -Assume external symbols to be potentially unaligned. By default all -symbols without explicit alignment are assumed to reside on a 2 byte -boundary as mandated by the IBM Z ABI. +Assume external symbols, whose natural alignment would be 1, to be +potentially unaligned. By default all symbols without explicit +alignment are assumed to reside on a 2 byte boundary as mandated by +the IBM Z ABI. diff --git a/gcc/testsuite/gcc.target/s390/aligned-1.c b/gcc/testsuite/gcc.target/s390/aligned-1.c index 2dc99cf66bd..3f5a2611ef1 100644 --- a/gcc/testsuite/gcc.target/s390/aligned-1.c +++ b/gcc/testsuite/gcc.target/s390/aligned-1.c @@ -1,20 +1,103 @@ -/* Even symbols without explicite alignment are assumed to reside on a +/* Even symbols without explicit alignment are assumed to reside on a 2 byte boundary, as mandated by the IBM Z ELF ABI, and therefore can be accessed using the larl instruction. */ /* { dg-do compile } */ /* { dg-options "-O3 -march=z900 -fno-section-anchors" } */ -extern unsigned char extern_implicitly_aligned; -extern unsigned char extern_explicitly_aligned __attribute__((aligned(2))); -unsigned char aligned; +extern unsigned char extern_char; +extern unsigned char extern_explicitly_aligned_char __attribute__((aligned(2))); +extern unsigned char extern_explicitly_unaligned_char __attribute__((aligned(1))); +extern unsigned char __attribute__((weak)) extern_weak_char; +extern unsigned char extern_explicitly_aligned_weak_char __attribute__((weak,aligned(2))); +extern unsigned char extern_explicitly_unaligned_weak_char __attribute__((weak,aligned(1))); -unsigned char +unsigned char normal_char; +unsigned char explicitly_unaligned_char __attribute__((aligned(1))); +unsigned char __attribute__((weak)) weak_char = 0; +unsigned char explicitly_aligned_weak_char __attribute__((weak,aligned(2))); +unsigned char
Re: [PATCH] s390: testsuite: Fix abs-4.c
On 3/21/24 15:41, Stefan Schulze Frielinghaus wrote: > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/abs-4.c: On s390 we also have a copysign optab > for long double. Thus, scan 3 instead of 2 times for it. > --- > Ok for mainline? Ok. Thanks! Andreas
Re: [PATCH] s390: testsuite: Fix backprop-6.c
On 3/22/24 10:49, Stefan Schulze Frielinghaus wrote: > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/backprop-6.c: On s390 we also have a copysign > optab for long double. Thus, scan 3 instead of 2 times for it. > --- > OK for mainline? Ok. Thanks! Andreas
Re: [PATCH] libsanitizer: Do not mention MSan and DFSan in an error message
On 4/4/24 13:38, Ilya Leoshkevich wrote: > Bootstrapped and regtested on s390x-redhat-linux. Ok for master? > > > libsanitizer/ChangeLog: > > * sanitizer_common/sanitizer_linux_s390.cpp (AvoidCVE_2016_2143): > Do not mention MSan and DFSan, which are not supported by GCC. Ok, Thanks! Andreas > --- > libsanitizer/sanitizer_common/sanitizer_linux_s390.cpp | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/libsanitizer/sanitizer_common/sanitizer_linux_s390.cpp > b/libsanitizer/sanitizer_common/sanitizer_linux_s390.cpp > index 74db831b0aa..65ba825fa97 100644 > --- a/libsanitizer/sanitizer_common/sanitizer_linux_s390.cpp > +++ b/libsanitizer/sanitizer_common/sanitizer_linux_s390.cpp > @@ -212,7 +212,7 @@ void AvoidCVE_2016_2143() { > return; >Report( > "ERROR: Your kernel seems to be vulnerable to CVE-2016-2143. Using > ASan,\n" > -"MSan, TSan, DFSan or LSan with such kernel can and will crash your\n" > +"TSan or LSan with such kernel can and will crash your\n" > "machine, or worse.\n" > "\n" > "If you are certain your kernel is not vulnerable (you have compiled > it\n"
Re: [PATCH] libsanitizer: Do not mention MSan and DFSan in an error message
On 4/4/24 14:22, Jakub Jelinek wrote: > On Thu, Apr 04, 2024 at 02:19:08PM +0200, Andreas Krebbel wrote: >> On 4/4/24 13:38, Ilya Leoshkevich wrote: >>> Bootstrapped and regtested on s390x-redhat-linux. Ok for master? >>> >>> >>> libsanitizer/ChangeLog: >>> >>> * sanitizer_common/sanitizer_linux_s390.cpp (AvoidCVE_2016_2143): >>> Do not mention MSan and DFSan, which are not supported by GCC. >> >> Ok, Thanks! > > This then needs to be added to libsanitizer/LOCAL_PATCHES , otherwise > it will disappear on the next merge from upstream. > > Though, I must say I'm not entirely convinced the change is worth the > hassle on every libsanitizer merge. You are right. We will leave the message as is. Thanks! Andreas > >>> diff --git a/libsanitizer/sanitizer_common/sanitizer_linux_s390.cpp >>> b/libsanitizer/sanitizer_common/sanitizer_linux_s390.cpp >>> index 74db831b0aa..65ba825fa97 100644 >>> --- a/libsanitizer/sanitizer_common/sanitizer_linux_s390.cpp >>> +++ b/libsanitizer/sanitizer_common/sanitizer_linux_s390.cpp >>> @@ -212,7 +212,7 @@ void AvoidCVE_2016_2143() { >>> return; >>>Report( >>> "ERROR: Your kernel seems to be vulnerable to CVE-2016-2143. Using >>> ASan,\n" >>> -"MSan, TSan, DFSan or LSan with such kernel can and will crash your\n" >>> +"TSan or LSan with such kernel can and will crash your\n" >>> "machine, or worse.\n" >>> "\n" >>> "If you are certain your kernel is not vulnerable (you have compiled >>> it\n" > > Jakub >
Re: [PATCH] s390: Fix s390_const_int_pool_entry_p and movdi peephole2 [PR114605]
On 4/8/24 13:43, Ilya Leoshkevich wrote: > On Sat, 2024-04-06 at 18:58 +0200, Jakub Jelinek wrote: >> Hi! >> >> The following testcase is miscompiled, because we have initially >> a movti which loads the 0x3f803f80ULL TImode constant >> from constant pool. Later on we split it into a pair of DImode >> loads. Now, for the first load (why just that?, though not stage4 >> material) we trigger the peephole2 which uses >> s390_const_int_pool_entry_p. >> That function doesn't check at all the constant pool mode though, >> sees >> the constant pool at that address has a CONST_INT value and just >> assumes >> that is the value to return, which is especially wrong for big- >> endian, >> if it is a DImode load from offset 0, it should be loading 0 rather >> than >> 0x3f803f80ULL. >> The following patch adds checks if we are extracing a MODE_INT mode, >> if the constant pool has MODE_INT mode as well, punts if constant >> pool >> has smaller mode size than the extraction one (then it would be UB), >> if it has the same mode as before keeps using what it did before, >> if constant pool has a larger mode than the one being extracted, uses >> simplify_subreg. I'd have used avoid_constant_pool_reference >> instead which can handle also offsets into the constant pool >> constants, >> but it can't handle UNSPEC_LTREF. >> >> Another thing is that once that is fixed, we ICE when we extract >> constant >> like 0, ior insn predicate require non-0 constant. So, the patch >> also >> fixes the peephole2 so that if either 32-bit half is zero, it uses a >> mere >> load of the constant into register rather than a pair of such load >> and ior. >> >> Bootstrapped/regtested on s390x-linux, ok for trunk? > > Hi Jakub, thanks for the patch, it looks good to me. > Since I'm not a maintainer, we need to wait for Andreas' opinion. Ok. Thank you very much Jakub for fixing this! Andreas > >> >> 2024-04-06 Jakub Jelinek >> >> PR target/114605 >> * config/s390/s390.cc (s390_const_int_pool_entry_p): Punt >> if mem doesn't have MODE_INT mode, or pool constant doesn't >> have MODE_INT mode, or if pool constant mode is smaller than >> mem mode. If mem mode is different from pool constant mode, >> try to simplify subreg. If that doesn't work, punt, if it >> does, use the simplified constant instead of the constant >> pool >> constant. >> * config/s390/s390.md (movdi from const pool peephole): If >> either low or high 32-bit part is zero, just emit move insn >> instead of move + ior. >> >> * gcc.dg/pr114605.c: New test.
Re: [PATCH v2] s390x: Optimize vector permute with constant indexes
On 4/9/24 16:31, Juergen Christ wrote: > Loop vectorizer can generate vector permutes with constant indexes > where all indexes are equal. Optimize this case to use vector > replicate instead of vector permute. > > gcc/ChangeLog: > > * config/s390/s390.cc (expand_perm_as_replicate): Implement. > (vectorize_vec_perm_const_1): Call new function. > * config/s390/vx-builtins.md (vec_splat): Change to... > (@vec_splat): ...this. > > gcc/testsuite/ChangeLog: > > * gcc.target/s390/vector/vec-expand-replicate.c: New test. > > Bootstrapped and regtested on s390x. Ok for trunk? Does this also work when using the vec_perm intrinsic or would we need to define a matching RTX for that? Ok. Thanks! Andreas
Re: [PATCH] IBM Z: Preserve exceptions in autovec-*-signaling-eq.c tests
On 2/19/24 13:39, Ilya Leoshkevich wrote: > DSE, DCE, and other passes are removing redundant signaling comparisons > from these tests, but the whole point is to check that GCC knows how to > emit them. Use -fno-delete-dead-exceptions to prevent that. > > gcc/testsuite/ChangeLog: > > * gcc.target/s390/zvector/autovec-double-signaling-eq.c: > Preserve exceptions. > * gcc.target/s390/zvector/autovec-float-signaling-eq.c: > Likewise. Ok. Thanks! Andreas > --- > .../gcc.target/s390/zvector/autovec-double-signaling-eq.c | 2 +- > .../gcc.target/s390/zvector/autovec-float-signaling-eq.c| 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git > a/gcc/testsuite/gcc.target/s390/zvector/autovec-double-signaling-eq.c > b/gcc/testsuite/gcc.target/s390/zvector/autovec-double-signaling-eq.c > index 3645d3cc393..b23568e06b4 100644 > --- a/gcc/testsuite/gcc.target/s390/zvector/autovec-double-signaling-eq.c > +++ b/gcc/testsuite/gcc.target/s390/zvector/autovec-double-signaling-eq.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O3 -march=z14 -mzvector -mzarch -fexceptions > -fnon-call-exceptions" } */ > +/* { dg-options "-O3 -march=z14 -mzvector -mzarch -fexceptions > -fnon-call-exceptions -fno-delete-dead-exceptions" } */ > > #include "autovec.h" > > diff --git > a/gcc/testsuite/gcc.target/s390/zvector/autovec-float-signaling-eq.c > b/gcc/testsuite/gcc.target/s390/zvector/autovec-float-signaling-eq.c > index d98aa0c494e..cd25d10c577 100644 > --- a/gcc/testsuite/gcc.target/s390/zvector/autovec-float-signaling-eq.c > +++ b/gcc/testsuite/gcc.target/s390/zvector/autovec-float-signaling-eq.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O3 -march=z14 -mzvector -mzarch -fexceptions > -fnon-call-exceptions" } */ > +/* { dg-options "-O3 -march=z14 -mzvector -mzarch -fexceptions > -fnon-call-exceptions -fno-delete-dead-exceptions" } */ > > #include "autovec.h" >
Re: [PATCH] [s390] target/112280 - properly guard permute query
On 1/11/24 14:58, Richard Biener wrote: > The following adds guards avoiding code generation to > expand_perm_as_a_vlbr_vstbr_candidate when d.testing_p. > > Built and tested on the testcase in the PR. > > OK to push as obvious? Otherwise please pick up, test and push. Ok to commit now. Thanks for the fix! I've just started a regression test and will take care of any fallout. Bye, Andreas > > Thanks, > Richard. > > PR target/112280 > * config/s390/s390.cc (expand_perm_as_a_vlbr_vstbr_candidate): > Do not generate code when d.testing_p. > --- > gcc/config/s390/s390.cc | 36 > 1 file changed, 24 insertions(+), 12 deletions(-) > > diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc > index 748ad9cd932..f182c26e78b 100644 > --- a/gcc/config/s390/s390.cc > +++ b/gcc/config/s390/s390.cc > @@ -17867,33 +17867,45 @@ expand_perm_as_a_vlbr_vstbr_candidate (const struct > expand_vec_perm_d &d) > >if (memcmp (d.perm, perm[0], MAX_VECT_LEN) == 0) > { > - rtx target = gen_rtx_SUBREG (V8HImode, d.target, 0); > - rtx op0 = gen_rtx_SUBREG (V8HImode, d.op0, 0); > - emit_insn (gen_bswapv8hi (target, op0)); > + if (!d.testing_p) > + { > + rtx target = gen_rtx_SUBREG (V8HImode, d.target, 0); > + rtx op0 = gen_rtx_SUBREG (V8HImode, d.op0, 0); > + emit_insn (gen_bswapv8hi (target, op0)); > + } >return true; > } > >if (memcmp (d.perm, perm[1], MAX_VECT_LEN) == 0) > { > - rtx target = gen_rtx_SUBREG (V4SImode, d.target, 0); > - rtx op0 = gen_rtx_SUBREG (V4SImode, d.op0, 0); > - emit_insn (gen_bswapv4si (target, op0)); > + if (!d.testing_p) > + { > + rtx target = gen_rtx_SUBREG (V4SImode, d.target, 0); > + rtx op0 = gen_rtx_SUBREG (V4SImode, d.op0, 0); > + emit_insn (gen_bswapv4si (target, op0)); > + } >return true; > } > >if (memcmp (d.perm, perm[2], MAX_VECT_LEN) == 0) > { > - rtx target = gen_rtx_SUBREG (V2DImode, d.target, 0); > - rtx op0 = gen_rtx_SUBREG (V2DImode, d.op0, 0); > - emit_insn (gen_bswapv2di (target, op0)); > + if (!d.testing_p) > + { > + rtx target = gen_rtx_SUBREG (V2DImode, d.target, 0); > + rtx op0 = gen_rtx_SUBREG (V2DImode, d.op0, 0); > + emit_insn (gen_bswapv2di (target, op0)); > + } >return true; > } > >if (memcmp (d.perm, perm[3], MAX_VECT_LEN) == 0) > { > - rtx target = gen_rtx_SUBREG (V1TImode, d.target, 0); > - rtx op0 = gen_rtx_SUBREG (V1TImode, d.op0, 0); > - emit_insn (gen_bswapv1ti (target, op0)); > + if (!d.testing_p) > + { > + rtx target = gen_rtx_SUBREG (V1TImode, d.target, 0); > + rtx op0 = gen_rtx_SUBREG (V1TImode, d.op0, 0); > + emit_insn (gen_bswapv1ti (target, op0)); > + } >return true; > } >
[Committed] IBM Z: Cover weak symbols with -munaligned-symbols
With the recently introduced -munaligned-symbols option byte-sized variables which are resolved externally are considered to be potentially misaligned. However, this should rather also be applied to symbols which resolve locally if they are weak. Done with this patch. Committed to mainline. gcc/ChangeLog: * config/s390/s390.cc (s390_encode_section_info): Replace SYMBOL_REF_LOCAL_P with decl_binds_to_current_def_p. gcc/testsuite/ChangeLog: * gcc.target/s390/unaligned-2.c: New test. --- gcc/config/s390/s390.cc | 6 ++ gcc/testsuite/gcc.target/s390/unaligned-2.c | 16 2 files changed, 18 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/s390/unaligned-2.c diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc index 044de874590..a5c36b43972 100644 --- a/gcc/config/s390/s390.cc +++ b/gcc/config/s390/s390.cc @@ -13802,10 +13802,8 @@ s390_encode_section_info (tree decl, rtx rtl, int first) byte aligned as mandated by our ABI. This behavior can be overridden for external symbols with the -munaligned-symbols switch. */ - if (DECL_ALIGN (decl) % 16 - && (DECL_USER_ALIGN (decl) - || (!SYMBOL_REF_LOCAL_P (XEXP (rtl, 0)) - && s390_unaligned_symbols_p))) + if ((DECL_USER_ALIGN (decl) && DECL_ALIGN (decl) % 16) + || (s390_unaligned_symbols_p && !decl_binds_to_current_def_p (decl))) SYMBOL_FLAG_SET_NOTALIGN2 (XEXP (rtl, 0)); else if (DECL_ALIGN (decl) % 32) SYMBOL_FLAG_SET_NOTALIGN4 (XEXP (rtl, 0)); diff --git a/gcc/testsuite/gcc.target/s390/unaligned-2.c b/gcc/testsuite/gcc.target/s390/unaligned-2.c new file mode 100644 index 000..c1ece6d5935 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/unaligned-2.c @@ -0,0 +1,16 @@ +/* weak symbols might get overridden in another module by symbols + which are not aligned on a 2-byte boundary. Although this violates + the zABI we try to handle this gracefully by not using larl on + these symbols if -munaligned-symbols has been specified. */ + +/* { dg-do compile } */ +/* { dg-options "-O3 -march=z900 -fno-section-anchors -munaligned-symbols" } */ +unsigned char __attribute__((weak)) weaksym = 0; + +unsigned char +foo () +{ + return weaksym; +} + +/* { dg-final { scan-assembler-times "larl\t%r\[0-9\]*,weaksym\n" 0 } } */ -- 2.43.0
Re: [PATCH] s390: Fix builtin-classify-type-1.c on s390 too [PR112725]
On 11/30/23 17:34, Jakub Jelinek wrote: > On Wed, Nov 29, 2023 at 07:27:20PM +0100, Jakub Jelinek wrote: >> Given that the s390 backend defines pretty much the same target hook >> as rs6000, I believe it suffers (at least when using -mvx?) the same >> problem as rs6000, though admittedly this is so far completely >> untested. >> >> Ok for trunk if it passes bootstrap/regtest there? > > Now successfully bootstrapped/regtested on s390x-linux and indeed it > fixes > -FAIL: c-c++-common/builtin-classify-type-1.c -Wc++-compat (test for excess > errors) > -UNRESOLVED: c-c++-common/builtin-classify-type-1.c -Wc++-compat > compilation failed to produce executable > there as well. > >> 2023-11-29 Jakub Jelinek >> >> PR target/112725 >> * config/s390/s390.cc (s390_invalid_arg_for_unprototyped_fn): Return >> NULL for __builtin_classify_type calls with vector arguments. Ok. Thank you, Jakub! Andreas
Re: [PATCH] s390x: Fix PR112753
On 11/30/23 16:45, Juergen Christ wrote: > Commit 466b100e5fee808d77598e0f294654deec281150 introduced a bug in > s390_md_asm_adjust if vector extensions are not available. Fix the control > flow of this function to not adjust long double values. > > gcc/ChangeLog: > > * config/s390/s390.cc (s390_md_asm_adjust): Fix. > > gcc/testsuite/ChangeLog: > > * gcc.target/s390/pr112753.c: New test. > > Bootstrapped and tested on s390x. Committed to mainline with a slightly more verbose changelog which also refers to the BZ. Thanks! Andreas > > Signed-off-by: Juergen Christ > --- > gcc/config/s390/s390.cc | 4 > gcc/testsuite/gcc.target/s390/pr112753.c | 8 > 2 files changed, 12 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/s390/pr112753.c > > diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc > index 29b5dc979207..3a4d2d346f0c 100644 > --- a/gcc/config/s390/s390.cc > +++ b/gcc/config/s390/s390.cc > @@ -17604,6 +17604,10 @@ s390_md_asm_adjust (vec &outputs, vec > &inputs, >outputs[i] = fprx2; > } > > + if (!TARGET_VXE) > +/* Long doubles are stored in FPR pairs - nothing left to do. */ > +return after_md_seq; > + >for (unsigned i = 0; i < ninputs; i++) > { >if (GET_MODE (inputs[i]) != TFmode) > diff --git a/gcc/testsuite/gcc.target/s390/pr112753.c > b/gcc/testsuite/gcc.target/s390/pr112753.c > new file mode 100644 > index ..7183b3f12bed > --- /dev/null > +++ b/gcc/testsuite/gcc.target/s390/pr112753.c > @@ -0,0 +1,8 @@ > +/* This caused an ICE on s390x due to a bug in s390_md_asm_adjust when no > + vector extension is available. */ > + > +/* { dg-do compile } */ > +/* { dg-options "-O2 -march=zEC12" } */ > + > +long double strtold_l_internal___x; > +void strtold_l_internal() { __asm__("" : : > "fm"(strtold_l_internal___x)); }
Re: [PATCH] testsuite: Fix up gcc.target/s390/pr96127.c test for modern C [PR96127]
On 12/3/23 19:36, Jakub Jelinek wrote: > Hi! > > I've noticed this test regressed on s390x-linux with the addition of the > switch to modern C patchset. Haven't tried to reproduce the ICE, but as it > was a backend ICE and FE after warning used to add such casts before (now > errors), I think this ought to keep the testcase testing what was intended > before. > > Ok for trunk? Ok, thanks! Andreas
Re: [PATCH] s390: Fix expansion of vec_step
On 12/4/23 11:14, Stefan Schulze Frielinghaus wrote: > Add missing "s390" while expanding vec_step to __builtin_s390_vec_step. > > gcc/ChangeLog: > > * config/s390/vecintrin.h (vec_step): Expand vec_step to > __builtin_s390_vec_step. Ok, Thanks! Andreas
Re: [PATCH 1/3] s390: Recognize further vpdi and vmr{l,h} pattern
On 11/9/23 09:22, Stefan Schulze Frielinghaus wrote: > Deal with cases where vpdi and vmr{l,h} are still applicable if the > operands of those instructions are swapped. For example, currently for > > V2DI foo (V2DI x) > { > return (V2DI) {x[1], x[0]}; > } > > the assembler sequence > > vlgvg %r1,%v24,1 > vzero %v0 > vlvgg %v0,%r1,0 > vmrhg %v24,%v0,%v24 > > is emitted. With this patch a single vpdi is emitted. > > Extensive tests are included in a subsequent patch of this series where > more cases are covered. > > Bootstrapped and regtested on s390. Ok for mainline? > > gcc/ChangeLog: > > * config/s390/s390.cc (expand_perm_with_merge): Deal with cases > where vmr{l,h} are still applicable if the operands are swapped. > (expand_perm_with_vpdi): Likewise for vpdi. Ok, Thanks! Andreas
Re: [PATCH 3/3] s390: Revise vector reverse elements
On 11/9/23 09:22, Stefan Schulze Frielinghaus wrote: > Replace UNSPEC_VEC_ELTSWAP with a vec_select implementation. > > Furthermore, for a vector reverse elements operation between registers > of mode V8HI perform three rotates instead of a vperm operation since > the latter involves loading the permutation vector from the literal > pool. > > Prior z15, instead of > larl + vl + vl + vperm > prefer > vl + vpdi (+ verllg (+ verllf)) > for a load operation. > > Likewise, prior z15, instead of > larl + vl + vperm + vst > prefer > vpdi (+ verllg (+ verllf)) + vst > for a store operation. > > Bootstrapped and regtested on s390. Ok for mainline? > > gcc/ChangeLog: > > * config/s390/s390.md: Remove UNSPEC_VEC_ELTSWAP. > * config/s390/vector.md (eltswapv16qi): New expander. > (*eltswapv16qi): New insn and splitter. > (eltswapv8hi): New insn and splitter. > (eltswap): New insn and splitter for modes V_HW_4 as well > as V_HW_2. > * config/s390/vx-builtins.md (eltswap): Remove. > (*eltswapv16qi): Remove. > (*eltswap): Remove. > (*eltswap_emu): Remove. > > gcc/testsuite/ChangeLog: > > * gcc.target/s390/zvector/vec-reve-load-halfword-z14.c: Remove > vperm and substitude by vpdi et al. > * gcc.target/s390/zvector/vec-reve-load-halfword.c: Likewise. > * gcc.target/s390/vector/reverse-elements-1.c: New test. > * gcc.target/s390/vector/reverse-elements-2.c: New test. > * gcc.target/s390/vector/reverse-elements-3.c: New test. > * gcc.target/s390/vector/reverse-elements-4.c: New test. > * gcc.target/s390/vector/reverse-elements-5.c: New test. > * gcc.target/s390/vector/reverse-elements-6.c: New test. > * gcc.target/s390/vector/reverse-elements-7.c: New test. Ok, thanks! Andreas
Re: [PATCH 2/3] s390: Add expand_perm_reverse_elements
On 11/9/23 09:22, Stefan Schulze Frielinghaus wrote: > Replace expand_perm_with_rot, expand_perm_with_vster, and > expand_perm_with_vstbrq with a general implementation > expand_perm_reverse_elements. > > Bootstrapped and regtested on s390. Ok for mainline? > > gcc/ChangeLog: > > * config/s390/s390.cc (expand_perm_with_rot): Remove. > (expand_perm_reverse_elements): New. > (expand_perm_with_vster): Remove. > (expand_perm_with_vstbrq): Remove. > (vectorize_vec_perm_const_1): Replace removed functions with new > one. Ok, thanks! Andreas
Re: [PATCH] s390: Reduce number of patterns where the condition is false anyway
On 11/9/23 09:24, Stefan Schulze Frielinghaus wrote: > For patterns which make use of two modes, do not build the cross product > and then exclude illegal combinations via conditions but rather do not > create those in the first place. Here we are following the idea of the > attribute TOINTVEC/tointvec and introduce TOINT/toint. > > Bootstrapped and regtested on s390. Ok for mainline? > > gcc/ChangeLog: > > * config/s390/s390.md (VX_CONV_INT): Remove iterator. > (gf): Add float mappings. > (TOINT, toint): New attribute. > (*fixuns_trunc2_z13): > Remove. > (*fixuns_trunc2_z13): Add. > (*fix_trunc2_bfp_z13): > Remove. > (*fix_trunc2_bfp_z13): Add. > (*floatuns2_z13): Remove. > (*floatuns2_z13): Add. > * config/s390/vector.md (VX_VEC_CONV_INT): Remove iterator. > (float2): Remove. > (float2): Add. > (floatuns2): Remove. > (floatuns2): Add. > (fix_trunc2): > Remove. > (fix_trunc2): Add. > (fixuns_trunc2): > Remove. > (fixuns_trunc2): Add. Ok, thanks! Andreas
[Committed] IBM Z: Fix ICE with overloading and checking enabled
s390_resolve_overloaded_builtin, when called on NON_DEPENDENT_EXPR, ICEs when using the type from it which ends up as error_mark_node. This particular instance of the problem does not occur anymore since NON_DEPENDENT_EXPR has been removed. Nevertheless that case needs to be handled here. Bootstrapped and regression tested on IBM Z. Committed to mainline. gcc/ChangeLog: * config/s390/s390-c.cc (s390_fn_types_compatible): Add a check for error_mark_node. gcc/testsuite/ChangeLog: * g++.target/s390/zvec-templ-1.C: New test. --- gcc/config/s390/s390-c.cc| 3 +++ gcc/testsuite/g++.target/s390/zvec-templ-1.C | 24 2 files changed, 27 insertions(+) create mode 100644 gcc/testsuite/g++.target/s390/zvec-templ-1.C diff --git a/gcc/config/s390/s390-c.cc b/gcc/config/s390/s390-c.cc index 269f4f8e978..fce569342f3 100644 --- a/gcc/config/s390/s390-c.cc +++ b/gcc/config/s390/s390-c.cc @@ -781,6 +781,9 @@ s390_fn_types_compatible (enum s390_builtin_ov_type_index typeindex, tree in_arg = (*arglist)[i]; tree in_type = TREE_TYPE (in_arg); + if (in_type == error_mark_node) + goto mismatch; + if (VECTOR_TYPE_P (b_arg_type)) { /* Vector types have to match precisely. */ diff --git a/gcc/testsuite/g++.target/s390/zvec-templ-1.C b/gcc/testsuite/g++.target/s390/zvec-templ-1.C new file mode 100644 index 000..07bb65f199b --- /dev/null +++ b/gcc/testsuite/g++.target/s390/zvec-templ-1.C @@ -0,0 +1,24 @@ +// { dg-do compile } +// { dg-options "-O0 -mzvector -march=arch14 -mzarch" } +// { dg-bogus "internal compiler error" "ICE" { target s390*-*-* } 23 } +// { dg-excess-errors "" } + +/* This used to ICE with checking enabled because + s390_resolve_overloaded_builtin gets called on NON_DEPENDENT_EXPR + arguments. We then try to determine the type of it, get an error + node and ICEd consequently when using this. + + This particular instance of the problem disappeared when + NON_DEPENDENT_EXPRs got removed with: + + commit dad311874ac3b3cf4eca1c04f67cae80c953f7b8 + Author: Patrick Palka + Date: Fri Oct 20 10:45:00 2023 -0400 + +c++: remove NON_DEPENDENT_EXPR, part 1 + + Nevertheless we should check for error mark nodes in that code. */ + +template void foo() { + __builtin_s390_vec_perm( , , ); +} -- 2.41.0
[Committed] IBM Z: Add GTY marker to builtin data structures
This adds GTY markers to s390_builtin_types, s390_builtin_fn_types, and s390_builtin_decls. These were missing causing problems in particular when using builtins after including a precompiled header. Unfortunately the declaration of these data structures use enum values from s390-builtins.h. This file however is not included everywhere and is rather large. In order to include it only for the purpose of gtype-desc.cc we place a preprocessed copy of it in the build directory and include only this. This is going to be backported to GCC 12 and 13. Bootstrapped and regression tested on IBM Z. Committed to mainline. gcc/ChangeLog: * config.gcc: Add s390-gen-builtins.h to target_gtfiles. * config/s390/s390-builtins.h (s390_builtin_types) (s390_builtin_fn_types, s390_builtin_decls): Add GTY marker. * config/s390/t-s390 (EXTRA_GTYPE_DEPS): Add s390-gen-builtins.h. Add build rule for s390-gen-builtins.h. --- gcc/config.gcc | 1 + gcc/config/s390/s390-builtins.h | 10 +- gcc/config/s390/t-s390 | 4 3 files changed, 10 insertions(+), 5 deletions(-) diff --git a/gcc/config.gcc b/gcc/config.gcc index ba6d63e33ac..c1460ca354e 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -571,6 +571,7 @@ s390*-*-*) d_target_objs="s390-d.o" extra_options="${extra_options} fused-madd.opt" extra_headers="s390intrin.h htmintrin.h htmxlintrin.h vecintrin.h" + target_gtfiles="./s390-gen-builtins.h" ;; # Note the 'l'; we need to be able to match e.g. "shle" or "shl". sh[123456789lbe]*-*-* | sh-*-*) diff --git a/gcc/config/s390/s390-builtins.h b/gcc/config/s390/s390-builtins.h index 45bba876828..84676fe5b3f 100644 --- a/gcc/config/s390/s390-builtins.h +++ b/gcc/config/s390/s390-builtins.h @@ -88,8 +88,8 @@ enum s390_builtin_ov_type_index #define MAX_OV_OPERANDS 6 -extern tree s390_builtin_types[BT_MAX]; -extern tree s390_builtin_fn_types[BT_FN_MAX]; +extern GTY(()) tree s390_builtin_types[BT_MAX]; +extern GTY(()) tree s390_builtin_fn_types[BT_FN_MAX]; /* Builtins. */ @@ -172,6 +172,6 @@ opflags_for_builtin (int fcode) return opflags_builtin[fcode]; } -extern tree s390_builtin_decls[S390_BUILTIN_MAX + - S390_OVERLOADED_BUILTIN_MAX + - S390_OVERLOADED_BUILTIN_VAR_MAX]; +extern GTY(()) tree s390_builtin_decls[S390_BUILTIN_MAX + + S390_OVERLOADED_BUILTIN_MAX + + S390_OVERLOADED_BUILTIN_VAR_MAX]; diff --git a/gcc/config/s390/t-s390 b/gcc/config/s390/t-s390 index 828818bed2d..4ab9718f6e2 100644 --- a/gcc/config/s390/t-s390 +++ b/gcc/config/s390/t-s390 @@ -19,6 +19,7 @@ TM_H += $(srcdir)/config/s390/s390-builtins.def TM_H += $(srcdir)/config/s390/s390-builtin-types.def PASSES_EXTRA += $(srcdir)/config/s390/s390-passes.def +EXTRA_GTYPE_DEPS += ./s390-gen-builtins.h s390-c.o: $(srcdir)/config/s390/s390-c.cc \ $(srcdir)/config/s390/s390-protos.h $(CONFIG_H) $(SYSTEM_H) coretypes.h \ @@ -30,3 +31,6 @@ s390-c.o: $(srcdir)/config/s390/s390-c.cc \ s390-d.o: $(srcdir)/config/s390/s390-d.cc $(COMPILE) $< $(POSTCOMPILE) + +s390-gen-builtins.h: $(srcdir)/config/s390/s390-builtins.h + $(COMPILER) -E $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $< > $@ -- 2.41.0
Re: [PATCH] s390: Fix vec_scatter_element for vectors of floats
On 11/14/23 12:44, Stefan Schulze Frielinghaus wrote: > The offset for vec_scatter_element of floats should be a vector of type > UV4SI instead of V4SF. Note, this is an incompatibility change. > > Bootstrapped on s390. Ok for mainline? > > gcc/ChangeLog: > > * config/s390/s390-builtin-types.def: Add/remove types. > * config/s390/s390-builtins.def (s390_vec_scatter_element_flt): > The type for the offset should be UV4SI instead of V4SF. Ok, Thanks! Andreas > --- > gcc/config/s390/s390-builtin-types.def | 2 +- > gcc/config/s390/s390-builtins.def | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/gcc/config/s390/s390-builtin-types.def > b/gcc/config/s390/s390-builtin-types.def > index 3d8b30cdcc8..22ee348dbbb 100644 > --- a/gcc/config/s390/s390-builtin-types.def > +++ b/gcc/config/s390/s390-builtin-types.def > @@ -856,7 +856,7 @@ DEF_OV_TYPE (BT_OV_VOID_V2DI_LONG_LONGLONGPTR, BT_VOID, > BT_V2DI, BT_LONG, BT_LON > DEF_OV_TYPE (BT_OV_VOID_V2DI_UV2DI_LONGLONGPTR_ULONGLONG, BT_VOID, BT_V2DI, > BT_UV2DI, BT_LONGLONGPTR, BT_ULONGLONG) > DEF_OV_TYPE (BT_OV_VOID_V4SF_FLTPTR_UINT, BT_VOID, BT_V4SF, BT_FLTPTR, > BT_UINT) > DEF_OV_TYPE (BT_OV_VOID_V4SF_LONG_FLTPTR, BT_VOID, BT_V4SF, BT_LONG, > BT_FLTPTR) > -DEF_OV_TYPE (BT_OV_VOID_V4SF_V4SF_FLTPTR_ULONGLONG, BT_VOID, BT_V4SF, > BT_V4SF, BT_FLTPTR, BT_ULONGLONG) > +DEF_OV_TYPE (BT_OV_VOID_V4SF_UV4SI_FLTPTR_ULONGLONG, BT_VOID, BT_V4SF, > BT_UV4SI, BT_FLTPTR, BT_ULONGLONG) > DEF_OV_TYPE (BT_OV_VOID_V4SI_INTPTR_UINT, BT_VOID, BT_V4SI, BT_INTPTR, > BT_UINT) > DEF_OV_TYPE (BT_OV_VOID_V4SI_LONG_INTPTR, BT_VOID, BT_V4SI, BT_LONG, > BT_INTPTR) > DEF_OV_TYPE (BT_OV_VOID_V4SI_UV4SI_INTPTR_ULONGLONG, BT_VOID, BT_V4SI, > BT_UV4SI, BT_INTPTR, BT_ULONGLONG) > diff --git a/gcc/config/s390/s390-builtins.def > b/gcc/config/s390/s390-builtins.def > index 964d86c74a0..b59fa09fe07 100644 > --- a/gcc/config/s390/s390-builtins.def > +++ b/gcc/config/s390/s390-builtins.def > @@ -708,7 +708,7 @@ OB_DEF_VAR (s390_vec_scatter_element_u32,s390_vscef, > 0, > OB_DEF_VAR (s390_vec_scatter_element_s64,s390_vsceg,0, >O4_U1, BT_OV_VOID_V2DI_UV2DI_LONGLONGPTR_ULONGLONG) > OB_DEF_VAR (s390_vec_scatter_element_b64,s390_vsceg,0, >O4_U1, BT_OV_VOID_BV2DI_UV2DI_ULONGLONGPTR_ULONGLONG) > OB_DEF_VAR (s390_vec_scatter_element_u64,s390_vsceg,0, >O4_U1, BT_OV_VOID_UV2DI_UV2DI_ULONGLONGPTR_ULONGLONG) > -OB_DEF_VAR (s390_vec_scatter_element_flt,s390_vscef,B_VXE, >O4_U2, BT_OV_VOID_V4SF_V4SF_FLTPTR_ULONGLONG) > +OB_DEF_VAR (s390_vec_scatter_element_flt,s390_vscef,B_VXE, >O4_U2, BT_OV_VOID_V4SF_UV4SI_FLTPTR_ULONGLONG) > OB_DEF_VAR (s390_vec_scatter_element_dbl,s390_vsceg,0, >O4_U1, BT_OV_VOID_V2DF_UV2DI_DBLPTR_ULONGLONG) > > B_DEF (s390_vscef, vec_scatter_elementv4si,0, >B_VX, O4_U2, > BT_FN_VOID_UV4SI_UV4SI_UINTPTR_ULONGLONG)
Re: [PATCH] s390: Fix generation of s390-gen-builtins.h
On 11/15/23 14:29, Stefan Schulze Frielinghaus wrote: > By default the preprocessed output includes linemarkers. This leads to > an error if -pedantic is used as e.g. during bootstrap: > > s390-gen-builtins.h:1:3: error: style of line directive is a GCC extension > [-Werror] > > Fixed by omitting linemarkers while generating s390-gen-builtins.h. > > gcc/ChangeLog: > > * config/s390/t-s390: Generate s390-gen-builtins.h without > linemarkers. Ok, Thanks! Andreas > --- > gcc/config/s390/t-s390 | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/config/s390/t-s390 b/gcc/config/s390/t-s390 > index 4ab9718f6e2..2e884c367de 100644 > --- a/gcc/config/s390/t-s390 > +++ b/gcc/config/s390/t-s390 > @@ -33,4 +33,4 @@ s390-d.o: $(srcdir)/config/s390/s390-d.cc > $(POSTCOMPILE) > > s390-gen-builtins.h: $(srcdir)/config/s390/s390-builtins.h > - $(COMPILER) -E $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $< > $@ > + $(COMPILER) -E -P $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $< > > $@
Re: [PATCH] s390: Fix ICE in testcase pr89233
On 11/15/23 14:12, Juergen Christ wrote: > When using GNU vector extensions, an access outside of the vector size > caused an ICE on s390. Fix this by aligning with the vec_extract > builtin, i.e., computing constant index modulo number of lanes. > > Fixes testcase gcc.target/s390/pr89233.c. > > Bootstrapped and tested on s390. OK for mainline? > > gcc/ChangeLog: > > * config/s390/vector.md: (*vec_extract) Fix. Committed to mainline. Thanks! Andreas
Re: [PATCH] s390: split int128 load
On 11/15/23 14:15, Juergen Christ wrote: > Issue two loads when using GPRs instead of one load-multiple. > > Bootstrapped and tested on s390. OK for mainline? > > gcc/ChangeLog: > > * config/s390/s390.md: Split TImode loads. > > gcc/testsuite/ChangeLog: > > * gcc.target/s390/int128load.c: New test. > > Signed-off-by: Juergen Christ Since the testcase is using __int128 it needs to be gated like this to prevent it from being tested with -m31: /* { dg-do compile { target int128 } } */ Committed to mainline with that change. Thanks! Andreas
Re: [PATCH] s390: implement flags output
On 11/15/23 14:15, Juergen Christ wrote: > Implement flags output for inline assemblies. Only use one output constraint > that captures the whole condition code. No breakout into different condition > codes is allowed. Also, only one condition code variable is allowed. > > Add further logic to canonicalize various cases where we combine different > cases of possible condition codes. > > Bootstrapped and tested on s390. OK for mainline? > > gcc/ChangeLog: > > * config/s390/s390-c.cc (s390_cpu_cpp_builtins): Define > __GCC_ASM_FLAG_OUTPUTS__. > * config/s390/s390.cc (s390_canonicalize_comparison): More > UNSPEC_CC_TO_INT cases. > (s390_md_asm_adjust): Implement flags output. > * config/s390/s390.md (ccstore4): Allow mask operands. > * doc/extend.texi: Document flags output. > > gcc/testsuite/ChangeLog: > > * gcc.target/s390/ccor.c: New test. > > Signed-off-by: Juergen Christ Committed to mainline with a few minor formatting fixes. Thanks! Andreas
Re: [PATCH] s390: Fix builtins floating-point convert to/from fixed
Ok, thanks! Andreas On 11/27/23 10:11, Stefan Schulze Frielinghaus wrote: > Ping. > > On Tue, Nov 14, 2023 at 04:19:59PM +0100, Stefan Schulze Frielinghaus wrote: >> Remove flags for non-existing operands 2 and 3. >> >> Bootstrapped on s390. Ok for mainline? >> >> gcc/ChangeLog: >> >> * config/s390/s390-builtins.def >> (s390_vcefb,s390_vcdgb,s390_vcelfb,s390_vcdlgb,s390_vcfeb,s390_vcgdb, >> s390_vclfeb,s390_vclgdb): Remove flags for non-existing operands >> 2 and 3. >> --- >> gcc/config/s390/s390-builtins.def | 16 >> 1 file changed, 8 insertions(+), 8 deletions(-) >> >> diff --git a/gcc/config/s390/s390-builtins.def >> b/gcc/config/s390/s390-builtins.def >> index 964d86c74a0..5bcf0d16ba3 100644 >> --- a/gcc/config/s390/s390-builtins.def >> +++ b/gcc/config/s390/s390-builtins.def >> @@ -2840,10 +2840,10 @@ OB_DEF (s390_vec_double, >> s390_vec_double_s64,s390_vec_double_u64, >> OB_DEF_VAR (s390_vec_double_s64,s390_vcdgb, 0, >> 0, BT_OV_V2DF_V2DI) >> OB_DEF_VAR (s390_vec_double_u64,s390_vcdlgb,0, >> 0, BT_OV_V2DF_UV2DI) >> >> -B_DEF (s390_vcefb, floatv4siv4sf2, 0, >> B_VXE2, O2_U4 | O3_U3, BT_FN_V4SF_V4SI) >> -B_DEF (s390_vcdgb, floatv2div2df2, 0, >> B_VX, O2_U4 | O3_U3, BT_FN_V2DF_V2DI) >> -B_DEF (s390_vcelfb,floatunsv4siv4sf2, 0, >> B_VXE2, O2_U4 | O3_U3, BT_FN_V4SF_UV4SI) >> -B_DEF (s390_vcdlgb,floatunsv2div2df2, 0, >> B_VX, O2_U4 | O3_U3, BT_FN_V2DF_UV2DI) >> +B_DEF (s390_vcefb, floatv4siv4sf2, 0, >> B_VXE2, 0, BT_FN_V4SF_V4SI) >> +B_DEF (s390_vcdgb, floatv2div2df2, 0, >> B_VX, 0, BT_FN_V2DF_V2DI) >> +B_DEF (s390_vcelfb,floatunsv4siv4sf2, 0, >> B_VXE2, 0, BT_FN_V4SF_UV4SI) >> +B_DEF (s390_vcdlgb,floatunsv2div2df2, 0, >> B_VX, 0, BT_FN_V2DF_UV2DI) >> >> OB_DEF (s390_vec_signed, >> s390_vec_signed_flt,s390_vec_signed_dbl,B_VX, >> BT_FN_OV4SI_OV4SI) >> OB_DEF_VAR (s390_vec_signed_flt,s390_vcfeb, B_VXE2, >> 0, BT_OV_V4SI_V4SF) >> @@ -2853,10 +2853,10 @@ OB_DEF (s390_vec_unsigned, >> s390_vec_unsigned_flt,s390_vec_unsigned_ >> OB_DEF_VAR (s390_vec_unsigned_flt, s390_vclfeb,B_VXE2, >> 0, BT_OV_UV4SI_V4SF) >> OB_DEF_VAR (s390_vec_unsigned_dbl, s390_vclgdb,0, >> 0, BT_OV_UV2DI_V2DF) >> >> -B_DEF (s390_vcfeb, fix_truncv4sfv4si2, 0, >> B_VXE2, O2_U4 | O3_U3, BT_FN_V4SI_V4SF) >> -B_DEF (s390_vcgdb, fix_truncv2dfv2di2, 0, >> B_VX, O2_U4 | O3_U3, BT_FN_V2DI_V2DF) >> -B_DEF (s390_vclfeb,fixuns_truncv4sfv4si2, 0, >> B_VXE2, O2_U4 | O3_U3, BT_FN_UV4SI_V4SF) >> -B_DEF (s390_vclgdb,fixuns_truncv2dfv2di2, 0, >> B_VX, O2_U4 | O3_U3, BT_FN_UV2DI_V2DF) >> +B_DEF (s390_vcfeb, fix_truncv4sfv4si2, 0, >> B_VXE2, 0, BT_FN_V4SI_V4SF) >> +B_DEF (s390_vcgdb, fix_truncv2dfv2di2, 0, >> B_VX, 0, BT_FN_V2DI_V2DF) >> +B_DEF (s390_vclfeb,fixuns_truncv4sfv4si2, 0, >> B_VXE2, 0, BT_FN_UV4SI_V4SF) >> +B_DEF (s390_vclgdb,fixuns_truncv2dfv2di2, 0, >> B_VX, 0, BT_FN_UV2DI_V2DF) >> >> B_DEF (s390_vfisb, vec_fpintv4sf, 0, >> B_VXE, O2_U4 | O3_U3, BT_FN_V4SF_V4SF_UCHAR_UCHAR) >> B_DEF (s390_vfidb, vec_fpintv2df, 0, >> B_VX, O2_U4 | O3_U3, BT_FN_V2DF_V2DF_UCHAR_UCHAR) >> -- >> 2.41.0 >>
Re: [PATCH] s390: Fix constraint for insn *cmphi_ccu
Ok, thanks! Andreas On 11/27/23 10:12, Stefan Schulze Frielinghaus wrote: > Ping. > > On Wed, Oct 25, 2023 at 11:27:33AM +0200, Stefan Schulze Frielinghaus wrote: >> Currently for an unsigned 16-bit comparison between memory and an >> immediate where the high bit is set, a clc is emitted. This is because >> the constant is created for mode HI and therefore sign extended. This >> means constraint D does not hold anymore. Since the mode already >> restricts the immediate to 16 bit, it is enough to make use of >> constraint n and chop of the high bits in the output template. >> >> Bootstrapped and regtested on s390. Ok for mainline? >> >> gcc/ChangeLog: >> >> * config/s390/s390.md (*cmphi_ccu): For immediate operand 1 make >> use of constraint n instead of D and chop of high bits in the >> output template. >> --- >> gcc/config/s390/s390.md | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md >> index 3f29ba21442..777a20f8e77 100644 >> --- a/gcc/config/s390/s390.md >> +++ b/gcc/config/s390/s390.md >> @@ -1355,13 +1355,13 @@ >> (define_insn "*cmphi_ccu" >>[(set (reg CC_REGNUM) >> (compare (match_operand:HI 0 "nonimmediate_operand" "d,d,Q,Q,BQ") >> - (match_operand:HI 1 "general_operand" "Q,S,D,BQ,Q")))] >> + (match_operand:HI 1 "general_operand" "Q,S,n,BQ,Q")))] >>"s390_match_ccmode (insn, CCUmode) >> && !register_operand (operands[1], HImode)" >>"@ >> clm\t%0,3,%S1 >> clmy\t%0,3,%S1 >> - clhhsi\t%0,%1 >> + clhhsi\t%0,%x1 >> # >> #" >>[(set_attr "op_type" "RS,RSY,SIL,SS,SS") >> -- >> 2.41.0 >>
Re: [PATCH] s390: Streamline NNPA builtins with their LLVM counterparts
Ok, thanks! Andreas On 11/27/23 10:12, Stefan Schulze Frielinghaus wrote: > Ping. > > On Thu, Nov 16, 2023 at 01:07:30PM +0100, Stefan Schulze Frielinghaus wrote: >> For the opaque NNP-data type prefer unsigned over signed integer types. >> >> gcc/ChangeLog: >> >> * config/s390/s390-builtin-types.def: Add/remove types. >> * config/s390/s390-builtins.def >> (s390_vclfnhs,s390_vclfnls,s390_vcrnfs,s390_vcfn,s390_vcnf): >> Replace type V8HI with UV8HI. >> >> gcc/testsuite/ChangeLog: >> >> * gcc.target/s390/zvector/vec-nnpa-fp16-convert.c: Replace V8HI >> types with UV8HI. >> * gcc.target/s390/zvector/vec-nnpa-fp32-convert-1.c: Dito. >> * gcc.target/s390/zvector/vec_convert_from_fp16.c: Dito. >> * gcc.target/s390/zvector/vec_convert_to_fp16.c: Dito. >> * gcc.target/s390/zvector/vec_extend_to_fp32_hi.c: Dito. >> * gcc.target/s390/zvector/vec_extend_to_fp32_lo.c: Dito. >> * gcc.target/s390/zvector/vec_round_from_fp32.c: Dito. >> --- >> gcc/config/s390/s390-builtin-types.def | 5 ++--- >> gcc/config/s390/s390-builtins.def | 10 +- >> .../gcc.target/s390/zvector/vec-nnpa-fp16-convert.c| 6 +++--- >> .../gcc.target/s390/zvector/vec-nnpa-fp32-convert-1.c | 2 +- >> .../gcc.target/s390/zvector/vec_convert_from_fp16.c| 4 ++-- >> .../gcc.target/s390/zvector/vec_convert_to_fp16.c | 4 ++-- >> .../gcc.target/s390/zvector/vec_extend_to_fp32_hi.c| 2 +- >> .../gcc.target/s390/zvector/vec_extend_to_fp32_lo.c| 2 +- >> .../gcc.target/s390/zvector/vec_round_from_fp32.c | 2 +- >> 9 files changed, 18 insertions(+), 19 deletions(-) >> >> diff --git a/gcc/config/s390/s390-builtin-types.def >> b/gcc/config/s390/s390-builtin-types.def >> index 3d8b30cdcc8..0bf759bd77a 100644 >> --- a/gcc/config/s390/s390-builtin-types.def >> +++ b/gcc/config/s390/s390-builtin-types.def >> @@ -265,9 +265,9 @@ DEF_FN_TYPE_2 (BT_FN_V2DI_V2DF_V2DF, BT_V2DI, BT_V2DF, >> BT_V2DF) >> DEF_FN_TYPE_2 (BT_FN_V2DI_V2DI_V2DI, BT_V2DI, BT_V2DI, BT_V2DI) >> DEF_FN_TYPE_2 (BT_FN_V2DI_V4SI_V4SI, BT_V2DI, BT_V4SI, BT_V4SI) >> DEF_FN_TYPE_2 (BT_FN_V4SF_FLT_INT, BT_V4SF, BT_FLT, BT_INT) >> +DEF_FN_TYPE_2 (BT_FN_V4SF_UV8HI_UINT, BT_V4SF, BT_UV8HI, BT_UINT) >> DEF_FN_TYPE_2 (BT_FN_V4SF_V4SF_UCHAR, BT_V4SF, BT_V4SF, BT_UCHAR) >> DEF_FN_TYPE_2 (BT_FN_V4SF_V4SF_V4SF, BT_V4SF, BT_V4SF, BT_V4SF) >> -DEF_FN_TYPE_2 (BT_FN_V4SF_V8HI_UINT, BT_V4SF, BT_V8HI, BT_UINT) >> DEF_FN_TYPE_2 (BT_FN_V4SI_BV4SI_V4SI, BT_V4SI, BT_BV4SI, BT_V4SI) >> DEF_FN_TYPE_2 (BT_FN_V4SI_INT_VOIDCONSTPTR, BT_V4SI, BT_INT, >> BT_VOIDCONSTPTR) >> DEF_FN_TYPE_2 (BT_FN_V4SI_UV4SI_UV4SI, BT_V4SI, BT_UV4SI, BT_UV4SI) >> @@ -279,7 +279,6 @@ DEF_FN_TYPE_2 (BT_FN_V8HI_BV8HI_V8HI, BT_V8HI, BT_BV8HI, >> BT_V8HI) >> DEF_FN_TYPE_2 (BT_FN_V8HI_UV8HI_UV8HI, BT_V8HI, BT_UV8HI, BT_UV8HI) >> DEF_FN_TYPE_2 (BT_FN_V8HI_V16QI_V16QI, BT_V8HI, BT_V16QI, BT_V16QI) >> DEF_FN_TYPE_2 (BT_FN_V8HI_V4SI_V4SI, BT_V8HI, BT_V4SI, BT_V4SI) >> -DEF_FN_TYPE_2 (BT_FN_V8HI_V8HI_UINT, BT_V8HI, BT_V8HI, BT_UINT) >> DEF_FN_TYPE_2 (BT_FN_V8HI_V8HI_V8HI, BT_V8HI, BT_V8HI, BT_V8HI) >> DEF_FN_TYPE_2 (BT_FN_VOID_UINT64PTR_UINT64, BT_VOID, BT_UINT64PTR, >> BT_UINT64) >> DEF_FN_TYPE_2 (BT_FN_VOID_V2DF_FLTPTR, BT_VOID, BT_V2DF, BT_FLTPTR) >> @@ -317,6 +316,7 @@ DEF_FN_TYPE_3 (BT_FN_UV8HI_UV8HI_USHORT_INT, BT_UV8HI, >> BT_UV8HI, BT_USHORT, BT_I >> DEF_FN_TYPE_3 (BT_FN_UV8HI_UV8HI_UV8HI_INT, BT_UV8HI, BT_UV8HI, BT_UV8HI, >> BT_INT) >> DEF_FN_TYPE_3 (BT_FN_UV8HI_UV8HI_UV8HI_INTPTR, BT_UV8HI, BT_UV8HI, >> BT_UV8HI, BT_INTPTR) >> DEF_FN_TYPE_3 (BT_FN_UV8HI_UV8HI_UV8HI_UV8HI, BT_UV8HI, BT_UV8HI, BT_UV8HI, >> BT_UV8HI) >> +DEF_FN_TYPE_3 (BT_FN_UV8HI_V4SF_V4SF_UINT, BT_UV8HI, BT_V4SF, BT_V4SF, >> BT_UINT) >> DEF_FN_TYPE_3 (BT_FN_V16QI_UV16QI_UV16QI_INTPTR, BT_V16QI, BT_UV16QI, >> BT_UV16QI, BT_INTPTR) >> DEF_FN_TYPE_3 (BT_FN_V16QI_V16QI_V16QI_INTPTR, BT_V16QI, BT_V16QI, >> BT_V16QI, BT_INTPTR) >> DEF_FN_TYPE_3 (BT_FN_V16QI_V16QI_V16QI_V16QI, BT_V16QI, BT_V16QI, BT_V16QI, >> BT_V16QI) >> @@ -347,7 +347,6 @@ DEF_FN_TYPE_3 (BT_FN_V4SI_V4SI_V4SI_V4SI, BT_V4SI, >> BT_V4SI, BT_V4SI, BT_V4SI) >> DEF_FN_TYPE_3 (BT_FN_V4SI_V8HI_V8HI_V4SI, BT_V4SI, BT_V8HI, BT_V8HI, >> BT_V4SI) >> DEF_FN_TYPE_3 (BT_FN_V8HI_UV8HI_UV8HI_INTPTR, BT_V8HI, BT_UV8HI, BT_UV8HI, >> BT_INTPTR) >> DEF_FN_TYPE_3 (BT_FN_V8HI_V16QI_V16QI_V8HI, BT_V8HI, BT_V16QI, BT_V16QI, >> BT_V8HI) >> -DEF_FN_TYPE_3 (BT_FN_V8HI_V4SF_V4SF_UINT, BT_V8HI, BT_V4SF, BT_V4SF, >> BT_UINT) >> DEF_FN_TYPE_3 (BT_FN_V8HI_V4SI_V4SI_INTPTR, BT_V8HI, BT_V4SI, BT_V4SI, >> BT_INTPTR) >> DEF_FN_TYPE_3 (BT_FN_V8HI_V8HI_V8HI_INTPTR, BT_V8HI, BT_V8HI, BT_V8HI, >> BT_INTPTR) >> DEF_FN_TYPE_3 (BT_FN_V8HI_V8HI_V8HI_V8HI, BT_V8HI, BT_V8HI, BT_V8HI, >> BT_V8HI) >> diff --git a/gcc/config/s390/s390-builtins.def >> b/gcc/config/s390/s390-builtins.def >> index 964d86c74a0..f331eba100a 100644 >> --- a/gcc/config/s390/s390-builtins.def >> +++ b/gcc/confi
Re: [PATCH] s390: Fixup builtins vec_rli and verll
On 11/27/23 10:53, Stefan Schulze Frielinghaus wrote: > Commit 248df13b966f46649e16dc3c8c92b263790ef503 restricted the rotate > count to immediates. Although the documentation of vec_rli (Vector > Element Rotate Left Immediate) can be read as if it where restricted to > immediates, this is not the case. Thus, revert this commit. > > In order to finally allow register operands, the rotate count must be of > type unsigned char since the expander expects it to be of mode QI. The > previously used type unsigned integer worked out for immediates since > those are of VOID mode anyway. > > Bootstrapped and regtested on s390. Ok for mainline? > > gcc/ChangeLog: > > * config/s390/s390-builtin-types.def: Remove types. > * config/s390/s390-builtins.def (O_U64): Remove 64-bit literal support. > Don't restrict s390_vec_rli and s390_verll[bhfg] to immediates. > * config/s390/s390.cc (s390_const_operand_ok): Remove 64-bit > literal support. Ok, Thanks! Andreas
Re: [PATCH] s390: Add missing builtin type
On 11/27/23 13:38, Stefan Schulze Frielinghaus wrote: > One builtin type slipped through the cracks of the last commits. > > Bootstrapped on s390. Ok for mainline? > > gcc/ChangeLog: > > * config/s390/s390-builtin-types.def (BT_FN_UV8HI_UV8HI_UINT): > Add missing builtin type. Ok Andreas
Re: [PATCH] s390: Check for ADDR_REGS in s390_decompose_addrstyle_without_index
On 6/26/24 14:15, Stefan Schulze Frielinghaus wrote: An explicit check for address registers was not required so far since during register allocation the processing of address constraints was sufficient. However, address constraints themself do not check for REGNO_OK_FOR_{BASE,INDEX}_P. Thus, with the newly introduced late-combine pass in r15-1579-g792f97b44ffc5e we generate new insns with invalid address registers which aren't fixed up afterwards. Fixed by explicitly checking for address registers in s390_decompose_addrstyle_without_index such that those new insns are rejected. gcc/ChangeLog: target/PR115634 * config/s390/s390.cc (s390_decompose_addrstyle_without_index): Check for ADDR_REGS in s390_decompose_addrstyle_without_index. Ok. Thanks! Andreas
Re: [PATCH] s390: Align *cjump_64 and *icjump_64
On 7/11/24 16:29, Stefan Schulze Frielinghaus wrote: During machine reorg we optimize backward jumps and transform insns as e.g. (jump_insn 118 117 119 (set (pc) (if_then_else (ne (reg:CCRAW 33 %cc) (const_int 8 [0x8])) (label_ref 134) (pc))) "dec_math_1.f90":204:8 discrim 1 2161 {*cjump_64} (expr_list:REG_DEAD (reg:CCRAW 33 %cc) (int_list:REG_BR_PROB 719407028 (nil))) -> 134) into (jump_insn 118 117 432 (set (pc) (if_then_else (ne (reg:CCRAW 33 %cc) (const_int 8 [0x8])) (pc) (label_ref 433))) "dec_math_1.f90":204:8 discrim 1 -1 (expr_list:REG_DEAD (reg:CCRAW 33 %cc) (int_list:REG_BR_PROB 719407028 (nil))) -> 433) The latter is not recognized anymore since *icjump_64 only matches CC_REGNUM against zero. Fixed by aligning *cjump_64 and *icjump_64. gcc/ChangeLog: * config/s390/s390.md (*icjump_64): Allow raw CC comparisons, i.e., any constant integer between 0 and 15 for CC comparisons. Ok. Thanks! Andreas
Re: [PATCH] s390: Fix output template for movv1qi
On 7/2/24 15:43, Stefan Schulze Frielinghaus wrote: Although for instructions MVI and MVIY it does not make a difference whether the immediate is interpreted as signed or unsigned, GAS expects unsigned immediates for instruction format SI_URD. gcc/ChangeLog: * config/s390/vector.md (mov): Fix output template for movv1qi. Ok. Thanks Andreas --- Bootstrapped and regtested on s390. Ok for {mainline,11,12,13,14}? gcc/config/s390/vector.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md index 40de0c75a7c..26fd505f2cd 100644 --- a/gcc/config/s390/vector.md +++ b/gcc/config/s390/vector.md @@ -368,8 +368,8 @@ lr\t%0,%1 mvi\t%0,0 mviy\t%0,0 - mvi\t%0,-1 - mviy\t%0,-1 + mvi\t%0,255 + mviy\t%0,255 lhi\t%0,0 lhi\t%0,-1 llc\t%0,%1
Re: [PATCH 0/3] Prepare and drop vcond expanders
On 7/1/24 10:32, Stefan Schulze Frielinghaus wrote: This drops vcond expanders. The first patch "s390: Emulate vec_cmp{eq,gt,gtu} for 128-bit integers" is somewhat independent of the other two, since we run already in ICEs. However, since after removing vcond expanders testsuite shows one additional fallout without this patch, which is why I would like to make sure that this patch lands first and included it in this series. Stefan Schulze Frielinghaus (3): s390: Emulate vec_cmp{eq,gt,gtu} for 128-bit integers s390: Enable vcond_mask for 128-bit ops s390: Drop vcond{,u} expanders Ok. Thanks! Andreas gcc/config/s390/vector.md | 156 -- .../gcc.target/s390/vector/vec-cmp-emu-1.c| 35 .../gcc.target/s390/vector/vec-cmp-emu-2.c| 18 ++ .../gcc.target/s390/vector/vec-cmp-emu-3.c| 17 ++ 4 files changed, 175 insertions(+), 51 deletions(-) create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-cmp-emu-1.c create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-cmp-emu-2.c create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-cmp-emu-3.c
Re: [PATCH] s390: Fully exploit vgm, vgbm, vrepi
On 7/2/24 15:48, Stefan Schulze Frielinghaus wrote: Currently instructions vgm and vrepi are utilized only for constant vectors where the element mode equals the element mode of the corresponding instruction. This patch lifts this restriction by making use of those instructions for constant vectors even if element modes do not coincide. For example, the constant vector (v2di){0x7ffe7ffe, 0x7ffe7ffe} can be loaded via vgmf %v0,1,30. Similar, the constant vector (v4si){0x, 0x, 0x, 0x} can be loaded via vrepiq %v0,-86. Analog, if the element mode of a constant vector is smaller than the element mode of a corresponding instruction, we still may make use of those instructions. For example, the constant vector (v4si){0x7fff, 0xfffe, 0x7fff, 0xfffe} can be loaded via vgmg %v0,17,46. Similar, the constant vector (v4si){-1, -16643, -1, -16643} can be loaded via vrepig %v0,-16643. Additionally this patch enables vgm, vgbm, vrepi for partial vectors, i.e., vectors of size less than 16 bytes. Basically this is done by treating a vector as a full vector resulting in replicating constants into the ignored bits whereas vgbm sets those to zero. Furthermore, there is no restriction to integer vectors anymore, i.e., supporting scalars of mode up to and including TI and TF and also floating-point vectors. Here are some numbers how often instructions are emitted for SPEC 2017: w/o patch w/ patch vgbm 140 365 vgm 1750824452 vrepi1360 2775 I expect most (maybe even all) to save us a load from the literal pool. gcc/ChangeLog: * config/s390/2964.md: Remove extended mnemonics for vgm. * config/s390/3906.md: Remove extended mnemonics for vgm. * config/s390/3931.md: Remove extended mnemonics for vgm. * config/s390/8561.md: Remove extended mnemonics for vgm. * config/s390/constraints.md (jKK): Remove constraint. (jzz): Add constraint. * config/s390/s390-protos.h (s390_contiguous_bitmask_vector_p): Add prototype. (s390_constant_via_vgm_p): Add prototype. (s390_constant_via_vrepi_p): Add prototype. * config/s390/s390.cc (s390_contiguous_bitmask_vector_p): New function. (s390_constant_via_vgm_vrepi_helper): New function. (s390_constant_via_vgm_p): New function. (s390_constant_via_vgbm_p): For the sake of symmetry rename s390_bytemask_vector_p into s390_constant_via_vgbm_p. (s390_bytemask_vector_p): Deal with non-integer and partial vectors. (s390_constant_via_vrepi_p): New function. (s390_legitimate_constant_p): Allow partial vectors. (legitimate_reload_constant_p): Fix indentation. (legitimate_reload_vector_constant_p): Restrict to constraints j00, jm1, jxx, jyy, jzz only, i.e., allow partial vectors. (s390_expand_vec_init): Also make use of vrepi if possible. (print_operand): Add q,p,r for vgm,vrepi,vgbm, respectively. Remove e,s,t for constant vectors. * config/s390/s390.md (movti): Add variants utilizing vgbm,vgm,vrepi. * config/s390/vector.md (mov): Adapt variants for vgbm,vgm,vrepi for the new scheme. (mov): Adapt variants for vgbm,vgm for the new scheme and add vrepi variant for modes V_8,V_16,V_32,V_64. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/vec-copysign.c: Change to non-extended mnemonic. * gcc.target/s390/vector/vec-genmask-1.c: Change to non-extended mnemonic. * gcc.target/s390/vector/vec-init-1.c: Change to non-extended mnemonic. * gcc.target/s390/vector/vec-vrepi-1.c: Change to non-extended mnemonic. * gcc.target/s390/zvector/autovec-double-quiet-uneq.c: Change to non-extended mnemonic. * gcc.target/s390/zvector/autovec-float-quiet-uneq.c: Change to non-extended mnemonic. * gcc.target/s390/zvector/vec-genmask-1.c: Change to non-extended mnemonic. * gcc.target/s390/zvector/vec-splat-1.c: Change to non-extended mnemonic. * gcc.target/s390/zvector/vec-splat-2.c: Change to non-extended mnemonic. * gcc.target/s390/vector/vgbm-double-1.c: New test. * gcc.target/s390/vector/vgbm-float-1.c: New test. * gcc.target/s390/vector/vgbm-int128-1.c: New test. * gcc.target/s390/vector/vgbm-integer-1.c: New test. * gcc.target/s390/vector/vgbm-longdouble-1.c: New test. * gcc.target/s390/vector/vgm-df-1.c: New test. * gcc.target/s390/vector/vgm-di-1.c: New test. * gcc.target/s390/vector/vgm-hi-1.c: New test. * gcc.target/s390/vector/vgm-int128-1.c: New test. * gcc.target/s390/vector/vgm-longdouble-1.c: New test. * gcc.target/s390/vector/vgm-qi-1.c: New test. * gcc.target/s390/vector/vgm-sf-1.
Re: [PATCH] s390: Fix unresolved iterators bhfgq and xdee
On 7/16/24 10:29, Stefan Schulze Frielinghaus wrote: Code attribute bhfgq is missing a mapping for TF. This results in unresolved iterators in assembler templates for *bswaptf. With the TF mapping added the base mnemonics vlbr and vstbr are not "used" anymore but only the extended mnemonics (vlbr was interpreted as vlbr; likewise for vstbr). Therefore, remove the base mnemonics from the scheduling description, otherwise, genattrtab would error about unknown mnemonics. Similarly, we end up with unresolved iterators in assembler templates for mulfprx23 since code attribute xdee is missing a mapping for FPRX2. gcc/ChangeLog: * config/s390/3931.md (vlbr, vstbr): Remove. * config/s390/s390.md (xdee): Add FPRX2 mapping. * config/s390/vector.md (bhfgq): Add TF mapping. Ok. Thanks! Andreas --- Bootstrapped and regtested on s390. Ok for {mainline,12,13,14}? gcc/config/s390/3931.md | 5 - gcc/config/s390/s390.md | 2 +- gcc/config/s390/vector.md | 2 +- 3 files changed, 2 insertions(+), 7 deletions(-) diff --git a/gcc/config/s390/3931.md b/gcc/config/s390/3931.md index 632c2456b6a..9f7a4c58755 100644 --- a/gcc/config/s390/3931.md +++ b/gcc/config/s390/3931.md @@ -404,7 +404,6 @@ vlvgg, vlvgh, vlvgp, vst, -vstbr, vstbrf, vstbrg, vstbrh, @@ -627,7 +626,6 @@ tm, tmy, vl, vlbb, -vlbr, vlbrf, vlbrg, vlbrh, @@ -661,7 +659,6 @@ vlreph, vlrl, vlrlr, vst, -vstbr, vstbrf, vstbrg, vstbrh, @@ -2148,7 +2145,6 @@ vistrfs, vistrhs, vl, vlbb, -vlbr, vlbrf, vlbrg, vlbrh, @@ -2240,7 +2236,6 @@ tbegin, tbeginc, tend, vst, -vstbr, vstbrf, vstbrg, vstbrh, diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md index 303026f6af7..3d5759d6252 100644 --- a/gcc/config/s390/s390.md +++ b/gcc/config/s390/s390.md @@ -745,7 +745,7 @@ ;; In FP templates, a in "mr" will expand to "mxr" in ;; TF/TDmode, "mdr" in DF/DDmode, "meer" in SFmode and "mer in ;; SDmode. -(define_mode_attr xdee [(TF "x") (DF "d") (SF "ee") (TD "x") (DD "d") (SD "e")]) +(define_mode_attr xdee [(TF "x") (FPRX2 "x") (DF "d") (SF "ee") (TD "x") (DD "d") (SD "e")]) ;; The decimal floating point variants of add, sub, div and mul support 3 ;; fp register operands. The following attributes allow to merge the bfp and diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md index 63678859657..cca9e3556c9 100644 --- a/gcc/config/s390/vector.md +++ b/gcc/config/s390/vector.md @@ -136,7 +136,7 @@ (V1TI "q") (TI "q") (V1SF "f") (V2SF "f") (V4SF "f") (V1DF "g") (V2DF "g") - (V1TF "q")]) + (V1TF "q") (TF "q")]) ; This is for vmalhw. It gets an 'w' attached to avoid confusion with ; multiply and add logical high vmalh.
Re: [PATCH] s390: Fix unresolved iterators bhfgq and xdee
On 7/20/24 08:39, Stefan Schulze Frielinghaus wrote: I'm pinging this early since I would like to make sure that it gets into 14.2 RC which is about to be done on Tuesday 23rd July. On Tue, Jul 16, 2024 at 04:50:29PM +0200, Stefan Schulze Frielinghaus wrote: Code attribute bhfgq is missing a mapping for TF. This results in unresolved iterators in assembler templates for *bswaptf. With the TF mapping added the base mnemonics vlbr and vstbr are not "used" anymore but only the extended mnemonics (vlbr was interpreted as vlbr; likewise for vstbr). Therefore, remove the base mnemonics from the scheduling description, otherwise, genattrtab would error about unknown mnemonics. Likewise, for movtf_vr only the extended mnemonics for vrepi are used, now, which means the base mnemonic is "unused" and has to be removed from the scheduling description. Similarly, we end up with unresolved iterators in assembler templates for mulfprx23 since code attribute xdee is missing a mapping for FPRX2. Note, this is basically a cherry pick of commit r15-2060-ga4abda934aa426 with the addition that vrepi is removed from the scheduling description, too. Bootstrapped on s390. Ok for release branches 12, 13, and 14? Ok, Thanks! Andreas gcc/ChangeLog: * config/s390/3931.md (vlbr, vstbr, vrepi): Remove. * config/s390/s390.md (xdee): Add FPRX2 mapping. * config/s390/vector.md (bhfgq): Add TF mapping. --- gcc/config/s390/3931.md | 7 --- gcc/config/s390/s390.md | 2 +- gcc/config/s390/vector.md | 2 +- 3 files changed, 2 insertions(+), 9 deletions(-) diff --git a/gcc/config/s390/3931.md b/gcc/config/s390/3931.md index bed1f6c21f1..9cb11b72bba 100644 --- a/gcc/config/s390/3931.md +++ b/gcc/config/s390/3931.md @@ -404,7 +404,6 @@ vlvgg, vlvgh, vlvgp, vst, -vstbr, vstbrf, vstbrg, vstbrh, @@ -627,7 +626,6 @@ tm, tmy, vl, vlbb, -vlbr, vlbrf, vlbrg, vlbrh, @@ -661,7 +659,6 @@ vlreph, vlrl, vlrlr, vst, -vstbr, vstbrf, vstbrg, vstbrh, @@ -1077,7 +1074,6 @@ vrepb, vrepf, vrepg, vreph, -vrepi, vrepib, vrepif, vrepig, @@ -1930,7 +1926,6 @@ vrepb, vrepf, vrepg, vreph, -vrepi, vrepib, vrepif, vrepig, @@ -2156,7 +2151,6 @@ vistrfs, vistrhs, vl, vlbb, -vlbr, vlbrf, vlbrg, vlbrh, @@ -2248,7 +2242,6 @@ tbegin, tbeginc, tend, vst, -vstbr, vstbrf, vstbrg, vstbrh, diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md index 50a828f2bbb..8edc1261c38 100644 --- a/gcc/config/s390/s390.md +++ b/gcc/config/s390/s390.md @@ -744,7 +744,7 @@ ;; In FP templates, a in "mr" will expand to "mxr" in ;; TF/TDmode, "mdr" in DF/DDmode, "meer" in SFmode and "mer in ;; SDmode. -(define_mode_attr xdee [(TF "x") (DF "d") (SF "ee") (TD "x") (DD "d") (SD "e")]) +(define_mode_attr xdee [(TF "x") (FPRX2 "x") (DF "d") (SF "ee") (TD "x") (DD "d") (SD "e")]) ;; The decimal floating point variants of add, sub, div and mul support 3 ;; fp register operands. The following attributes allow to merge the bfp and diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md index 1bae1056951..f88e8b655fa 100644 --- a/gcc/config/s390/vector.md +++ b/gcc/config/s390/vector.md @@ -134,7 +134,7 @@ (V1TI "q") (TI "q") (V1SF "f") (V2SF "f") (V4SF "f") (V1DF "g") (V2DF "g") - (V1TF "q")]) + (V1TF "q") (TF "q")]) ; This is for vmalhw. It gets an 'w' attached to avoid confusion with ; multiply and add logical high vmalh. -- 2.45.0
Re: [PATCH] s390: Implement TARGET_NOCE_CONVERSION_PROFITABLE_P [PR109549]
On 5/8/24 10:06, Stefan Schulze Frielinghaus wrote: > Consider a NOCE conversion as profitable if there is at least one > conditional move. > > gcc/ChangeLog: > > * config/s390/s390.cc (TARGET_NOCE_CONVERSION_PROFITABLE_P): > Define. > (s390_noce_conversion_profitable_p): Implement. > > gcc/testsuite/ChangeLog: > > * gcc.target/s390/ccor.c: Order of loads are reversed, now, as a > consequence the condition has to be reversed. > --- > Bootstrapped and regtested on s390. Ok for mainline? > > gcc/config/s390/s390.cc | 32 > gcc/testsuite/gcc.target/s390/ccor.c | 4 ++-- > 2 files changed, 34 insertions(+), 2 deletions(-) > > diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc > index bf46eab2d63..23b18b5c506 100644 > --- a/gcc/config/s390/s390.cc > +++ b/gcc/config/s390/s390.cc > @@ -78,6 +78,7 @@ along with GCC; see the file COPYING3. If not see > #include "tree-pass.h" > #include "context.h" > #include "builtins.h" > +#include "ifcvt.h" > #include "rtl-iter.h" > #include "intl.h" > #include "tm-constrs.h" > @@ -18037,6 +18038,37 @@ s390_vectorize_vec_perm_const (machine_mode vmode, > machine_mode op_mode, >return vectorize_vec_perm_const_1 (d); > } > > +/* Consider a NOCE conversion as profitable if there is at least one > + conditional move. */ > + > +#undef TARGET_NOCE_CONVERSION_PROFITABLE_P > +#define TARGET_NOCE_CONVERSION_PROFITABLE_P s390_noce_conversion_profitable_p We collect these definitions at the very end of s390.cc > + > +static bool > +s390_noce_conversion_profitable_p (rtx_insn *seq, struct noce_if_info > *if_info) > +{ > + if (if_info->speed_p) > +{ > + for (rtx_insn *insn = seq; insn; insn = NEXT_INSN (insn)) > + { > + rtx set = single_set (insn); > + if (set == NULL) > + continue; > + if (GET_CODE (SET_SRC (set)) != IF_THEN_ELSE) > + continue; > + rtx src = SET_SRC (set); > + machine_mode mode = GET_MODE (src); > + if (GET_MODE_CLASS (mode) != MODE_INT > + && GET_MODE_CLASS (mode) != MODE_FLOAT) > + continue; > + if (GET_MODE_SIZE (mode) > GET_MODE_SIZE (Pmode)) I guess GET_MODE_SIZE(Pmode) should be UNITS_PER_WORD here to enable the conversion also for 64 bit modes with -m31 -mzarch. Ok with these changes. Thanks! Andreas > + continue; > + return true; > + } > +} > + return default_noce_conversion_profitable_p (seq, if_info); > +} > + > /* Initialize GCC target structure. */ > > #undef TARGET_ASM_ALIGNED_HI_OP > diff --git a/gcc/testsuite/gcc.target/s390/ccor.c > b/gcc/testsuite/gcc.target/s390/ccor.c > index 31f30f60314..36a3c3a999a 100644 > --- a/gcc/testsuite/gcc.target/s390/ccor.c > +++ b/gcc/testsuite/gcc.target/s390/ccor.c > @@ -42,7 +42,7 @@ GENFUN1(2) > > GENFUN1(3) > > -/* { dg-final { scan-assembler {locrno} } } */ > +/* { dg-final { scan-assembler {locro} } } */ > > GENFUN2(0,1) > > @@ -58,7 +58,7 @@ GENFUN2(0,3) > > GENFUN2(1,2) > > -/* { dg-final { scan-assembler {locrnlh} } } */ > +/* { dg-final { scan-assembler {locrlh} } } */ > > GENFUN2(1,3) >
Re: [PATCH] s390: Fix high-level builtins vec_gfmsum{,_accum}_128
On 8/8/24 20:28, Stefan Schulze Frielinghaus wrote: Starting with r14-9449-g9f2b16ce1efef0 builtins were streamlined with those in LLVM. In particular s390_vgfm{,a}g have been changed from UV16QI to UINT128 in order to match those in LLVM. However, these low-level builtins are directly used by the high-level builtins vec_gfmsum{,_accum}_128 which expect UV16QI instead. Therefore, introduce new low-level builtins s390_vgfm{,a}g_128 and make use of them, respectively. Bootstrapped on s390. Ok for mainline and releases/gcc-14? gcc/ChangeLog: * config/s390/s390-builtin-types.def (BT_FN_UV16QI_UV2DI_UV2DI): New. (BT_FN_UV16QI_UV2DI_UV2DI_UV16QI): New. * config/s390/s390-builtins.def (s390_vgfmg_128): New. (s390_vgfmag_128): New. * config/s390/vecintrin.h (vec_gfmsum_128): Use s390_vgfmg_128. (vec_gfmsum_accum_128): Use s390_vgfmag_128. Ok. Thanks! Andreas
Re: [PATCH] s390: Remove vector intrinsics
On 8/8/24 20:29, Stefan Schulze Frielinghaus wrote: The following intrinsics are not implemented. Thus, remove them. Ok for mainline? gcc/ChangeLog: * config/s390/vecintrin.h (vec_vstbrh): Remove. (vec_vstbrf): Remove. (vec_vstbrg): Remove. (vec_vstbrq): Remove. (vec_vstbrf_flt): Remove. (vec_vstbrg_dbl): Remove. (vec_vsterb): Remove. (vec_vsterh): Remove. (vec_vsterf): Remove. (vec_vsterg): Remove. (vec_vsterf_flt): Remove. (vec_vsterg_dbl): Remove. Ok. Thanks! Andreas
[Committed] IBM Z: Fix ICE in expand_perm_as_replicate
The current implementation assumes to always be invoked with register operands. For memory operands we even have an instruction though (vlrep). With the patch we try this first and only if it fails force the input into a register and continue. vec_splats generation fails for single element 128bit types which are allowed for vec_splat. This is something to sort out with another patch I guess. Bootstrapped and regtested on IBM Z. Committed to mainline. Needs to be committed to GCC 14 branch as well. gcc/ChangeLog: * config/s390/s390.cc (expand_perm_as_replicate): Handle memory operands. * config/s390/vx-builtins.md (vec_splats): Turn into parameterized expander. (@vec_splats): New expander. gcc/testsuite/ChangeLog: * g++.dg/torture/vshuf-mem.C: New test. --- gcc/config/s390/s390.cc | 17 +-- gcc/config/s390/vx-builtins.md | 2 +- gcc/testsuite/g++.dg/torture/vshuf-mem.C | 27 3 files changed, 43 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/g++.dg/torture/vshuf-mem.C diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc index fa517bd3e77..ec836ec3cd4 100644 --- a/gcc/config/s390/s390.cc +++ b/gcc/config/s390/s390.cc @@ -17940,7 +17940,8 @@ expand_perm_as_replicate (const struct expand_vec_perm_d &d) unsigned char i; unsigned char elem; rtx base = d.op0; - rtx insn; + rtx insn = NULL_RTX; + /* Needed to silence maybe-uninitialized warning. */ gcc_assert (d.nelt > 0); elem = d.perm[0]; @@ -17954,7 +17955,19 @@ expand_perm_as_replicate (const struct expand_vec_perm_d &d) base = d.op1; elem -= d.nelt; } - insn = maybe_gen_vec_splat (d.vmode, d.target, base, GEN_INT (elem)); + if (memory_operand (base, d.vmode)) + { + /* Try to use vector load and replicate. */ + rtx new_base = adjust_address (base, GET_MODE_INNER (d.vmode), +elem * GET_MODE_UNIT_SIZE (d.vmode)); + insn = maybe_gen_vec_splats (d.vmode, d.target, new_base); + } + if (insn == NULL_RTX) + { + base = force_reg (d.vmode, base); + insn = maybe_gen_vec_splat (d.vmode, d.target, base, GEN_INT (elem)); + } + if (insn == NULL_RTX) return false; emit_insn (insn); diff --git a/gcc/config/s390/vx-builtins.md b/gcc/config/s390/vx-builtins.md index 93c0d408a43..bb271c09a7d 100644 --- a/gcc/config/s390/vx-builtins.md +++ b/gcc/config/s390/vx-builtins.md @@ -145,7 +145,7 @@ DONE; }) -(define_expand "vec_splats" +(define_expand "@vec_splats" [(set (match_operand:VEC_HW 0 "register_operand" "") (vec_duplicate:VEC_HW (match_operand: 1 "general_operand" "")))] "TARGET_VX") diff --git a/gcc/testsuite/g++.dg/torture/vshuf-mem.C b/gcc/testsuite/g++.dg/torture/vshuf-mem.C new file mode 100644 index 000..5f1ebf65665 --- /dev/null +++ b/gcc/testsuite/g++.dg/torture/vshuf-mem.C @@ -0,0 +1,27 @@ +// { dg-options "-std=c++11" } +// { dg-do run } +// { dg-additional-options "-march=z14" { target s390*-*-* } } + +/* This used to trigger (2024-05-28) the vectorize_vec_perm_const + backend hook to be invoked with a MEM source operand. Extracted + from onnxruntime's mlas library. */ + +typedef float V4SF __attribute__((vector_size (16))); +typedef int V4SI __attribute__((vector_size (16))); + +template < unsigned I0, unsigned I1, unsigned I2, unsigned I3 > V4SF +MlasShuffleFloat32x4 (V4SF Vector) +{ + return __builtin_shuffle (Vector, Vector, V4SI{I0, I1, I2, I3}); +} + +int +main () +{ + V4SF f = { 1.0f, 2.0f, 3.0f, 4.0f }; + if (MlasShuffleFloat32x4 < 1, 1, 1, 1 > (f)[3] != 2.0f) +__builtin_abort (); + if (MlasShuffleFloat32x4 < 3, 3, 3, 3 > (f)[1] != 4.0f) +__builtin_abort (); + return 0; +} -- 2.45.1
Re: [PATCH] s390: Extend two/four element integer vectors
On 6/11/24 10:24, Stefan Schulze Frielinghaus wrote: For the moment I deliberately left out one-element QHS vectors since it is unclear whether these are pathological cases or whether they are really used. If we ever get an extend for V1DI -> V1TI we should reconsider this. As a side-effect this fixes PR115261. gcc/ChangeLog: target/PR115261 * config/s390/s390.md (any_extend,extend_insn,zero_extend): New code attributes and code iterator. * config/s390/vector.md (V_EXTEND): New mode iterator. (2): New insn. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/vec-extend-1.c: New test. * gcc.target/s390/vector/vec-extend-2.c: New test. --- Bootstrap and regtested on s390. Ok for mainline? Ok. Thanks! Andreas
Re: [PATCH] s390: Extend two element float vector
On 6/11/24 10:26, Stefan Schulze Frielinghaus wrote: This implements a V2SF -> V2DF extend. gcc/ChangeLog: * config/s390/vector.md (*vmrhf): New. (extendv2sfv2df2): New. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/vec-extend-3.c: New test. Since we already have a *vmrhf pattern, should we perhaps add something to the name to make it easier to distinguish in the rtl dumps? You have added the mode already, but perhaps something like *vmrhf_half or something like this? Ok with or without that change. Thanks! Andreas
Re: [PATCH v2] s390: Implement TARGET_NOCE_CONVERSION_PROFITABLE_P [PR109549]
On 6/2/24 14:07, Stefan Schulze Frielinghaus wrote: Since the patch works fine so far for mainline, ok to backport to GCC 14? Yes please do. Thanks! Andreas On Fri, May 17, 2024 at 08:59:05AM +0200, Stefan Schulze Frielinghaus wrote: I've adapted the patch as follows and will push. Thanks, Stefan -- Consider a NOCE conversion as profitable if there is at least one conditional move. gcc/ChangeLog: * config/s390/s390.cc (TARGET_NOCE_CONVERSION_PROFITABLE_P): Define. (s390_noce_conversion_profitable_p): Implement. gcc/testsuite/ChangeLog: * gcc.target/s390/ccor.c: Order of loads are reversed, now, as a consequence the condition has to be reversed. --- gcc/config/s390/s390.cc | 32 gcc/testsuite/gcc.target/s390/ccor.c | 4 ++-- 2 files changed, 34 insertions(+), 2 deletions(-) diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc index bf46eab2d63..7f8f1681c2a 100644 --- a/gcc/config/s390/s390.cc +++ b/gcc/config/s390/s390.cc @@ -78,6 +78,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-pass.h" #include "context.h" #include "builtins.h" +#include "ifcvt.h" #include "rtl-iter.h" #include "intl.h" #include "tm-constrs.h" @@ -18037,6 +18038,34 @@ s390_vectorize_vec_perm_const (machine_mode vmode, machine_mode op_mode, return vectorize_vec_perm_const_1 (d); } +/* Consider a NOCE conversion as profitable if there is at least one + conditional move. */ + +static bool +s390_noce_conversion_profitable_p (rtx_insn *seq, struct noce_if_info *if_info) +{ + if (if_info->speed_p) +{ + for (rtx_insn *insn = seq; insn; insn = NEXT_INSN (insn)) + { + rtx set = single_set (insn); + if (set == NULL) + continue; + if (GET_CODE (SET_SRC (set)) != IF_THEN_ELSE) + continue; + rtx src = SET_SRC (set); + machine_mode mode = GET_MODE (src); + if (GET_MODE_CLASS (mode) != MODE_INT + && GET_MODE_CLASS (mode) != MODE_FLOAT) + continue; + if (GET_MODE_SIZE (mode) > UNITS_PER_WORD) + continue; + return true; + } +} + return default_noce_conversion_profitable_p (seq, if_info); +} + /* Initialize GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP @@ -18350,6 +18379,9 @@ s390_vectorize_vec_perm_const (machine_mode vmode, machine_mode op_mode, #undef TARGET_VECTORIZE_VEC_PERM_CONST #define TARGET_VECTORIZE_VEC_PERM_CONST s390_vectorize_vec_perm_const +#undef TARGET_NOCE_CONVERSION_PROFITABLE_P +#define TARGET_NOCE_CONVERSION_PROFITABLE_P s390_noce_conversion_profitable_p + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-s390.h" diff --git a/gcc/testsuite/gcc.target/s390/ccor.c b/gcc/testsuite/gcc.target/s390/ccor.c index 31f30f60314..36a3c3a999a 100644 --- a/gcc/testsuite/gcc.target/s390/ccor.c +++ b/gcc/testsuite/gcc.target/s390/ccor.c @@ -42,7 +42,7 @@ GENFUN1(2) GENFUN1(3) -/* { dg-final { scan-assembler {locrno} } } */ +/* { dg-final { scan-assembler {locro} } } */ GENFUN2(0,1) @@ -58,7 +58,7 @@ GENFUN2(0,3) GENFUN2(1,2) -/* { dg-final { scan-assembler {locrnlh} } } */ +/* { dg-final { scan-assembler {locrlh} } } */ GENFUN2(1,3) -- 2.45.0
Re: [PATCH] s390: testsuite: Fix ifcvt-one-insn-bool.c
On Wed, Jun 05, 2024 at 08:00:15AM +0200, Stefan Schulze Frielinghaus wrote: With the change of r15-787-g57e04879389f9c I forgot to also update this test. gcc/testsuite/ChangeLog: * gcc.target/s390/ifcvt-one-insn-bool.c: Fix loc. Ok. Thanks! Andreas --- Ok for mainline? Ok for GCC 14 if the corresponding backport is also approved? gcc/testsuite/gcc.target/s390/ifcvt-one-insn-bool.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.target/s390/ifcvt-one-insn-bool.c b/gcc/testsuite/gcc.target/s390/ifcvt-one-insn-bool.c index 0c8c2f879a6..4ae29dbd6b6 100644 --- a/gcc/testsuite/gcc.target/s390/ifcvt-one-insn-bool.c +++ b/gcc/testsuite/gcc.target/s390/ifcvt-one-insn-bool.c @@ -3,7 +3,7 @@ /* { dg-do compile { target { s390*-*-* } } } */ /* { dg-options "-O2 -march=z13 -mzarch" } */ -/* { dg-final { scan-assembler "lochinh\t%r.?,1" } } */ +/* { dg-final { scan-assembler "lochile\t%r.?,1" } } */ #include int foo (int *a, unsigned int n) -- 2.45.1
Re: PING^1 [PATCH 44/52] s390: New hook implementation s390_c_mode_for_floating_type
On 6/13/24 09:43, Kewen.Lin wrote: Hi, Gentle ping: https://gcc.gnu.org/pipermail/gcc-patches/2024-June/653382.html BR, Kewen on 2024/6/3 11:01, Kewen Lin wrote: This is to remove macros {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE defines in s390 port, and add new port specific hook implementation s390_c_mode_for_floating_type. gcc/ChangeLog: * config/s390/s390.cc (s390_c_mode_for_floating_type): New function. (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro. * config/s390/s390.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. Ok. Thanks! Andreas --- gcc/config/s390/s390.cc | 15 +++ gcc/config/s390/s390.h | 3 --- 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc index fa517bd3e77..117da36b3c0 100644 --- a/gcc/config/s390/s390.cc +++ b/gcc/config/s390/s390.cc @@ -18066,6 +18066,18 @@ s390_noce_conversion_profitable_p (rtx_insn *seq, struct noce_if_info *if_info) return default_noce_conversion_profitable_p (seq, if_info); } +/* Implement TARGET_C_MODE_FOR_FLOATING_TYPE. Return TFmode or DFmode + for TI_LONG_DOUBLE_TYPE which is for long double type, go with the + default one for the others. */ + +static machine_mode +s390_c_mode_for_floating_type (enum tree_index ti) +{ + if (ti == TI_LONG_DOUBLE_TYPE) +return TARGET_LONG_DOUBLE_128 ? TFmode : DFmode; + return default_mode_for_floating_type (ti); +} + /* Initialize GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP @@ -18382,6 +18394,9 @@ s390_noce_conversion_profitable_p (rtx_insn *seq, struct noce_if_info *if_info) #undef TARGET_NOCE_CONVERSION_PROFITABLE_P #define TARGET_NOCE_CONVERSION_PROFITABLE_P s390_noce_conversion_profitable_p +#undef TARGET_C_MODE_FOR_FLOATING_TYPE +#define TARGET_C_MODE_FOR_FLOATING_TYPE s390_c_mode_for_floating_type + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-s390.h" diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h index 0ea8802..4a4dde1a9ba 100644 --- a/gcc/config/s390/s390.h +++ b/gcc/config/s390/s390.h @@ -396,9 +396,6 @@ extern const char *s390_host_detect_local_cpu (int argc, const char **argv); #define INT_TYPE_SIZE 32 #define LONG_TYPE_SIZE (TARGET_64BIT ? 64 : 32) #define LONG_LONG_TYPE_SIZE 64 -#define FLOAT_TYPE_SIZE 32 -#define DOUBLE_TYPE_SIZE 64 -#define LONG_DOUBLE_TYPE_SIZE (TARGET_LONG_DOUBLE_128 ? 128 : 64) /* Work around target_flags dependency in ada/targtyps.cc. */ #define WIDEST_HARDWARE_FP_SIZE 64
Re: [PATCH] s390: testsuite: Fix nobp-table-jump-*.c
On Mon, Jun 03, 2024 at 03:43:39PM +0200, Stefan Schulze Frielinghaus wrote: Starting with r14-5628-g53ba8d669550d3 interprocedural VRP became strong enough in order to render these tests useless. Fixed by disabling IPA. gcc/testsuite/ChangeLog: * gcc.target/s390/nobp-table-jump-inline-z10.c: Do not perform IPA. * gcc.target/s390/nobp-table-jump-inline-z900.c: Dito. * gcc.target/s390/nobp-table-jump-z10.c: Dito. * gcc.target/s390/nobp-table-jump-z900.c: Dito. --- Ok for mainline? Ok. Thanks! Andreas .../s390/nobp-table-jump-inline-z10.c | 42 +-- .../s390/nobp-table-jump-inline-z900.c| 42 +-- .../gcc.target/s390/nobp-table-jump-z10.c | 42 +-- .../gcc.target/s390/nobp-table-jump-z900.c| 42 +-- 4 files changed, 84 insertions(+), 84 deletions(-) diff --git a/gcc/testsuite/gcc.target/s390/nobp-table-jump-inline-z10.c b/gcc/testsuite/gcc.target/s390/nobp-table-jump-inline-z10.c index 8dfd7e4c786..121751166d0 100644 --- a/gcc/testsuite/gcc.target/s390/nobp-table-jump-inline-z10.c +++ b/gcc/testsuite/gcc.target/s390/nobp-table-jump-inline-z10.c @@ -4,29 +4,29 @@ /* case-values-threshold will be set to 20 by the back-end when jump thunk are requested. */ -int __attribute__((noinline,noclone)) foo1 (void) { return 1; } -int __attribute__((noinline,noclone)) foo2 (void) { return 2; } -int __attribute__((noinline,noclone)) foo3 (void) { return 3; } -int __attribute__((noinline,noclone)) foo4 (void) { return 4; } -int __attribute__((noinline,noclone)) foo5 (void) { return 5; } -int __attribute__((noinline,noclone)) foo6 (void) { return 6; } -int __attribute__((noinline,noclone)) foo7 (void) { return 7; } -int __attribute__((noinline,noclone)) foo8 (void) { return 8; } -int __attribute__((noinline,noclone)) foo9 (void) { return 9; } -int __attribute__((noinline,noclone)) foo10 (void) { return 10; } -int __attribute__((noinline,noclone)) foo11 (void) { return 11; } -int __attribute__((noinline,noclone)) foo12 (void) { return 12; } -int __attribute__((noinline,noclone)) foo13 (void) { return 13; } -int __attribute__((noinline,noclone)) foo14 (void) { return 14; } -int __attribute__((noinline,noclone)) foo15 (void) { return 15; } -int __attribute__((noinline,noclone)) foo16 (void) { return 16; } -int __attribute__((noinline,noclone)) foo17 (void) { return 17; } -int __attribute__((noinline,noclone)) foo18 (void) { return 18; } -int __attribute__((noinline,noclone)) foo19 (void) { return 19; } -int __attribute__((noinline,noclone)) foo20 (void) { return 20; } +int __attribute__((noipa)) foo1 (void) { return 1; } +int __attribute__((noipa)) foo2 (void) { return 2; } +int __attribute__((noipa)) foo3 (void) { return 3; } +int __attribute__((noipa)) foo4 (void) { return 4; } +int __attribute__((noipa)) foo5 (void) { return 5; } +int __attribute__((noipa)) foo6 (void) { return 6; } +int __attribute__((noipa)) foo7 (void) { return 7; } +int __attribute__((noipa)) foo8 (void) { return 8; } +int __attribute__((noipa)) foo9 (void) { return 9; } +int __attribute__((noipa)) foo10 (void) { return 10; } +int __attribute__((noipa)) foo11 (void) { return 11; } +int __attribute__((noipa)) foo12 (void) { return 12; } +int __attribute__((noipa)) foo13 (void) { return 13; } +int __attribute__((noipa)) foo14 (void) { return 14; } +int __attribute__((noipa)) foo15 (void) { return 15; } +int __attribute__((noipa)) foo16 (void) { return 16; } +int __attribute__((noipa)) foo17 (void) { return 17; } +int __attribute__((noipa)) foo18 (void) { return 18; } +int __attribute__((noipa)) foo19 (void) { return 19; } +int __attribute__((noipa)) foo20 (void) { return 20; } -int __attribute__((noinline,noclone)) +int __attribute__((noipa)) bar (int a) { int ret = 0; diff --git a/gcc/testsuite/gcc.target/s390/nobp-table-jump-inline-z900.c b/gcc/testsuite/gcc.target/s390/nobp-table-jump-inline-z900.c index 46d2c54bcff..5ad0c72afc3 100644 --- a/gcc/testsuite/gcc.target/s390/nobp-table-jump-inline-z900.c +++ b/gcc/testsuite/gcc.target/s390/nobp-table-jump-inline-z900.c @@ -4,29 +4,29 @@ /* case-values-threshold will be set to 20 by the back-end when jump thunk are requested. */ -int __attribute__((noinline,noclone)) foo1 (void) { return 1; } -int __attribute__((noinline,noclone)) foo2 (void) { return 2; } -int __attribute__((noinline,noclone)) foo3 (void) { return 3; } -int __attribute__((noinline,noclone)) foo4 (void) { return 4; } -int __attribute__((noinline,noclone)) foo5 (void) { return 5; } -int __attribute__((noinline,noclone)) foo6 (void) { return 6; } -int __attribute__((noinline,noclone)) foo7 (void) { return 7; } -int __attribute__((noinline,noclone)) foo8 (void) { return 8; } -int __attribute__((noinline,noclone)) foo9 (void) { return 9; } -int __attribute__((noinline,noclone)) foo10 (void) { return 10; } -int __attribute__((noinline,noclone)) foo11 (void) {
Re: [committed] testsuite: Add -Wno-psabi to vshuf-mem.C test
On 6/14/24 20:03, Jakub Jelinek wrote: Also wonder about the // { dg-additional-options "-march=z14" { target s390*-*-* } } line, doesn't that mean the test will FAIL on all pre-z14 HW? Shouldn't it use some z14_runtime or similar effective target, or check in main (in that case copied over to g++.target/s390) whether z14 instructions can be actually used at runtime? Oh right. I'll remove that line and replicate the testcase in the arch specific test dir. Andreas
Re: [PATCH] s390: testsuite: Xfail range-sincos.c and vrp-float-abs-1.c
On 4/12/24 10:16, Stefan Schulze Frielinghaus wrote: > As mentioned in PR114678 those failures will be fixed by > https://gcc.gnu.org/pipermail/gcc-patches/2024-March/648303.html > For GCC 14 just xfail them which should be reverted once the patch is > applied. > > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/range-sincos.c: Xfail for s390. > * gcc.dg/tree-ssa/vrp-float-abs-1.c: Dito.> --- > Ok for mainline? Ok, thanks! Andreas > > gcc/testsuite/gcc.dg/tree-ssa/range-sincos.c| 2 +- > gcc/testsuite/gcc.dg/tree-ssa/vrp-float-abs-1.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/range-sincos.c > b/gcc/testsuite/gcc.dg/tree-ssa/range-sincos.c > index 337f9cda02f..35b38c3c914 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/range-sincos.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/range-sincos.c > @@ -40,4 +40,4 @@ stool (double x) > link_error (); > } > > -// { dg-final { scan-tree-dump-not "link_error" "evrp" { target { { > *-*-linux* } && { glibc } } } } } > +// { dg-final { scan-tree-dump-not "link_error" "evrp" { target { { > *-*-linux* } && { glibc } } xfail s390*-*-* } } } xfail: PR114678 > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-abs-1.c > b/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-abs-1.c > index 4b7b75833e0..a814a973963 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-abs-1.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-abs-1.c > @@ -14,4 +14,4 @@ foo (double x, double y) > } > } > > -// { dg-final { scan-tree-dump-not "link_error" "evrp" } } > +// { dg-final { scan-tree-dump-not "link_error" "evrp" { xfail s390*-*-* } } > } xfail: PR114678
Re: [PATCH] s390: avoid peeking eof after __vector
On 4/17/24 03:52, Jiufu Guo wrote: > > Hi, > > I would like to ping this patch. > > > Jeff (Jiufu Guo) > > Jiufu Guo writes: > >> Hi, >> >> Same like PR101168, this patch is need for s390 to >> avoid peeking eof after vector keyword. >> And similar test case is also ok for s390. >> >> Is this ok for trunk? >> >> Jeff (Jiufu Guo) >> >> PR target/95782 >> >> gcc/ChangeLog: >> >> * config/s390/s390-c.cc (s390_macro_to_expand): Avoid empty identifier. >> >> gcc/testsuite/ChangeLog: >> >> * g++.target/s390/pr95782.C: New test. Sorry for the delay. This is ok. Thanks! Andreas >> >> --- >> gcc/config/s390/s390-c.cc | 4 +++- >> gcc/testsuite/g++.target/s390/pr95782.C | 5 + >> 2 files changed, 8 insertions(+), 1 deletion(-) >> create mode 100644 gcc/testsuite/g++.target/s390/pr95782.C >> >> diff --git a/gcc/config/s390/s390-c.cc b/gcc/config/s390/s390-c.cc >> index 8d3d1a467a8..45f164d978b 100644 >> --- a/gcc/config/s390/s390-c.cc >> +++ b/gcc/config/s390/s390-c.cc >> @@ -275,7 +275,9 @@ s390_macro_to_expand (cpp_reader *pfile, const cpp_token >> *tok) >>/* __vector long __bool a; */ >>if (ident == C_CPP_HASHNODE (__bool_keyword)) >> expand_bool_p = true; >> - else >> + >> + /* If there are more tokens to check. */ >> + else if (ident) >> { >>/* Triggered with: __vector long long __bool a; */ >>do >> diff --git a/gcc/testsuite/g++.target/s390/pr95782.C >> b/gcc/testsuite/g++.target/s390/pr95782.C >> new file mode 100644 >> index 000..daf887fc6fe >> --- /dev/null >> +++ b/gcc/testsuite/g++.target/s390/pr95782.C >> @@ -0,0 +1,5 @@ >> +// { dg-do compile } >> +// { dg-options "-march=z14 -mzvector" } >> + >> +using vdbl = __vector double; >> +#define BREAK 1
Re: [PATCH] s390: testsuite: Remove xfail for vpopct{b,h}
On 4/22/24 08:01, Stefan Schulze Frielinghaus wrote: > Starting with r14-9316-g7890836de20912 patterns for vpopct{b,h} are also > detected. Thus, remove xfails. > > gcc/testsuite/ChangeLog: > > * gcc.target/s390/vxe/popcount-1.c: Remove xfail. Ok. Thanks! Andreas > --- > Ok for mainline? > > gcc/testsuite/gcc.target/s390/vxe/popcount-1.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/gcc/testsuite/gcc.target/s390/vxe/popcount-1.c > b/gcc/testsuite/gcc.target/s390/vxe/popcount-1.c > index 9ea835a1cf0..25ef354f963 100644 > --- a/gcc/testsuite/gcc.target/s390/vxe/popcount-1.c > +++ b/gcc/testsuite/gcc.target/s390/vxe/popcount-1.c > @@ -21,7 +21,7 @@ vpopctb (uv16qi a) > >return r; > } > -/* { dg-final { scan-assembler "vpopctb\t%v24,%v24" { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler "vpopctb\t%v24,%v24" } } */ > > uv8hi __attribute__((noinline)) > vpopcth (uv8hi a) > @@ -34,7 +34,7 @@ vpopcth (uv8hi a) > >return r; > } > -/* { dg-final { scan-assembler "vpopcth\t%v24,%v24" { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler "vpopcth\t%v24,%v24" } } */ > > uv4si __attribute__((noinline)) > vpopctf (uv4si a)
Re: [PATCH] s390: testsuite: Fix forwprop-4{0,1}.c
Hi Stefan, due to that missed optimization we currently generate silly code for these two tests and should really fix this (after gcc entering stage1). So just skipping it on s390x would definitely be the wrong choice I think. I think our vectorize_vec_perm_const correctly rejects this permute pattern, since it would require a load from literal pool. Question is why we do have to rely on this being turned into a permute first to get rid of the obviously redundant assignments. Shouldn't fwprop be able to handle this without it? I'm ok with your patch, but please also open a BZ for it and perhaps mention it in the comment close to the xfail. Thanks! Andreas On 4/22/24 08:23, Stefan Schulze Frielinghaus wrote: > The tests fail on s390 since can_vec_perm_const_p fails and therefore > the bit insert/ref survive which r14-3381-g27de9aa152141e aims for. > Strictly speaking, the tests only fail in case the target supports > vectors, i.e., for targets prior z13 or in case of -mesa the emulated > vector operations are optimized out. > > Easiest would be to skip the entire test for s390. Another solution > would be to xfail in case of vector support hoping that eventually we > end up with an xpass for a future machine generation or if gcc advances. > That is implemented by this patch. In order to do so I implemented a > new target test s390_mvx which tests whether vector support is available > or not. Maybe this is already over-engineered for a simple test? Any > thoughts? > --- > gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c | 4 ++-- > gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c | 4 ++-- > gcc/testsuite/lib/target-supports.exp | 14 ++ > 3 files changed, 18 insertions(+), 4 deletions(-) > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c > b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c > index 7513497f552..b67e3e93a7f 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c > @@ -10,5 +10,5 @@ vector int g(vector int a) >return a; > } > > -/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 0 "optimized" } } */ > -/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" } } */ > +/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 0 "optimized" { xfail > s390_mvx } } } */ > +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" { xfail > s390_mvx } } } */ > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c > b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c > index b1e75797a90..0f119675207 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c > @@ -11,6 +11,6 @@ vector int g(vector int a, int c) >return a; > } > > -/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 1 "optimized" } } */ > -/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" } } */ > +/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 1 "optimized" { xfail > s390_mvx } } } */ > +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" { xfail > s390_mvx } } } */ > /* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 0 "optimized" } } */ > diff --git a/gcc/testsuite/lib/target-supports.exp > b/gcc/testsuite/lib/target-supports.exp > index edce672c0e2..5a692baa8ef 100644 > --- a/gcc/testsuite/lib/target-supports.exp > +++ b/gcc/testsuite/lib/target-supports.exp > @@ -12380,6 +12380,20 @@ proc check_effective_target_profile_update_atomic {} > { > } "-fprofile-update=atomic -fprofile-generate"] > } > > +# Return 1 if the target has a vector facility. > +proc check_effective_target_s390_mvx { } { > +if ![istarget s390*-*-*] then { > + return 0; > +} > + > +return [check_no_compiler_messages_nocache s390_mvx assembly { > + #if !defined __VX__ > + #error no vector facility. > + #endif > + int dummy; > +} [current_compiler_flags]] > +} > + > # Return 1 if vector (va - vector add) instructions are understood by > # the assembler and can be executed. This also covers checking for > # the VX kernel feature. A kernel without that feature does not
[Committed] s390x: Do not default to -mvx for -mesa
We currently enable the vector extensions also for -march=z13 -m31 mesa which is very wrong. Not a regression but an obvious fix, so I've committed it to mainline now. Will have to cherry-pick it for stable branches as well. gcc/ChangeLog: * config/s390/s390.cc (s390_option_override_internal): Check zarch flag before enabling -mvx. --- gcc/config/s390/s390.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc index bf46eab2d63..5968808fcb6 100644 --- a/gcc/config/s390/s390.cc +++ b/gcc/config/s390/s390.cc @@ -16104,7 +16104,7 @@ s390_option_override_internal (struct gcc_options *opts, } else { - if (TARGET_CPU_VX_P (opts)) + if (TARGET_CPU_VX_P (opts) && TARGET_ZARCH_P (opts->x_target_flags)) /* Enable vector support if available and not explicitly disabled by user. E.g. with -m31 -march=z13 -mzarch */ opts->x_target_flags |= MASK_OPT_VX; -- 2.44.0
[Committed] s390x: Fix vec_xl/vec_xst type aliasing [PR114676]
The requirements of the vec_xl/vec_xst intrinsincs wrt aliasing of the pointer argument are not really documented. As it turns out, users are likely to get it wrong. With this patch we let the pointer argument alias everything in order to make it more robust for users. Committed to mainline. Will be cherry-picked for stable branches as well. gcc/ChangeLog: PR target/114676 * config/s390/s390-c.cc (s390_expand_overloaded_builtin): Use a MEM_REF with an addend of type ptr_type_node. gcc/testsuite/ChangeLog: PR target/114676 * gcc.target/s390/zvector/pr114676.c: New test. Suggested-by: Jakub Jelinek --- gcc/config/s390/s390-c.cc | 16 +--- .../gcc.target/s390/zvector/pr114676.c| 19 +++ 2 files changed, 28 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/s390/zvector/pr114676.c diff --git a/gcc/config/s390/s390-c.cc b/gcc/config/s390/s390-c.cc index 8d3d1a467a8..1bb6e810766 100644 --- a/gcc/config/s390/s390-c.cc +++ b/gcc/config/s390/s390-c.cc @@ -498,11 +498,11 @@ s390_expand_overloaded_builtin (location_t loc, /* Build a vector type with the alignment of the source location in order to enable correct alignment hints to be generated for vl. */ - tree mem_type = build_aligned_type (return_type, - TYPE_ALIGN (TREE_TYPE (TREE_TYPE ((*arglist)[1]; + unsigned align = TYPE_ALIGN (TREE_TYPE (TREE_TYPE ((*arglist)[1]))); + tree mem_type = build_aligned_type (return_type, align); return build2 (MEM_REF, mem_type, fold_build_pointer_plus ((*arglist)[1], (*arglist)[0]), - build_int_cst (TREE_TYPE ((*arglist)[1]), 0)); + build_int_cst (ptr_type_node, 0)); } case S390_OVERLOADED_BUILTIN_s390_vec_xst: case S390_OVERLOADED_BUILTIN_s390_vec_xstd2: @@ -511,11 +511,13 @@ s390_expand_overloaded_builtin (location_t loc, /* Build a vector type with the alignment of the target location in order to enable correct alignment hints to be generated for vst. */ - tree mem_type = build_aligned_type (TREE_TYPE((*arglist)[0]), - TYPE_ALIGN (TREE_TYPE (TREE_TYPE ((*arglist)[2]; + unsigned align = TYPE_ALIGN (TREE_TYPE (TREE_TYPE ((*arglist)[2]))); + tree mem_type = build_aligned_type (TREE_TYPE ((*arglist)[0]), align); return build2 (MODIFY_EXPR, mem_type, - build1 (INDIRECT_REF, mem_type, - fold_build_pointer_plus ((*arglist)[2], (*arglist)[1])), + build2 (MEM_REF, mem_type, + fold_build_pointer_plus ((*arglist)[2], + (*arglist)[1]), + build_int_cst (ptr_type_node, 0)), (*arglist)[0]); } case S390_OVERLOADED_BUILTIN_s390_vec_load_pair: diff --git a/gcc/testsuite/gcc.target/s390/zvector/pr114676.c b/gcc/testsuite/gcc.target/s390/zvector/pr114676.c new file mode 100644 index 000..bdc66b2920a --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/zvector/pr114676.c @@ -0,0 +1,19 @@ +/* { dg-do run { target { s390*-*-* } } } */ +/* { dg-options "-O3 -mzarch -march=z14 -mzvector" } */ + +#include + +void __attribute__((noinline)) foo (int *mem) +{ + vec_xst ((vector float){ 1.0f, 2.0f, 3.0f, 4.0f }, 0, (float*)mem); +} + +int +main () +{ + int m[4] = { 0 }; + foo (m); + if (m[3] == 0) +__builtin_abort (); + return 0; +} -- 2.44.0
Re: [PATCH] s390: testsuite: Fix zero_bits_compound-1.c
On 4/30/24 10:32, Stefan Schulze Frielinghaus wrote: > Starting with r12-2731-g96146e61cd7aee we do not generate code like > > _5 = (unsigned int) c_2(D); > i_6 = _5 << 8; > _7 = _5 << 20; > i_8 = i_6 | _7; > > anymore but instead > > _5 = (unsigned int) c_2(D); > _3 = _5 * 1048832; > > which leads finally to slightly different assembly code where we > previously ended up for z10 or newer with > > lr %r1,%r2 > sll %r1,8 > rosbg %r1,%r2,32,43,20 > llgfr %r2,%r1 > br %r14 > > and now > > lr %r1,%r2 > sll %r1,12 > ar %r2,%r1 > risbg %r2,%r2,35,128+55,8 > br %r14 > > The zero-extend materializes via risbg for which the pattern contains an > "and" which is why the test fails. Thus, instead of scanning for RTL > expressions rather scan for assembler instructions for s390. > --- > Ok for mainline? Ok. Thanks! Andreas > > gcc/testsuite/gcc.dg/zero_bits_compound-1.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/gcc/testsuite/gcc.dg/zero_bits_compound-1.c > b/gcc/testsuite/gcc.dg/zero_bits_compound-1.c > index e71594911b2..f1e267e0fb0 100644 > --- a/gcc/testsuite/gcc.dg/zero_bits_compound-1.c > +++ b/gcc/testsuite/gcc.dg/zero_bits_compound-1.c > @@ -39,4 +39,5 @@ unsigned long bar (unsigned char c) > } > > /* Check that no pattern containing an AND expression was used. */ > -/* { dg-final { scan-assembler-not "\\(and:" } } */ > +/* { dg-final { scan-assembler-not "\\(and:" { target { ! { s390*-*-* } } } > } } */ > +/* { dg-final { scan-assembler-not "\\tng?rk?\\t" { target { s390*-*-* } } } > } */
Re: [PATCH] s390: testsuite: Fix risbg-ll-2.c
On 4/30/24 10:34, Stefan Schulze Frielinghaus wrote: > Starting with r14-2047-gd0e891406b16dc we see through subregs which > means for f10 in risbg-ll-2.c we do not end up with rosbg_si_noshift but > rather rosbg_di_noshift which materializes in slightly different start > index. This saves us an extend. > > gcc/testsuite/ChangeLog: > > * gcc.target/s390/risbg-ll-2.c: Fix start offset for rosbg of > f10. Ok. Thanks! Andreas > --- > Ok for mainline? > > gcc/testsuite/gcc.target/s390/risbg-ll-2.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/testsuite/gcc.target/s390/risbg-ll-2.c > b/gcc/testsuite/gcc.target/s390/risbg-ll-2.c > index 8bf1a0ff88b..ca80602a83f 100644 > --- a/gcc/testsuite/gcc.target/s390/risbg-ll-2.c > +++ b/gcc/testsuite/gcc.target/s390/risbg-ll-2.c > @@ -113,7 +113,7 @@ i32 f9 (i64 v_x, i32 v_y) > // ands with incompatible masks. > i32 f10 (i64 v_x, i32 v_y) > { > - /* { dg-final { scan-assembler > "f10:\n\tsrlg\t%r2,%r2,48\n\trosbg\t%r2,%r3,32,39,0" { target { lp64 } } } } > */ > + /* { dg-final { scan-assembler > "f10:\n\tsrlg\t%r2,%r2,48\n\trosbg\t%r2,%r3,0,39,0" { target { lp64 } } } } */ >/* { dg-final { scan-assembler > "f10:\n\tnilf\t%r4,4278190080\n\trosbg\t%r4,%r2,48,63,48" { target { ! lp64 } > } } } */ >i64 v_shr6 = ((ui64)v_x) >> 48; >i32 v_conv = (ui32)v_shr6;
Re: [PATCH] s390: Fix TF to FPRX2 conversion [PR115860]
Ok, Thanks! Andreas On 8/16/24 09:41, Stefan Schulze Frielinghaus wrote: Currently subregs originating from *tf_to_fprx2_0 and *tf_to_fprx2_1 survive register allocation. This in turn leads to wrong register renaming. Keeping the current approach would mean we need two insns for *tf_to_fprx2_0 and *tf_to_fprx2_1, respectively. Something along the lines (define_insn "*tf_to_fprx2_0" [(set (subreg:DF (match_operand:FPRX2 0 "nonimmediate_operand" "=f") 0) (unspec:DF [(match_operand:TF 1 "general_operand" "v")] UNSPEC_TF_TO_FPRX2_0))] "TARGET_VXE" "#") (define_insn "*tf_to_fprx2_0" [(set (match_operand:DF 0 "nonimmediate_operand" "=f") (unspec:DF [(match_operand:TF 1 "general_operand" "v")] UNSPEC_TF_TO_FPRX2_0))] "TARGET_VXE" "vpdi\t%v0,%v1,%v0,1 [(set_attr "op_type" "VRR")]) and similar for *tf_to_fprx2_1. Note, pre register allocation operand 0 has mode FPRX2 and afterwards DF once subregs have been eliminated. Since we always copy a whole vector register into a floating-point register pair, another way to fix this is to merge *tf_to_fprx2_0 and *tf_to_fprx2_1 into a single insn which means we don't have to use subregs at all. The downside of this is that the assembler template contains two instructions, now. The upside is that we don't have to come up with some artificial insn before RA which might be more readable/maintainable. That is implemented by this patch. In commit r11-4872-ge627cda5686592, the output operand specifier %V was introduced which is used in tf_to_fprx2 only, now. I didn't come up with its counterpart like %F for floating-point registers. Instead I printed the register pair in the output function directly. This spares us a new and "rare" format specifier for a single insn. I don't have a strong opinion which option to choose, however, we should either add %F in order to mimic the same behaviour as %V or getting rid of %V and inline the logic in the output function. I lean towards the latter. Any preferences? --- gcc/config/s390/s390.md| 2 + gcc/config/s390/vector.md | 66 +++--- gcc/testsuite/gcc.target/s390/pr115860-1.c | 26 + 3 files changed, 60 insertions(+), 34 deletions(-) create mode 100644 gcc/testsuite/gcc.target/s390/pr115860-1.c diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md index 3d5759d6252..31240899934 100644 --- a/gcc/config/s390/s390.md +++ b/gcc/config/s390/s390.md @@ -241,6 +241,8 @@ UNSPEC_VEC_VFMIN UNSPEC_VEC_VFMAX + UNSPEC_TF_TO_FPRX2 + UNSPEC_NNPA_VCLFNHS_V8HI UNSPEC_NNPA_VCLFNLS_V8HI UNSPEC_NNPA_VCRNFS_V8HI diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md index a75b7cb5825..561182e0c2c 100644 --- a/gcc/config/s390/vector.md +++ b/gcc/config/s390/vector.md @@ -907,36 +907,36 @@ "vmrlg\t%0,%1,%2"; [(set_attr "op_type" "VRR")]) - -(define_insn "*tf_to_fprx2_0" - [(set (subreg:DF (match_operand:FPRX2 0 "nonimmediate_operand" "+f") 0) - (subreg:DF (match_operand:TF1 "general_operand" "v") 0))] - "TARGET_VXE" - ; M4 == 1 corresponds to %v0[0] = %v1[0]; %v0[1] = %v0[1]; - "vpdi\t%v0,%v1,%v0,1" - [(set_attr "op_type" "VRR")]) - -(define_insn "*tf_to_fprx2_1" - [(set (subreg:DF (match_operand:FPRX2 0 "nonimmediate_operand" "+f") 8) - (subreg:DF (match_operand:TF1 "general_operand" "v") 8))] +(define_insn "tf_to_fprx2" + [(set (match_operand:FPRX2 0 "register_operand" "=f,f ,f") + (unspec:FPRX2 [(match_operand:TF 1 "general_operand" "v,AR,AT")] + UNSPEC_TF_TO_FPRX2))] "TARGET_VXE" - ; M4 == 5 corresponds to %V0[0] = %v1[1]; %V0[1] = %V0[1]; - "vpdi\t%V0,%v1,%V0,5" - [(set_attr "op_type" "VRR")]) - -(define_insn_and_split "tf_to_fprx2" - [(set (match_operand:FPRX20 "nonimmediate_operand" "=f,f") - (subreg:FPRX2 (match_operand:TF 1 "general_operand" "v,AR") 0))] - "TARGET_VXE" - "#" - "!(MEM_P (operands[1]) && MEM_VOLATILE_P (operands[1]))" - [(set (match_dup 2) (match_dup 3)) - (set (match_dup 4) (match_dup 5))] { - operands[2] = simplify_gen_subreg (DFmode, operands[0], FPRX2mode, 0); - operands[3] = simplify_gen_subreg (DFmode, operands[1], TFmode, 0); - operands[4] = simplify_gen_subreg (DFmode, operands[0], FPRX2mode, 8); - operands[5] = simplify_gen_subreg (DFmode, operands[1], TFmode, 8); + char buf[64]; + switch (which_alternative) +{ +case 0: + if (REGNO (operands[0]) == REGNO (operands[1])) + return "vpdi\t%V0,%v1,%V0,5"; + else + return "ldr\t%f0,%f1;vpdi\t%V0,%v1,%V0,5"; +case 1: + { + const char *reg_pair = reg_names[REGNO (operands[0]) + 1]; + snprintf (buf, sizeof (buf), "ld\t%%f0,%%1;ld\t%%%s,8+%%1", reg_pair); + output_asm_insn (buf, operands); + return ""; + } +case 2: + { + const char *reg_pair = reg_names[REG
Re: [PATCH] s390: Fix TF to FPRX2 conversion [PR115860]
On 9/12/24 08:14, Stefan Schulze Frielinghaus wrote: .. Right, so offsettable_memref_p only ensures that any resulting address is a valid general address. So we have to manually check for short displacement. Maybe something along the lines: diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc index 7aea776da2f..e61cda8352a 100644 --- a/gcc/config/s390/s390.cc +++ b/gcc/config/s390/s390.cc @@ -3714,6 +3714,18 @@ s390_mem_constraint (const char *str, rtx op) if ((reload_completed || reload_in_progress) ? !offsettable_memref_p (op) : !offsettable_nonstrict_memref_p (op)) return 0; + /* offsettable_memref_p ensures only that any positive offset added to + the address forms a valid general address. For Q and R constraints we + also have to verify that the resulting displacement after adding any + positive offset less than the size of the object being referenced is + still valid. */ + if (str[1] == 'Q' || str[1] == 'R') + { + int o = GET_MODE_SIZE (GET_MODE (op)) - 1; + rtx tmp = adjust_address (op, QImode, o); + if (!s390_check_qrst_address (str[1], XEXP (tmp, 0), true)) + return 0; + } return s390_check_qrst_address (str[1], XEXP (op, 0), true); case 'B': /* Check for non-literal-pool variants of memory constraints. */ My reading of the constraints A[RQST] is that those are only used for operands with non-block mode. Thus, I didn't check for block mode. Maybe an assert would be worthwhile. This looks reasonable to me. I guess this deserves to be a separate patch? Yea I think so, too, since this fixes the constraints AR and AQ which is independent of this patch. I will prepare one shortly. Agreed. Feel free to commit the change above right away. Thanks! Andreas
Re: [PATCH] s390: Fix strict_low_part generation
On 8/16/24 09:14, Stefan Schulze Frielinghaus wrote: In s390_expand_insv(), if generating code for ICM et al. src is a MEM and gen_lowpart might force src into a register such that we end up with patterns which do not match anymore. Use adjust_address() instead in order to preserve a MEM. Furthermore, it is not straight forward to enforce a subreg. For example, in case of a paradoxical subreg, gen_lowpart() may return a register. In order to compensate this, s390_gen_lowpart_subreg() emits a reference to a pseudo which does not coincide with its definition which is wrong. Additionally, if dest is a paradoxical subreg, then do not try to emit a strict_low_part since it could mean that dest was not initialized even though this might be fixed up later by init-regs. Splitter for insn *get_tp_64, *zero_extendhisi2_31, *zero_extendqisi2_31, *zero_extendqihi2_31 are applied after reload. Thus, operands[0] is a hard register and gen_lowpart (m, operands[0]) just returns the hard register for mode m which is fine to use as an argument for strict_low_part, i.e., we do not need to enforce subregs here since after reload subregs are supposed to be eliminated anyway. This fixes gcc.dg/torture/pr111821.c. gcc/ChangeLog: * config/s390/s390-protos.h (s390_gen_lowpart_subreg): Remove. * config/s390/s390.cc (s390_gen_lowpart_subreg): Remove. (s390_expand_insv): Use adjust_address() and emit a strict_low_part only in case of a natural subreg. * config/s390/s390.md: Use gen_lowpart() instead of s390_gen_lowpart_subreg(). Ok. Thanks! Andreas
[PING] 3 patches waiting for approval/review
[RFC] Allow functions calling mcount before prologue to be leaf functions http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00993.html [PATCH] PR57377: Fix mnemonic attribute http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01364.html [PATCH] Doc: Add documentation for the mnemonic attribute http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01436.html Bye, -Andreas-
Re: [PATCH] Enable non-complex math builtins from C99 for Bionic
On Wed, Aug 21, 2013 at 11:21:32PM +0400, Alexander Ivchenko wrote: > I'm sorry for that. The following patch cured my build of those > targets; it is also preserving the initial presence of c99. There were > plenty of targets that were changed by my patch, I hope this time I > didn't miss anything. S/390 bootstrap still fails. The reason is that when we set tm_file/tm_p_file in config.gcc we do not append to the existing content so your adjustments done before are ignored and gcc complains about unknown linux_android_libc_has_function. However, since with s390 Linux and TPF we only target GNU Linux systems with Glibc we can default to gnu_libc_has_functions anyway: Index: gcc/config/s390/linux.h === *** gcc/config/s390/linux.h.orig2013-01-14 07:48:06.0 + --- gcc/config/s390/linux.h 2013-08-22 07:57:46.006014197 + *** along with GCC; see the file COPYING3. *** 87,90 --- 87,93 /* Define if long doubles should be mangled as 'g'. */ #define TARGET_ALTERNATE_LONG_DOUBLE_MANGLING + #undef TARGET_LIBC_HAS_FUNCTION + #define TARGET_LIBC_HAS_FUNCTION gnu_libc_has_function + #endif Index: gcc/config/s390/tpf.h === *** gcc/config/s390/tpf.h.orig 2013-08-22 07:01:48.0 + --- gcc/config/s390/tpf.h 2013-08-22 07:57:27.706013534 + *** along with GCC; see the file COPYING3. *** 111,118 /* IBM copies these libraries over with these names. */ #define MATH_LIBRARY "CLBM" #define LIBSTDCXX "CPP2" - #endif /* ! _TPF_H */ - /* We redefine this hook so the version from elfos.h header won't be used. */ #undef TARGET_LIBC_HAS_FUNCTION ! #define TARGET_LIBC_HAS_FUNCTION default_libc_has_function --- 111,118 /* IBM copies these libraries over with these names. */ #define MATH_LIBRARY "CLBM" #define LIBSTDCXX "CPP2" #undef TARGET_LIBC_HAS_FUNCTION ! #define TARGET_LIBC_HAS_FUNCTION gnu_libc_has_function ! ! #endif /* ! _TPF_H */
[PING] 3 patches waiting for approval/review
[RFC] Allow functions calling mcount before prologue to be leaf functions http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00993.html [PATCH] PR57377: Fix mnemonic attribute http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01364.html [PATCH] Doc: Add documentation for the mnemonic attribute http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01436.html Bye, -Andreas-
[PATCH] S/390: Use fast BCR serialization facility.
Hi, since z196 we have the fast-BCR-serialization facility. With this facility a faster synchronization primitive is provided which is sufficient for all our cases. In fact the compare and swap atomic operations use the very same mechanism. With this patch we always use the new variant on z196 and later. The bcr variants used for synchronization purposes always go into their own dispatch group. This wasn't correctly implemented for z196 and zEC12 so far. I'll commit the patch after regression tests passed. Bye, -Andreas- 2013-09-03 Andreas Krebbel * config/s390/s390.md: Add "bcr_flush" value to mnemonic attribute. ("mem_thread_fence_1"): Use bcr 14,0 for z196 and later. Set the mnemonic attribute to "bcr_flush". Set the "z196prop" attribute to "z196_alone". * config/s390/2827.md: Add "bcr_flush" to "ooo_groupalone" and "zEC12_simple". --- gcc/config/s390/2827.md |4 gcc/config/s390/s390.md | 16 +++! 2 files changed, 3 insertions(+), 17 modifications(!) Index: gcc/config/s390/s390.md === *** gcc/config/s390/s390.md.orig --- gcc/config/s390/s390.md *** *** 291,297 z196_cracked" (const_string "none")) ! (define_attr "mnemonic" "unknown" (const_string "unknown")) ;; Length in bytes. --- 291,297 z196_cracked" (const_string "none")) ! (define_attr "mnemonic" "bcr_flush,unknown" (const_string "unknown")) ;; Length in bytes. *** *** 9007,9018 ; Although bcr is superscalar on Z10, this variant will never ; become part of an execution group. (define_insn "mem_thread_fence_1" [(set (match_operand:BLK 0 "" "") (unspec:BLK [(match_dup 0)] UNSPEC_MB))] "" ! "bcr\t15,0" ! [(set_attr "op_type" "RR")]) ; ; atomic load/store operations --- 9007,9028 ; Although bcr is superscalar on Z10, this variant will never ; become part of an execution group. + ; With z196 we can make use of the fast-BCR-serialization facility. + ; This allows for a slightly faster sync which is sufficient for our + ; purposes. (define_insn "mem_thread_fence_1" [(set (match_operand:BLK 0 "" "") (unspec:BLK [(match_dup 0)] UNSPEC_MB))] "" ! { ! if (TARGET_Z196) ! return "bcr\t14,0"; ! else ! return "bcr\t15,0"; ! } ! [(set_attr "op_type" "RR") !(set_attr "mnemonic" "bcr_flush") !(set_attr "z196prop" "z196_alone")]) ; ; atomic load/store operations Index: gcc/config/s390/2827.md === *** gcc/config/s390/2827.md.orig --- gcc/config/s390/2827.md *** *** 32,43 (const_int 0))) (define_attr "ooo_groupalone" "" ! (cond [(eq_attr "mnemonic" "lnxbr,madb,ltxtr,clc,axtr,msebr,slbgr,xc,alcr,lpxbr,slbr,maebr,mlg,mfy,lxdtr,maeb,lxeb,nc,mxtr,sxtr,dxbr,alc,msdbr,ltxbr,lxdb,madbr,lxdbr,lxebr,mvc,m,mseb,mlr,mlgr,slb,tcxb,msdb,sqxbr,alcgr,oc,flogr,alcg,mxbr,dxtr,axbr,mr,sxbr,slbg,ml,lcxbr") (const_int 1)] (const_int 0))) (define_insn_reservation "zEC12_simple" 1 (and (eq_attr "cpu" "zEC12") !(eq_attr "mnemonic" "ltg,ogrk,lr,lnebr,lghrl,sdbr,x,asi,lhr,sebr,madb,ar,lhrl,clfxtr,llgfr,clghrl,cgr,cli,agrk,ic,adbr,aebr,lrv,clg,cy,cghi,sy,celfbr,seb,clgfr,al,tm,lang,clfebr,lghr,cdb,lpebr,laa,ark,lh,or,icy,xi,msebr,n,llihl,afi,cs,nrk,sth,lgr,l,lcr,stey,xg,crt,slgfr,ny,ld,j,llihh,slgr,clfhsi,slg,lb,lgrl,lrl,llihf,lndbr,llcr,laxg,mvghi,rllg,sdb,xrk,laag,alhsik,algfi,algr,aly,agfi,lrvr,d,crl,llgc,tmhl,algsi,lgh,icmh,clhrl,xgrk,icm,iilf,ork,lbr,cg,ldgr,lgf,iihf,llghr,sg,clfdbr,llgtr,stam,cebr,tmhh,tceb,slgf,basr,lgbr,maebr,lgb,cgfi,aeb,ltebr,lax,clfit,lrvgr,nihl,ni,clfdtr,srdl,mdb,srk,xihf,stgrl,sthrl,algf,ltr,cdlgbr,cgit,ng,lat,llghrl,ltgr,nihh,clgfrl,srlk,maeb,agr,cxlftr,ler,bcr,stcy,cds,clfi,nihf,ly,clt,lgat,alg,lhy,lgfrl,clghsi,clrt,tmll,srlg,tcdb,ay,sty,clr,lgfi,lan,lpdbr,clgt,adb,ahik,sra,algrk,cdfbr,lcebr,clfxbr,msdbr,ceb,clgr,tmy,tmlh,alghsik,lcgr,mvi,cdbr,ltgf,xr,larl,ldr,llgcr,clgrt,clrl,cghsi,cliy,madbr,oy,ogr,llgt,meebr,slr,clgxbr,chi,s,icmy,llc,ngr,clhhsi,ltgfr,llill,lhi,o,meeb,clgdtr,sll,clgrl,clgf,ledbr,cegbr,mviy,algfr,rll,cdlftr,sldl,cdlgtr,lg,niy,st,sgr,ag,le,xgr,cr,stg,llilh,sr,lzer,cdsg,sllk,mdbr,stoc,csg,clgit,chhsi,strl,llilf,lndfr,ngrk,clgebr,clgfi,llgh,mseb,ltdbr,oill,la,llhrl,stc,lghi,oihl
[PATCH] S/390: Add support for the "load fp integer" instructions
Hi, the attached patch implements pattern definitions for the nearest integer functions for binary and decimal floating point. Since z196 we have "load fp integer" instructions which allow suppression of the inexact exception. These provide a 1:1 mapping to several of the standard math.h functions. The DFP variants are not yet expanded for the standard math function since the necessary GCC builtins are missing so far. I'll commit the patch after waiting for comments and regression test. Bye, -Andreas- 2013-09-03 Andreas Krebbel * config/s390/s390.md (UNSPEC_FPINT_FLOOR, UNSPEC_FPINT_BTRUNC) (UNSPEC_FPINT_ROUND, UNSPEC_FPINT_CEIL, UNSPEC_FPINT_NEARBYINT) (UNSPEC_FPINT_RINT): New constant definitions. (FPINT, fpint_name, fpint_roundingmode): New integer iterator definition with 2 attributes. ("2", "rint2") ("2", "rint2"): New pattern definitions. 2013-09-03 Andreas Krebbel * gcc.target/s390/nearestint-1.c: New testcase. --- gcc/config/s390/2827.md | 21 ++! gcc/config/s390/s390.md | 78 ++! gcc/testsuite/gcc.target/s390/nearestint-1.c | 48 3 files changed, 143 insertions(+), 4 modifications(!) Index: gcc/config/s390/2827.md === *** gcc/config/s390/2827.md.orig --- gcc/config/s390/2827.md *** *** 37,43 (define_insn_reservation "zEC12_simple" 1 (and (eq_attr "cpu" "zEC12") !(eq_attr "mnemonic" "ltg,ogrk,lr,lnebr,lghrl,sdbr,x,asi,lhr,sebr,madb,ar,lhrl,clfxtr,llgfr,clghrl,cgr,cli,agrk,ic,adbr,aebr,lrv,clg,cy,cghi,sy,celfbr,seb,clgfr,al,tm,lang,clfebr,lghr,cdb,lpebr,laa,ark,lh,or,icy,xi,msebr,n,llihl,afi,cs,nrk,sth,lgr,l,lcr,stey,xg,crt,slgfr,ny,ld,j,llihh,slgr,clfhsi,slg,lb,lgrl,lrl,llihf,lndbr,llcr,laxg,mvghi,rllg,sdb,xrk,laag,alhsik,algfi,algr,aly,agfi,lrvr,d,crl,llgc,tmhl,algsi,lgh,icmh,clhrl,xgrk,icm,iilf,ork,lbr,cg,ldgr,lgf,iihf,llghr,sg,clfdbr,llgtr,stam,cebr,tmhh,tceb,slgf,basr,lgbr,maebr,lgb,cgfi,aeb,ltebr,lax,clfit,lrvgr,nihl,ni,clfdtr,srdl,mdb,srk,xihf,stgrl,sthrl,algf,ltr,cdlgbr,cgit,ng,lat,llghrl,ltgr,nihh,clgfrl,srlk,maeb,agr,cxlftr,ler,bcr_flush,stcy,cds,clfi,nihf,ly,clt,lgat,alg,lhy,lgfrl,clghsi,clrt,tmll,srlg,tcdb,ay,sty,clr,lgfi,lan,lpdbr,clgt,adb,ahik,sra,algrk,cdfbr,lcebr,clfxbr,msdbr,ceb,clgr,tmy,tmlh,alghsik,lcgr,mvi,cdbr,ltgf,xr,larl,ldr,llgcr,clgrt,clrl,cghsi,cliy,madbr,oy,ogr,llgt,meebr,slr,clgxbr,chi,s,icmy,llc,ngr,clhhsi,ltgfr,llill,lhi,o,meeb,clgdtr,sll,clgrl,clgf,ledbr,cegbr,mviy,algfr,rll,cdlftr,sldl,cdlgtr,lg,niy,st,sgr,ag,le,xgr,cr,stg,llilh,sr,lzer,cdsg,sllk,mdbr,stoc,csg,clgit,chhsi,strl,llilf,lndfr,ngrk,clgebr,clgfi,llgh,mseb,ltdbr,oill,la,llhrl,stc,lghi,oihl,xiy,sllg,llgf,cgrt,ldeb,cl,sl,cdlfbr,oi,oilh,nr,srak,oihh,ear,slgrk,og,c,slgfi,sthy,oilf,oiy,msdb,oihf,a,cfi,lzxr,lzdr,srag,cdgbr,brasl,alr,cgrl,llgfrl,cit,clgxtr,ley,exrl,lcdfr,lay,xilf,lcdbr,alsi,mvhhi,srl,chsi,lgfr,lrvg,cly,sgrk,ahi,celgbr,nill,clgdbr,jg,slrk,lxr,sar,slfi,cpsdr,lcgfr,aghik,nilh,mvhi,lpdfr,xy,alrk,lao,agsi,ldy,nilf,llhr,alfi,laog,sly,aghi,ldebr,bras,srda,cefbr,lt")) "nothing") (define_insn_reservation "zEC12_cgdbr" 2 (and (eq_attr "cpu" "zEC12") --- 37,43 (define_insn_reservation "zEC12_simple" 1 (and (eq_attr "cpu" "zEC12") !(eq_attr "mnemonic" "ltg,ogrk,lr,lnebr,lghrl,sdbr,x,asi,lhr,sebr,madb,ar,lhrl,clfxtr,llgfr,clghrl,cgr,cli,agrk,ic,adbr,aebr,lrv,clg,cy,cghi,sy,celfbr,seb,clgfr,al,tm,lang,clfebr,lghr,cdb,lpebr,laa,ark,lh,or,icy,xi,msebr,n,llihl,afi,cs,nrk,sth,lgr,l,lcr,stey,xg,crt,slgfr,ny,ld,j,llihh,slgr,clfhsi,slg,lb,lgrl,lrl,llihf,lndbr,llcr,laxg,mvghi,rllg,sdb,xrk,laag,alhsik,algfi,algr,aly,agfi,lrvr,d,crl,llgc,tmhl,algsi,lgh,icmh,clhrl,xgrk,icm,iilf,ork,lbr,cg,ldgr,lgf,iihf,llghr,sg,clfdbr,llgtr,stam,cebr,tmhh,tceb,slgf,basr,lgbr,maebr,lgb,cgfi,aeb,ltebr,lax,clfit,lrvgr,nihl,ni,clfdtr,srdl,mdb,srk,xihf,stgrl,sthrl,algf,ltr,cdlgbr,cgit,ng,lat,llghrl,ltgr,nihh,clgfrl,srlk,maeb,agr,cxlftr,ler,bcr_flush,stcy,cds,clfi,nihf,ly,clt,lgat,alg,lhy,lgfrl,clghsi,clrt,tmll,srlg,tcdb,ay,sty,clr,lgfi,lan,lpdbr,clgt,adb,ahik,sra,algrk,cdfbr,lcebr,clfxbr,msdbr,ceb,clgr,tmy,tmlh,alghsik,lcgr,mvi,cdbr,ltgf,xr,larl,ldr,llgcr,clgrt,clrl,cghsi,cliy,madbr,oy,ogr,llgt,meebr,slr,clgxbr,chi,s,icmy,llc,ngr,clhhsi,ltgfr,llill,lhi,o,meeb,clgdtr,sll,clgrl,clgf,ledbr,cegbr,mviy,algfr,rll,cdlftr,sldl,cdlgtr,lg,niy,st,sgr,ag,le,xgr,cr,stg,llilh,sr,lzer,cdsg,sllk,mdbr,stoc,csg,clgit,chhsi,strl,llilf,lndfr,ngrk,clgebr,clgfi,llgh,mseb,ltdbr,oill,la,llhrl,stc,lghi,oihl,xiy,sllg,llgf,cgrt,ldeb,cl,sl,cdlfbr,oi,oilh,nr,srak,oihh,ear,slgrk,og,c,slgfi,sthy,oilf,oiy,msdb,oihf,a,cfi,lzxr,lzdr,srag,cdgbr,brasl,alr,cgrl,llgfrl,cit,clgxtr,l
[PING] 3 patches waiting for approval/review
[RFC] Allow functions calling mcount before prologue to be leaf functions http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00993.html [PATCH] PR57377: Fix mnemonic attribute http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01364.html [PATCH] Doc: Add documentation for the mnemonic attribute http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01436.html Bye, -Andreas-
[Committed] S/390: Fix PR 58574
Hi, this fixes a bug in the literal pool splitting code in the s390 back end. Jakub debugged the problem and provided a fix. I've tested the patch on s390 and s390x with the default options as well as -march=z10/-mtune=zEC12. No regressions. Committed to mainline. Jakub tested the 4.8 version and will commit it soon. Bye, -Andreas- 2013-10-01 Jakub Jelinek Andreas Krebbel PR target/58574 * config/s390/s390.c (s390_split_branches): Modify check for table jump insns. (s390_chunkify_start): Rearrange table jump insn check in order to deal with compare and branch insns correctly. 2013-10-01 Jakub Jelinek PR target/58574 * gcc.c-torture/execute/pr58574.c: New testcase. --- gcc/config/s390/s390.c| 51 +-!!! gcc/testsuite/gcc.c-torture/execute/pr58574.c | 219 ++ 2 files changed, 230 insertions(+), 14 deletions(-), 26 modifications(!) Index: gcc/config/s390/s390.c === *** gcc/config/s390/s390.c.orig --- gcc/config/s390/s390.c *** s390_split_branches (void) *** 6025,6035 for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) { ! if (! JUMP_P (insn)) continue; pat = PATTERN (insn); ! if (GET_CODE (pat) == PARALLEL && XVECLEN (pat, 0) > 2) pat = XVECEXP (pat, 0, 0); if (GET_CODE (pat) != SET || SET_DEST (pat) != pc_rtx) continue; --- 6025,6035 for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) { ! if (! JUMP_P (insn) || tablejump_p (insn, NULL, NULL)) continue; pat = PATTERN (insn); ! if (GET_CODE (pat) == PARALLEL) pat = XVECEXP (pat, 0, 0); if (GET_CODE (pat) != SET || SET_DEST (pat) != pc_rtx) continue; *** s390_chunkify_start (void) *** 7049,7054 --- 7049,7056 for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) { + rtx table; + /* Labels marked with LABEL_PRESERVE_P can be target of non-local jumps, so we have to mark them. The same holds for named labels. *** s390_chunkify_start (void) *** 7063,7104 if (! vec_insn || ! JUMP_TABLE_DATA_P (vec_insn)) bitmap_set_bit (far_labels, CODE_LABEL_NUMBER (insn)); } ! /* If we have a direct jump (conditional or unconditional) !or a casesi jump, check all potential targets. */ else if (JUMP_P (insn)) { ! rtx pat = PATTERN (insn); ! rtx table; ! if (GET_CODE (pat) == PARALLEL && XVECLEN (pat, 0) > 2) pat = XVECEXP (pat, 0, 0); ! if (GET_CODE (pat) == SET) ! { rtx label = JUMP_LABEL (insn); if (label) { ! if (s390_find_pool (pool_list, label) != s390_find_pool (pool_list, insn)) bitmap_set_bit (far_labels, CODE_LABEL_NUMBER (label)); } - } - else if (tablejump_p (insn, NULL, &table)) -{ - rtx vec_pat = PATTERN (table); - int i, diff_p = GET_CODE (vec_pat) == ADDR_DIFF_VEC; - - for (i = 0; i < XVECLEN (vec_pat, diff_p); i++) -{ - rtx label = XEXP (XVECEXP (vec_pat, diff_p, i), 0); - - if (s390_find_pool (pool_list, label) - != s390_find_pool (pool_list, insn)) -bitmap_set_bit (far_labels, CODE_LABEL_NUMBER (label)); - } } ! } } /* Insert base register reload insns before every pool. */ --- 7065,7105 if (! vec_insn || ! JUMP_TABLE_DATA_P (vec_insn)) bitmap_set_bit (far_labels, CODE_LABEL_NUMBER (insn)); } + /* Check potential targets in a table jump (casesi_jump). */ + else if (tablejump_p (insn, NULL, &table)) + { + rtx vec_pat = PATTERN (table); + int i, diff_p = GET_CODE (vec_pat) == ADDR_DIFF_VEC; + + for (i = 0; i < XVECLEN (vec_pat, diff_p); i++) + { + rtx label = XEXP (XVECEXP (vec_pat, diff_p, i), 0); ! if (s390_find_pool (pool_list, label) ! != s390_find_pool (pool_list, insn)) ! bitmap_set_bit (far_labels, CODE_LABEL_NUMBER (label)); ! } ! } ! /* If we have a direct jump (conditional or unconditional), !check all potential targets. */ else if (JUMP_P (insn)) { ! rtx pat = PATTERN (insn); ! if (GET_CODE (pat) == PARALLEL) pat = XVECEXP (pat, 0, 0); ! if (GET_CODE (pat) == SET) ! { rtx label = JUMP_LABEL (insn); if (label) { !
Re: [PING] 3 patches waiting for approval/review
On 02/10/13 09:10, Paulo Matos wrote: > >> -Original Message- >> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On >> Behalf Of Andreas Krebbel >> Sent: 01 October 2013 10:18 >> To: gcc-patches@gcc.gnu.org >> Subject: [PING] 3 patches waiting for approval/review >> >> [RFC] Allow functions calling mcount before prologue to be leaf functions >> http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00993.html >> >> [PATCH] PR57377: Fix mnemonic attribute >> http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01364.html >> >> [PATCH] Doc: Add documentation for the mnemonic attribute >> http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01436.html >> >> Bye, >> >> -Andreas- >> > > Documentation patch has a typo: > > + specific checks in e.g. the pipleline description. Fixed. Thanks. -Andreas-
[PATCH][4.8] S/390: Transactional memory fixes
Hi, with the attached patch we support more operand types in the tabort and tbegin_retry builtins. The patch also removes the constraint letters in the expanders and fixes a builtin prototype in the documentation. The testcase is adjusted accordingly. Bootstrapped and regtested on s390 and s390x with --with-arch=zEC12. I'll apply the patch to mainline and 4.8 branch after waiting for comments. Bye, -Andreas- 2013-10-02 Andreas Krebbel * config/s390/s390.md ("tbegin", "tbegin_nofloat", "tbegin_retry") ("tbegin_retry_nofloat", "tend", "tabort", "tx_assist"): Remove constraint letters from expanders. ("tbegin_retry", "tbegin_retry_nofloat"): Change predicate of the retry count to general_operand. ("tabort"): Give operand 0 a mode. ("tabort_1"): Add mode and constraint letter for operand 0. * doc/extend.texi: Fix protoype of __builtin_non_tx_store. 2013-10-02 Andreas Krebbel * gcc.target/s390/htm-1.c: Add more tests to cover different operand types. --- gcc/config/s390/s390.md | 28 !!! gcc/doc/extend.texi |2 ! gcc/testsuite/gcc.target/s390/htm-1.c | 48 +! 3 files changed, 25 insertions(+), 53 modifications(!) Index: gcc/config/s390/s390.md === *** gcc/config/s390/s390.md.orig --- gcc/config/s390/s390.md *** *** 9962,9969 ; Non-constrained transaction begin (define_expand "tbegin" ! [(match_operand:SI 0 "register_operand" "=d") !(match_operand:BLK 1 "memory_operand" "=Q")] "TARGET_HTM" { s390_expand_tbegin (operands[0], operands[1], NULL_RTX, true); --- 9962,9969 ; Non-constrained transaction begin (define_expand "tbegin" ! [(match_operand:SI 0 "register_operand" "") !(match_operand:BLK 1 "memory_operand" "")] "TARGET_HTM" { s390_expand_tbegin (operands[0], operands[1], NULL_RTX, true); *** *** 9971,9978 }) (define_expand "tbegin_nofloat" ! [(match_operand:SI 0 "register_operand" "=d") !(match_operand:BLK 1 "memory_operand" "=Q")] "TARGET_HTM" { s390_expand_tbegin (operands[0], operands[1], NULL_RTX, false); --- 9971,9978 }) (define_expand "tbegin_nofloat" ! [(match_operand:SI 0 "register_operand" "") !(match_operand:BLK 1 "memory_operand" "")] "TARGET_HTM" { s390_expand_tbegin (operands[0], operands[1], NULL_RTX, false); *** *** 9980,9988 }) (define_expand "tbegin_retry" ! [(match_operand:SI 0 "register_operand" "=d") !(match_operand:BLK 1 "memory_operand" "=Q") !(match_operand 2 "const_int_operand")] "TARGET_HTM" { s390_expand_tbegin (operands[0], operands[1], operands[2], true); --- 9980,9988 }) (define_expand "tbegin_retry" ! [(match_operand:SI 0 "register_operand" "") !(match_operand:BLK 1 "memory_operand" "") !(match_operand:SI 2 "general_operand" "")] "TARGET_HTM" { s390_expand_tbegin (operands[0], operands[1], operands[2], true); *** *** 9990,9998 }) (define_expand "tbegin_retry_nofloat" ! [(match_operand:SI 0 "register_operand" "=d") !(match_operand:BLK 1 "memory_operand" "=Q") !(match_operand 2 "const_int_operand")] "TARGET_HTM" { s390_expand_tbegin (operands[0], operands[1], operands[2], false); --- 9990,9998 }) (define_expand "tbegin_retry_nofloat" ! [(match_operand:SI 0 "register_operand" "") !(match_operand:BLK 1 "memory_operand" "") !(match_operand:SI 2 "general_operand" "")] "TARGET_HTM" { s390_expand_tbegin (operands[0], operands[1], operands[2], false); *** *** 10059,10065 (define_expand "tend" [(set (reg:CCRAW CC_REGNUM) (unspec_volatile:CCRAW [(const_int 0)] UNSPECV_TEND)) !(set (match_operand:SI 0 "register_operand" "=d") (unspec:SI [(reg:CCRAW CC_REGNUM)] UNSPEC_CC_TO_INT))] "TARGET_HTM" "") --- 10059,10065 (define_expand "tend" [(set (reg:CCRAW CC_REGNUM) (unspec_volatile:CCRAW [(const_int 0)] UNSPECV_TEND)) !(set (match_operand:SI 0 "regis