[Patch, ARM/Thumb1]Add a Thumb1 insn pattern to legalize the instruction that moves pc to low register
Hi there, When compile below simple code: terguo01@terry-pc01:mtpcs-frame$ cat test.c int main(void) { return 0; } I got ICE with option -mtpcs-leaf-frame (no error if remove this option). terguo01@terry-pc01:mtpcs-frame$ /work/terguo01/tools/gcc-arm-none-eabi-5_0-2014q4/bin/arm-none-eabi-gcc -mtpcs-leaf-frame test.c -c -mcpu=cortex-m0plus -mthumb -da test.c: In function 'main': test.c:4:1: error: unrecognizable insn: } ^ (insn 20 19 21 (set (reg:SI 2 r2) (reg:SI 15 pc)) test.c:2 -1 (nil)) test.c:4:1: internal compiler error: in extract_insn, at recog.c:2327 Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html\ for instructions. This RTL is generated in function thumb1_expand_prologue. The expected insn pattern is thumb1_movsi_insn in thumb1.md. And instruction like "mov r2, pc" is a legal instruction. Because gcc returns NO_REG for PC register, so no valid pattern to match instruction that move pc to low register. This patch intends to add a new insn pattern to legalize such thing. Tested with GCC regression test. No regression. Is it OK to trunk? BR, Terry 2014-12-08 Terry Guo terry@arm.com * config/arm/predicates.md (pc_register): New to match PC register. * config/arm/thumb1.md (*thumb1_movpc_insn): New insn pattern. gcc/testsuite/ChangeLog: 2014-12-08 Terry Guo terry@arm.com * gcc.target/arm/thumb1-mov-pc.c: New test.diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md index 032808c..c5ef5ed 100644 --- a/gcc/config/arm/predicates.md +++ b/gcc/config/arm/predicates.md @@ -361,6 +361,10 @@ (and (match_code "smin,smax,umin,umax") (match_test "mode == GET_MODE (op)"))) +(define_special_predicate "pc_register" + (and (match_code "reg") + (match_test "REGNO (op) == PC_REGNUM"))) + (define_special_predicate "cc_register" (and (match_code "reg") (and (match_test "REGNO (op) == CC_REGNUM") diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md index ddedc39..8e6057c 100644 --- a/gcc/config/arm/thumb1.md +++ b/gcc/config/arm/thumb1.md @@ -1780,6 +1780,16 @@ " ) +(define_insn "*thumb1_movpc_insn" + [(set (match_operand:SI 0 "low_register_operand") +(match_operand:SI 1 "pc_register"))] + "TARGET_THUMB1" + "mov\\t%0, pc" + [(set_attr "length" "2") + (set_attr "conds" "nocond") + (set_attr "type" "mov_reg")] +) + ;; NB never uses BX. (define_insn "*thumb1_tablejump" [(set (pc) (match_operand:SI 0 "register_operand" "l*r")) diff --git a/gcc/testsuite/gcc.target/arm/thumb1-mov-pc.c b/gcc/testsuite/gcc.target/arm/thumb1-mov-pc.c new file mode 100644 index 000..9f94131 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/thumb1-mov-pc.c @@ -0,0 +1,7 @@ +/* { dg-options "-mtpcs-leaf-frame -O2" } */ +/* { dg-skip-if "" { ! { arm_thumb1 } } } */ +int +main () +{ + return 0; +}
Re: [PATCH, i386] Fix PR64003
2014-12-07 12:51 GMT+03:00 Richard Sandiford : > Ilya Enkovich writes: >> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md >> index 88435d6..9019ed8 100644 >> --- a/gcc/config/i386/i386.md >> +++ b/gcc/config/i386/i386.md >> @@ -10958,6 +10958,24 @@ >> ;; Basic conditional jump instructions. >> ;; We ignore the overflow flag for signed branch instructions. >> >> +(define_insn "*jcc_1_bnd" >> + [(set (pc) >> + (if_then_else (match_operator 1 "ix86_comparison_operator" >> + [(reg FLAGS_REG) (const_int 0)]) >> + (label_ref (match_operand 0)) >> + (pc)))] >> + "TARGET_MPX && ix86_bnd_prefixed_insn_p (insn)" >> + "bnd %+j%C1\t%l0" >> + [(set_attr "type" "ibr") >> + (set_attr "modrm" "0") >> + (set (attr "length") >> +(if_then_else (and (ge (minus (match_dup 0) (pc)) >> + (const_int -126)) >> + (lt (minus (match_dup 0) (pc)) >> + (const_int 128))) >> + (const_int 3) >> + (const_int 7)))]) >> + > > Sorry, looking at this again, shouldn't the offset be -125 rather > than -126? The pc in: > >(ge (minus (match_dup 0) (pc)) >(const_int -126)) > > is the start of the instruction, so we need to add the length > of the jump itself to the real minimum range of -128. You are right. Similarly upper bound should be changed from 128 to 129. Thanks, Ilya > > Thanks, > Richard
Re: [PATCH] Finish up string builtin folding move
On Thu, 4 Dec 2014, Richard Biener wrote: > > This is the last patch, moving stpcpy folding. There are still > string function foldings left but those are exclusively GENERIC > now with no chance of advertedly recursing from GIMPLE folding > via GENERIC folding back to GIMPLE folding. I'll deal with those > during next stage1. > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. The following is what I have applied. Bootstrapped / tested on x86_64-unknown-linux-gnu. Richard. 2014-12-08 Richard Biener * builtins.c (fold_builtin_0): Remove unused ignore parameter. (fold_builtin_1): Likewise. (fold_builtin_3): Likewise. (fold_builtin_varargs): Likewise. (fold_builtin_2): Likewise. Do not fold stpcpy here. (fold_builtin_n): Adjust. (fold_builtin_stpcpy): Move to gimple-fold.c. (gimple_fold_builtin_stpcpy): Moved and gimplified from builtins.c. (gimple_fold_builtin): Fold stpcpy here. Index: trunk/gcc/builtins.c === *** trunk.orig/gcc/builtins.c 2014-12-04 14:24:58.265803955 +0100 --- trunk/gcc/builtins.c2014-12-04 14:49:45.367752467 +0100 *** static tree fold_builtin_fabs (location_ *** 191,201 static tree fold_builtin_abs (location_t, tree, tree); static tree fold_builtin_unordered_cmp (location_t, tree, tree, tree, enum tree_code, enum tree_code); ! static tree fold_builtin_0 (location_t, tree, bool); ! static tree fold_builtin_1 (location_t, tree, tree, bool); ! static tree fold_builtin_2 (location_t, tree, tree, tree, bool); ! static tree fold_builtin_3 (location_t, tree, tree, tree, tree, bool); ! static tree fold_builtin_varargs (location_t, tree, tree*, int, bool); static tree fold_builtin_strpbrk (location_t, tree, tree, tree); static tree fold_builtin_strstr (location_t, tree, tree, tree); --- 191,201 static tree fold_builtin_abs (location_t, tree, tree); static tree fold_builtin_unordered_cmp (location_t, tree, tree, tree, enum tree_code, enum tree_code); ! static tree fold_builtin_0 (location_t, tree); ! static tree fold_builtin_1 (location_t, tree, tree); ! static tree fold_builtin_2 (location_t, tree, tree, tree); ! static tree fold_builtin_3 (location_t, tree, tree, tree, tree); ! static tree fold_builtin_varargs (location_t, tree, tree*, int); static tree fold_builtin_strpbrk (location_t, tree, tree, tree); static tree fold_builtin_strstr (location_t, tree, tree, tree); *** fold_builtin_exponent (location_t loc, t *** 8657,8703 return NULL_TREE; } - /* Fold function call to builtin stpcpy with arguments DEST and SRC. -Return NULL_TREE if no simplification can be made. */ - - static tree - fold_builtin_stpcpy (location_t loc, tree fndecl, tree dest, tree src) - { - tree fn, len, lenp1, call, type; - - if (!validate_arg (dest, POINTER_TYPE) - || !validate_arg (src, POINTER_TYPE)) - return NULL_TREE; - - len = c_strlen (src, 1); - if (!len - || TREE_CODE (len) != INTEGER_CST) - return NULL_TREE; - - if (optimize_function_for_size_p (cfun) - /* If length is zero it's small enough. */ - && !integer_zerop (len)) - return NULL_TREE; - - fn = builtin_decl_implicit (BUILT_IN_MEMCPY); - if (!fn) - return NULL_TREE; - - lenp1 = size_binop_loc (loc, PLUS_EXPR, - fold_convert_loc (loc, size_type_node, len), - build_int_cst (size_type_node, 1)); - /* We use dest twice in building our expression. Save it from - multiple expansions. */ - dest = builtin_save_expr (dest); - call = build_call_expr_loc (loc, fn, 3, dest, src, lenp1); - - type = TREE_TYPE (TREE_TYPE (fndecl)); - dest = fold_build_pointer_plus_loc (loc, dest, len); - dest = fold_convert_loc (loc, type, dest); - dest = omit_one_operand_loc (loc, type, dest, call); - return dest; - } - /* Fold function call to builtin memchr. ARG1, ARG2 and LEN are the arguments to the call, and TYPE is its return type. Return NULL_TREE if no simplification can be made. */ --- 8657,8662 *** fold_builtin_arith_overflow (location_t *** 9857,9867 } /* Fold a call to built-in function FNDECL with 0 arguments. !IGNORE is true if the result of the function call is ignored. This !function returns NULL_TREE if no simplification was possible. */ static tree ! fold_builtin_0 (location_t loc, tree fndecl, bool ignore ATTRIBUTE_UNUSED) { tree type = TREE_TYPE (TREE_TYPE (fndecl)); enum built_in_function fcode = DECL_FUNCTION_CODE (fndecl); --- 9816,9825 } /* Fold a call to built-in function FNDECL with 0 arguments. !This function returns NULL_TREE if no simplification was possible. */ static tree ! fold_builtin_0 (location_t loc, tre
Re: [PATCH 3/n] OpenMP 4.0 offloading infrastructure: offload tables
On Thu, Dec 4, 2014 at 8:35 PM, Ilya Verbin wrote: > Hi, > > On 30 Sep 18:53, Ilya Verbin wrote: >> This patch creates 2 vectors with decls: offload_funcs and offload_vars. >> libgomp will use addresses from these arrays to look up offloaded code. >> >> During the compilation they are outputted to: >> * binary __gnu_offload_funcs/vars sections, or using >> targetm.record_offload_symbol hook for PTX. > > In some cases LTO may optimize out a global variable, declared as target, but > it still will be referenced from the offload table, that will cause a linking > error. Here is the example: > > #pragma omp declare target > int G; > #pragma omp end declare target So where is that "magic" target use then? Why doesn't the symtab reachability code see the use? (why don't we prune it from the offload table?) Richard. > int main () > { > int res = 0; > > #pragma omp target map(alloc: G) map(from: res) > { > G = 1; > res = G; > } > > return res; > } > > $ gcc -fopenmp -flto -O1 test.c > xxx.ltrans0.ltrans.o:.offload_var_table.3973: error: undefined reference to > 'G' > > This issue can be resolved by forcing output of such variables. > Is this fix ok? Should I add a testcase? > > > diff --git a/gcc/varpool.c b/gcc/varpool.c > index 0526b7f..db28c2a 100644 > --- a/gcc/varpool.c > +++ b/gcc/varpool.c > @@ -175,6 +175,7 @@ varpool_node::get_create (tree decl) >g->have_offload = true; >if (!in_lto_p) > vec_safe_push (offload_vars, decl); > + node->force_output = 1; > #endif > } > > > Thanks, > -- Ilya
Re: Fix sreal::shift add to_double conversion
On Thu, Dec 4, 2014 at 9:06 PM, Jan Hubicka wrote: >> On December 4, 2014 7:00:12 PM CET, Jan Hubicka wrote: >> >Hi, >> >this patch fixes implementation of shift in sreal.h that ICEs on 0 >> >value. >> >Also the comment is copied from shift_right and does not make sense for >> >sreal.h. >> > >> >The to_double conversion is useful for debug output: Most of inliner >> >data is >> >now output as integer that does not make much sense. >> >> Can't you use dump for this? > > Well, dump will get you somewhat less readable form like 1234*2^3 and also it > makes it hard to dump > multiple values with single printf. > if (dump) > { > sreal num,den; > relative_time_benefit (callee_info, edge, edge_time, &num, &den); > fprintf (dump_file, >" %f: guessed profile. frequency %f," >" benefit %f%%, time w/o inlining %f, time w inlining %f" >" overall growth %i (current) %i (original)\n", >badness.to_double (), (double)edge->frequency / > CGRAPH_FREQ_BASE, >(num/den).to_double () * 100, >compute_uninlined_call_time (callee_info, edge).to_double > (), >compute_inlined_call_time (edge, edge_time).to_double (), >estimate_growth (callee), >callee_info->growth); > } > > would look even more ugly if done with dump method. Hmm, ok. Please add a comment before to_double that it is only supposed to be used for debug dumping. Richard. > Honza >> >> Richard. >> >> >Bootstrapped/regtested x86_64-linux, OK? >> > >> > * sreal.h (to_double): Declare. >> > (shift): Do not ICE on 0, update outdated comment. >> > * sreal.c: Include math.h >> > (sreal::to_double): New function. >> >Index: sreal.h >> >=== >> >--- sreal.h (revision 218286) >> >+++ sreal.h (working copy) >> >@@ -53,6 +53,7 @@ >> > >> > void dump (FILE *) const; >> > int64_t to_int () const; >> >+ double to_double () const; >> > sreal operator+ (const sreal &other) const; >> > sreal operator- (const sreal &other) const; >> > sreal operator* (const sreal &other) const; >> >@@ -93,12 +94,14 @@ >> > >> > sreal shift (int s) const >> > { >> >+/* Zero needs no shifting. */ >> >+if (!m_sig) >> >+ return *this; >> > gcc_checking_assert (s <= SREAL_BITS); >> > gcc_checking_assert (s >= -SREAL_BITS); >> > >> >-/* Exponent should never be so large because shift_right is used >> >only by >> >- sreal_add and sreal_sub ant thus the number cannot be shifted out >> >from >> >- exponent range. */ >> >+/* Overflows/drop to 0 could be handled gracefully, but hopefully >> >we do not >> >+ need to do so. */ >> > gcc_checking_assert (m_exp + s <= SREAL_MAX_EXP); >> > gcc_checking_assert (m_exp + s >= -SREAL_MAX_EXP); >> > >> >Index: sreal.c >> >=== >> >--- sreal.c (revision 218286) >> >+++ sreal.c (working copy) >> >@@ -47,6 +47,7 @@ >> > sig == 0 && exp == -SREAL_MAX_EXP >> > */ >> > >> >+#include >> > #include "config.h" >> > #include "system.h" >> > #include "coretypes.h" >> >@@ -167,6 +168,19 @@ >> > return sign * m_sig; >> > } >> > >> >+/* Return value of *this as double. */ >> >+ >> >+double >> >+sreal::to_double () const >> >+{ >> >+ double val = m_sig; >> >+ if (m_negative) >> >+val = -val; >> >+ if (m_exp) >> >+val *= exp2 (m_exp); >> >+ return val; >> >+} >> >+ >> > /* Return *this + other. */ >> > >> > sreal >>
Re: [PATCH 3/n] OpenMP 4.0 offloading infrastructure: offload tables
2014-12-08 12:18 GMT+03:00 Richard Biener : > On Thu, Dec 4, 2014 at 8:35 PM, Ilya Verbin wrote: >> Hi, >> >> On 30 Sep 18:53, Ilya Verbin wrote: >>> This patch creates 2 vectors with decls: offload_funcs and offload_vars. >>> libgomp will use addresses from these arrays to look up offloaded code. >>> >>> During the compilation they are outputted to: >>> * binary __gnu_offload_funcs/vars sections, or using >>> targetm.record_offload_symbol hook for PTX. >> >> In some cases LTO may optimize out a global variable, declared as target, but >> it still will be referenced from the offload table, that will cause a linking >> error. Here is the example: >> >> #pragma omp declare target >> int G; >> #pragma omp end declare target > > So where is that "magic" target use then? Why doesn't the symtab > reachability code see the use? (why don't we prune it from the offload > table?) > > Richard. Nowhere on host-side, but it can remain non-optimized on target-side, therefore we can not remove it only from the host offload table. -- Ilya
Re: [PATCH 3/n] OpenMP 4.0 offloading infrastructure: offload tables
On Mon, Dec 08, 2014 at 01:28:10PM +0400, Ilya Verbin wrote: > 2014-12-08 12:18 GMT+03:00 Richard Biener : > > On Thu, Dec 4, 2014 at 8:35 PM, Ilya Verbin wrote: > >> Hi, > >> > >> On 30 Sep 18:53, Ilya Verbin wrote: > >>> This patch creates 2 vectors with decls: offload_funcs and offload_vars. > >>> libgomp will use addresses from these arrays to look up offloaded code. > >>> > >>> During the compilation they are outputted to: > >>> * binary __gnu_offload_funcs/vars sections, or using > >>> targetm.record_offload_symbol hook for PTX. > >> > >> In some cases LTO may optimize out a global variable, declared as target, > >> but > >> it still will be referenced from the offload table, that will cause a > >> linking > >> error. Here is the example: > >> > >> #pragma omp declare target > >> int G; > >> #pragma omp end declare target > > > > So where is that "magic" target use then? Why doesn't the symtab > > reachability code see the use? (why don't we prune it from the offload > > table?) > > > > Richard. > > Nowhere on host-side, but it can remain non-optimized on target-side, > therefore we can not remove it only from the host offload table. Note, the two tables have to match, and after streaming the offloading LTO IL, it can't be easily adjusted anymore. Unless we'd allow some magic value (say NULL) for vars that were optimized away on the host side, then libgomp and/or mkoffload would need to be adjusted to ignore or remove pairs from the table where the host var has been optimized away, but the target one not necessarily so. Jakub
Re: [PATCH, CHKP] Don't try to optimize bounds returned by strchr
2014-12-06 1:55 GMT+03:00 Jeff Law : > On 12/02/14 06:40, Ilya Enkovich wrote: >> >> Hi, >> >> For strchr calls bounds of the first argument are considered as returned >> which is wrong because NULL may be returned. This patch fixes that. >> Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? >> >> Thanks, >> Ilya >> -- >> 2014-12-02 Ilya Enkovich >> >> * tree-chkp.c (chkp_build_returned_bound): Don't predict >> return bounds for strchr calls. > > OK. > > Do you have a testcase you could add to the suite? I cannot add a nice test for bounds returned by strchr for NULL until it is wrapped (which will happen eventually). Thanks, Ilya > > Jeff >
Re: [PATCH] [AArch64, RTL] Bics instruction generation for aarch64
On 11/11/14 10:38, Alex Velenko wrote: > From 98bb6d7323ce79e28be8ef892b919391ed857e1f Mon Sep 17 00:00:00 2001 > From: Alex Velenko > Date: Fri, 31 Oct 2014 18:43:32 + > Subject: [PATCH] [AArch64, RTL] Bics instruction generation for aarch64 > > Hi, > > This patch adds rtl patterns for aarch64 to generate bics instructions in > cases when caputed value gets discarded and only only the status regester > change of the instruction gets reused. > > Previously, bics would only be generated, if the value computed by bics > would later be reused, which is not necessarily the case when computing > this value for "if" statements. > > Is this patch ok? > > Thanks, > Alex > > gcc/ > > 2014-11-10 Alex Velenko > > * gcc/config/aarch64/aarch64.md > (and_one_cmpl3_compare0_no_reuse): >New define_insn. > * (and_one_cmpl_3_compare0_no_reuse): >Likewise. > > gcc/testsuite/ > > 2014-11-10 Alex Velenko > > * gcc.target/aarch64/bics1.c : New testcase. OK. R. > --- > gcc/config/aarch64/aarch64.md | 26 > gcc/testsuite/gcc.target/aarch64/bics_3.c | 69 > +++ > 2 files changed, 95 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/aarch64/bics_3.c > > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index 341c26f..6158d82 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -2845,6 +2845,18 @@ > [(set_attr "type" "logics_reg")] > ) > > +(define_insn "*and_one_cmpl3_compare0_no_reuse" > + [(set (reg:CC_NZ CC_REGNUM) > +(compare:CC_NZ > + (and:GPI (not:GPI > + (match_operand:GPI 0 "register_operand" "r")) > + (match_operand:GPI 1 "register_operand" "r")) > + (const_int 0)))] > + "" > + "bics\\tzr, %1, %0" > + [(set_attr "type" "logics_reg")] > +) > + > (define_insn "*_one_cmpl_3" > [(set (match_operand:GPI 0 "register_operand" "=r") > (LOGICAL:GPI (not:GPI > @@ -2894,6 +2906,20 @@ > [(set_attr "type" "logics_shift_imm")] > ) > > +(define_insn "*and_one_cmpl_3_compare0_no_reuse" > + [(set (reg:CC_NZ CC_REGNUM) > +(compare:CC_NZ > + (and:GPI (not:GPI > + (SHIFT:GPI > +(match_operand:GPI 0 "register_operand" "r") > +(match_operand:QI 1 "aarch64_shift_imm_" "n"))) > + (match_operand:GPI 2 "register_operand" "r")) > + (const_int 0)))] > + "" > + "bics\\tzr, %2, %0, %1" > + [(set_attr "type" "logics_shift_imm")] > +) > + > (define_insn "clz2" > [(set (match_operand:GPI 0 "register_operand" "=r") > (clz:GPI (match_operand:GPI 1 "register_operand" "r")))] > diff --git a/gcc/testsuite/gcc.target/aarch64/bics_3.c > b/gcc/testsuite/gcc.target/aarch64/bics_3.c > new file mode 100644 > index 000..ecb53e9 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/bics_3.c > @@ -0,0 +1,69 @@ > +/* { dg-do run } */ > +/* { dg-options "-O2 --save-temps" } */ > + > +extern void abort (void); > + > +int __attribute__ ((noinline)) > +bics_si_test (int a, int b) > +{ > + if (a & ~b) > +return 1; > + else > +return 0; > +} > + > +int __attribute__ ((noinline)) > +bics_si_test2 (int a, int b) > +{ > + if (a & ~ (b << 2)) > +return 1; > + else > +return 0; > +} > + > +typedef long long s64; > + > +int __attribute__ ((noinline)) > +bics_di_test (s64 a, s64 b) > +{ > + if (a & ~b) > +return 1; > + else > +return 0; > +} > + > +int __attribute__ ((noinline)) > +bics_di_test2 (s64 a, s64 b) > +{ > + if (a & ~(b << 2)) > +return 1; > + else > +return 0; > +} > + > +int > +main (void) > +{ > + int a = 5; > + int b = 5; > + int c = 20; > + s64 d = 5; > + s64 e = 5; > + s64 f = 20; > + if (bics_si_test (a, b)) > +abort (); > + if (bics_si_test2 (c, b)) > +abort (); > + if (bics_di_test (d, e)) > +abort (); > + if (bics_di_test2 (f, e)) > +abort (); > + return 0; > +} > + > +/* { dg-final { scan-assembler-times "bics\twzr, w\[0-9\]+, w\[0-9\]+" > 2 } } */ > +/* { dg-final { scan-assembler-times "bics\twzr, w\[0-9\]+, w\[0-9\]+, > lsl 2" 1 } } */ > +/* { dg-final { scan-assembler-times "bics\txzr, x\[0-9\]+, x\[0-9\]+" > 2 } } */ > +/* { dg-final { scan-assembler-times "bics\txzr, x\[0-9\]+, x\[0-9\]+, > lsl 2" 1 } } */ > + > +/* { dg-final { cleanup-saved-temps } } */ >
[match-and-simplify] Merge from trunk
2014-12-08 Richard Biener Merge from trunk r218262 through r218477.
[PATCH] Fix PR ipa/64049
Hello, this surprisingly simple patch fixes PR ipa/64049. The root cause seems to be that in this test case we try to devirtualize a method call on a return value. Boot-Strapped and regression-tested on X86_64-linux-gnu. OK for trunk? Thanks Bernd. patch-pr64049.diff Description: Binary data patch-pr64049.diff Description: Binary data
FW: [PATCH] Fix PR ipa/64049
> > Hello, > > > this surprisingly simple patch fixes PR ipa/64049. The root cause seems to be > that in this test case we try to devirtualize a method call on a return value. > > > Boot-Strapped and regression-tested on X86_64-linux-gnu. > OK for trunk? > > > Thanks > Bernd. > Again with changelog. Sorry. patch-pr64049.diff Description: Binary data 2014-12-08 Bernd Edlinger PR ipa/64049 * ipa-polymorphic-call.c (pa_polymorphic_call_context::ipa_polymorphic_call): Allow RESULT_DECL. testsuite/ChangeLog: 2014-12-08 Bernd Edlinger PR ipa/64049 * g++.dg/ipa/pr64049.h: New. * g++.dg/ipa/pr64049-1.C: New. * g++.dg/ipa/pr64049-2.C: New.
Re: [PATCH, PR63995, CHKP] Use single static bounds var for varpool nodes sharing asm name
Ping 2014-11-26 15:35 GMT+03:00 Ilya Enkovich : > On 25 Nov 15:03, Ilya Enkovich wrote: >> 2014-11-25 14:11 GMT+03:00 Richard Biener : >> > On Tue, Nov 25, 2014 at 11:19 AM, Ilya Enkovich >> > wrote: >> > >> > Ok, then it's get_for_asmname (). That said - the above loops look >> > bogus to me. Honza - any better ideas? >> >> get_for_asmname () returns the first element in a chain of nodes with >> the same asm name. May I rely on the order of nodes in this chain? >> Probably use ASSEMBLER_NAME as a key in chkp_static_var_bounds hash? >> >> Thanks, >> Ilya >> >> > >> > Richard. >> > > > A variant with var's ASSEMBLER_NAME as a key works fine. Instrumented > bootstrap passes. OK for trunk? > > Thanks, > Ilya > -- > gcc/ > > 2014-11-26 Ilya Enkovich > > PR bootstrap/63995 > * tree-chkp.c (chkp_make_static_bounds): Share bounds var > between nodes sharing assembler name. > > gcc/testsuite > > 2014-11-26 Ilya Enkovich > > PR bootstrap/63995 > * g++.dg/dg.exp: Add mpx-dg.exp. > * g++.dg/pr63995-1.C: New. > > > diff --git a/gcc/testsuite/g++.dg/dg.exp b/gcc/testsuite/g++.dg/dg.exp > index 14beae1..44eab0c 100644 > --- a/gcc/testsuite/g++.dg/dg.exp > +++ b/gcc/testsuite/g++.dg/dg.exp > @@ -18,6 +18,7 @@ > > # Load support procs. > load_lib g++-dg.exp > +load_lib mpx-dg.exp > > # If a testcase doesn't have special options, use these. > global DEFAULT_CXXFLAGS > diff --git a/gcc/testsuite/g++.dg/pr63995-1.C > b/gcc/testsuite/g++.dg/pr63995-1.C > new file mode 100644 > index 000..82e7606 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/pr63995-1.C > @@ -0,0 +1,16 @@ > +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */ > +/* { dg-require-effective-target mpx } */ > +/* { dg-options "-O2 -g -fcheck-pointer-bounds -mmpx" } */ > + > +int test1 (int i) > +{ > + extern const int arr[10]; > + return arr[i]; > +} > + > +extern const int arr[10]; > + > +int test2 (int i) > +{ > + return arr[i]; > +} > diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c > index 3e38691..924cb71 100644 > --- a/gcc/tree-chkp.c > +++ b/gcc/tree-chkp.c > @@ -2727,9 +2727,23 @@ chkp_make_static_bounds (tree obj) >/* First check if we already have required var. */ >if (chkp_static_var_bounds) > { > - slot = chkp_static_var_bounds->get (obj); > - if (slot) > - return *slot; > + /* For vars we use assembler name as a key in > +chkp_static_var_bounds map. It allows to > +avoid duplicating bound vars for decls > +sharing assembler name. */ > + if (TREE_CODE (obj) == VAR_DECL) > + { > + tree name = DECL_ASSEMBLER_NAME (obj); > + slot = chkp_static_var_bounds->get (name); > + if (slot) > + return *slot; > + } > + else > + { > + slot = chkp_static_var_bounds->get (obj); > + if (slot) > + return *slot; > + } > } > >/* Build decl for bounds var. */ > @@ -2793,7 +2807,13 @@ chkp_make_static_bounds (tree obj) >if (!chkp_static_var_bounds) > chkp_static_var_bounds = new hash_map; > > - chkp_static_var_bounds->put (obj, bnd_var); > + if (TREE_CODE (obj) == VAR_DECL) > +{ > + tree name = DECL_ASSEMBLER_NAME (obj); > + chkp_static_var_bounds->put (name, bnd_var); > +} > + else > +chkp_static_var_bounds->put (obj, bnd_var); > >return bnd_var; > }
Re: [PATCH, i386] Fix PR64003
On 08 Dec 11:30, Ilya Enkovich wrote: > 2014-12-07 12:51 GMT+03:00 Richard Sandiford : > > Ilya Enkovich writes: > >> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > >> index 88435d6..9019ed8 100644 > >> --- a/gcc/config/i386/i386.md > >> +++ b/gcc/config/i386/i386.md > >> @@ -10958,6 +10958,24 @@ > >> ;; Basic conditional jump instructions. > >> ;; We ignore the overflow flag for signed branch instructions. > >> > >> +(define_insn "*jcc_1_bnd" > >> + [(set (pc) > >> + (if_then_else (match_operator 1 "ix86_comparison_operator" > >> + [(reg FLAGS_REG) (const_int 0)]) > >> + (label_ref (match_operand 0)) > >> + (pc)))] > >> + "TARGET_MPX && ix86_bnd_prefixed_insn_p (insn)" > >> + "bnd %+j%C1\t%l0" > >> + [(set_attr "type" "ibr") > >> + (set_attr "modrm" "0") > >> + (set (attr "length") > >> +(if_then_else (and (ge (minus (match_dup 0) (pc)) > >> + (const_int -126)) > >> + (lt (minus (match_dup 0) (pc)) > >> + (const_int 128))) > >> + (const_int 3) > >> + (const_int 7)))]) > >> + > > > > Sorry, looking at this again, shouldn't the offset be -125 rather > > than -126? The pc in: > > > >(ge (minus (match_dup 0) (pc)) > >(const_int -126)) > > > > is the start of the instruction, so we need to add the length > > of the jump itself to the real minimum range of -128. > > You are right. Similarly upper bound should be changed from 128 to 129. > > Thanks, > Ilya > > > > > Thanks, > > Richard Here is a patch to fix it. Thanks, Ilya -- 2014-12-08 Ilya Enkovich PR target/64003 * config/i386/i386.md (*jcc_1_bnd): Fix short jump range. (*jcc_2_bnd): Likewise. (jump_bnd): Likewise. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 9019ed8..eafba04 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -10970,9 +10970,9 @@ (set_attr "modrm" "0") (set (attr "length") (if_then_else (and (ge (minus (match_dup 0) (pc)) - (const_int -126)) + (const_int -125)) (lt (minus (match_dup 0) (pc)) - (const_int 128))) + (const_int 129))) (const_int 3) (const_int 7)))]) @@ -11006,9 +11006,9 @@ (set_attr "modrm" "0") (set (attr "length") (if_then_else (and (ge (minus (match_dup 0) (pc)) - (const_int -126)) + (const_int -125)) (lt (minus (match_dup 0) (pc)) - (const_int 128))) + (const_int 129))) (const_int 3) (const_int 7)))]) @@ -11464,9 +11464,9 @@ [(set_attr "type" "ibr") (set (attr "length") (if_then_else (and (ge (minus (match_dup 0) (pc)) - (const_int -126)) + (const_int -125)) (lt (minus (match_dup 0) (pc)) - (const_int 128))) + (const_int 129))) (const_int 3) (const_int 6))) (set_attr "modrm" "0")])
[PATCH][wwwdocs] Fix typo in changes.html and minor rewording
Hi all, As the subject says, this just fixes a typo in the fprofile-generate option name and rewords the text in the next sentence a bit. Ok to commit? Thanks, KyrillIndex: htdocs/gcc-5/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v retrieving revision 1.48 diff -U 3 -r1.48 changes.html --- htdocs/gcc-5/changes.html 8 Dec 2014 03:17:31 - 1.48 +++ htdocs/gcc-5/changes.html 8 Dec 2014 10:47:05 - @@ -41,8 +41,8 @@ inline-insns-single limits for hot calls. IPA reference pass was significantly sped up making it feasible to enable -fipa-reference with - -fprofile-generage. This also solve bottleneck - seen when optimizing Chromium with link time optimization. + -fprofile-generate. This also solves a bottleneck + seen when building Chromium with link time optimization. Symbol table and call-graph API was reworked to C++ and simplified.
RE: [PATCH][wwwdocs] Fix typo in changes.html and minor rewording
Hi Kyrill, > Hi all, > > As the subject says, this just fixes a typo in the fprofile-generate > option name and rewords the text in the next sentence a bit. > Ok to commit? > > Thanks, > Kyrill I think this kind of change does not really need approval. But, maybe you should also look at fixing the spelling of "sped": s/sped/speed/ Bernd. --- htdocs/gcc-5/changes.html 8 Dec 2014 03:17:31 - 1.48 +++ htdocs/gcc-5/changes.html 8 Dec 2014 10:47:05 - @@ -41,8 +41,8 @@ inline-insns-single limits for hot calls. IPA reference pass was significantly sped up making it feasible to enable -fipa-reference with
Re: [PATCH][wwwdocs] Fix typo in changes.html and minor rewording
On Mon, Dec 8, 2014 at 11:24 AM, Bernd Edlinger wrote: > Hi Kyrill, > > >> Hi all, >> >> As the subject says, this just fixes a typo in the fprofile-generate >> option name and rewords the text in the next sentence a bit. >> Ok to commit? >> >> Thanks, >> Kyrill > > > > I think this kind of change does not really need approval. > But, maybe you should also look at fixing the spelling of "sped": > s/sped/speed/ sped is past tense and past participle for speed. The usage is correct here. Ramana > > Bernd. > > --- htdocs/gcc-5/changes.html 8 Dec 2014 03:17:31 - 1.48 > +++ htdocs/gcc-5/changes.html 8 Dec 2014 10:47:05 - > @@ -41,8 +41,8 @@ > inline-insns-single limits for hot calls. > IPA reference pass was significantly sped up making it feasible > to enable -fipa-reference with >
[PATCH][4.8] Backport ISL 0.14 support
I have installed the following patch on the 4.8 branch after verifying it still builds ok and enables graphite with system ISL 0.14 and system ISL 0.12.2 (and the corresponding cloog). Richard. 2014-12-08 Richard Biener Backport from 4.9 branch 2014-12-04 Tobias Burnus * configure.ac: Permit also ISL 0.14 with CLooG. * Makefile.def: Make more dependent on mpfr, mpc, isl, and cloog. * Makefile.in: Regenerate. * configure: Regenerate. gcc/ * configure.ac (ac_has_isl_schedule_constraints_compute_schedule): New check. * graphite-clast-to-gimple.c: For ISL 0.14, include deprecate headers. * graphite-interchange.c: Ditto. * graphite-poly.c: Ditto. * graphite-sese-to-poly.c: Ditto. * graphite-optimize-isl.c (getScheduleForBandList): Ditto. Conditionally use ISL 0.13+ functions. * config.in: Regenerate. * configure: Regenerate. Index: configure.ac === --- configure.ac(revision 218391) +++ configure.ac(revision 218392) @@ -1658,6 +1658,9 @@ if test "x$with_isl" != "xno" && ISL_CHECK_VERSION(0,11) if test "${gcc_cv_isl}" = no ; then ISL_CHECK_VERSION(0,12) + if test "${gcc_cv_isl}" = no ; then +ISL_CHECK_VERSION(0,14) + fi fi fi dnl Only execute fail-action, if ISL has been requested. Index: gcc/config.in === --- gcc/config.in (revision 218391) +++ gcc/config.in (revision 218392) @@ -1211,6 +1211,12 @@ #endif +/* Define if isl_schedule_constraints_compute_schedule exists. */ +#ifndef USED_FOR_TARGET +#undef HAVE_ISL_SCHED_CONSTRAINTS_COMPUTE_SCHEDULE +#endif + + /* Define to 1 if you have the `kill' function. */ #ifndef USED_FOR_TARGET #undef HAVE_KILL Index: gcc/configure.ac === --- gcc/configure.ac(revision 218391) +++ gcc/configure.ac(revision 218392) @@ -5495,8 +5495,31 @@ AC_ARG_VAR(CLOOGLIBS,[How to link CLOOG] AC_ARG_VAR(CLOOGINC,[How to find CLOOG include files]) if test "x${CLOOGLIBS}" != "x" ; then AC_DEFINE(HAVE_cloog, 1, [Define if cloog is in use.]) + + # Check whether isl_schedule_constraints_compute_schedule is available; + # it's new in ISL-0.13. + saved_CFLAGS="$CFLAGS" + CFLAGS="$CFLAGS $ISLINC" + saved_LIBS="$LIBS" + LIBS="$LIBS $CLOOGLIBS $ISLLIBS $GMPLIBS" + + AC_MSG_CHECKING([Checking for isl_schedule_constraints_compute_schedule]) + AC_TRY_LINK([#include ], + [isl_schedule_constraints_compute_schedule (NULL);], + [ac_has_isl_schedule_constraints_compute_schedule=yes], + [ac_has_isl_schedule_constraints_compute_schedule=no]) + AC_MSG_RESULT($ac_has_isl_schedule_constraints_compute_schedule) + + LIBS="$saved_LIBS" + CFLAGS="$saved_CFLAGS" + + if test x"$ac_has_isl_schedule_constraints_compute_schedule" = x"yes"; then + AC_DEFINE(HAVE_ISL_SCHED_CONSTRAINTS_COMPUTE_SCHEDULE, 1, + [Define if isl_schedule_constraints_compute_schedule exists.]) + fi fi + # Check for plugin support AC_ARG_ENABLE(plugin, [AS_HELP_STRING([--enable-plugin], [enable plugin support])], Index: gcc/graphite-clast-to-gimple.c === --- gcc/graphite-clast-to-gimple.c (revision 218391) +++ gcc/graphite-clast-to-gimple.c (revision 218392) @@ -30,6 +30,11 @@ along with GCC; see the file COPYING3. #include #include #include +#ifdef HAVE_ISL_SCHED_CONSTRAINTS_COMPUTE_SCHEDULE +#include +#include +#include +#endif #endif #include "system.h" Index: gcc/graphite-optimize-isl.c === --- gcc/graphite-optimize-isl.c (revision 218391) +++ gcc/graphite-optimize-isl.c (revision 218392) @@ -28,6 +28,10 @@ along with GCC; see the file COPYING3. #include #include #include +#ifdef HAVE_ISL_SCHED_CONSTRAINTS_COMPUTE_SCHEDULE +#include +#include +#endif #endif #include "system.h" @@ -373,7 +377,11 @@ getScheduleForBandList (isl_band_list *B { for (i = ScheduleDimensions - 1 ; i >= 0 ; i--) { +#ifdef HAVE_ISL_SCHED_CONSTRAINTS_COMPUTE_SCHEDULE + if (isl_band_member_is_coincident (Band, i)) +#else if (isl_band_member_is_zero_distance (Band, i)) +#endif { isl_map *TileMap; isl_union_map *TileUMap; Index: gcc/graphite-poly.c === --- gcc/graphite-poly.c (revision 218391) +++ gcc/graphite-poly.c (revision 218392) @@ -30,6 +30,10 @@ along with GCC; see the file COPYING3. #include #include #include +#ifdef HAVE_ISL_SCHED_CONSTRAINTS_COMPUTE_SCHEDULE +#include +#include +#endif #endif #include "system.h" I
Re: [Patch, ARM/Thumb1]Add a Thumb1 insn pattern to legalize the instruction that moves pc to low register
On 08/12/14 08:24, Terry Guo wrote: > Hi there, > > When compile below simple code: > > terguo01@terry-pc01:mtpcs-frame$ cat test.c > int main(void) > { > return 0; > } > > I got ICE with option -mtpcs-leaf-frame (no error if remove this option). > > terguo01@terry-pc01:mtpcs-frame$ > /work/terguo01/tools/gcc-arm-none-eabi-5_0-2014q4/bin/arm-none-eabi-gcc > -mtpcs-leaf-frame test.c -c -mcpu=cortex-m0plus -mthumb -da > test.c: In function 'main': > test.c:4:1: error: unrecognizable insn: > } > ^ > (insn 20 19 21 (set (reg:SI 2 r2) > (reg:SI 15 pc)) test.c:2 -1 > (nil)) > test.c:4:1: internal compiler error: in extract_insn, at recog.c:2327 > Please submit a full bug report, > with preprocessed source if appropriate. > See http://gcc.gnu.org/bugs.html\ for instructions. > > This RTL is generated in function thumb1_expand_prologue. The expected insn > pattern is thumb1_movsi_insn in thumb1.md. And instruction like "mov r2, pc" > is a legal instruction. Because gcc returns NO_REG for PC register, so no > valid pattern to match instruction that move pc to low register. This patch > intends to add a new insn pattern to legalize such thing. > > Tested with GCC regression test. No regression. Is it OK to trunk? > > BR, > Terry > > 2014-12-08 Terry Guo terry@arm.com > > * config/arm/predicates.md (pc_register): New to match PC register. > * config/arm/thumb1.md (*thumb1_movpc_insn): New insn pattern. > > gcc/testsuite/ChangeLog: > 2014-12-08 Terry Guo terry@arm.com > > * gcc.target/arm/thumb1-mov-pc.c: New test. > > > thumb1-move-pc-v1.txt > > > diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md > index 032808c..c5ef5ed 100644 > --- a/gcc/config/arm/predicates.md > +++ b/gcc/config/arm/predicates.md > @@ -361,6 +361,10 @@ >(and (match_code "smin,smax,umin,umax") > (match_test "mode == GET_MODE (op)"))) > > +(define_special_predicate "pc_register" > + (and (match_code "reg") > + (match_test "REGNO (op) == PC_REGNUM"))) > + > (define_special_predicate "cc_register" >(and (match_code "reg") > (and (match_test "REGNO (op) == CC_REGNUM") > diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md > index ddedc39..8e6057c 100644 > --- a/gcc/config/arm/thumb1.md > +++ b/gcc/config/arm/thumb1.md > @@ -1780,6 +1780,16 @@ >" > ) > > +(define_insn "*thumb1_movpc_insn" > + [(set (match_operand:SI 0 "low_register_operand") This needs constraints. > +(match_operand:SI 1 "pc_register"))] > + "TARGET_THUMB1" > + "mov\\t%0, pc" > + [(set_attr "length" "2") > + (set_attr "conds" "nocond") > + (set_attr "type" "mov_reg")] > +) > + > ;; NB never uses BX. > (define_insn "*thumb1_tablejump" >[(set (pc) (match_operand:SI 0 "register_operand" "l*r")) > diff --git a/gcc/testsuite/gcc.target/arm/thumb1-mov-pc.c > b/gcc/testsuite/gcc.target/arm/thumb1-mov-pc.c > new file mode 100644 > index 000..9f94131 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/arm/thumb1-mov-pc.c > @@ -0,0 +1,7 @@ > +/* { dg-options "-mtpcs-leaf-frame -O2" } */ > +/* { dg-skip-if "" { ! { arm_thumb1 } } } */ > +int > +main () > +{ > + return 0; > +} >
Re: [PATCH][wwwdocs] Fix typo in changes.html and minor rewording
On 08/12/14 11:26, Ramana Radhakrishnan wrote: On Mon, Dec 8, 2014 at 11:24 AM, Bernd Edlinger wrote: Hi Kyrill, Hi all, As the subject says, this just fixes a typo in the fprofile-generate option name and rewords the text in the next sentence a bit. Ok to commit? Thanks, Kyrill I think this kind of change does not really need approval. But, maybe you should also look at fixing the spelling of "sped": s/sped/speed/ sped is past tense and past participle for speed. The usage is correct here. Thank you both for taking a look. I did consider the typo fix to be obvious, but was on the fence about the rewording part. I'll commit this later today unless someone objects. Kyrill Ramana Bernd. --- htdocs/gcc-5/changes.html 8 Dec 2014 03:17:31 - 1.48 +++ htdocs/gcc-5/changes.html 8 Dec 2014 10:47:05 - @@ -41,8 +41,8 @@ inline-insns-single limits for hot calls. IPA reference pass was significantly sped up making it feasible to enable -fipa-reference with
Re: [PATCH] Mark explicit decls as implicit when we've seen a prototype
On Thu, 4 Dec 2014, Joseph Myers wrote: > On Thu, 4 Dec 2014, Richard Biener wrote: > > > OTOH this also means the user cannot provide a conforming > > implementation on his own and get that used by GCC without editing > > system headers or including a header with -isystem or similar > > tricks. > > Well - you could have a pragma / attribute for that purpose (declaring > "this program is providing a version of function X that has the semantics > GCC expects for function X", so GCC can both generate and optimize calls). > Such a pragma / attribute could also override targetm.libc_has_function > (for the case of the user providing their own definition of something > missing from their system's standard libraries). A related case would be > declaring somehow "I will be linking in libm, even though this translation > unit doesn't appear to be using libm functions, so calls to libm functions > can be implicitly generated", if GCC were made to avoid introducing uses > of libm. (Again, this would be a matter of providing a cleaner interface > rather than something that currently can't be expressed at all - the > proposed definition of what it means to use libm explicitly implies that > "if (0) (void) sqrt (0);" says that libm is being used.) I tried to come up with a patch that adds extra checks but the existing way of having 'implicit' vs. 'explicit' only available as a flag rather than making the explicitely declared builtin decl available makes it harder than necessary. I also run into the issue that we remove all cgraph edges during GIMPLE optimizations thus embedding the "is there already a use of the builtin" in the new abstraction doesn't fly. This means the frontend has to provide a "used" flag to the middle-end as well as a "declared" flag. For providing a declared flag there is already the place where we special-case STPCPY, for providing a "used" flag I don't see any obvious place. I'm not pushing this further for stage3, but for stage1 I'd like to eventually address this by splitting up builtin_info_type's 'implicit_p' into a flags array providing implicit_p, declared_p, used_p and maybe declared_in_system_header_p. Would you be willing to fill in the gap computing "used_p" in the C frontend? My non-working abstraction for middle-end folders looks like /* Return the decl for the builtin function FNCODE or NULL_TREE if it cannot be emitted by GCC. */ tree builtin_decl (enum built_in_function fncode) { tree decl = builtin_decl_implicit (fncode); if (decl) return decl; decl = builtin_decl_explicit (fncode); if (!decl) return NULL_TREE; /* We cannot use a builtin that has not been declared explicitely. */ if (!(builtin_info.flags[fncode] & bif_declared_p)) return NULL_TREE; /* We can use a builtin that has been declared explicitely only if the program contains a reference to it already. */ cgraph_node *node = cgraph_node::get_for_asmname (DECL_ASSEMBLER_NAME (decl)); if (!node || !node->callers) return NULL_TREE; return decl; } which currently fails because of us having removing cgraph edges at the points the function is called. Thus the cgraph caller check would be replaced by a check for 'used_p'. We'd still get __builtin_exp10 emitted instead of the user-declared "exp10" variant (we don't have access to that decl). Without the 'used_p' flag work the rest would be equivalent to setting implicit_p for explicit declarations we see (the originally proposed patch which follows the stpcpy example). Thanks, Richard.
Re: [PATCH] Mark explicit decls as implicit when we've seen a prototype
On Mon, Dec 08, 2014 at 01:24:12PM +0100, Richard Biener wrote: > I'm not pushing this further for stage3, but for stage1 I'd like > to eventually address this by splitting up builtin_info_type's > 'implicit_p' into a flags array providing implicit_p, declared_p, > used_p and maybe declared_in_system_header_p. Would you be > willing to fill in the gap computing "used_p" in the C frontend? The used_p thing might be problematic, I'd expect that several packages use libm functions somewhere in dead code or when it is folded into a constant and don't link with -lm, if those dead or optimized away uses would be counted as uses nevertheless, then if optimizers create new libm references because of those, I'd be afraid such programs wouldn't link anymore. Jakub
Re: FW: [PATCH] Fix PR ipa/64049
On Mon, Dec 8, 2014 at 11:57 AM, Bernd Edlinger wrote: > > >> >> Hello, >> >> >> this surprisingly simple patch fixes PR ipa/64049. The root cause seems to >> be >> that in this test case we try to devirtualize a method call on a return >> value. >> >> >> Boot-Strapped and regression-tested on X86_64-linux-gnu. >> OK for trunk? I think positive tests, in this case TREE_CODE (...) == VAR_DECL are better. Ok with that change. Thanks, Richard. >> >> Thanks >> Bernd. >> > > Again with changelog. Sorry. >
Re: [PATCH] Mark explicit decls as implicit when we've seen a prototype
On Mon, 8 Dec 2014, Jakub Jelinek wrote: > On Mon, Dec 08, 2014 at 01:24:12PM +0100, Richard Biener wrote: > > I'm not pushing this further for stage3, but for stage1 I'd like > > to eventually address this by splitting up builtin_info_type's > > 'implicit_p' into a flags array providing implicit_p, declared_p, > > used_p and maybe declared_in_system_header_p. Would you be > > willing to fill in the gap computing "used_p" in the C frontend? > > The used_p thing might be problematic, I'd expect that several packages > use libm functions somewhere in dead code or when it is folded into > a constant and don't link with -lm, if those dead or optimized away > uses would be counted as uses nevertheless, then if optimizers create new > libm references because of those, I'd be afraid such programs wouldn't link > anymore. Same applies to your STPCPY special-casing, even without introducing a use. The alternative is to decide "used" in the middle-end at one point, for example at the end of all_lowering_passes where hopefully we have constant folded and removed dead code enough. We can also compute an overall "uses libm" flag to fix the testcase I reported (of course we'd like to re-compute that at LTO time). Do you think that's better? It's of course less well-defined what is a "use" of exp10 then (as opposed to what Joseph specified with a reference outside of sizeof() and similar contexts). Richard.
Re: [PATCH] Mark explicit decls as implicit when we've seen a prototype
On Mon, Dec 08, 2014 at 01:54:21PM +0100, Richard Biener wrote: > On Mon, 8 Dec 2014, Jakub Jelinek wrote: > > > On Mon, Dec 08, 2014 at 01:24:12PM +0100, Richard Biener wrote: > > > I'm not pushing this further for stage3, but for stage1 I'd like > > > to eventually address this by splitting up builtin_info_type's > > > 'implicit_p' into a flags array providing implicit_p, declared_p, > > > used_p and maybe declared_in_system_header_p. Would you be > > > willing to fill in the gap computing "used_p" in the C frontend? > > > > The used_p thing might be problematic, I'd expect that several packages > > use libm functions somewhere in dead code or when it is folded into > > a constant and don't link with -lm, if those dead or optimized away > > uses would be counted as uses nevertheless, then if optimizers create new > > libm references because of those, I'd be afraid such programs wouldn't link > > anymore. > > Same applies to your STPCPY special-casing, even without introducing > a use. Well, the important difference there is that stpcpy is in libc, not libm, and you get the former by default usually (-nostdlib is very rare). Yes, supposedly overall uses libc, uses libm flags would be reasonable. > The alternative is to decide "used" in the middle-end at one point, > for example at the end of all_lowering_passes where hopefully > we have constant folded and removed dead code enough. We can also > compute an overall "uses libm" flag to fix the testcase I reported > (of course we'd like to re-compute that at LTO time). > > Do you think that's better? It's of course less well-defined what > is a "use" of exp10 then (as opposed to what Joseph specified > with a reference outside of sizeof() and similar contexts). Jakub
Re: [PATCH] PR other/63613: Add fixincludes for dejagnu.h
Jeff Law writes: > On 12/04/14 15:42, Rainer Orth wrote: >> David Malcolm writes: >> >>> assumed -fgnu89-inline until a recent upstream fix; >>> see http://lists.gnu.org/archive/html/dejagnu/2014-10/msg00011.html >>> >>> Remove the workaround from jit.exp that used -fgnu89-inline >>> in favor of a fixincludes to dejagnu.h that applies the upstream fix >>> to a local copy. >>> >>> This should make it easier to support C++ testcases from jit.exp. >> >> I wonder how this would work if dejagnu.h doesn't live in a system >> include dir (e.g. a self-compiled version)? fixincludes won't touch >> those AFAIU. The previous version with -fgnu89-inline would still work >> in that case provided dejagnu.h is found at all. > Presumably in that case the answer is upgrade dejagnu? :-) I've two problems with this: * There's not yet a DejaGnu release available with the fix and I've no idea if there are any planned any time soon. Not everyone is comfortable with random git (or whatever) snapshots. * I don't consider this a critical issue that cannot work without current releases. We're already working around several upstream DejaGnu issues in our codebase, and I don't consider this particular one important enough to require everyone to upgrade to a not-a-release version. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: Fix libgomp crash without TLS (PR42616)
Hi guys, Could you please take a look at this issue? This fix is still urgent for Android. 2014-12-01 18:25 GMT+03:00 Varvara Rainchik : > Hi Jakub, > > Do you think this patch is ok for upstream: > > 2014-12-01 Varvara Rainchik > > * libgomp/libgomp.h: Eliminate case when HAVE_TLS is not defined: > always use tls emulation. > * libgomp/team.c: Likewise. > > -- > diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h > index a1482cc..a659ebc 100644 > --- a/libgomp/libgomp.h > +++ b/libgomp/libgomp.h > @@ -471,19 +471,11 @@ enum gomp_cancel_kind > > /* ... and here is that TLS data. */ > > -#ifdef HAVE_TLS > extern __thread struct gomp_thread gomp_tls_data; > static inline struct gomp_thread *gomp_thread (void) > { >return &gomp_tls_data; > } > -#else > -extern pthread_key_t gomp_tls_key; > -static inline struct gomp_thread *gomp_thread (void) > -{ > - return pthread_getspecific (gomp_tls_key); > -} > -#endif > > extern struct gomp_task_icv *gomp_new_icv (void); > > diff --git a/libgomp/team.c b/libgomp/team.c > index e6a6d8f..2e8dc47 100644 > --- a/libgomp/team.c > +++ b/libgomp/team.c > @@ -37,12 +37,7 @@ pthread_key_t gomp_thread_destructor; > > > /* This is the libgomp per-thread data structure. */ > -#ifdef HAVE_TLS > __thread struct gomp_thread gomp_tls_data; > -#else > -pthread_key_t gomp_tls_key; > -#endif > - > > /* This structure is used to communicate across pthread_create. */ > > @@ -70,13 +65,7 @@ gomp_thread_start (void *xdata) >void (*local_fn) (void *); >void *local_data; > > -#ifdef HAVE_TLS >thr = &gomp_tls_data; > -#else > - struct gomp_thread local_thr; > - thr = &local_thr; > - pthread_setspecific (gomp_tls_key, thr); > -#endif >gomp_sem_init (&thr->release, 0); > >/* Extract what we need from data. */ > @@ -916,13 +905,6 @@ gomp_team_end (void) > static void __attribute__((constructor)) > initialize_team (void) > { > -#ifndef HAVE_TLS > - static struct gomp_thread initial_thread_tls_data; > - > - pthread_key_create (&gomp_tls_key, NULL); > - pthread_setspecific (gomp_tls_key, &initial_thread_tls_data); > -#endif > - >if (pthread_key_create (&gomp_thread_destructor, gomp_free_thread) != 0) > gomp_fatal ("could not create thread pool destructor."); > } > > Or should I add some extra checks in config/tls.m4? As far as I > understand you have mentioned case when both native tls and tls > emulation are not supported. So, is it sufficient to check that > HAVE_TLS and USE_EMUTLS are not defined to detect this case? > > 2014-11-10 16:15 GMT+03:00 Varvara Rainchik : >> *Ping* >> >> 2014-10-13 14:48 GMT+04:00 Varvara Rainchik : Now, I wonder on which OS and why does config/tls.m4 CHECK_GCC_TLS actually fail? Can you figure that out? >>> >>> On Android check passes with --disable-tls (standard while building >>> gcc for Android as TLS is not supported in bionic) and fails with >>> --enable-tls (i686-linux-android/libgomp/conftest.c:32: undefined >>> reference to `___tls_get_addr'). So, HAVE_TLS is not defined in both >>> cases. >>> If we get rid of HAVE_TLS code altogether, we might lose support of some very old OSes, e.g. some Linux distros with a recent gcc and binutils (so that emutls isn't used), but very old glibc (that doesn't support TLS or supports it incorrectly, think of pre-2002 glibc). So, if we get rid of !HAVE_TLS code in libgomp, it would be nice if config/tls.m4 detected it properly and we'd just fail at configure time. >>> >>> How can we check this in config/tls.m4? Can we just combine tests on >>> TLS and emutls? E.g. check whether HAVE_TLS and USE_EMUTLS are both >>> defined. >>> And if we don't, just make sure that on Android, Darwin and/or M$Win (or whatever other OS you had in mind which does support pthreads, but doesn't support native TLS) find out why HAVE_AS_TLS is not defined (guess config.log should explain that). >>> >>> HAVE_AS_TLS is also not defined for Android as it depends on --enable-tls.
Re: Ping with testcase: [PATCH][AArch64] Fix __builtin_aarch64_absdi, must not fold to ABS_EXPR
On 3 December 2014 at 10:30, Alan Lawrence wrote: > On Wed, Nov 26, 2014 at 04:35:50PM +, James Greenhalgh wrote: >> Why do we want to turn off folding for the V4SF/V2SF/V2DF modes of these >> intrinsics? There should be no difference between the mid-end definition >> and the intrinsic definition of their behaviour. > > Good point. Done. > >> I also note that the integer forms of these now end up as an "abs" RTL >> expression - can we guarantee that preserves the intrinsics behaviour and >> no RTL-folder will come along and mis-optimize? Documentation is vague >> on this point. > > I don't think we can guarantee that indefinitely, but it doesn't seem anyone > has implemented such a rule *at present*. And we'd lose e.g. constant > folding, if we switched to using an AArch64-specific UNSPEC. If anyone does > add such a rule, I'd have some hope that the testcase will catch it... > >> I'm also not convinced by the SIMD_ARG_SCRATCH foo you add. Looking at >> the aarch64.md:absdi2 pattern I can't see why we need that scratch at >> all. It seems we could get away with marking operand 0 early-clobber and >> using it as our scratch register. Then we could drop all the extra >> infrastructure from this patch. > > Hah, good point. Ok, I attach a revised patch, updating the pattern as you > suggest and omitting the infrastructure changes. (I'm leaving the now-unused > qualifier_internal as it stands, as there are arguments both for removing it > and "fixing" it as per previous patch, but I don't think we should hold this > up in the case that we want to have that discussion!) > > Cross-tested check-gcc on aarch64-none-elf, including previously-posted test > of vabs_s8. > > Ok for trunk (with previously-posted testcase) ? > --Alan > > gcc/ChangeLog: > > * config/aarch64/aarch64.md (absdi2): Remove scratch operand by > earlyclobbering result operand. > > * config/aarch64/aarch64-builtins.c (aarch64_types_unop_qualifiers): > Remove final qualifier_internal. > (aarch64_fold_builtin): Stop folding abs builtins, except on floats. OK, and the test case too /Marcus
Re: Fix libgomp crash without TLS (PR42616)
On Mon, Dec 08, 2014 at 04:30:46PM +0300, Varvara Rainchik wrote: > Hi guys, > > Could you please take a look at this issue? This fix is still urgent > for Android. I'm afraid it will break those targets where emutls is not on, but the C library doesn't support TLS. I think it is acceptable not to care about #pragma omp from different pthread_create created threads in that case, but stopping support completely might be too early. So, can you instead arrange for HAVE_TLS to be defined if emutls is supported (check for that during configure)? Jakub
[PATCH 01/10] rs6000: Clobber XER[CA] in all user asm statements
A lot of old user code clobbers the carry bit without telling the compiler about it. This used to just work, because the compiler never used the bit outside of a single RTL instruction. But that will change. Let's clobber the carry bit in every asm; this isn't very expensive. 2014-12-08 Segher Boessenkool gcc/ PR target/64180 * config/rs6000/rs6000.c (TARGET_MD_ASM_CLOBBERS): Define. (rs6000_md_asm_clobbers): New function. --- gcc/config/rs6000/rs6000.c | 16 1 file changed, 16 insertions(+) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index cdb9de9..e58fc81 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -1556,6 +1556,9 @@ static const struct attribute_spec rs6000_attribute_table[] = #undef TARGET_ASM_LOOP_ALIGN_MAX_SKIP #define TARGET_ASM_LOOP_ALIGN_MAX_SKIP rs6000_loop_align_max_skip +#undef TARGET_MD_ASM_CLOBBERS +#define TARGET_MD_ASM_CLOBBERS rs6000_md_asm_clobbers + #undef TARGET_OPTION_OVERRIDE #define TARGET_OPTION_OVERRIDE rs6000_option_override @@ -3146,6 +3149,19 @@ rs6000_builtin_mask_calculate (void) | ((TARGET_LONG_DOUBLE_128) ? RS6000_BTM_LDBL128 : 0)); } +/* Implement TARGET_MD_ASM_CLOBBERS. All asm statements are considered + to clobber the XER[CA] bit because clobbering that bit without telling + the compiler worked just fine with versions of GCC before GCC 5, and + breaking a lot of older code in ways that are hard to track down is + not such a great idea. */ + +static tree +rs6000_md_asm_clobbers (tree, tree, tree clobbers) +{ + tree s = build_string (strlen (reg_names[CA_REGNO]), reg_names[CA_REGNO]); + return tree_cons (NULL_TREE, s, clobbers); +} + /* Override command line options. Mostly we process the processor type and sometimes adjust other TARGET_ options. */ -- 1.8.1.4
[PATCH 03/10] rs6000: Do not use addic in the ctr* splitters
These splitters are only used when for some reason we generated a bdnz loop but we cannot actually use one (CTR is already used for something else, for example). This shouldn't happen often, so making the split sequence one insn longer isn't a big deal. 2014-12-08 Segher Boessenkool gcc/ PR target/64180 * config/rs6000/rs6000.md (*ctr_internal1, *ctr_internal2, *ctr_internal5, *ctr_internal6): Change "r" alternatives to "b". Increase length. (splitters for these): Split to cmp+addi instead of addic. --- gcc/config/rs6000/rs6000.md | 42 -- 1 file changed, 20 insertions(+), 22 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index abf20c3..6f4bafb 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -13496,7 +13496,7 @@ (define_expand "ctr" (define_insn "*ctr_internal1" [(set (pc) - (if_then_else (ne (match_operand:P 1 "register_operand" "c,*r,*r,*r") + (if_then_else (ne (match_operand:P 1 "register_operand" "c,*b,*b,*b") (const_int 1)) (label_ref (match_operand 0 "" "")) (pc))) @@ -13516,11 +13516,11 @@ (define_insn "*ctr_internal1" return \"bdz $+8\;b %l0\"; }" [(set_attr "type" "branch") - (set_attr "length" "*,12,16,16")]) + (set_attr "length" "*,16,20,20")]) (define_insn "*ctr_internal2" [(set (pc) - (if_then_else (ne (match_operand:P 1 "register_operand" "c,*r,*r,*r") + (if_then_else (ne (match_operand:P 1 "register_operand" "c,*b,*b,*b") (const_int 1)) (pc) (label_ref (match_operand 0 "" "" @@ -13540,13 +13540,13 @@ (define_insn "*ctr_internal2" return \"bdnz $+8\;b %l0\"; }" [(set_attr "type" "branch") - (set_attr "length" "*,12,16,16")]) + (set_attr "length" "*,16,20,20")]) ;; Similar but use EQ (define_insn "*ctr_internal5" [(set (pc) - (if_then_else (eq (match_operand:P 1 "register_operand" "c,*r,*r,*r") + (if_then_else (eq (match_operand:P 1 "register_operand" "c,*b,*b,*b") (const_int 1)) (label_ref (match_operand 0 "" "")) (pc))) @@ -13566,11 +13566,11 @@ (define_insn "*ctr_internal5" return \"bdnz $+8\;b %l0\"; }" [(set_attr "type" "branch") - (set_attr "length" "*,12,16,16")]) + (set_attr "length" "*,16,20,20")]) (define_insn "*ctr_internal6" [(set (pc) - (if_then_else (eq (match_operand:P 1 "register_operand" "c,*r,*r,*r") + (if_then_else (eq (match_operand:P 1 "register_operand" "c,*b,*b,*b") (const_int 1)) (pc) (label_ref (match_operand 0 "" "" @@ -13590,7 +13590,7 @@ (define_insn "*ctr_internal6" return \"bdz $+8\;b %l0\"; }" [(set_attr "type" "branch") - (set_attr "length" "*,12,16,16")]) + (set_attr "length" "*,16,20,20")]) ;; Now the splitters if we could not allocate the CTR register @@ -13606,13 +13606,12 @@ (define_split (clobber (match_scratch:CC 3 "")) (clobber (match_scratch:P 4 ""))] "reload_completed" - [(parallel [(set (match_dup 3) - (compare:CC (plus:P (match_dup 1) - (const_int -1)) - (const_int 0))) - (set (match_dup 0) - (plus:P (match_dup 1) - (const_int -1)))]) + [(set (match_dup 3) + (compare:CC (match_dup 1) + (const_int 1))) + (set (match_dup 0) + (plus:P (match_dup 1) + (const_int -1))) (set (pc) (if_then_else (match_dup 7) (match_dup 5) (match_dup 6)))] @@ -13632,13 +13631,12 @@ (define_split (clobber (match_scratch:CC 3 "")) (clobber (match_scratch:P 4 ""))] "reload_completed && ! gpc_reg_operand (operands[0], SImode)" - [(parallel [(set (match_dup 3) - (compare:CC (plus:P (match_dup 1) - (const_int -1)) - (const_int 0))) - (set (match_dup 4) - (plus:P (match_dup 1) - (const_int -1)))]) + [(set (match_dup 3) + (compare:CC (match_dup 1) + (const_int 1))) + (set (match_dup 4) + (plus:P (match_dup 1) + (const_int -1))) (set (match_dup 0) (match_dup 4)) (set (pc) (if_then_else (match_dup 7) -- 1.8.1.4
[PATCH 00/10] rs6000: Carry bit improvements
This series makes the PowerPC port deal with the {add,subf}{c,ic,e,ze,me} instructions directly, not as bigger primitives (which used to be emitted as multiple machine instructions). It does this by expanding multi-precision add/sub/neg as well as the unsigned wordlength sCOND to separate machine insns, which are then simplified where possible by the combine pass and friends. For eq/ne this series splits after combine. This is PR target/64180. All patches bootstrapped and tested on powerpc64-linux, runtestflags -m32,-m64,-m32/-mpowerpc64,-m64/-mlra; also a lot of tiny tests that I'll submit separately. Comments? Is this okay for mainline? Segher Segher Boessenkool (10): rs6000: Clobber XER[CA] in all user asm statements rs6000: Remove addic alternative from various lo_sum patterns rs6000: Do not use addic in the ctr* splitters rs6000: Remove addic from the normal add pattern rs6000: Merge and split add and addic rs6000: New add/subf carry insns rs6000: The big carry insns addition rs6000: compare->two for rld*;rld*. rs6000: Remove the (now unused) insn type "compare" rs6000: Get rid of the weird decimal thing in add gcc/config/rs6000/40x.md |2 +- gcc/config/rs6000/440.md |2 +- gcc/config/rs6000/476.md |2 +- gcc/config/rs6000/601.md |2 +- gcc/config/rs6000/603.md |2 +- gcc/config/rs6000/6xx.md |2 +- gcc/config/rs6000/7450.md |2 +- gcc/config/rs6000/7xx.md |2 +- gcc/config/rs6000/8540.md |2 +- gcc/config/rs6000/cell.md | 10 +- gcc/config/rs6000/darwin.md | 16 +- gcc/config/rs6000/e300c2c3.md |2 +- gcc/config/rs6000/e500mc.md |2 +- gcc/config/rs6000/e500mc64.md |2 +- gcc/config/rs6000/e5500.md|2 +- gcc/config/rs6000/e6500.md|2 +- gcc/config/rs6000/mpc.md |2 +- gcc/config/rs6000/power4.md |5 +- gcc/config/rs6000/power5.md |5 +- gcc/config/rs6000/power6.md |5 +- gcc/config/rs6000/power7.md |5 +- gcc/config/rs6000/power8.md |8 +- gcc/config/rs6000/predicates.md | 14 + gcc/config/rs6000/rs6000-protos.h |1 + gcc/config/rs6000/rs6000.c| 49 +- gcc/config/rs6000/rs6000.md | 1909 - gcc/config/rs6000/rs64.md |2 +- gcc/config/rs6000/spe.md | 18 +- 28 files changed, 733 insertions(+), 1344 deletions(-) -- 1.8.1.4
[PATCH 02/10] rs6000: Remove addic alternative from various lo_sum patterns
This means we do not allow GPR0 as base address of those anymore. The alternative is to not allow the carry bit to be live over any lo_sum, which is more expensive. 2014-12-08 Segher Boessenkool gcc/ PR target/64180 * config/rs6000/darwin.md (macho_low_si): Remove "r" alternative. (macho_low_di): Ditto. * config/rs6000/rs6000.md (*largetoc_low): Ditto. (tocref): Ditto. (elf_low): Ditto. * config/rs6000/spe.md (mov_si_e500_subreg0_elf_low_be): Ditto. (mov_si_e500_subreg0_elf_low_le): Ditto. (mov_si_e500_subreg4_elf_low_be): Ditto. Reformat condition. (mov_si_e500_subreg4_elf_low_le): Ditto. --- gcc/config/rs6000/darwin.md | 16 ++-- gcc/config/rs6000/rs6000.md | 18 +++--- gcc/config/rs6000/spe.md| 18 +- 3 files changed, 22 insertions(+), 30 deletions(-) diff --git a/gcc/config/rs6000/darwin.md b/gcc/config/rs6000/darwin.md index 8b816b7..764f847 100644 --- a/gcc/config/rs6000/darwin.md +++ b/gcc/config/rs6000/darwin.md @@ -213,22 +213,18 @@ (define_expand "macho_low" }) (define_insn "macho_low_si" - [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r") - (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,!*r") + [(set (match_operand:SI 0 "gpc_reg_operand" "=r") + (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b") (match_operand 2 "" "")))] "TARGET_MACHO && ! TARGET_64BIT" - "@ -la %0,lo16(%2)(%1) -addic %0,%1,lo16(%2)") + "la %0,lo16(%2)(%1)") (define_insn "macho_low_di" - [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r") - (lo_sum:DI (match_operand:DI 1 "gpc_reg_operand" "b,!*r") + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") + (lo_sum:DI (match_operand:DI 1 "gpc_reg_operand" "b") (match_operand 2 "" "")))] "TARGET_MACHO && TARGET_64BIT" - "@ -la %0,lo16(%2)(%1) -addic %0,%1,lo16(%2)") + "la %0,lo16(%2)(%1)") (define_split [(set (mem:V4SI (plus:DI (match_operand:DI 0 "gpc_reg_operand" "") diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index f3b5aae..abf20c3 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -10712,13 +10712,11 @@ (define_insn "*largetoc_high_plus_aix" "addis %0,%1+%3@u(%2)") (define_insn "*largetoc_low" - [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r") -(lo_sum:DI (match_operand:DI 1 "gpc_reg_operand" "b,!*r") + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") +(lo_sum:DI (match_operand:DI 1 "gpc_reg_operand" "b") (match_operand:DI 2 "" "")))] "TARGET_ELF && TARGET_CMODEL != CMODEL_SMALL" - "@ -addi %0,%1,%2@l -addic %0,%1,%2@l") + "addi %0,%1,%2@l") (define_insn "*largetoc_low_aix" [(set (match_operand:P 0 "gpc_reg_operand" "=r") @@ -10728,7 +10726,7 @@ (define_insn "*largetoc_low_aix" "la %0,%2@l(%1)") (define_insn_and_split "*tocref" - [(set (match_operand:P 0 "gpc_reg_operand" "=b*r") + [(set (match_operand:P 0 "gpc_reg_operand" "=b") (match_operand:P 1 "small_toc_ref" "R"))] "TARGET_TOC" "la %0,%a1" @@ -10747,13 +10745,11 @@ (define_insn "elf_high" "lis %0,%1@ha") (define_insn "elf_low" - [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r") - (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,!*r") + [(set (match_operand:SI 0 "gpc_reg_operand" "=r") + (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b") (match_operand 2 "" "")))] "TARGET_ELF && ! TARGET_64BIT" - "@ -la %0,%2@l(%1) -addic %0,%1,%K2") + "la %0,%2@l(%1)") ;; Call and call_value insns (define_expand "call" diff --git a/gcc/config/rs6000/spe.md b/gcc/config/rs6000/spe.md index 8eec7b7..07c293c 100644 --- a/gcc/config/rs6000/spe.md +++ b/gcc/config/rs6000/spe.md @@ -2519,7 +2519,7 @@ (define_insn "*mov_si_e500_subreg0_le" (define_insn_and_split "*mov_si_e500_subreg0_elf_low_be" [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r") 0) - (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "r") + (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b") (match_operand 2 "" "")))] "WORDS_BIG_ENDIAN && (((TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) @@ -2538,13 +2538,13 @@ (define_insn_and_split "*mov_si_e500_subreg0_elf_low_be" (define_insn "*mov_si_e500_subreg0_elf_low_le" [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r") 0) - (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "r") + (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b") (match_operand 2 "" "")))] "!WORDS_BIG_ENDIAN && (((TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) || (TARGET_SPE && mode != DFmode && mode != TFmode)) && TARGET_ELF && !TARGET_64BIT)" - "addic %0,%1,%K2") + "addi %0,%1,%K2") ;; ??? Could use evstwwe for memory stores in some cases, dep
Re: [PATCH 01/10] rs6000: Clobber XER[CA] in all user asm statements
On Mon, Dec 8, 2014 at 9:18 AM, Segher Boessenkool wrote: > A lot of old user code clobbers the carry bit without telling the compiler > about it. This used to just work, because the compiler never used the bit > outside of a single RTL instruction. But that will change. Let's clobber > the carry bit in every asm; this isn't very expensive. > > > 2014-12-08 Segher Boessenkool > > gcc/ > PR target/64180 > * config/rs6000/rs6000.c (TARGET_MD_ASM_CLOBBERS): Define. > (rs6000_md_asm_clobbers): New function. Okay. I don't know of another option that preserves backward compatibility. If RTH looks at this series of patches, he may have a better suggestion. Thanks, David
Re: [PATCH 02/10] rs6000: Remove addic alternative from various lo_sum patterns
On Mon, Dec 8, 2014 at 9:18 AM, Segher Boessenkool wrote: > This means we do not allow GPR0 as base address of those anymore. The > alternative is to not allow the carry bit to be live over any lo_sum, > which is more expensive. > > > 2014-12-08 Segher Boessenkool > > gcc/ > PR target/64180 > * config/rs6000/darwin.md (macho_low_si): Remove "r" alternative. > (macho_low_di): Ditto. > * config/rs6000/rs6000.md (*largetoc_low): Ditto. > (tocref): Ditto. > (elf_low): Ditto. > * config/rs6000/spe.md (mov_si_e500_subreg0_elf_low_be): Ditto. > (mov_si_e500_subreg0_elf_low_le): Ditto. > (mov_si_e500_subreg4_elf_low_be): Ditto. Reformat condition. > (mov_si_e500_subreg4_elf_low_le): Ditto. Okay. Thanks, David
Re: [PATCH 03/10] rs6000: Do not use addic in the ctr* splitters
On Mon, Dec 8, 2014 at 9:18 AM, Segher Boessenkool wrote: > These splitters are only used when for some reason we generated a bdnz > loop but we cannot actually use one (CTR is already used for something > else, for example). This shouldn't happen often, so making the split > sequence one insn longer isn't a big deal. > > > 2014-12-08 Segher Boessenkool > > gcc/ > PR target/64180 > * config/rs6000/rs6000.md (*ctr_internal1, *ctr_internal2, > *ctr_internal5, *ctr_internal6): Change "r" alternatives > to "b". Increase length. > (splitters for these): Split to cmp+addi instead of addic. Okay. Thanks, David
[PATCH] DWARFv5 Emit DW_TAG_atomic_type for C11 _Atomic.
This implements the DW_TAG_atomic_type for C11 _Atomic proposal as adopted in the latest DWARF5 draft. http://dwarfstd.org/ShowIssue.php?issue=131112.1 This is much simpler than my previous patch. Thanks to Andreas cleanups for PR 63300 adding new qualifier support to dwarf2out.c is almost trivial. Now that we have experimental -gdwarf-5 support we can emit this new TAG just when the user explictly asks for it. I do have a corresponding GDB patch to take advantage of the new information. gcc/ChangeLog PR debug/60782 * dwarf2out.c (modified_type_die): Handle TYPE_QUAL_ATOMIC. gcc/testsuite/ChangeLog PR debug/60782 * gcc.dg/debug/dwarf2/atomic.c: New test. * gcc.dg/debug/dwarf2/stacked-qualified-types-3.c: Likewise. include/ChangeLog PR debug/60782 * dwarf2.def: Add DWARFv5 DW_TAG_atomic_type. --- gcc/ChangeLog | 5 gcc/dwarf2out.c| 7 - gcc/testsuite/ChangeLog| 6 gcc/testsuite/gcc.dg/debug/dwarf2/atomic.c | 15 ++ .../debug/dwarf2/stacked-qualified-types-3.c | 34 ++ include/ChangeLog | 5 include/dwarf2.def | 2 ++ 7 files changed, 73 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/atomic.c create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/stacked-qualified-types-3.c diff --git a/gcc/ChangeLog b/gcc/ChangeLog index b2eb950..b2fe45c 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2014-12-08 Mark Wielaard + + PR debug/60782 + * dwarf2out.c (modified_type_die): Handle TYPE_QUAL_ATOMIC. + 2014-12-02 Dmitry Vyukov * asan.c: (asan_finish_file): Use default priority for constructors diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index ca1e3ef..34b327e 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -10551,7 +10551,7 @@ modified_type_die (tree type, int cv_quals, dw_die_ref context_die) dw_die_ref mod_scope; /* Only these cv-qualifiers are currently handled. */ const int cv_qual_mask = (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE - | TYPE_QUAL_RESTRICT); + | TYPE_QUAL_RESTRICT | TYPE_QUAL_ATOMIC); if (code == ERROR_MARK) return NULL; @@ -10564,6 +10564,10 @@ modified_type_die (tree type, int cv_quals, dw_die_ref context_die) if (dwarf_version < 3) cv_quals &= ~TYPE_QUAL_RESTRICT; + /* Likewise for DW_TAG_atomic_type for DWARFv5. */ + if (dwarf_version < 5) +cv_quals &= ~TYPE_QUAL_ATOMIC; + /* See if we already have the appropriately qualified variant of this type. */ qualified_type = get_qualified_type (type, cv_quals); @@ -10625,6 +10629,7 @@ modified_type_die (tree type, int cv_quals, dw_die_ref context_die) struct qual_info { int q; enum dwarf_tag t; }; static const struct qual_info qual_info[] = { + { TYPE_QUAL_ATOMIC, DW_TAG_atomic_type }, { TYPE_QUAL_RESTRICT, DW_TAG_restrict_type }, { TYPE_QUAL_VOLATILE, DW_TAG_volatile_type }, { TYPE_QUAL_CONST, DW_TAG_const_type }, diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index f2e54ef..e983a07 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,9 @@ +2014-12-08 Mark Wielaard + + PR debug/60782 + * gcc.dg/debug/dwarf2/atomic.c: New test. + * gcc.dg/debug/dwarf2/stacked-qualified-types-3.c: Likewise. + 2014-12-02 Uros Bizjak * gcc.target/i386/avx512ifma-vpmaddhuq-2.c: Define AVX512IFMA. diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/atomic.c b/gcc/testsuite/gcc.dg/debug/dwarf2/atomic.c new file mode 100644 index 000..187e695 --- /dev/null +++ b/gcc/testsuite/gcc.dg/debug/dwarf2/atomic.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-std=c11 -gdwarf-5 -dA" } */ + +struct Atomics +{ + _Atomic(int) counter; + struct Pointer + { +_Atomic volatile char *p; + } p; +}; + +struct Atomics a; + +/* { dg-final { scan-assembler-times "DIE \\(\[^\n\]*\\) DW_TAG_atomic_type" 2 } } */ diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/stacked-qualified-types-3.c b/gcc/testsuite/gcc.dg/debug/dwarf2/stacked-qualified-types-3.c new file mode 100644 index 000..efa3fa0 --- /dev/null +++ b/gcc/testsuite/gcc.dg/debug/dwarf2/stacked-qualified-types-3.c @@ -0,0 +1,34 @@ +/* make sure we don't duplicate type qualifiers unneeded. */ +/* { dg-do compile } */ +/* { dg-options "-std=c11 -gdwarf-5 -dA" } */ + +/* This should give us: + - One const type pointing to a char + - One volatile type pointing to a char + - One atomic type pointing to a char + - Either one const type pointing to the volatile type pointing to a char + or one volatile type pointing to the const type pointing to a char. + But not both. + - Either one volatile type
Re: [PATCH] DWARFv5 Emit DW_TAG_atomic_type for C11 _Atomic.
OK. Jason
Re: [PATCH 04/10] rs6000: Remove addic from the normal add pattern
On Mon, Dec 8, 2014 at 9:18 AM, Segher Boessenkool wrote: > This means we can no longer add GPR0+imm. Register allocation will have > to use a different register. > > > 2014-12-08 Segher Boessenkool > > gcc/ > PR target/64180 > * config/rs6000/rs6000.md (*add3_internal1): Remove addic > alternative. > > --- > gcc/config/rs6000/rs6000.md | 9 +++-- > 1 file changed, 3 insertions(+), 6 deletions(-) > > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index 6f4bafb..1647f8b 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -1491,17 +1491,14 @@ (define_expand "add3" > } > }) > > -;; Discourage ai/addic because of carry but provide it in an alternative > -;; allowing register zero as source. > (define_insn "*add3_internal1" > - [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,?r,r") > - (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,r,b") > - (match_operand:GPR 2 "add_operand" "r,I,I,L")))] > + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r") > + (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b") > + (match_operand:GPR 2 "add_operand" "r,I,L")))] >"!DECIMAL_FLOAT_MODE_P (GET_MODE (operands[0])) && !DECIMAL_FLOAT_MODE_P > (GET_MODE (operands[1]))" >"@ > add %0,%1,%2 > addi %0,%1,%2 > - addic %0,%1,%2 > addis %0,%1,%v2" >[(set_attr "type" "add")]) > > -- > 1.8.1.4 > Why are you removing the alternative instead of clobbering XER[CA]? Thanks, David
Re: match.pd tweaks for vectors and issues with HONOR_NANS
On Tue, 18 Nov 2014, Richard Biener wrote: I'll try to replace some more TYPE_MODE during stage3... Btw, a convenience would be to be able to write HONOR_NANS (type) thus effectively make HONOR_* inline functions with a machine_mode and a type overload (and the type overload properly looking at element types). Making those functions inline is not easy, because real.h and tree.h don't include each other. Here is a version with the functions not inline. I was tempted to also overload on gcond const* (for the cases that call gimple_cond_lhs) but the arguments were always gimple and not gcond*, so I didn't. Passes bootstrap+testsuite on x86_64-linux-gnu. 2014-12-08 Marc Glisse * real.h (HONOR_NANS): Replace macro with 3 overloaded declarations. * real.c: Include rtl.h and options.h. (HONOR_NANS): Define three overloads. * builtins.c (fold_builtin_classify, fold_builtin_unordered_cmp): Simplify argument of HONOR_NANS. * fold-const.c (combine_comparisons, fold_truth_not_expr, fold_cond_expr_with_comparison, merge_truthop_with_opposite_arm, fold_comparison, fold_binary_loc): Likewise. * ifcvt.c (noce_try_move, noce_try_minmax): Likewise. * ipa-inline-analysis.c (add_clause, set_cond_stmt_execution_predicate): Likewise. * match.pd: Likewise. * rtlanal.c (may_trap_p_1): Likewise. * simplify-rtx.c (simplify_const_relational_operation): Likewise. * tree-if-conv.c (parse_predicate): Likewise. * tree-ssa-ccp.c (valid_lattice_transition): Likewise. * tree-ssa-ifcombine.c (ifcombine_ifandif): Likewise. * tree-ssa-phiopt.c (minmax_replacement, neg_replacement): Likewise. * tree-ssa-reassoc.c (eliminate_using_constants): Likewise. * tree-ssa-tail-merge.c (gimple_equal_p): Likewise. -- Marc GlisseIndex: builtins.c === --- builtins.c (revision 218467) +++ builtins.c (working copy) @@ -9641,34 +9641,34 @@ fold_builtin_classify (location_t loc, t integer_minus_one_node, integer_one_node); tmp = fold_build3_loc (loc, COND_EXPR, integer_type_node, isinf_call, tmp, integer_zero_node); } return tmp; } case BUILT_IN_ISFINITE: - if (!HONOR_NANS (TYPE_MODE (TREE_TYPE (arg))) + if (!HONOR_NANS (arg) && !HONOR_INFINITIES (TYPE_MODE (TREE_TYPE (arg return omit_one_operand_loc (loc, type, integer_one_node, arg); if (TREE_CODE (arg) == REAL_CST) { r = TREE_REAL_CST (arg); return real_isfinite (&r) ? integer_one_node : integer_zero_node; } return NULL_TREE; case BUILT_IN_ISNAN: - if (!HONOR_NANS (TYPE_MODE (TREE_TYPE (arg + if (!HONOR_NANS (arg)) return omit_one_operand_loc (loc, type, integer_zero_node, arg); if (TREE_CODE (arg) == REAL_CST) { r = TREE_REAL_CST (arg); return real_isnan (&r) ? integer_one_node : integer_zero_node; } arg = builtin_save_expr (arg); return fold_build2_loc (loc, UNORDERED_EXPR, type, arg, arg); @@ -9782,27 +9782,26 @@ fold_builtin_unordered_cmp (location_t l else if (code0 == REAL_TYPE && code1 == INTEGER_TYPE) cmp_type = type0; else if (code0 == INTEGER_TYPE && code1 == REAL_TYPE) cmp_type = type1; arg0 = fold_convert_loc (loc, cmp_type, arg0); arg1 = fold_convert_loc (loc, cmp_type, arg1); if (unordered_code == UNORDERED_EXPR) { - if (!HONOR_NANS (TYPE_MODE (TREE_TYPE (arg0 + if (!HONOR_NANS (arg0)) return omit_two_operands_loc (loc, type, integer_zero_node, arg0, arg1); return fold_build2_loc (loc, UNORDERED_EXPR, type, arg0, arg1); } - code = HONOR_NANS (TYPE_MODE (TREE_TYPE (arg0))) ? unordered_code - : ordered_code; + code = HONOR_NANS (arg0) ? unordered_code : ordered_code; return fold_build1_loc (loc, TRUTH_NOT_EXPR, type, fold_build2_loc (loc, code, type, arg0, arg1)); } /* Fold __builtin_{,s,u}{add,sub,mul}{,l,ll}_overflow, either into normal arithmetics if it can never overflow, or into internal functions that return both result of arithmetics and overflowed boolean flag in a complex integer result, or some other check for overflow. */ static tree Index: fold-const.c === --- fold-const.c(revision 218467) +++ fold-const.c(working copy) @@ -2585,21 +2585,21 @@ compcode_to_comparison (enum comparison_ and RCODE on the identical operands LL_ARG and LR_ARG. Take into account the possibility of trapping if the mode has NaNs, and return NULL_TREE if this makes the transformation invalid. */ tree
[PATCH 10/10] rs6000: Get rid of the weird decimal thing in add
Peter tells me it was an artifact of old versions of the DFP code. The condition can never be false; delete it. 2014-12-08 Segher Boessenkool gcc/ * config/rs6000/rs6000.md (*add3): Remove condition. --- gcc/config/rs6000/rs6000.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index a544e57..6724819 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -1506,7 +1506,7 @@ (define_insn "*add3" [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r") (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b") (match_operand:GPR 2 "add_operand" "r,I,L")))] - "!DECIMAL_FLOAT_MODE_P (GET_MODE (operands[0])) && !DECIMAL_FLOAT_MODE_P (GET_MODE (operands[1]))" + "" "@ add %0,%1,%2 addi %0,%1,%2 -- 1.8.1.4
[PATCH 05/10] rs6000: Merge and split add and addic
This splits the addic patterns from the add/addi patterns, and merges the insns and splitters. Also change type "compare" to "add" for "addic." (the dot form). 2014-12-08 Segher Boessenkool gcc/ PR target/64180 * config/rs6000/rs6000.md (*add3_internal1): Rename to "*add3". (*add3_internal2, *add3_internal3, and (their splitters): Delete. (*add3_dot, *add3_dot2): New. (*add3_imm_dot, *add3_imm_dot2): New. --- gcc/config/rs6000/rs6000.md | 119 +++- 1 file changed, 74 insertions(+), 45 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 1647f8b..dcdb7c1 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -1491,7 +1491,7 @@ (define_expand "add3" } }) -(define_insn "*add3_internal1" +(define_insn "*add3" [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r") (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b") (match_operand:GPR 2 "add_operand" "r,I,L")))] @@ -1510,70 +1510,99 @@ (define_insn "addsi3_high" "addis %0,%1,ha16(%2)" [(set_attr "type" "add")]) -(define_insn "*add3_internal2" - [(set (match_operand:CC 0 "cc_reg_operand" "=x,x,?y,?y") - (compare:CC (plus:P (match_operand:P 1 "gpc_reg_operand" "%r,r,r,r") - (match_operand:P 2 "reg_or_short_operand" "r,I,r,I")) +(define_insn_and_split "*add3_dot" + [(set (match_operand:CC 3 "cc_reg_operand" "=x,?y") + (compare:CC (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r") + (match_operand:GPR 2 "gpc_reg_operand" "r,r")) (const_int 0))) - (clobber (match_scratch:P 3 "=r,r,r,r"))] - "" + (clobber (match_scratch:GPR 0 "=r,r"))] + "mode == Pmode" "@ - add. %3,%1,%2 - addic. %3,%1,%2 - # + add. %0,%1,%2 #" - [(set_attr "type" "add,compare,compare,compare") + "&& reload_completed && cc_reg_not_cr0_operand (operands[3], CCmode)" + [(set (match_dup 0) + (plus:GPR (match_dup 1) +(match_dup 2))) + (set (match_dup 3) + (compare:CC (match_dup 0) + (const_int 0)))] + "" + [(set_attr "type" "add") (set_attr "dot" "yes") - (set_attr "length" "4,4,8,8")]) + (set_attr "length" "4,8")]) -(define_split - [(set (match_operand:CC 0 "cc_reg_not_cr0_operand" "") - (compare:CC (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "") - (match_operand:GPR 2 "reg_or_short_operand" "")) +(define_insn_and_split "*add3_dot2" + [(set (match_operand:CC 3 "cc_reg_operand" "=x,?y") + (compare:CC (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r") + (match_operand:GPR 2 "gpc_reg_operand" "r,r")) (const_int 0))) - (clobber (match_scratch:GPR 3 ""))] - "reload_completed" - [(set (match_dup 3) + (set (match_operand:GPR 0 "gpc_reg_operand" "=r,r") (plus:GPR (match_dup 1) -(match_dup 2))) - (set (match_dup 0) - (compare:CC (match_dup 3) + (match_dup 2)))] + "mode == Pmode" + "@ + add. %0,%1,%2 + #" + "&& reload_completed && cc_reg_not_cr0_operand (operands[3], CCmode)" + [(set (match_dup 0) + (plus:GPR (match_dup 1) + (match_dup 2))) + (set (match_dup 3) + (compare:CC (match_dup 0) (const_int 0)))] - "") + "" + [(set_attr "type" "add") + (set_attr "dot" "yes") + (set_attr "length" "4,8")]) -(define_insn "*add3_internal3" - [(set (match_operand:CC 3 "cc_reg_operand" "=x,x,?y,?y") - (compare:CC (plus:P (match_operand:P 1 "gpc_reg_operand" "%r,r,r,r") - (match_operand:P 2 "reg_or_short_operand" "r,I,r,I")) +(define_insn_and_split "*add3_imm_dot" + [(set (match_operand:CC 3 "cc_reg_operand" "=x,?y") + (compare:CC (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,r") + (match_operand:GPR 2 "short_cint_operand" "I,I")) (const_int 0))) - (set (match_operand:P 0 "gpc_reg_operand" "=r,r,r,r") - (plus:P (match_dup 1) - (match_dup 2)))] - "" + (clobber (match_scratch:GPR 0 "=r,r")) + (clobber (reg:GPR CA_REGNO))] + "mode == Pmode" "@ - add. %0,%1,%2 addic. %0,%1,%2 - # #" - [(set_attr "type" "add,compare,compare,compare") + "&& reload_completed && cc_reg_not_cr0_operand (operands[3], CCmode)" + [(set (match_dup 0) + (plus:GPR (match_dup 1) + (match_dup 2))) + (set (match_dup 3) + (compare:CC (match_dup 0) + (const_int 0)))] + "" + [(set_attr "type" "add") (set_attr "dot" "yes") - (set_attr "length" "4,4,8,8")]) + (set_attr "length" "4,8")]) -(define_split - [(set (match_operand:CC 3 "cc_reg_not_cr0_operand" "") - (compare:CC (plus:P (match_operand:P 1 "gpc_reg_operand" "") - (match_oper
[PATCH 08/10] rs6000: compare->two for rld*;rld*.
These now are the only remaining patterns that have type "compare". And they shouldn't: "two" is a better type for them. 2014-12-08 Segher Boessenkool gcc/ * config/rs6000/rs6000.md (*anddi3_2rld_dot, *anddi3_rld_dot2): Change type from "compare" to "two". --- gcc/config/rs6000/rs6000.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index eb15fff..d0b8bc8 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -7341,7 +7341,7 @@ (define_insn_and_split "*anddi3_2rld_dot" { build_mask64_2_operands (operands[2], &operands[4]); } - [(set_attr "type" "compare") + [(set_attr "type" "two") (set_attr "dot" "yes") (set_attr "length" "8,12")]) @@ -7374,7 +7374,7 @@ (define_insn_and_split "*anddi3_2rld_dot2" { build_mask64_2_operands (operands[2], &operands[4]); } - [(set_attr "type" "compare") + [(set_attr "type" "two") (set_attr "dot" "yes") (set_attr "length" "8,12")]) -- 1.8.1.4
[PATCH 04/10] rs6000: Remove addic from the normal add pattern
This means we can no longer add GPR0+imm. Register allocation will have to use a different register. 2014-12-08 Segher Boessenkool gcc/ PR target/64180 * config/rs6000/rs6000.md (*add3_internal1): Remove addic alternative. --- gcc/config/rs6000/rs6000.md | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 6f4bafb..1647f8b 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -1491,17 +1491,14 @@ (define_expand "add3" } }) -;; Discourage ai/addic because of carry but provide it in an alternative -;; allowing register zero as source. (define_insn "*add3_internal1" - [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,?r,r") - (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,r,b") - (match_operand:GPR 2 "add_operand" "r,I,I,L")))] + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r") + (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b") + (match_operand:GPR 2 "add_operand" "r,I,L")))] "!DECIMAL_FLOAT_MODE_P (GET_MODE (operands[0])) && !DECIMAL_FLOAT_MODE_P (GET_MODE (operands[1]))" "@ add %0,%1,%2 addi %0,%1,%2 - addic %0,%1,%2 addis %0,%1,%v2" [(set_attr "type" "add")]) -- 1.8.1.4
[PATCH 06/10] rs6000: New add/subf carry insns
This implements addc, addic, adde, addze, addme, subfc, subfic, subfe, subfze, subfme. Some of those in multiple forms: canonical RTL for the "immediate" insns has four forms; subfX has a special form for subtracting a register from itself. All these new insns get type "add" for now. 2014-12-08 Segher Boessenkool gcc/ PR target/64180 * config/rs6000/predicates.md (adde_operand): New. * config/rs6000/rs6000.md (add3_carry): New. (add3_imm_carry_pos): New. (add3_imm_carry_0): New. (add3_imm_carry_m1): New. (add3_imm_carry_neg): New. (add3_carry_in): New. (add3_carry_in_0): New. (add3_carry_in_m1): New. (subf3_carry): New. (subf3_imm_carry_0): New. (subf3_imm_carry_m1): New. (subf3_carry_in): New. (subf3_carry_in_0): New. (subf3_carry_in_m1): New. (subf3_carry_in_xx): New. --- gcc/config/rs6000/predicates.md | 6 ++ gcc/config/rs6000/rs6000.md | 162 2 files changed, 168 insertions(+) diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md index ea230a5..a19cb2f 100644 --- a/gcc/config/rs6000/predicates.md +++ b/gcc/config/rs6000/predicates.md @@ -788,6 +788,12 @@ (define_predicate "add_operand" || satisfies_constraint_L (op)") (match_operand 0 "gpc_reg_operand"))) +;; Return 1 if the operand is either a non-special register, or 0, or -1. +(define_predicate "adde_operand" + (if_then_else (match_code "const_int") +(match_test "INTVAL (op) == 0 || INTVAL (op) == -1") +(match_operand 0 "gpc_reg_operand"))) + ;; Return 1 if OP is a constant but not a valid add_operand. (define_predicate "non_add_cint_operand" (and (match_code "const_int") diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index dcdb7c1..e23fadb 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -1634,6 +1634,96 @@ (define_split FAIL; }) + +(define_insn "add3_carry" + [(set (match_operand:P 0 "gpc_reg_operand" "=r") + (plus:P (match_operand:P 1 "gpc_reg_operand" "r") + (match_operand:P 2 "reg_or_short_operand" "rI"))) + (set (reg:P CA_REGNO) + (ltu:P (plus:P (match_dup 1) + (match_dup 2)) + (match_dup 1)))] + "" + "add%I2c %0,%1,%2" + [(set_attr "type" "add")]) + +(define_insn "*add3_imm_carry_pos" + [(set (match_operand:P 0 "gpc_reg_operand" "=r") + (plus:P (match_operand:P 1 "gpc_reg_operand" "r") + (match_operand:P 2 "short_cint_operand" "n"))) + (set (reg:P CA_REGNO) + (geu:P (match_dup 1) + (match_operand:P 3 "const_int_operand" "n")))] + "INTVAL (operands[2]) > 0 + && INTVAL (operands[2]) + INTVAL (operands[3]) == 0" + "addic %0,%1,%2" + [(set_attr "type" "add")]) + +(define_insn "*add3_imm_carry_0" + [(set (match_operand:P 0 "gpc_reg_operand" "=r") + (match_operand:P 1 "gpc_reg_operand" "r")) + (set (reg:P CA_REGNO) + (const_int 0))] + "" + "addic %0,%1,0" + [(set_attr "type" "add")]) + +(define_insn "*add3_imm_carry_m1" + [(set (match_operand:P 0 "gpc_reg_operand" "=r") + (plus:P (match_operand:P 1 "gpc_reg_operand" "r") + (const_int -1))) + (set (reg:P CA_REGNO) + (ne:P (match_dup 1) + (const_int 0)))] + "" + "addic %0,%1,-1" + [(set_attr "type" "add")]) + +(define_insn "*add3_imm_carry_neg" + [(set (match_operand:P 0 "gpc_reg_operand" "=r") + (plus:P (match_operand:P 1 "gpc_reg_operand" "r") + (match_operand:P 2 "short_cint_operand" "n"))) + (set (reg:P CA_REGNO) + (gtu:P (match_dup 1) + (match_operand:P 3 "const_int_operand" "n")))] + "INTVAL (operands[2]) < 0 + && INTVAL (operands[2]) + INTVAL (operands[3]) == -1" + "addic %0,%1,%2" + [(set_attr "type" "add")]) + + +(define_insn "add3_carry_in" + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r") + (plus:GPR (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r") + (match_operand:GPR 2 "adde_operand" "r,n")) + (reg:GPR CA_REGNO))) + (clobber (reg:GPR CA_REGNO))] + "" + "@ + adde %0,%1,%2 + add%G2e %0,%1" + [(set_attr "type" "add")]) + +(define_insn "add3_carry_in_0" + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r") + (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r") + (reg:GPR CA_REGNO))) + (clobber (reg:GPR CA_REGNO))] + "" + "addze %0,%1" + [(set_attr "type" "add")]) + +(define_insn "add3_carry_in_m1" + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r") + (plus:GPR (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r") + (reg:GPR CA_REGNO)) + (const_int -1))) + (clobber (reg:GPR CA_REGNO))] + "" + "addme %0,%1" + [(set_attr "type" "add")]) + + (define_expand "one_cmpl2" [(set (match_operand:SDI 0 "gpc_reg
[PATCH 09/10] rs6000: Remove the (now unused) insn type "compare"
2014-12-08 Segher Boessenkool gcc/ * config/rs6000/40x.md (ppc403-compare): Remove "compare". config/rs6000/440.md (ppc440-compare): Remove "compare". config/rs6000/476.md (ppc476-compare): Remove "compare". config/rs6000/601.md (ppc601-compare): Remove "compare". config/rs6000/603.md (ppc603-compare): Remove "compare". config/rs6000/6xx.md (ppc604-compare): Remove "compare". config/rs6000/7450.md (ppc7450-compare): Remove "compare". config/rs6000/7xx.md (ppc750-compare): Remove "compare". config/rs6000/8540.md (ppc8540_su): Remove "compare". config/rs6000/cell.md (cell-fast-cmp, cell-cmp-microcoded): Remove "compare". config/rs6000/e300c2c3.md (ppce300c3_cmp): Remove "compare". config/rs6000/e500mc.md (e500mc_su): Remove "compare". config/rs6000/e500mc64.md (e500mc64_su2): Remove "compare". config/rs6000/e5500.md (e5500_sfx2): Remove "compare". config/rs6000/e6500.md (e6500_sfx2): Remove "compare". config/rs6000/mpc.md (mpccore-compare): Remove "compare". config/rs6000/power4.md (power4-compare): Remove "compare". config/rs6000/power5.md (power5-compare): Remove "compare". config/rs6000/power6.md (power6-compare): Remove "compare". config/rs6000/power7.md (power7-compare): Remove "compare". config/rs6000/power8.md (power8-compare): Remove "compare". Update comment. config/rs6000/rs6000.c (rs6000_adjust_cost) : Remove (three times). (is_cracked_insn): Remove TYPE_COMPARE case. (insn_must_be_first_in_group) : Remove (twice). config/rs6000/rs6000.md (type): Remove "compare". (cell_micro): Remove "compare". config/rs6000/rs64.md (rs64a-compare): Remove "compare". --- gcc/config/rs6000/40x.md | 2 +- gcc/config/rs6000/440.md | 2 +- gcc/config/rs6000/476.md | 2 +- gcc/config/rs6000/601.md | 2 +- gcc/config/rs6000/603.md | 2 +- gcc/config/rs6000/6xx.md | 2 +- gcc/config/rs6000/7450.md | 2 +- gcc/config/rs6000/7xx.md | 2 +- gcc/config/rs6000/8540.md | 2 +- gcc/config/rs6000/cell.md | 10 -- gcc/config/rs6000/e300c2c3.md | 2 +- gcc/config/rs6000/e500mc.md | 2 +- gcc/config/rs6000/e500mc64.md | 2 +- gcc/config/rs6000/e5500.md| 2 +- gcc/config/rs6000/e6500.md| 2 +- gcc/config/rs6000/mpc.md | 2 +- gcc/config/rs6000/power4.md | 5 ++--- gcc/config/rs6000/power5.md | 5 ++--- gcc/config/rs6000/power6.md | 5 ++--- gcc/config/rs6000/power7.md | 5 ++--- gcc/config/rs6000/power8.md | 8 +++- gcc/config/rs6000/rs6000.c| 6 -- gcc/config/rs6000/rs6000.md | 4 +--- gcc/config/rs6000/rs64.md | 2 +- 24 files changed, 32 insertions(+), 48 deletions(-) diff --git a/gcc/config/rs6000/40x.md b/gcc/config/rs6000/40x.md index 0903536..650fe52 100644 --- a/gcc/config/rs6000/40x.md +++ b/gcc/config/rs6000/40x.md @@ -53,7 +53,7 @@ (define_insn_reservation "ppc403-three" 1 "iu_40x,iu_40x,iu_40x") (define_insn_reservation "ppc403-compare" 3 - (and (ior (eq_attr "type" "cmp,compare") + (and (ior (eq_attr "type" "cmp") (and (eq_attr "type" "add,logical,shift,exts") (eq_attr "dot" "yes"))) (eq_attr "cpu" "ppc403,ppc405")) diff --git a/gcc/config/rs6000/440.md b/gcc/config/rs6000/440.md index ff91fdb..b8a1d33 100644 --- a/gcc/config/rs6000/440.md +++ b/gcc/config/rs6000/440.md @@ -95,7 +95,7 @@ (define_insn_reservation "ppc440-branch" 1 "ppc440_issue,ppc440_i_pipe") (define_insn_reservation "ppc440-compare" 2 - (and (ior (eq_attr "type" "cmp,compare,cr_logical,delayed_cr,mfcr") + (and (ior (eq_attr "type" "cmp,cr_logical,delayed_cr,mfcr") (and (eq_attr "type" "add,logical,shift,exts") (eq_attr "dot" "yes"))) (eq_attr "cpu" "ppc440")) diff --git a/gcc/config/rs6000/476.md b/gcc/config/rs6000/476.md index 9bfd6b6..cf47754 100644 --- a/gcc/config/rs6000/476.md +++ b/gcc/config/rs6000/476.md @@ -77,7 +77,7 @@ (define_insn_reservation "ppc476-complex-integer" 1 ppc476_i_pipe") (define_insn_reservation "ppc476-compare" 4 - (and (ior (eq_attr "type" "compare,mfcr,mfcrf,mtcr,mfjmpr,mtjmpr") + (and (ior (eq_attr "type" "mfcr,mfcrf,mtcr,mfjmpr,mtjmpr") (and (eq_attr "type" "add,logical,shift,exts") (eq_attr "dot" "yes"))) (eq_attr "cpu" "ppc476")) diff --git a/gcc/config/rs6000/601.md b/gcc/config/rs6000/601.md index de51cbf..af14274 100644 --- a/gcc/config/rs6000/601.md +++ b/gcc/config/rs6000/601.md @@ -74,7 +74,7 @@ (define_insn_reservation "ppc601-idiv" 36 ; compare executes on integer unit, but feeds insns which ; execute on the branch unit. (define_insn_reservation "ppc601-compare" 3 - (and (ior (eq_attr "type" "cmp,compare") + (and (ior (eq_attr "type" "cmp") (and (eq_attr "type" "shift,exts") (eq_a
[PATCH 07/10] rs6000: The big carry insns addition
Now that everything is in place for letting GCC use the carry-using and carry-producing machine insns as separate RTL insns, switch over all remaining patterns that clobber CA without telling the compiler. This are the multiple-precision add/sub/neg patterns, and the various eq/ne/ ltu/gtu/leu/geu patterns. The eq/ne patterns are special. The optimal machine code for those isn't regular (like e.g. the ltu patterns are), and we want to implement a plain eq as "cntlz[wd];sr[wd]i"; but that means that if we split those patterns at expand time, combine will happily put them back together again. So expand them as eq/ne, and split later. 2014-12-08 Segher Boessenkool gcc/ PR target/64180 * config/rs6000/predicates.md (unsigned_comparison_operator): New. (signed_comparison_operator): New. * config/rs6000/rs6000-protos.h (rs6000_emit_eqne): Declare. * config/rs6000/rs6000.c (rs6000_emit_eqne): New function. (rs6000_emit_sCOND): Remove ISEL test (move it to the expander). * config/rs6000/rs6000.md (add3 for SDI): Expand DImode add to addc,adde directly, if !TARGET_POWERPC64. (sub3 for SDI): Expand DImode sub to subfc,subfe directly, if !TARGET_POWERPC64. (neg2): Delete expander. (*neg2): Rename to "neg2". (addti3, subti3): Delete. (addti3, subti3): New expanders. (*adddi3_noppc64, *subdi3_noppc64, *negdi2_noppc64): Delete. (cstore4_unsigned): New expander. (cstore4): Allow GPR as output (not just SI). Rewrite. (cstore4 for FP): Remove superfluous quotes. (*eq, *eq_compare, *plus_eqsi and splitter, *compare_plus_eqsi and splitter, *plus_eqsi_compare and splitter, *neg_eq0, *neg_eq, *ne0_, plus_ne0_, compare_plus_ne0_ and splitter, *compare_plus_ne0__1 and splitter, *plus_ne0__compare and splitter, *leu, *leu_compare and splitter, *plus_leu, *neg_leu, *and_neg_leu, *ltu, *ltu_compare, *plus_ltu, *plus_ltu_1, *plus_ltucompare, *neg_ltu, *geu, *geu_compare and splitter, *plus_geu, *neg_geu, *and_neg_geu, *plus_gt0, *gtu, *gtu_compare, *plus_gtu, *plus_gtu_1, *plus_gtu_compare, *neg_gtu, 12 anonymous insns, and 12 anonymous splitters): Delete. (eq3, ne3): New. (*neg_eq_, *neg_ne_): New. (*plus_eq_, *plus_ne_): New. (*minus_eq_, *minus_ne_): New. --- gcc/config/rs6000/predicates.md |8 + gcc/config/rs6000/rs6000-protos.h |1 + gcc/config/rs6000/rs6000.c| 27 +- gcc/config/rs6000/rs6000.md | 1559 + 4 files changed, 400 insertions(+), 1195 deletions(-) diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md index a19cb2f..8b68f9c 100644 --- a/gcc/config/rs6000/predicates.md +++ b/gcc/config/rs6000/predicates.md @@ -1240,6 +1240,14 @@ (define_predicate "rs6000_cbranch_operator" (match_code ("unlt,unle,ungt,unge" (match_operand 0 "comparison_operator"))) +;; Return 1 if OP is an unsigned comparison operator. +(define_predicate "unsigned_comparison_operator" + (match_code "ltu,gtu,leu,geu")) + +;; Return 1 if OP is a signed comparison operator. +(define_predicate "signed_comparison_operator" + (match_code "lt,gt,le,ge")) + ;; Return 1 if OP is a comparison operation that is valid for an SCC insn -- ;; it must be a positive comparison. (define_predicate "scc_comparison_operator" diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h index 8d27a69..eb64598 100644 --- a/gcc/config/rs6000/rs6000-protos.h +++ b/gcc/config/rs6000/rs6000-protos.h @@ -109,6 +109,7 @@ extern void print_operand (FILE *, rtx, int); extern void print_operand_address (FILE *, rtx); extern enum rtx_code rs6000_reverse_condition (machine_mode, enum rtx_code); +extern rtx rs6000_emit_eqne (machine_mode, rtx, rtx, rtx); extern void rs6000_emit_sISEL (machine_mode, rtx[]); extern void rs6000_emit_sCOND (machine_mode, rtx[]); extern void rs6000_emit_cbranch (machine_mode, rtx[]); diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index e58fc81..f60f4d7 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -19421,6 +19421,27 @@ rs6000_emit_sISEL (machine_mode mode ATTRIBUTE_UNUSED, rtx operands[]) rs6000_emit_int_cmove (operands[0], operands[1], const1_rtx, const0_rtx); } +/* Emit RTL that sets a register to zero if OP1 and OP2 are equal. SCRATCH + can be used as that dest register. Return the dest register. */ + +rtx +rs6000_emit_eqne (machine_mode mode, rtx op1, rtx op2, rtx scratch) +{ + if (op2 == const0_rtx) +return op1; + + if (GET_CODE (scratch) == SCRATCH) +scratch = gen_reg_rtx (mode); + + if (logical_operand (op2, mode)) +emit_insn (gen_rtx_SET (VOIDmode, scratch, gen_rtx_X
Re: [PATCH 04/10] rs6000: Remove addic from the normal add pattern
On Mon, Dec 08, 2014 at 09:44:44AM -0500, David Edelsohn wrote: > > -;; Discourage ai/addic because of carry but provide it in an alternative > > -;; allowing register zero as source. > > (define_insn "*add3_internal1" > > - [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,?r,r") > > - (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,r,b") > > - (match_operand:GPR 2 "add_operand" "r,I,I,L")))] > > + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r") > > + (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b") > > + (match_operand:GPR 2 "add_operand" "r,I,L")))] > >"!DECIMAL_FLOAT_MODE_P (GET_MODE (operands[0])) && !DECIMAL_FLOAT_MODE_P > > (GET_MODE (operands[1]))" > >"@ > > add %0,%1,%2 > > addi %0,%1,%2 > > - addic %0,%1,%2 > > addis %0,%1,%v2" > >[(set_attr "type" "add")]) > > Why are you removing the alternative instead of clobbering XER[CA]? I should have mentioned, sorry. We don't want to clobber CA on *every* add, and reload can generate more adds out of thin air. And CA is a fixed register. There may be a better way to do this, but I don't know it. Segher
Re: [PATCH 04/10] rs6000: Remove addic from the normal add pattern
On Mon, Dec 8, 2014 at 10:01 AM, Segher Boessenkool wrote: > On Mon, Dec 08, 2014 at 09:44:44AM -0500, David Edelsohn wrote: >> > -;; Discourage ai/addic because of carry but provide it in an alternative >> > -;; allowing register zero as source. >> > (define_insn "*add3_internal1" >> > - [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,?r,r") >> > - (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,r,b") >> > - (match_operand:GPR 2 "add_operand" "r,I,I,L")))] >> > + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r") >> > + (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b") >> > + (match_operand:GPR 2 "add_operand" "r,I,L")))] >> >"!DECIMAL_FLOAT_MODE_P (GET_MODE (operands[0])) && >> > !DECIMAL_FLOAT_MODE_P (GET_MODE (operands[1]))" >> >"@ >> > add %0,%1,%2 >> > addi %0,%1,%2 >> > - addic %0,%1,%2 >> > addis %0,%1,%v2" >> >[(set_attr "type" "add")]) >> >> Why are you removing the alternative instead of clobbering XER[CA]? > > I should have mentioned, sorry. > > We don't want to clobber CA on *every* add, and reload can generate more > adds out of thin air. And CA is a fixed register. > > There may be a better way to do this, but I don't know it. Okay. It's unfortunate that GCC does not have separate patterns for a safe ADD used by reload. Thanks, David
Re: [PATCH 05/10] rs6000: Merge and split add and addic
On Mon, Dec 8, 2014 at 9:18 AM, Segher Boessenkool wrote: > This splits the addic patterns from the add/addi patterns, and merges the > insns and splitters. > Also change type "compare" to "add" for "addic." (the dot form). > > > 2014-12-08 Segher Boessenkool > > gcc/ > PR target/64180 > * config/rs6000/rs6000.md (*add3_internal1): Rename to > "*add3". > (*add3_internal2, *add3_internal3, and (their splitters): > Delete. > (*add3_dot, *add3_dot2): New. > (*add3_imm_dot, *add3_imm_dot2): New. Okay. Thanks, David
Re: RFA: patch to fix PR64157
Vladimir Makarov writes: >The following patch fixes > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64157 > > >After calling target_reinit from save_target_globals for switchable > targets (as ppc), a lot of ira data (register sets, classes etc) become > undefined. After that ira-costs.c crashes when the undefined data are used. > >The patch was successfully bootstrapped and tested on x86-64. > >Ok to commit to the trunk? > > 2014-12-05 Vladimir Makarov > > PR rtl-optimization/64157 > * toplev.c (target_reinit): Call ira_init. > > Index: toplev.c > === > --- toplev.c (revision 218378) > +++ toplev.c (working copy) > @@ -1888,6 +1888,8 @@ target_reinit (void) >/* This invokes target hooks to set fixed_reg[] etc, which is > mode-dependent. */ >init_regs (); > + /* Set IRA data depended on target parameters. */ > + ira_init (); Could you give more details about how this happens? It's reverting part of: 2014-06-25 Jan Hubicka * toplev.c (backend_init_target): Move init_emit_regs and init_regs to... (backend_init) ... here; skip ira_init_once and backend_init_target. (target_reinit) ... and here; clear this_target_rtl->lang_dependent_initialized. (lang_dependent_init_target): Clear this_target_rtl->lang_dependent_initialized; break out rtl initialization to ... (initialize_rtl): ... here; call also backend_init_target and ira_init_once. * toplev.h (initialize_rtl): New function. * function.c: Include toplev.h (init_function_start): Call initialize_rtl. * rtl.h (target_rtl): Add target_specific_initialized, lang_dependent_initialized. which was supposed to delay the ira_init so that it only gets called once we start to compile a function. It sounds from your patch like that either isn't early enough or isn't happening at all for some reason. Thanks, Richard
Re: [PATCH 06/10] rs6000: New add/subf carry insns
On Mon, Dec 8, 2014 at 9:18 AM, Segher Boessenkool wrote: > This implements addc, addic, adde, addze, addme, subfc, subfic, subfe, > subfze, subfme. Some of those in multiple forms: canonical RTL for the > "immediate" insns has four forms; subfX has a special form for subtracting > a register from itself. > > All these new insns get type "add" for now. > > > 2014-12-08 Segher Boessenkool > > gcc/ > PR target/64180 > * config/rs6000/predicates.md (adde_operand): New. > * config/rs6000/rs6000.md (add3_carry): New. > (add3_imm_carry_pos): New. > (add3_imm_carry_0): New. > (add3_imm_carry_m1): New. > (add3_imm_carry_neg): New. > (add3_carry_in): New. > (add3_carry_in_0): New. > (add3_carry_in_m1): New. > (subf3_carry): New. > (subf3_imm_carry_0): New. > (subf3_imm_carry_m1): New. > (subf3_carry_in): New. > (subf3_carry_in_0): New. > (subf3_carry_in_m1): New. > (subf3_carry_in_xx): New. Okay. Thanks, David
Re: [PATCH 06/10] rs6000: New add/subf carry insns
On Mon, Dec 8, 2014 at 10:41 AM, David Edelsohn wrote: > On Mon, Dec 8, 2014 at 9:18 AM, Segher Boessenkool > wrote: >> This implements addc, addic, adde, addze, addme, subfc, subfic, subfe, >> subfze, subfme. Some of those in multiple forms: canonical RTL for the >> "immediate" insns has four forms; subfX has a special form for subtracting >> a register from itself. >> >> All these new insns get type "add" for now. >> >> >> 2014-12-08 Segher Boessenkool >> >> gcc/ >> PR target/64180 >> * config/rs6000/predicates.md (adde_operand): New. >> * config/rs6000/rs6000.md (add3_carry): New. >> (add3_imm_carry_pos): New. >> (add3_imm_carry_0): New. >> (add3_imm_carry_m1): New. >> (add3_imm_carry_neg): New. >> (add3_carry_in): New. >> (add3_carry_in_0): New. >> (add3_carry_in_m1): New. >> (subf3_carry): New. >> (subf3_imm_carry_0): New. >> (subf3_imm_carry_m1): New. >> (subf3_carry_in): New. >> (subf3_carry_in_0): New. >> (subf3_carry_in_m1): New. >> (subf3_carry_in_xx): New. > > Okay. > > Thanks, David As we both noticed, there are a few problems with this patch, so I'll wait for a revised version. Thanks, David
Re: Fix libgomp crash without TLS (PR42616)
Is it ok to add GCC_CHECK_EMUTLS test in libgomp/configure.ac or should I add in tls.m3 a similair test that would be used only in libgomp? 2014-12-08 17:03 GMT+03:00 Jakub Jelinek : > On Mon, Dec 08, 2014 at 04:30:46PM +0300, Varvara Rainchik wrote: >> Hi guys, >> >> Could you please take a look at this issue? This fix is still urgent >> for Android. > > I'm afraid it will break those targets where emutls is not on, but the C > library doesn't support TLS. I think it is acceptable not to care about > #pragma omp from different pthread_create created threads in that case, but > stopping support completely might be too early. > So, can you instead arrange for HAVE_TLS to be defined if emutls is > supported (check for that during configure)? > > Jakub
Re: [patch] Fix tilepro includes
On Fri, 2014-11-21 08:45:11 -0500, Andrew MacLeod wrote: > During the flattening of optabs.h, I updated all the config/* files > which were affected. I've been getting spurious failures with > config-list.mk where my changes would "disappear" and tracked down > why. > > I was blissfully unaware that the tilepro ports mul-tables.c file is > actually generated from gen-mul-tables.cc. > > This patch fixes the include issue by adding "#include insn-codes.h" > to the generated files. I also added a comment indicating these are > generated files, and to make changes in the generator. > > This allows all the tile* ports to compile properly again. > > OK for trunk? Seems this wasn't ever ACKed or applied up to now? I'm still seeing compilation errors for the tile targets, see eg. http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=382169 MfG, JBG -- Jan-Benedict Glaw jbg...@lug-owl.de +49-172-7608481 Signature of: 23:53 <@jbglaw> So, ich kletter' jetzt mal ins Bett. the second : 23:57 <@jever2> .oO( kletter ..., hat er noch Gitter vorm Bett, wie früher meine Kinder?) 00:00 <@jbglaw> jever2: *patsch* 00:01 <@jever2> *aua*, wofür, Gedanken sind frei! 00:02 <@jbglaw> Nee, freie Gedanken, die sind seit 1984 doch aus! 00:03 <@jever2> 1984? ich bin erst seit 1985 verheiratet! signature.asc Description: Digital signature
Re: Fix libgomp crash without TLS (PR42616)
On Mon, Dec 08, 2014 at 07:01:09PM +0300, Varvara Rainchik wrote: > Is it ok to add GCC_CHECK_EMUTLS test in libgomp/configure.ac or > should I add in tls.m3 a similair test that would be used only in > libgomp? I think config/tls.m4 would be a better place, but doing it in some way where the users of config/tls.m4 could actually by using different macros from there choose what they want (either check solely for real TLS, or only for emutls, or for both). And libgomp/configure.ac would then choose it is happy with both. Jakub
Re: [PATCH] libgccjit cleanups
On Sat, 2014-12-06 at 15:29 -0500, Ulrich Drepper wrote: > This patch broken out of one I sent earlier with some extensions. It > contains only little cleanups to the libgccjit code. > > When creating the linker command line the code now uses an auto_vec > instead of the fixed size array. > > The second change adds the missing context::set_str_option member > function to the C++ interface. > > The third change it to the string option handling. Instead of just > using the pointer passed to the function the code now makes a copy > of the string. > > > OK? Thanks. Overall this is good, a few nitpicks inline below: > gcc/ChangeLog: > > 2014-12-06 Ulrich Drepper > > * jit/jit-playback.c (convert_to_dso): Use auto_vec instead > of automatic array to build up command line. > * jit/jit-recording.c (recording::context::set_str_option): > Make copy of the string. > (recording::context::~context): Free string options. > * jit/jit-recording.h (recording::context): Adjust type > of m_str_options member. > * jit/libgccjit.h: Adjust comment about > gcc_jit_context_set_str_option parameter begin used after > the call. > * jit/libgccjit++.h (gccjit::context): Add set_str_option > member function. > > > diff --git a/gcc/jit/jit-playback.c b/gcc/jit/jit-playback.c > index ecdae80..6d1eb8a 100644 > --- a/gcc/jit/jit-playback.c > +++ b/gcc/jit/jit-playback.c > @@ -1726,18 +1726,19 @@ convert_to_dso (const char *ctxt_progname) > TV_ASSEMBLE. */ >auto_timevar assemble_timevar (TV_ASSEMBLE); >const char *errmsg; > - const char *argv[7]; > + auto_vec argvec; > +#define ADD_ARG(arg) argvec.safe_push (arg) >int exit_status = 0; >int err = 0; >const char *gcc_driver_name = GCC_DRIVER_NAME; > > - argv[0] = gcc_driver_name; > - argv[1] = "-shared"; > + ADD_ARG (gcc_driver_name); > + ADD_ARG ("-shared"); >/* The input: assembler. */ > - argv[2] = m_path_s_file; > + ADD_ARG (m_path_s_file); >/* The output: shared library. */ > - argv[3] = "-o"; > - argv[4] = m_path_so_file; > + ADD_ARG ("-o"); > + ADD_ARG (m_path_so_file); > >/* Don't use the linker plugin. > If running with just a "make" and not a "make install", then we'd > @@ -1746,17 +1747,17 @@ convert_to_dso (const char *ctxt_progname) > libto_plugin is a .la at build time, with it becoming installed with > ".so" suffix: i.e. it doesn't exist with a .so suffix until install > time. */ > - argv[5] = "-fno-use-linker-plugin"; > + ADD_ARG ("-fno-use-linker-plugin"); > >/* pex argv arrays are NULL-terminated. */ > - argv[6] = NULL; > + ADD_ARG (NULL); > >/* pex_one's error-handling requires pname to be non-NULL. */ >gcc_assert (ctxt_progname); > >errmsg = pex_one (PEX_SEARCH, /* int flags, */ > gcc_driver_name, > - const_cast (argv), > + const_cast (argvec.address ()), > ctxt_progname, /* const char *pname */ > NULL, /* const char *outname */ > NULL, /* const char *errname */ > @@ -1783,6 +1784,7 @@ convert_to_dso (const char *ctxt_progname) >getenv ("PATH")); >return; > } > +#undef ADD_ARG > } > > /* Top-level hook for playing back a recording context. > diff --git a/gcc/jit/jit-recording.c b/gcc/jit/jit-recording.c > --- a/gcc/jit/jit-recording.c > +++ b/gcc/jit/jit-recording.c > @@ -215,6 +215,9 @@ recording::context::~context () >delete m; > } > > + for (i = 0; i < GCC_JIT_NUM_STR_OPTIONS; ++i) > +free (m_str_options[i]); > + >if (m_builtins_manager) > delete m_builtins_manager; > > @@ -827,7 +830,7 @@ recording::context::set_str_option (enum > gcc_jit_str_option opt, >"unrecognized (enum gcc_jit_str_option) value: %i", opt); >return; > } > - m_str_options[opt] = value; > + m_str_options[opt] = xstrdup (value); > } If the user repeatedly calls set_str_option on the same opt, this will be a leak. So something like: /* Free any existing value. */ free (m_str_options[opt]); m_str_options[opt] = xstrdup (value); > /* Set the given integer option for this context, or add an error if > diff --git a/gcc/jit/jit-recording.h b/gcc/jit/jit-recording.h > --- a/gcc/jit/jit-recording.h > +++ b/gcc/jit/jit-recording.h > @@ -246,7 +246,7 @@ private: >char *m_first_error_str; >bool m_owns_first_error_str; > > - const char *m_str_options[GCC_JIT_NUM_STR_OPTIONS]; > + char *m_str_options[GCC_JIT_NUM_STR_OPTIONS]; >int m_int_options[GCC_JIT_NUM_INT_OPTIONS]; >bool m_bool_options[GCC_JIT_NUM_BOOL_OPTIONS]; > > diff --git a/gcc/jit/libgccjit++.h b/gcc/jit/libgccjit++.h > --- a/gcc/jit/libgccjit++.h > +++ b/gcc/jit/libgccjit++.h > @@ -99,6 +99,9 @@ namespace gccjit > void dump_to_file (const std::string &path, > bool update_loc
Re: [PATCH 8/9] Negative numbers added for sreal class.
On 11/28/2014 10:32 AM, Richard Biener wrote: On Thu, Nov 27, 2014 at 6:08 PM, Martin Liška wrote: On 11/21/2014 04:21 PM, Martin Liška wrote: On 11/21/2014 04:02 PM, Richard Biener wrote: On Fri, Nov 21, 2014 at 3:39 PM, Martin Liška wrote: Hello. Ok, this is simplified, one can use sreal a = 12345 and it works ;) that's a new API, right? There is no max () and I think that using LONG_MIN here is asking for trouble (host dependence). The comment in the file says the max should be sreal (SREAL_MAX_SIG, SREAL_MAX_EXP) and the min sreal (-SREAL_MAX_SIG, SREAL_MAX_EXP)? Sure, sreal can store much bigger(smaller) numbers :) Where do you need sreal::to_double? The host shouldn't perform double calculations so it can be only for dumping? In which case the user should have used sreal::dump (), maybe with extra arguments. That new function was request from Honza, only for debugging purpose. I agree that dump should this kind of job. If no other problem, I will run tests once more and commit it. Thanks, Martin -#define SREAL_MAX_EXP (INT_MAX / 4) +#define SREAL_MAX_EXP (INT_MAX / 8) this change doesn't look necessary anymore? Btw, it's also odd that... #define SREAL_PART_BITS 32 ... #define SREAL_MIN_SIG ((uint64_t) 1 << (SREAL_PART_BITS - 1)) #define SREAL_MAX_SIG (((uint64_t) 1 << SREAL_PART_BITS) - 1) thus all m_sig values fit in 32bits but we still use a uint64_t m_sig ... (the implementation uses 64bit for internal computations, but still the storage is wasteful?) Of course the way normalize() works requires that storage to be 64bits to store unnormalized values. I'd say ok with the SREAL_MAX_EXP change reverted. Hi. You are right, this change was done because I used one bit for m_negative (bitfield), not needed any more. Final version attached. Thank you, Martin Thanks, Richard. Otherwise looks good to me and sorry for not noticing the above earlier. Thanks, Richard. Thanks, Martin }; extern void debug (sreal &ref); @@ -76,12 +133,12 @@ inline sreal &operator+= (sreal &a, const sreal &b) inline sreal &operator-= (sreal &a, const sreal &b) { -return a = a - b; + return a = a - b; } inline sreal &operator/= (sreal &a, const sreal &b) { -return a = a / b; + return a = a / b; } inline sreal &operator*= (sreal &a, const sreal &b) -- 2.1.2 Hello. After IRC discussions, I decided to give sreal another refactoring where I use int64_t for m_sig. This approach looks much easier and straightforward. I would like to ask folk for comments? I think you want to exclude LONG_MIN (how do you know that LONG_MIN is min(int64_t)?) as a valid value for m_sig - after all SREAL_MIN_SIG makes it symmetric. Or simply use abs_hwi at Hello. I decided to use abs_hwi. + int64_t s = m_sig < 0 ? -1 : 1; + uint64_t sig = m_sig == LONG_MIN ? LONG_MAX : std::abs (m_sig); -#define SREAL_MIN_SIG ((uint64_t) 1 << (SREAL_PART_BITS - 1)) -#define SREAL_MAX_SIG (((uint64_t) 1 << SREAL_PART_BITS) - 1) +#define SREAL_MIN_SIG ((uint64_t) 1 << (SREAL_PART_BITS - 2)) +#define SREAL_MAX_SIG (((uint64_t) 1 << (SREAL_PART_BITS - 1)) - 1) shouldn't this also be -2 in the last line? It's correct, in the first line, I s/'SREAL_PART_BITS - 1'/'SREAL_PART_BITS - 2' and second one is also decremented: s/'SREAL_PART_BITS'/'REAL_PART_BITS - 1'. That is, you effectively use the MSB as a sign bit? Yes. It uses signed integer with MSB as a sign bit. Btw, a further change would be to make m_sig 'signed int', matching the "real" storage requirements according to SREAL_PART_BITS. This would of course still require temporaries used for computation to be 64bits and it would require normalization to work on the temporaries. But then we'd get down to 8 bytes storage ... Agree, we can shrink the size for future. I've been running tests, I hope this implementation is much nicer than having bool m_negative. What do you think about trunk acceptation? Thanks, Martin Richard. I am able to run profiled bootstrap on x86_64-linux-pc and ppc64-linux-pc and new regression is introduced. Thanks, Martin >From c275c48eb5f0f41f546bb7866d70c1fe066d6d62 Mon Sep 17 00:00:00 2001 From: mliska Date: Wed, 26 Nov 2014 15:46:42 +0100 Subject: [PATCH] New sreal implementation which uses int64_t as m_sig. gcc/ChangeLog: 2014-12-08 Martin Liska * sreal.c (sreal::shift_right): New implementation for int64_t as m_sig. (sreal::normalize): Likewise. (sreal::to_int): Likewise. (sreal::operator+): Likewise. (sreal::operator-): Likewise. (sreal::operator*): Likewise. (sreal::operator/): Likewise. (sreal::signedless_minus): Removed. (sreal::signedless_plus): Removed. (sreal::debug): const keyword is added. * sreal.h (sreal::operator<): New implementation for int64_t as m_sig. * ipa-inline.c (recursive_inlining): LONG_MIN is replaced with sreal::min (). --- gcc/ipa-inline.c | 2 +- gcc/sreal.c | 129 ++
Re: [PATCH 3/4] First usage of cgraph_summary in ipa-prop pass.
On 11/14/2014 04:23 PM, Martin Liška wrote: On 11/14/2014 03:05 PM, Martin Liška wrote: On 11/11/2014 01:26 PM, mliska wrote: gcc/ChangeLog: 2014-11-12 Martin Liska * auto-profile.c: Include cgraph_summary.h. * cgraph.c: Likewise. * cgraphbuild.c: Likewise. * cgraphclones.c: Likewise. * cgraphunit.c: Likewise. * ipa-cp.c: Likewise. * ipa-devirt.c: Likewise. * ipa-icf.c: Likewise. * ipa-inline-analysis.c (evaluate_properties_for_edge): Usage of ipa_node_params_vector is replaced with ipa_node_params_summary. (inline_node_duplication_hook): Likewise. (estimate_function_body_sizes): Likewise. (remap_edge_change_prob): Likewise. (inline_merge_summary): Likewise. * ipa-inline-transform.c: Include of cgraph_summary.h. * ipa-inline.c (early_inliner): Usage of ipa_node_params_vector is replaced with ipa_node_params_summary. * ipa-polymorphic-call.c: Include of cgraph_summary.h. * ipa-profile.c: Include of cgraph_summary.h. * ipa-prop.c (struct func_body_info): Struct keyword is removed. (struct ipa_cst_ref_desc): Likewise. (ipa_func_spec_opts_forbid_analysis_p): Likewise. (ipa_alloc_node_params): Likewise. (ipa_initialize_node_params): Likewise. (ipa_print_node_jump_functions_for_edge): Likewise. (ipa_print_node_jump_functions): Likewise. (ipa_print_all_jump_functions): Likewise. (ipa_set_jf_constant): Likewise. (check_stmt_for_type_change): Likewise. (detect_type_change_from_memory_writes): Likewise. (find_dominating_aa_status): Likewise. (parm_bb_aa_status_for_bb): Likewise. (parm_preserved_before_stmt_p): Likewise. (parm_ref_data_preserved_p): Likewise. (parm_ref_data_pass_through_p): Likewise. (struct ipa_known_agg_contents_list): Likewise. (get_place_in_agg_contents_list): Likewise. (build_agg_jump_func_from_list): Likewise. (determine_locally_known_aggregate_parts): Likewise. (ipa_compute_jump_functions_for_edge): Likewise. (ipa_compute_jump_functions_for_bb): Likewise. (ipa_note_param_call): Likewise. (ipa_analyze_indirect_call_uses) Likewise.: (ipa_analyze_virtual_call_uses): Likewise. (ipa_analyze_call_uses): Likewise. (visit_ref_for_mod_analysis): Likewise. (ipa_analyze_controlled_uses): Likewise. (ipa_analyze_node): Likewise. (update_jump_functions_after_inlining): Likewise. (ipa_make_edge_direct_to_target): Likewise. (ipa_find_agg_cst_for_param): Likewise. (remove_described_reference): Likewise. (jfunc_rdesc_usable): Likewise. (try_decrement_rdesc_refcount): Likewise. (try_make_edge_direct_simple_call): Likewise. (try_make_edge_direct_virtual_call): Likewise. (update_indirect_edges_after_inlining): Likewise. (propagate_info_to_inlined_callees): Likewise. (propagate_controlled_uses): Likewise. (ipa_propagate_indirect_call_infos): Likewise. (ipa_free_all_edge_args): Likewise. (ipa_node_params::~ipa_node_params): Likewise. (ipa_free_all_node_params): Likewise. (ipa_edge_removal_hook): Likewise. (ipa_node_removal_hook): Likewise. (ipa_edge_duplication_hook): Likewise. (ipa_add_new_function): Removed (ipa_node_params_cgraph_summary::duplication_hook): New function. (ipa_node_duplication_hook): Struct keyword removed. (ipa_register_cgraph_hooks): Removal of old hooks. (ipa_unregister_cgraph_hooks): Likewise. (ipa_print_node_params): Struct keyword is removed. (ipa_print_all_params): Likewise. (ipa_modify_formal_parameters): Likewise. (ipa_modify_call_arguments): Likewise. (ipa_modify_expr): Likewise. (ipa_get_adjustment_candidate): Likewise. (index_in_adjustments_multiple_times_p): Likewise. (ipa_combine_adjustments): Likewise. (ipa_dump_param_adjustments): Likewise. (ipa_write_jump_function): Likewise. (ipa_read_jump_function): Likewise. (ipa_write_indirect_edge_info): Likewise. (ipa_read_indirect_edge_info): Likewise. (ipa_write_node_info): Likewise. (ipa_read_node_info): Likewise. (ipa_prop_write_jump_functions): Likewise. (ipa_prop_read_section): Likewise. (ipa_prop_read_jump_functions): Likewise. (write_agg_replacement_chain): Likewise. (read_agg_replacement_chain): Likewise. (ipa_prop_write_all_agg_replacement): Likewise. (read_replacements_section): Likewise. (ipa_prop_read_all_agg_replacement): Likewise. (adjust_agg_replacement_values): Likewise. (ipcp_modif_dom_walker::before_dom_children): Likewise. (ipcp_transform_function): Likewise. * ipa-prop.h (struct ipa_node_params): Introduction of new class ipa_node_params_cgraph_summary. * ipa-split.c: Include cgraph_summary.h. * ipa-utils.c: Likewise. * ipa.c: Likewise. * omp-low.c: Likewise. * tree-inline.c: Likewise. * tree-sra.c: Likewise. * tree-ssa-pre.c: Likewise. gcc/lto/ChangeLog: 2014-11-12 Martin Liska * lto-p
Re: [PATCH 4/4] Data structure is used for inline_summary struct.
On 11/14/2014 04:24 PM, Martin Liška wrote: On 11/14/2014 03:09 PM, Martin Liška wrote: On 11/13/2014 05:04 PM, Jan Hubicka wrote: + if (!inline_summary_summary) +inline_summary_summary = (inline_summary_cgraph_summary *) inline_summary_cgraph_summary::create_ggc (symtab); Hehe, this is funny naming scheme. Peraps inline_summary_d and inline_summary_t for the data and type? Hello. I adopted suggested naming scheme. - -static void -inline_node_duplication_hook (struct cgraph_node *src, - struct cgraph_node *dst, - ATTRIBUTE_UNUSED void *data) +void +inline_summary_cgraph_summary::duplication_hook (cgraph_node *src, + cgraph_node *dst, + inline_summary *, + inline_summary *info) Becuase those are no longer "hooks" but virtual function, I guess we could call them simply duplicate/insert/remove. Agree with the change. In a way I would like to see these to be methods of the underlying type rather than virtual methods of the summary, becuase these are operations on the data themselves. I was thinking to model these by specual constructor and copy constructor (taking the extra node pointer parameters) and standard destructor. I am not sure this would be more understandable this way? Motivation for this implementation is: a) it's useful to have an access to cgraph_node that is associated with a sumary b) with GTY, we cannot call destructors -/* Need a typedef for inline_summary because of inline function - 'inline_summary' below. */ -typedef struct inline_summary inline_summary_t; -extern GTY(()) vec *inline_summary_vec; +class GTY((user)) inline_summary_cgraph_summary: public cgraph_summary +{ +public: + inline_summary_cgraph_summary (symbol_table *symtab, bool ggc): +cgraph_summary (symtab, ggc) {} + + static inline_summary_cgraph_summary *create_ggc (symbol_table *symtab) + { +inline_summary_cgraph_summary *summary = new (ggc_cleared_alloc ()) inline_summary_cgraph_summary(symtab, true); +summary->disable_insertion_hook (); +return summary; + } + + + virtual void insertion_hook (cgraph_node *, inline_summary *); + virtual void removal_hook (cgraph_node *node, inline_summary *); + virtual void duplication_hook (cgraph_node *src, cgraph_node *dst, inline_summary *src_data, inline_summary *dst_data); +}; + +extern GTY(()) cgraph_summary *inline_summary_summary; All in all it looks better than original code. If we moved insert/ /* Information kept about parameter of call site. */ struct inline_param_summary @@ -249,10 +265,10 @@ void clone_inlined_nodes (struct cgraph_edge *e, bool, bool, int *, extern int ncalls_inlined; extern int nfunctions_inlined; -static inline struct inline_summary * -inline_summary (struct cgraph_node *node) +static inline inline_summary * +get_inline_summary (const struct cgraph_node *node) { - return &(*inline_summary_vec)[node->uid]; + return (*inline_summary_summary)[node->summary_uid]; Hmm, i guess there is no way to avoid the (*...)? Otherwise it would be cleaner to use inline_summary[...] instead of get_inline_summary IMO. I added function_summary::get method, where the usage looks cleaner: inline_summary_d->get (node). Thanks, Martin Thanks for working on this! Honza Patch v3. Martin Version v4. Martin >From 4f87fdf746532a9ad89fcf85246cc449a49ba09d Mon Sep 17 00:00:00 2001 From: mliska Date: Mon, 8 Dec 2014 11:33:08 +0100 Subject: [PATCH 3/3] symbol_summary is used for inline_summary. gcc/lto/ChangeLog: 2014-12-08 Martin Liska * lto-partition.c (add_symbol_to_partition_1): New inline_summary_d is used. (undo_partition): Likewise. (lto_balanced_map): Likewise. gcc/ChangeLog: 2014-12-08 Martin Liska * cgraphunit.c (symbol_table::process_new_functions): New inline_summary_d is used. * ipa-cp.c (ipcp_cloning_candidate_p): Likewise. (devirtualization_time_bonus): Likewise. (estimate_local_effects): Likewise. (ipcp_propagate_stage): Likewise. * ipa-inline-analysis.c (evaluate_conditions_for_known_args): Likewise. (evaluate_properties_for_edge): Likewise. (inline_summary_alloc): Likewise. (reset_inline_summary): New inline_summary argument is introduced. (inline_summary_t::remove): New function. (inline_summary_t::duplicate): Likewise. (dump_inline_edge_summary): New inline_summary_d is used. (dump_inline_summary): Likewise. (estimate_function_body_sizes): Likewise. (compute_inline_parameters): Likewise. (estimate_edge_devirt_benefit): Likewise. (estimate_node_size_and_time): Likewise. (inline_update_callee_summaries): Likewise. (inline_merge_summary): Likewise. (inline_update_overall_summary): Likewise. (simple_edge_hints): Likewise. (do_estimate_edge_time): Likewise. (estimate_time_after_inlining): Likewise. (estimate_size_after_inlining): Likewise. (do_estimate_growth): Likewise. (growth_likely_positive): Likewise. (inline_generate_summary): Likewise. (inlin
Re: [PATCH 2/4] New data structure for cgraph_summary introduced.
On 11/14/2014 04:23 PM, Martin Liška wrote: On 11/14/2014 03:04 PM, Martin Liška wrote: On 11/13/2014 04:50 PM, Jan Hubicka wrote: gcc/ChangeLog: 2014-11-12 Martin Liska * Makefile.in: New object file is added. * cgraph.h (symbol_table::allocate_cgraph_symbol): Summary UID is filled up. * cgraph_summary.c: New file. * cgraph_summary.h: New file. Since I am trying to get rid of the cgraph prefixes for symbols (keep it for the graph only) and the summaries can be annotated to variables too. Even if it not necessarily supported by your current implementation, lets keep API prepared for it. So I would call it symtab-summary.* for source files and symtab_summary for base type (probably function_summary for annotating functions/cgraph_edge_summary for annotating edges?) Hello. I followed your remarks, new class is called function_summary and is located in symbol-summary.h. diff --git a/gcc/cgraph.h b/gcc/cgraph.h index e2becb9..588b6d5 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -1225,6 +1225,8 @@ public: int count_materialization_scale; /* Unique id of the node. */ int uid; + /* Summary unique id of the node. */ + int summary_uid; What makes summary_uid better than uid? Because cgraph_node::uid is not a unique ID, it's recycled. As I can see, there are two remaining usages of the fact that cgraph::uid are quite consecutive: a) node_growth_cache vector is resized according to cgraph_max_uid b) lto-partition.c: lto_balanced_map If we change ipa-related stuff to annotations and lto_balanced_map with be rewritten, we can finally unify uid and summary_uid. As Martin correctly pointed out, we should unify cgraph_node dumps, we combine uid and order. diff --git a/gcc/cgraph_summary.c b/gcc/cgraph_summary.c new file mode 100644 index 000..9af1d7e --- /dev/null +++ b/gcc/cgraph_summary.c And why do we need this file? It will need license header if really needed. Sure, the file can be removed. Martin The implementation seems sane - I will check the actual uses :) Please send the updated patch though. Honza Hello. There's v3 of the patch. Martin Hello. Version 4 of the patch. I have a conversation with Trevor and I hope it can be acceptable to trunk? Patch set can bootstrap on x86_64-linux-pc and no regression has been seen. Thanks, Martin >From df978f4a338869b56dc558c70f6c8f3c884b9fe1 Mon Sep 17 00:00:00 2001 From: mliska Date: Fri, 5 Dec 2014 15:59:04 +0100 Subject: [PATCH 1/3] New symbol_summary class introduced. gcc/ChangeLog: 2014-12-08 Martin Liska * cgraph.h (symbol_table::allocate_cgraph_symbol): Summary UID is filled up. * symbol-summary.h: New file. * gengtype.c (open_base_files): Add symbol-summary.h. * toplev.c (general_init): Call constructor of symbol_table. --- gcc/cgraph.h | 8 ++ gcc/gengtype.c | 4 +- gcc/symbol-summary.h | 281 +++ gcc/toplev.c | 3 +- 4 files changed, 293 insertions(+), 3 deletions(-) create mode 100644 gcc/symbol-summary.h diff --git a/gcc/cgraph.h b/gcc/cgraph.h index a5c5f56..1664bd7 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -1237,6 +1237,8 @@ public: int count_materialization_scale; /* Unique id of the node. */ int uid; + /* Summary unique id of the node. */ + int summary_uid; /* ID assigned by the profiling. */ unsigned int profile_id; /* Time profiler: first run of function. */ @@ -1810,6 +1812,10 @@ public: friend class cgraph_node; friend class cgraph_edge; + symbol_table (): cgraph_max_summary_uid (1) + { + } + /* Initialize callgraph dump file. */ void initialize (void); @@ -2006,6 +2012,7 @@ public: int cgraph_count; int cgraph_max_uid; + int cgraph_max_summary_uid; int edges_count; int edges_max_uid; @@ -2334,6 +2341,7 @@ symbol_table::allocate_cgraph_symbol (void) node->uid = cgraph_max_uid++; } + node->summary_uid = cgraph_max_summary_uid++; return node; } diff --git a/gcc/gengtype.c b/gcc/gengtype.c index 2dc857e..276f543 100644 --- a/gcc/gengtype.c +++ b/gcc/gengtype.c @@ -1842,8 +1842,8 @@ open_base_files (void) "tree-ssa-loop-niter.h", "tree-into-ssa.h", "tree-dfa.h", "tree-ssa.h", "reload.h", "cpp-id-data.h", "tree-chrec.h", "except.h", "output.h", "cfgloop.h", "target.h", "lto-streamer.h", - "target-globals.h", "ipa-ref.h", "cgraph.h", "ipa-prop.h", - "ipa-inline.h", "dwarf2out.h", "omp-low.h", NULL + "target-globals.h", "ipa-ref.h", "cgraph.h", "symbol-summary.h", + "ipa-prop.h", "ipa-inline.h", "dwarf2out.h", "omp-low.h", NULL }; const char *const *ifp; outf_p gtype_desc_c; diff --git a/gcc/symbol-summary.h b/gcc/symbol-summary.h new file mode 100644 index 000..95270e5 --- /dev/null +++ b/gcc/symbol-summary.h @@ -0,0 +1,281 @@ +/* Callgraph summary data structure. + Copyright (C) 2014 Free Software Foundation, Inc. + Contribu
Re: [RFC, PATCH, fortran] PR fortran/60255 Deferred character length
Hi all, please find attached a more elaborate patch for pr60255. I totally agree that my first attempt was just scratching the surface of the work needed. This patch also is *not* complete, but because I am really new to gfortran patching, I don't want to present a final patch only to learn then, that I have violated design rules, common practice or the like. Therefore please comment and direct me to any sources/ideas to improve the patch. Topic: The pr 60255 is about assigning a char array to an unlimited polymorphic entity. In the comments the concern about the lost length information is raised. The patch adds a _len component to the unlimited polymorphic entity (after _data and _vtab) and adds an assignment of the string length to _len when a string is pointer assigned to the unlimited poly entity. Furthermore is the intrinsic len(unlimited poly pointing to a string) resolved to give the _len component. Yet missing: - assign _len component back to deferred char array length component - transport length along chains of unlimited poly entities, i.e., a => b; c => a where all objects are unlimited poly and b is a string. - allocate() in this context Patch dependencies: none Comments, concerns, candy welcome! Regards, Andre On Sun, 17 Aug 2014 14:32:21 +0200 domi...@lps.ens.fr (Dominique Dhumieres) wrote: > > the testcase should check that the code generated is actually working, > > not just that the ICE disappeared. > > I agree. Note that there is a test in the comment 3 of PR60255 that > can be used to check the run time behavior (and possibly check the > vtab issue). > > Dominique -- Andre Vehreschild * Email: vehre ad gmx dot de diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c index 0286c9e..29e31e1 100644 --- a/gcc/fortran/class.c +++ b/gcc/fortran/class.c @@ -2403,6 +2403,38 @@ yes: return true; } +/* Add the component _len to the class-type variable in c->expr1. */ + +void +gfc_add_len_component (gfc_code *c) +{ + /* Just make sure input is correct. This is already at the calling site, + but may be this routine is called from somewhere else in the furure. */ + gcc_assert (UNLIMITED_POLY(c->expr1) + && c->expr2 + && c->expr2->ts.type== BT_CHARACTER); + + gfc_component *len; + gfc_expr *e; + /* Check that _len is not present already. */ + if ((len= gfc_find_component (c->expr1->ts.u.derived, "_len", true, true))) +return; + /* Create the new component. */ + if (!gfc_add_component (c->expr1->ts.u.derived, "_len", &len)) +// Possible errors are already reported in add_component +return; + len->ts.type = BT_INTEGER; + len->ts.kind = 4; + len->attr.access = ACCESS_PRIVATE; + + /* Build minimal expression to initialize component with zero. */ + e = gfc_get_expr(); + e->ts = c->expr1->ts; + e->expr_type = EXPR_VARIABLE; + len->initializer = gfc_get_int_expr (gfc_default_integer_kind, + NULL, 0); + gfc_free_expr (e); +} /* Find (or generate) the symbol for an intrinsic type's vtab. This is needed to support unlimited polymorphism. */ @@ -2415,18 +2447,9 @@ find_intrinsic_vtab (gfc_typespec *ts) gfc_symbol *copy = NULL, *src = NULL, *dst = NULL; int charlen = 0; - if (ts->type == BT_CHARACTER) -{ - if (ts->deferred) - { - gfc_error ("TODO: Deferred character length variable at %C cannot " - "yet be associated with unlimited polymorphic entities"); - return NULL; - } - else if (ts->u.cl && ts->u.cl->length - && ts->u.cl->length->expr_type == EXPR_CONSTANT) - charlen = mpz_get_si (ts->u.cl->length->value.integer); -} + if (ts->type == BT_CHARACTER && !ts->deferred && ts->u.cl && ts->u.cl->length + && ts->u.cl->length->expr_type == EXPR_CONSTANT) +charlen = mpz_get_si (ts->u.cl->length->value.integer); /* Find the top-level namespace. */ for (ns = gfc_current_ns; ns; ns = ns->parent) @@ -2437,10 +2460,16 @@ find_intrinsic_vtab (gfc_typespec *ts) { char name[GFC_MAX_SYMBOL_LEN+1], tname[GFC_MAX_SYMBOL_LEN+1]; - if (ts->type == BT_CHARACTER) - sprintf (tname, "%s_%d_%d", gfc_basic_typename (ts->type), - charlen, ts->kind); - else + if (ts->type == BT_CHARACTER) { +if (!ts->deferred) + sprintf (tname, "%s_%d_%d", gfc_basic_typename (ts->type), + charlen, ts->kind); +else + /* The type is deferred here. Ensure that this is easily seen in the + vtable. */ + sprintf (tname, "%s_DEFERRED_%d", gfc_basic_typename (ts->type), + ts->kind); + } else sprintf (tname, "%s_%d_", gfc_basic_typename (ts->type), ts->kind); sprintf (name, "__vtab_%s", tname); diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index 1058502..f99c3f8 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -3192,6 +3192,8 @@ gfc_expr *gfc_class_initializer (gfc_typespec *, gfc_expr *);
Re: RFA: patch to fix PR64157
On 2014-12-08 10:41 AM, Richard Sandiford wrote: Vladimir Makarov writes: The following patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64157 After calling target_reinit from save_target_globals for switchable targets (as ppc), a lot of ira data (register sets, classes etc) become undefined. After that ira-costs.c crashes when the undefined data are used. The patch was successfully bootstrapped and tested on x86-64. Ok to commit to the trunk? 2014-12-05 Vladimir Makarov PR rtl-optimization/64157 * toplev.c (target_reinit): Call ira_init. Index: toplev.c === --- toplev.c(revision 218378) +++ toplev.c(working copy) @@ -1888,6 +1888,8 @@ target_reinit (void) /* This invokes target hooks to set fixed_reg[] etc, which is mode-dependent. */ init_regs (); + /* Set IRA data depended on target parameters. */ + ira_init (); Could you give more details about how this happens? It's reverting part of: I don't know this code well, Richard. I've just see that IRA data is undefined after save_target_globals, e.g. max_struct_costs_size has value zero which never should happen. As save_target_globals calls target_reinit and target_reinit has a comment /* Reinitialize everything when target parameters, such as register usage, have changed. */ I think it is natural reinitialize data for IRA from here. I'd appreciate if somebody look at this problem as it is not RA one per se. It is very easy to reproduce even on x86 build for ppc target.
Re: [PATCH] influence JIT linker command line
On Sat, 2014-12-06 at 15:56 -0500, Ulrich Drepper wrote: > This patch supercedes the patch I sent earlier this week to add > dependencies to the linker command line. The implementation is > different. > > First, based on Dave's comment that he wants to keep the interface > simple, to enable the linker optimizations no new interface is added. > Instead optimizations are enabled in the linker whenever the compiler > optimizes, too. I don't think this will create problems at all since > the time it takes nowadays is really low; it's only really measurable > for extremely large files. > > The way to add dependencies is changed. Instead of allowing an > unstructured string parameter to be added to the command line the new > proposed interface allows to introduce a dependency with possibly > information about the path the dependency is found. This should be > useful and implementable if at some point the explicit linker invocation > is replaced by a library implementation. The path argument of the new > interface is used differently depending on the name. If the name is of > the form -l* then the -L option is used and a runpath is added. > Otherwise the path is used to locate the file. > > > Comments? Thanks. The special meaning of -l as a prefix to make the entrypoint have two behaviors seems wrong to me. Similarly, on reading the docs, the repeated uses of the word "path" as both a param name and a concept seem confusing, to me at least. I think I'd prefer splitting the proposed gcc_jit_context_add_dependency into two API entrypoints, one for the -l case, and another for the filesystem name case. I think the resulting API would be easier for the end-user to understand. Or is there a reason to have both of the -l vs non-dash-l uses go through the same entrypoint? "man ld" uses "namespec" as the term for the arg to -l, so maybe the API could be two entrypoints looking something like this: (A) extern void gcc_jit_context_requires_namespec (gcc_jit_context *ctxt, const char *namespec, const char *searchdir, /* can be NULL */ int flags); which becomes a "-lnamespec", and perhaps a "-Lsearchdir"/"-Rsearchdir" (putting flags to the end so that we can default them to 0 in the C++ bindings). and: (B) extern void gcc_jit_context_requires_library (gcc_jit_context *ctxt, const char *path, int flags); Do we need separate name and path params here? If the user has name and path, can't they simply supply us with the path to the object? (i.e. "path" can be absolute, relative, or just a filename)? Or do we need to distinguish the possiblity that the linker might be searching in its standard locations, hence something like: extern void gcc_jit_context_requires_library (gcc_jit_context *ctxt, const char *filename, const char *searchdir, /* can be NULL */ int flags); Again, I'm sorry if I'm not fully grokking your use-cases here. Some possible concrete examples: gcc_jit_context_requires_namespec (ctxt, "foo", NULL, 0); adds a -lfoo to the link line, looking in the default places. gcc_jit_context_requires_namespec (ctxt, "bar", "/opt/bar-2.0/lib", 0); adds a -lbar to the link line, looking within the given dir first. The same, but via the filesystem-based entrypoint: gcc_jit_context_requires_library (ctxt, "libfoo.so", 0); gcc_jit_context_requires_library (ctxt, "/opt/bar-2.0/libbar.so", 0); and if I understand you, we can also do other kinds of object: gcc_jit_context_requires_library (ctxt, "/opt/bar-2.0/libbar.a", 0); gcc_jit_context_requires_library (ctxt, "/opt/bar-2.0/libbar.o", 0); In the C++ binding API these would become just: ctxt.requires_namespec ("foo"); ctxt.requires_namespec ("bar", "/opt/bar-2.0/lib"); ctxt.requires_library ("libfoo.so"); ctxt.requires_library ("/opt/bar-2.0/libbar.so"); ctxt.requires_library ("/opt/bar-2.0/libbar.a"); ctxt.requires_library ("/opt/bar-2.0/libbar.o"); Thoughts? I'm not in love with the names I suggested (though sadly I've already claimed the word "object" to mean something in the API, when it has its own meaning in the linker world). Maybe: import_by_{namespec|path}? import_{namespec|path}? link_by_{namespec|path}? link_to_{namespec|path}? add_library_by_{namespec|path|filename}? etc Various other stuff inline below. > gcc/ChangeLog: > > 2014-12-06 Ulrich Drepper > > * jit/jit-recording.c (recording::context::add_dependency): > New function.
Go patch committed: fix crash on invalid program
This patch from Chris Manghane fixes a crash-on-invalid in the Go frontend. This is GCC PR 64198. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 2a85649c19e1 go/parse.cc --- a/go/parse.cc Sun Nov 30 17:28:00 2014 -0800 +++ b/go/parse.cc Mon Dec 08 10:01:38 2014 -0800 @@ -3190,9 +3190,12 @@ if (token->is_op(OPERATOR_COMMA)) token = this->advance_token(); if (!token->is_op(OPERATOR_RPAREN)) -error_at(this->location(), "missing %<)%>"); - else -this->advance_token(); +{ + error_at(this->location(), "missing %<)%>"); + if (!this->skip_past_error(OPERATOR_RPAREN)) + return Expression::make_error(this->location()); +} + this->advance_token(); if (func->is_error_expression()) return func; return Expression::make_call(func, args, is_varargs, func->location());
Re: [PATCH] [AArch64, RTL] Bics instruction generation for aarch64
On 08/12/14 10:33, Richard Earnshaw wrote: On 11/11/14 10:38, Alex Velenko wrote: From 98bb6d7323ce79e28be8ef892b919391ed857e1f Mon Sep 17 00:00:00 2001 From: Alex Velenko Date: Fri, 31 Oct 2014 18:43:32 + Subject: [PATCH] [AArch64, RTL] Bics instruction generation for aarch64 Hi, This patch adds rtl patterns for aarch64 to generate bics instructions in cases when caputed value gets discarded and only only the status regester change of the instruction gets reused. Previously, bics would only be generated, if the value computed by bics would later be reused, which is not necessarily the case when computing this value for "if" statements. Is this patch ok? Thanks, Alex gcc/ 2014-11-10 Alex Velenko * gcc/config/aarch64/aarch64.md (and_one_cmpl3_compare0_no_reuse): New define_insn. * (and_one_cmpl_3_compare0_no_reuse): Likewise. gcc/testsuite/ 2014-11-10 Alex Velenko * gcc.target/aarch64/bics1.c : New testcase. OK. R. Committed, Alex --- gcc/config/aarch64/aarch64.md | 26 gcc/testsuite/gcc.target/aarch64/bics_3.c | 69 +++ 2 files changed, 95 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/bics_3.c diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 341c26f..6158d82 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -2845,6 +2845,18 @@ [(set_attr "type" "logics_reg")] ) +(define_insn "*and_one_cmpl3_compare0_no_reuse" + [(set (reg:CC_NZ CC_REGNUM) +(compare:CC_NZ + (and:GPI (not:GPI + (match_operand:GPI 0 "register_operand" "r")) + (match_operand:GPI 1 "register_operand" "r")) + (const_int 0)))] + "" + "bics\\tzr, %1, %0" + [(set_attr "type" "logics_reg")] +) + (define_insn "*_one_cmpl_3" [(set (match_operand:GPI 0 "register_operand" "=r") (LOGICAL:GPI (not:GPI @@ -2894,6 +2906,20 @@ [(set_attr "type" "logics_shift_imm")] ) +(define_insn "*and_one_cmpl_3_compare0_no_reuse" + [(set (reg:CC_NZ CC_REGNUM) +(compare:CC_NZ + (and:GPI (not:GPI + (SHIFT:GPI +(match_operand:GPI 0 "register_operand" "r") +(match_operand:QI 1 "aarch64_shift_imm_" "n"))) + (match_operand:GPI 2 "register_operand" "r")) + (const_int 0)))] + "" + "bics\\tzr, %2, %0, %1" + [(set_attr "type" "logics_shift_imm")] +) + (define_insn "clz2" [(set (match_operand:GPI 0 "register_operand" "=r") (clz:GPI (match_operand:GPI 1 "register_operand" "r")))] diff --git a/gcc/testsuite/gcc.target/aarch64/bics_3.c b/gcc/testsuite/gcc.target/aarch64/bics_3.c new file mode 100644 index 000..ecb53e9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/bics_3.c @@ -0,0 +1,69 @@ +/* { dg-do run } */ +/* { dg-options "-O2 --save-temps" } */ + +extern void abort (void); + +int __attribute__ ((noinline)) +bics_si_test (int a, int b) +{ + if (a & ~b) +return 1; + else +return 0; +} + +int __attribute__ ((noinline)) +bics_si_test2 (int a, int b) +{ + if (a & ~ (b << 2)) +return 1; + else +return 0; +} + +typedef long long s64; + +int __attribute__ ((noinline)) +bics_di_test (s64 a, s64 b) +{ + if (a & ~b) +return 1; + else +return 0; +} + +int __attribute__ ((noinline)) +bics_di_test2 (s64 a, s64 b) +{ + if (a & ~(b << 2)) +return 1; + else +return 0; +} + +int +main (void) +{ + int a = 5; + int b = 5; + int c = 20; + s64 d = 5; + s64 e = 5; + s64 f = 20; + if (bics_si_test (a, b)) +abort (); + if (bics_si_test2 (c, b)) +abort (); + if (bics_di_test (d, e)) +abort (); + if (bics_di_test2 (f, e)) +abort (); + return 0; +} + +/* { dg-final { scan-assembler-times "bics\twzr, w\[0-9\]+, w\[0-9\]+" 2 } } */ +/* { dg-final { scan-assembler-times "bics\twzr, w\[0-9\]+, w\[0-9\]+, lsl 2" 1 } } */ +/* { dg-final { scan-assembler-times "bics\txzr, x\[0-9\]+, x\[0-9\]+" 2 } } */ +/* { dg-final { scan-assembler-times "bics\txzr, x\[0-9\]+, x\[0-9\]+, lsl 2" 1 } } */ + +/* { dg-final { cleanup-saved-temps } } */
[PATCH, committed] Make jit/notes.txt better reflect current status quo
The high-level overview diagram in jit/notes.txt showed the release of the JIT mutex in where I'd like it to be, rather than where it currently is. Fix the diagram to reflect reality, and clarify various other things. Committed to trunk as r218488. gcc/jit/ChangeLog: * notes.txt: Show the beginning and ending of recording::context::compile vs playback::context::compile. Show the creation and unlinking of the tempdir. Show toplev::finalize. Move "RELEASE MUTEX" to the correct location. Show gcc_jit_result_release, and indicate where the dlopen/dlsym/dlclose occur. --- gcc/jit/notes.txt | 38 ++ 1 file changed, 34 insertions(+), 4 deletions(-) diff --git a/gcc/jit/notes.txt b/gcc/jit/notes.txt index d337cb4..12bb6d6 100644 --- a/gcc/jit/notes.txt +++ b/gcc/jit/notes.txt @@ -18,9 +18,13 @@ Client Code . Generated .libgccjit.so │ . . . . V . . gcc_jit_context_compile . ──> . . + . .│ start of recording::context::compile () . .│ . . . .│ ACQUIRE MUTEX . . .│ . . + . .│ start of playback::context::compile () + . .│ (create tempdir) . + . .│ . . . .V───> toplev::main (for now) . . . . │ . . . . (various code) @@ -65,13 +69,33 @@ Client Code . Generated .libgccjit.so . . . . (the middle─end and backend) . . . . ↓ . .<─ end of toplev::main - . .│ RELEASE MUTEX . . .│ . . - . .│ Convert assembler to DSO + . .V───> toplev::finalize + . . . . │ (purge internal state) + . .< end of toplev::finalize + . .│ . . + . .│ Convert assembler to DSO ("fake.so") + . .│ . . + . .│ Load DSO (dlopen "fake.so") . .│ . . - . .│ Load DSO. + . .│ end of playback::context::compile () + . .│ . . + . .│ playback::context dtor + . . ──> . . + . . │ Cleanup tempdir . + . . │ ("fake.so" is unlinked from the + . . │filesystem at this point) + . .<── . . + . .│ . . + . .│ RELEASE MUTEX . + . .│ . . + . .│ end of recording::context::compile () <─── . . │ . . . . + V . . gcc_jit_result_get_code . +──> . . + . .│ dlsym () within loaded DSO + <─── . . Get (void*). . . . │ . . . . │ Call it . . . . @@ -80,5 +104,11 @@ Client Code . Generated .libgccjit.so .│ . . . <─── . . . │ . . . . +etc│ . . . . + │ . . . . + V . . gcc_jit_result_release . +──> . . + . .│ dlclose () the loaded DSO + . .│(code becomes uncallable) + <─── . . │ . . . . -etc -- 1.8.5.3
Re: [Patch] Improving jump-thread pass for PR 54742
On 12/06/14 06:47, Sebastian Pop wrote: Jeff Law wrote: OK to commit. Thanks for your patience. Can you follow-up with a change which throttles this optimization when -Os is in effect. You can check optimize_function_for_size_p (cfun) and simply avoid the backward traversal or you could allow it in that case if the amount of copying is suitably small. Your call. I think it does not make sense to duplicate paths at -Os: I disabled the FSM jump-threading when optimizing for size like this. diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c index 29b20c8..ce70311 100644 --- a/gcc/tree-ssa-threadedge.c +++ b/gcc/tree-ssa-threadedge.c @@ -1335,8 +1335,9 @@ thread_through_normal_block (edge e, return 1; } if (!flag_expensive_optimizations + || optimize_function_for_size_p (cfun) || TREE_CODE (cond) != SSA_NAME || e->dest->loop_father != e->src->loop_father || loop_depth (e->dest->loop_father) == 0) return 0; I will regstrap and commit the attached patch. Looks good to me. Thanks for taking care of it. Jeff
[PATCH, committed] libgccjit++.h: use indentation to show inheritance
Committed to trunk as r218489. gcc/jit/ChangeLog: * libgccjit++.h: Indent the forward declarations of the classes to show the inheritance hierarchy. --- gcc/jit/libgccjit++.h | 21 - 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/gcc/jit/libgccjit++.h b/gcc/jit/libgccjit++.h index 67ed5d5..7f6718b 100644 --- a/gcc/jit/libgccjit++.h +++ b/gcc/jit/libgccjit++.h @@ -32,16 +32,19 @@ along with GCC; see the file COPYING3. If not see namespace gccjit { + /* Indentation indicates inheritance. */ class context; - class location; - class field; - class type; - class struct_; - class param; - class function; - class block; - class rvalue; - class lvalue; + class error; + class object; +class location; +class field; +class type; + class struct_; +class function; +class block; +class rvalue; + class lvalue; + class param; /* Errors within the API become C++ exceptions of this class. */ class error -- 1.8.5.3
Re: [PATCH] Fix PR 61225
On 12/05/14 17:16, Segher Boessenkool wrote: On Fri, Dec 05, 2014 at 03:31:54PM -0700, Jeff Law wrote: Combine does not consider combining 9 into 7 because there is no LOG_LINK between them (the link for r88 is between 8 and 7 already). OK, yea, that's a long standing design decision. We don't feed a single def into multiple use sites. There is no real reason not to do that. It doesn't increase computational complexity, although it is of course more expensive than what combine does today (it is more work, after all). And combining with a later use does not have too big a chance to succeed (since it has to keep the result of the earlier insn around always). No fundamental reason, it's just always been that way. One could argue that a bridge pattern often makes this unnecessary and bridges have been a well known way to work around combine's failings for a long time. GCC 6 or later ;-) Yea, I think so :) jeff
Re: Compare-elim pass
On 12/05/14 17:58, Segher Boessenkool wrote: On Fri, Dec 05, 2014 at 01:40:56PM -0700, Jeff Law wrote: My first thought would be to allow both and have combine swap the order in the vector if recog doesn't recognize the pattern. One could argue we could go through a full permutation of ordering in the vector, but that's probably above and beyond the call of duty. Combine also expects a certain ordering for its *inputs*. It might not be the only pass that does this either. All targets (where this works) have the compare first in all their patterns. This goes back, what, 20+ years? It might not be "officially" canonical, but it's the only ordering that works everywhere. Yup, aware that inputs can be an issue as well. But again, there's no reason we can't handle those too. I guess in the end the first question is whether or not we want a canonical form. I can see arguments either way. That answer drives the next set of questions/issues to work through.
Re: [patch] Fix tilepro includes
On 12/8/2014 11:23 AM, Jan-Benedict Glaw wrote: > On Fri, 2014-11-21 08:45:11 -0500, Andrew MacLeod wrote: >> During the flattening of optabs.h, I updated all the config/* files >> which were affected. I've been getting spurious failures with >> config-list.mk where my changes would "disappear" and tracked down >> why. >> >> I was blissfully unaware that the tilepro ports mul-tables.c file is >> actually generated from gen-mul-tables.cc. >> >> This patch fixes the include issue by adding "#include insn-codes.h" >> to the generated files. I also added a comment indicating these are >> generated files, and to make changes in the generator. >> >> This allows all the tile* ports to compile properly again. >> >> OK for trunk? > Seems this wasn't ever ACKed or applied up to now? I'm still seeing > compilation errors for the tile targets, see eg. > http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=382169 Sorry about that. Looks good to me. Walter
Re: [PATCH, PR63995, CHKP] Use single static bounds var for varpool nodes sharing asm name
On 12/08/14 04:00, Ilya Enkovich wrote: 2014-11-26 Ilya Enkovich PR bootstrap/63995 * tree-chkp.c (chkp_make_static_bounds): Share bounds var between nodes sharing assembler name. gcc/testsuite 2014-11-26 Ilya Enkovich PR bootstrap/63995 * g++.dg/dg.exp: Add mpx-dg.exp. * g++.dg/pr63995-1.C: New. Code seems fine. Not sure how the testcase is actually testing that you're sharing bounds though. The test AFAICT just verifies that it can be compiled and doesn't checkout the output in any way. Or is it the case that the test will ICE or segfault without your fix? Jeff
Re: [PATCH] Fix size & type for cold partition names (hot-cold function partitioning)
On 12/05/14 15:41, Caroline Tice wrote: When hot/cold function splitting occurs, a symbol is generated for the cold partition, and gets output in the assembly & debug info, but the symbol currently gets a size of 0 and a type of NOTYPE, as in this example (on x86_64-linux) from the cold_partition_label test in the testsuite: $ readelf -sW cold_partition_label.x02 | grep foo 36: 00400450 0 NOTYPE LOCAL DEFAULT 12 foo.cold.0 58: 0040049043 FUNCGLOBAL DEFAULT 12 foo $ This patch fixes this by calculating the right size for the partition, and outputing the size and type fo the cold partition symbol. After applying this patch and looking at the same test, I get: $ readelf -sW cold_partition_label.x02 | grep foo 36: 0040045029 FUNCLOCAL DEFAULT 12 foo.cold.0 58: 0040049043 FUNCGLOBAL DEFAULT 12 foo $ This patch has been tested by bootstrapping the compiler, running the dejagnu testsuite with no regressions, and checked as shown above that it fixes the original problem. Is this patch OK to commit to ToT? -- Caroline Tice cmt...@google.com 2014-12-05 Caroline Tice * final.c (final_scan_insn): Change 'cold_function_name' to 'cold_partition_name' and make it a global variable; also output assembly to give it a 'FUNC' type, if appropriate. * varasm.c (cold_partition_name): Declare and initialize global variable. (assemble_start_function): Re-set value for cold_partition_name. (assemble_end_function): Output assembly to calculate size of cold partition, and associate size with name, if appropriate. * varash.h (cold_partition_name): Add extern declaration for global variable. OK for the trunk. I'm ever-so-slightly worried about the PA and PTX ports are they're probably the most sensitive to types. But we'll deal with them if they complain. Adding a testcase would be helpful. I believe some of the LTO tests use readelf, so there should be some infrastructure you can factor out and reuse. jeff
Re: [PATCH, PR63995, CHKP] Use single static bounds var for varpool nodes sharing asm name
2014-12-09 0:00 GMT+03:00 Jeff Law : > On 12/08/14 04:00, Ilya Enkovich wrote: >>> >>> 2014-11-26 Ilya Enkovich >>> >>> PR bootstrap/63995 >>> * tree-chkp.c (chkp_make_static_bounds): Share bounds var >>> between nodes sharing assembler name. >>> >>> gcc/testsuite >>> >>> 2014-11-26 Ilya Enkovich >>> >>> PR bootstrap/63995 >>> * g++.dg/dg.exp: Add mpx-dg.exp. >>> * g++.dg/pr63995-1.C: New. > > Code seems fine. Not sure how the testcase is actually testing that you're > sharing bounds though. The test AFAICT just verifies that it can be > compiled and doesn't checkout the output in any way. Or is it the case that > the test will ICE or segfault without your fix? Without this fix compiler emits two symbols with the same name and assembler complains. Something like that: /tmp/cceHGHa1.s: Assembler messages: /tmp/cceHGHa1.s:12080: Error: symbol `__chkp_bounds_of_mode_inner' is already defined /tmp/cceHGHa1.s:12086: Error: symbol `__chkp_bounds_of_mode_nunits' is already defined Thanks, Ilya > > > Jeff > >
Re: [PATCH] Set stores_off_frame_dead_at_return to false if sibling call
On 12/01/14 11:44, John David Anglin wrote: To do unconditionally set frame_read? Or if we don't want to get that drastic, if (reload_completed || SIBLING_CALL_P (insn)) insn_info->frame_read = true; Will test but I, if I read the code correctly, setting insn_info->frame_read = true results in frame related stores being killed in a constant call. This doesn't seem like the right solution. But isn't killing in this context referring to GEN/KILL types of operations for global dataflow? ISTM that GEN is a store operation while KILL is a load operation (over-simplification, but stick with me). Thus a GEN-GEN will get optimized into a single GEN (dead store eliminated) while a GEN-KILL-GEN can not be optimized by DSE because of the KILL. It should always be safe to have an extraneous KILL since that merely inhibits optimization. While an extraneous GEN can be a disaster. So with setting frame_read, we're going to have more KILLs in the dataflow sets, which results in fewer stores being eliminated. They come from: /* If the frame is read, the frame related stores are killed. */ else if (insn_info->frame_read) { store_info_t store_info = i_ptr->store_rec; /* Skip the clobbers. */ while (!store_info->is_set) store_info = store_info->next; if (store_info->group_id >= 0 && rtx_group_vec[store_info->group_id]->frame_related) remove_store = true; } if (remove_store) { if (dump_file && (dump_flags & TDF_DETAILS)) dump_insn_info ("removing from active", i_ptr); active_local_stores_len--; if (last) last->next_local_store = i_ptr->next_local_store; else active_local_stores = i_ptr->next_local_store; } else last = i_ptr; i_ptr = i_ptr->next_local_store; AFAICT in this loop, setting FRAME_READ causes us to set REMOVE_STORE more often. REMOVE_STORE in this context seems to indicate that we're going to remove a store from the list we are tracking for potential removal. Which seems exactly right. Here as well: /* If this insn reads the frame, kill all the frame related stores. */ if (insn_info->frame_read) { FOR_EACH_VEC_ELT (rtx_group_vec, i, group) if (group->process_globally && group->frame_related) { if (kill) bitmap_ior_into (kill, group->group_kill); bitmap_and_compl_into (gen, group->group_kill); } } Which also seems like exactly what we want since when we set FRAME_READ we end up with a bigger KILL set and a smaller GEN set. I think the terminology and variable names certainly makes this tougher to follow than it should. Here we have frame related calls being killed before reload because the argument stores for the sibcall are off frame: /* Set the gen set of the exit block, and also any block with no successors that does not have a wild read. */ static void dse_step3_exit_block_scan (bb_info_t bb_info) { /* The gen set is all 0's for the exit block except for the frame_pointer_group. */ if (stores_off_frame_dead_at_return) { unsigned int i; group_info_t group; FOR_EACH_VEC_ELT (rtx_group_vec, i, group) { if (group->process_globally && group->frame_related) bitmap_ior_into (bb_info->gen, group->group_kill); } } } I see your point. Expanding the GEN set here seems wrong.If we go with setting STORES_OFF_FRAME_DEAD_AT_RETURN, then we definitely need to update its documentation. I think an argument could be made that we want both changes if I've interpreted this code correctly. Jeff
Re: [PATCH] Set stores_off_frame_dead_at_return to false if sibling call
On 12/04/14 08:50, John David Anglin wrote: On 12/1/2014 11:57 AM, Jeff Law wrote: Prior to reload (ie, in DSE1) there's a bit of magic in that we do not set frame_read on call insns. That may in fact be wrong and possibly the source of the problem. /* This field is only used for the processing of const functions. These functions cannot read memory, but they can read the stack because that is where they may get their parms. We need to be this conservative because, like the store motion pass, we don't consider CALL_INSN_FUNCTION_USAGE when processing call insns. Moreover, we need to distinguish two cases: 1. Before reload (register elimination), the stores related to outgoing arguments are stack pointer based and thus deemed of non-constant base in this pass. This requires special handling but also means that the frame pointer based stores need not be killed upon encountering a const function call. The above is wrong for sibcalls. Sibcall arguments are relative to the incoming argument pointer. Is this always the frame pointer? I don't think it's always the frame pointer. Don't we use an argument pointer for the PA64 runtime? If I recall, it was the only port that had a non-eliminable argument pointer at the time. If that is the case, then it may be possible to eliminate stack based stores when we have a sibcall before reload. if (reload_completed || SIBLING_CALL_P (insn)) insn_info->frame_read = true; This works. Dave
Re: [PATCH] Fix PR 61225
On 12/04/14 01:43, Zhenqiang Chen wrote: > > > > Part of PR rtl-optimization/61225 > > * combine.c (refer_same_reg_p): New function. > > (combine_instructions): Handle I1 -> I2 -> I3; I2 -> insn. > > (try_combine): Add one more parameter TO_COMBINED_INSN, which >is > > used to create a new insn parallel (TO_COMBINED_INSN, I3). > > > >testsuite/ChangeLog: > >2014-08-04 Zhenqiang Chen > > > > * gcc.target/i386/pr61225.c: New test. THanks for the updates and clarifications. Just a few minor things and while it's a bit of a hack, I'll approve: + +/* A is a compare (reg1, 0) and B is SINGLE_SET which SET_SRC is reg2. + It returns TRUE, if reg1 == reg2, and no other refer of reg1 + except A and B. */ + +static bool +refer_same_reg_p (rtx_insn *a, rtx_insn *b) +{ + rtx seta = single_set (a); + rtx setb = single_set (b); + + if (BLOCK_FOR_INSN (a) != BLOCK_FOR_INSN (b) + || !seta + || !setb) +return false; + + if (GET_CODE (SET_SRC (seta)) != COMPARE + || GET_MODE_CLASS (GET_MODE (SET_DEST (seta))) != MODE_CC + || !REG_P (XEXP (SET_SRC (seta), 0)) + || XEXP (SET_SRC (seta), 1) != const0_rtx + || !REG_P (SET_SRC (setb)) + || REGNO (SET_SRC (setb)) != REGNO (XEXP (SET_SRC (seta), 0))) +return false; Do you need to verify SETA and SETB satisfy single_set? Or has that already been done elsewhere? The name refer_same_reg_p seems wrong -- your function is verifying the underlying RTL store as well as the existence of a a dependency between the insns. Can you try to come up with a better name? Please use CONST0_RTX (mode) IIRC that'll allow this to work regardless of the size of the modes relative to the host word size. + + if (DF_REG_USE_COUNT (REGNO (SET_SRC (setb))) > 2) +{ + df_ref use; + rtx insn; + unsigned int i = REGNO (SET_SRC (setb)); + + for (use = DF_REG_USE_CHAIN (i); use; use = DF_REF_NEXT_REG (use)) +{ + insn = DF_REF_INSN (use); + if (insn != a && insn != b && !(NOTE_P (insn) || DEBUG_INSN_P (insn))) + return false; + } +} + + return true; +} Is this fragment really needed? Does it ever trigger? I'd think that for > 2 uses punting would be fine. Do we really commonly have cases with > 2 uses, but where they're all in SETA and SETB? } } + /* Try to combine a compare insn that sets CC +with a preceding insn that can set CC, and maybe with its +logical predecessor as well. +We need this special code because data flow connections +do not get entered in LOG_LINKS. */ + if ((prev = prev_nonnote_nondebug_insn (insn)) != NULL_RTX + && refer_same_reg_p (insn, prev) + && max_combine >= 4) + { + struct insn_link *next1; + FOR_EACH_LOG_LINK (next1, prev) + { + rtx_insn *link1 = next1->insn; + if (NOTE_P (link1)) + continue; + /* I1 -> I2 -> I3; I2 -> insn; + output parallel (insn, I3). */ + FOR_EACH_LOG_LINK (nextlinks, link1) + if ((next = try_combine (prev, link1, + nextlinks->insn, NULL, + &new_direct_jump_p, + last_combined_insn, insn)) != 0) + + { + delete_insn (insn); + insn = next; + statistics_counter_event (cfun, "four-insn combine", 1); + goto retry; + } + /* I2 -> I3; I2 -> insn + output next = parallel (insn, I3). */ + if ((next = try_combine (prev, link1, +NULL, NULL, +&new_direct_jump_p, +last_combined_insn, insn)) != 0) + + { + delete_insn (insn); + insn = next; + statistics_counter_event (cfun, "three-insn combine", 1); + goto retry; + } + } + } So you've got two new combine cases here, but I think the testcase only tests one of them. Can you include a testcase for both of hte major paths above (I1->I2->I3; I2->insn and I2->I3; I2->INSN) Please make those changes and repost for final approval. jeff
Re: [PATCH, CHKP] Fix instrumentation clones privatization
On 12/02/14 07:18, Ilya Enkovich wrote: Hi, Currently symbol names privatization doesn't work for instrumentation clones because clones assembler name is transparent alias and therefore alias target should be privatized instead. This patch does it. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? Thanks, Ilya -- gcc/ 2014-12-02 Ilya Enkovich * lto/lto-partition.c (privatize_symbol_name): Correctly privatize instrumentation clones. gcc/testsuite/ 2014-12-02 Ilya Enkovich * gcc.dg/lto/lto.exp: Load mpx-dg.exp. * gcc.dg/lto/chkp-privatize_0.c: New. * gcc.dg/lto/chkp-privatize_1.c: New. OK. jeff
Re: [PATCH, PR63995, CHKP] Use single static bounds var for varpool nodes sharing asm name
On 12/08/14 14:12, Ilya Enkovich wrote: 2014-12-09 0:00 GMT+03:00 Jeff Law : On 12/08/14 04:00, Ilya Enkovich wrote: 2014-11-26 Ilya Enkovich PR bootstrap/63995 * tree-chkp.c (chkp_make_static_bounds): Share bounds var between nodes sharing assembler name. gcc/testsuite 2014-11-26 Ilya Enkovich PR bootstrap/63995 * g++.dg/dg.exp: Add mpx-dg.exp. * g++.dg/pr63995-1.C: New. Code seems fine. Not sure how the testcase is actually testing that you're sharing bounds though. The test AFAICT just verifies that it can be compiled and doesn't checkout the output in any way. Or is it the case that the test will ICE or segfault without your fix? Without this fix compiler emits two symbols with the same name and assembler complains. Something like that: /tmp/cceHGHa1.s: Assembler messages: /tmp/cceHGHa1.s:12080: Error: symbol `__chkp_bounds_of_mode_inner' is already defined /tmp/cceHGHa1.s:12086: Error: symbol `__chkp_bounds_of_mode_nunits' is already defined Ah OK. Ok for the trunk. Thanks for your patience, Jeff
[PATCH] Fix broadcast from scalar patterns (PR target/63594)
Hi! This patch attempts to fix (set (reg:V*) (vec_duplicate:V* (reg/mem:))) patterns. One issue is that there were separate patterns for broadcast from gpr and separate patterns for broadcast from memory (and vector reg), that isn't a good idea for reload, which can't then freely choose. Another issue is that some pre-AVX2 broadcast patterns were present above the avx512vl broadcast patterns, so again, reload didn't have the possibility to use %xmm16-31/%ymm16-31 registers. Also, the splitter written for AVX2 broadcasts from gpr went into the way of AVX512VL broadcasts. And finally, the avx512*intrin.h headers were using #ifdef TARGET_64BIT, macro not used anywhere (probably meant to write __x86_64__ instead, but with the patch we actually just have one set of builtins. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-12-08 Jakub Jelinek PR target/63594 * config/i386/sse.md (vec_dupv4sf): Move after _vec_dup_gpr pattern. (*vec_dupv4si, *vec_dupv2di): Likewise. (_vec_dup_mem): Merge into ... (_vec_dup_gpr): ... this pattern. (*vec_dup AVX2_VEC_DUP_MODE splitter): Disable for TARGET_AVX512VL (for QI/HI scalar modes only if TARGET_AVX512BW is set too). * config/i386/i386.c (enum ix86_builtins): Remove IX86_BUILTIN_PBROADCASTQ256_MEM_MASK, IX86_BUILTIN_PBROADCASTQ128_MEM_MASK and IX86_BUILTIN_PBROADCASTQ512_MEM. (bdesc_args): Use __builtin_ia32_pbroadcastq512_gpr_mask, __builtin_ia32_pbroadcastq256_gpr_mask and __builtin_ia32_pbroadcastq128_gpr_mask instead of *_mem_mask regardless of OPTION_MASK_ISA_64BIT. * config/i386/avx512fintrin.h (_mm512_set1_epi64, _mm512_mask_set1_epi64, _mm512_maskz_set1_epi64): Use *_gpr_mask builtins regardless of whether TARGET_64BIT is defined or not. * config/i386/avx512vlintrin.h (_mm256_mask_set1_epi64, _mm256_maskz_set1_epi64, _mm_mask_set1_epi64, _mm_maskz_set1_epi64): Likewise. --- gcc/config/i386/sse.md.jj 2014-12-03 11:52:41.0 +0100 +++ gcc/config/i386/sse.md 2014-12-08 13:26:06.505543457 +0100 @@ -6319,22 +6319,6 @@ (define_insn "avx512f_vec_dup_1" (set_attr "prefix" "evex") (set_attr "mode" "")]) -(define_insn "vec_dupv4sf" - [(set (match_operand:V4SF 0 "register_operand" "=x,x,x") - (vec_duplicate:V4SF - (match_operand:SF 1 "nonimmediate_operand" "x,m,0")))] - "TARGET_SSE" - "@ - vshufps\t{$0, %1, %1, %0|%0, %1, %1, 0} - vbroadcastss\t{%1, %0|%0, %1} - shufps\t{$0, %0, %0|%0, %0, 0}" - [(set_attr "isa" "avx,avx,noavx") - (set_attr "type" "sseshuf1,ssemov,sseshuf1") - (set_attr "length_immediate" "1,0,1") - (set_attr "prefix_extra" "0,1,*") - (set_attr "prefix" "vex,vex,orig") - (set_attr "mode" "V4SF")]) - ;; Although insertps takes register source, we prefer ;; unpcklps with register source since it is shorter. (define_insn "*vec_concatv2sf_sse4_1" @@ -12821,37 +12805,6 @@ (define_split operands[1] = adjust_address (operands[1], mode, offs); }) -(define_insn "*vec_dupv4si" - [(set (match_operand:V4SI 0 "register_operand" "=x,x,x") - (vec_duplicate:V4SI - (match_operand:SI 1 "nonimmediate_operand" " x,m,0")))] - "TARGET_SSE" - "@ - %vpshufd\t{$0, %1, %0|%0, %1, 0} - vbroadcastss\t{%1, %0|%0, %1} - shufps\t{$0, %0, %0|%0, %0, 0}" - [(set_attr "isa" "sse2,avx,noavx") - (set_attr "type" "sselog1,ssemov,sselog1") - (set_attr "length_immediate" "1,0,1") - (set_attr "prefix_extra" "0,1,*") - (set_attr "prefix" "maybe_vex,vex,orig") - (set_attr "mode" "TI,V4SF,V4SF")]) - -(define_insn "*vec_dupv2di" - [(set (match_operand:V2DI 0 "register_operand" "=x,x,x,x") - (vec_duplicate:V2DI - (match_operand:DI 1 "nonimmediate_operand" " 0,x,m,0")))] - "TARGET_SSE" - "@ - punpcklqdq\t%0, %0 - vpunpcklqdq\t{%d1, %0|%0, %d1} - %vmovddup\t{%1, %0|%0, %1} - movlhps\t%0, %0" - [(set_attr "isa" "sse2_noavx,avx,sse3,noavx") - (set_attr "type" "sselog1,sselog1,sselog1,ssemov") - (set_attr "prefix" "orig,vex,maybe_vex,orig") - (set_attr "mode" "TI,TI,DF,V4SF")]) - (define_insn "*vec_concatv2si_sse4_1" [(set (match_operand:V2SI 0 "register_operand" "=Yr,*x,x, Yr,*x,x, x, *y,*y") (vec_concat:V2SI @@ -16665,46 +16618,78 @@ (define_insn "avx512f_broa (set_attr "mode" "")]) (define_insn "_vec_dup_gpr" - [(set (match_operand:VI12_AVX512VL 0 "register_operand" "=v") + [(set (match_operand:VI12_AVX512VL 0 "register_operand" "=v,v") (vec_duplicate:VI12_AVX512VL - (match_operand: 1 "register_operand" "r")))] + (match_operand: 1 "nonimmediate_operand" "vm,r")))] "TARGET_AVX512BW" - "vpbroadcast\t{%k1, %0|%0, %k1}" + "@ + vpbroadcast\t{%1, %0|%0, %1} + vpbroadcast\t{%k1, %0|%0, %k1}" [(set_attr "type" "ssemov") (set_attr "prefix" "evex") (set_attr "mode" "")]) (define
[PATCH] Two more spots for
Hi! I've noticed two new spots which use GET_MODE_SIZE (mode) instead of the more efficient . Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-12-08 Jakub Jelinek * config/i386/sse.md (*mov_internal, *avx512f_gatherdi_2): Use instead of GET_MODE_SIZE (mode). --- gcc/config/i386/sse.md.jj 2014-12-03 11:52:41.0 +0100 +++ gcc/config/i386/sse.md 2014-12-08 13:00:39.167141532 +0100 @@ -811,7 +811,7 @@ (define_insn "*mov_internal" /* There is no evex-encoded vmov* for sizes smaller than 64-bytes in avx512f, so we need to use workarounds, to access sse registers 16-31, which are evex-only. In avx512vl we don't need workarounds. */ - if (TARGET_AVX512F && GET_MODE_SIZE (mode) < 64 && !TARGET_AVX512VL + if (TARGET_AVX512F && < 64 && !TARGET_AVX512VL && ((REG_P (operands[0]) && EXT_REX_SSE_REGNO_P (REGNO (operands[0]))) || (REG_P (operands[1]) && EXT_REX_SSE_REGNO_P (REGNO (operands[1]) { @@ -18208,7 +18196,7 @@ (define_insn "*avx512f_gatherdi_2" { if (mode != mode) { - if (GET_MODE_SIZE (mode) != 64) + if ( != 64) return "vgatherq\t{%5, %x0%{%1%}|%t0%{%1%}, %g5}"; else return "vgatherq\t{%5, %t0%{%1%}|%t0%{%1%}, %g5}"; Jakub
Re: [PATCH, CHKP] Keep instrumentation references consistency when merging cgraph nodes
On 12/02/14 06:54, Ilya Enkovich wrote: Hi, Currently cgraph nodes merge may break instrumented_version references. It depends on order nodes are read and merged and therefore problem is not nicely reproducible. I couldn't make a small reproducer. This patch fixes problem and 253.perlbmk benchmark build with '-O3 -flto -fcheck-pointer-bounds -mmpx'. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? Thanks, Ilya -- 2014-12-02 Ilya Enkovich * lto-cgraph.c (input_cgraph_1): Don't break existing instrumentation clone references. * lto/lto-symtab.c (lto_cgraph_replace_node): Redirect instrumented_version references appropriately. OK. Jeff
Re: [PATCHv5][PING^4] Vimrc config with GNU formatting
On 11/28/14 10:26, Yury Gribov wrote: commit 3f560e9dd16a5e914b6f2ba82edffe13dfde944c Author: Yury Gribov Date: Thu Oct 2 15:50:52 2014 +0400 2014-10-02 Laurynas Biveinis Yury Gribov Vim config with GNU formatting. contrib/ * vimrc: New file. / * .gitignore: Added .local.vimrc and .lvimrc. * Makefile.tpl (vimrc, .lvimrc, .local.vimrc): New targets. * Makefile.in: Regenerate. OK, primarily because it's an improvement and not the default :-) Folks that want to use can explicitly opt-in. Jeff
PING: Re: [PATCH 05/05] Add command-line option-parsing to jit testcases
On Tue, 2014-11-25 at 20:34 -0500, David Malcolm wrote: > Add command-line option-parsing to the testcases, so that we can > manipulate them without needing a recompile (e.g. varying > optimization levels etc). > > This uses getopt_long, which is a GNU extension to libc. Is that > acceptable? Ping. Specifically, is it acceptable to use getopt_long within the jit testcases, or should I find another way to do this (e.g. just getopt)? Sorry that it wasn't obvious that I was asking for review on this one. > Implement a --num-iterations option, to override the default of 5. > > When running tests under valgrind, pass in "--num-iterations=1", > speeding up the tests by a factor of 5. > > gcc/testsuite/ChangeLog: > * jit.dg/harness.h (num_iterations): New variable. > (parse_options): New function. > (main): Use "num_iterations", rather than hardcoding 5. > * jit.dg/jit.exp (fixed_host_execute): When running tests under > valgrind, pass in "--num-iterations=1". > --- > gcc/testsuite/jit.dg/harness.h | 46 > +++--- > gcc/testsuite/jit.dg/jit.exp | 5 + > 2 files changed, 48 insertions(+), 3 deletions(-) > > diff --git a/gcc/testsuite/jit.dg/harness.h b/gcc/testsuite/jit.dg/harness.h > index bff64de..a30b66e 100644 > --- a/gcc/testsuite/jit.dg/harness.h > +++ b/gcc/testsuite/jit.dg/harness.h > @@ -247,17 +247,57 @@ extract_progname (const char *argv0) > } > > #ifndef TEST_PROVIDES_MAIN > + > +/* Option parsing, for varying how we run the built testcases > + (e.g. when poking at things in the debugger). */ > + > +#include > +int num_iterations = 5; > + > +static void > +parse_options (int argc, char **argv) > +{ > + while (1) > +{ > + static struct option long_options[] = > + { > + {"num-iterations", required_argument, 0, 'i'}, > + {0, 0, 0, 0} > + }; > + > + int option_index = 0; > + /* getopt_long is a GNU extension to libc. */ > + int c = getopt_long (argc, argv, "i:", > +long_options, &option_index); > + if (c == -1) > + break; > + > + switch (c) > + { > + case 'i': > + num_iterations = atoi (optarg); > + break; > + default: > + fprintf (stderr, "Usage: %s [--num-iterations NUM]\n", > +argv[0]); > + exit(EXIT_FAILURE); > + } > +} > +} > + > int > main (int argc, char **argv) > { >int i; > > - for (i = 1; i <= 5; i++) > + parse_options (argc, argv); > + > + for (i = 1; i <= num_iterations; i++) > { >snprintf (test, sizeof (test), > "%s iteration %d of %d", > -extract_progname (argv[0]), > -i, 5); > + extract_progname (argv[0]), > + i, num_iterations); > >//printf ("ITERATION %d\n", i); >test_jit (argv[0], NULL); > diff --git a/gcc/testsuite/jit.dg/jit.exp b/gcc/testsuite/jit.dg/jit.exp > index a37ccc7..438aabd 100644 > --- a/gcc/testsuite/jit.dg/jit.exp > +++ b/gcc/testsuite/jit.dg/jit.exp > @@ -157,6 +157,11 @@ proc fixed_host_execute {args} { > set valgrind_params {"valgrind"} > lappend valgrind_params "--leak-check=full" > lappend valgrind_params "--log-file=${valgrind_logfile}" > + # When running under valgrind, speed things up by only running one > + # in-process iteration of the testcase, rather than the default of 5. > + # Only testcases that use the "main" from harness.h > + # (#ifndef TEST_PROVIDES_MAIN) will respond to this option. > + lappend params "--num-iterations=1" > } else { > set valgrind_params {} > }
Re: [PATCH] Set stores_off_frame_dead_at_return to false if sibling call
On 12/8/2014 3:49 PM, Jeff Law wrote: I think the terminology and variable names certainly makes this tougher to follow than it should. I certainly agree. When I first looked at the code, I thought it was completely backwards. Dave -- John David Anglindave.ang...@bell.net
Re: [Patch] Improving jump-thread pass for PR 54742
On Sat, 2014-12-06 at 19:21 +, Sebastian Pop wrote: > > I think it does not make sense to duplicate paths at -Os: I disabled the FSM > > jump-threading when optimizing for size like this. > > > > diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c > > index 29b20c8..ce70311 100644 > > --- a/gcc/tree-ssa-threadedge.c > > +++ b/gcc/tree-ssa-threadedge.c > > @@ -1335,8 +1335,9 @@ thread_through_normal_block (edge e, > > return 1; > > } > > > >if (!flag_expensive_optimizations > > + || optimize_function_for_size_p (cfun) > > || TREE_CODE (cond) != SSA_NAME > > || e->dest->loop_father != e->src->loop_father > > || loop_depth (e->dest->loop_father) == 0) > > return 0; > > > > I will regstrap and commit the attached patch. > > Bootstrapped and regression tested on x86_64-linux. > Committed r218451. > > Sebastian Thanks for getting all this checked in Sebastian, I have tested it on coremark and I am getting the speed up that I expect. But I am a little confused about turning off jump threading. I am getting the optimization on coremark with -O3 and that is great and if I use '-O3 -fno-expensive-optimizations' then I don't see this part of the jump threading, also great. But I was surprised that if I just used '-O3 -fno-thread-jumps' then I still see this optimization. Is that expected? Should this test also check flag_thread_jumps? Or should that be getting checked somewhere else? Steve Ellcey
Re: PING: Re: [PATCH 05/05] Add command-line option-parsing to jit testcases
On Dec 8, 2014, at 12:44 PM, David Malcolm wrote: > On Tue, 2014-11-25 at 20:34 -0500, David Malcolm wrote: >> Add command-line option-parsing to the testcases, so that we can >> manipulate them without needing a recompile (e.g. varying >> optimization levels etc). >> >> This uses getopt_long, which is a GNU extension to libc. Is that >> acceptable? > > Ping. Specifically, is it acceptable to use getopt_long within the jit > testcases, or should I find another way to do this (e.g. just getopt)? So, the standard by which we measure, does it kill building or testing of ada on mingwin? If it does, then no, not acceptable. I’d like to think there is nothing you can do in jit.exp that could do that. So, from this perspective, yeah, feel free to do what you want. Git it done first. The second person that wants to port your code to a new machine (a different machine) will trip over all the bad things you did, and someone will then have to fix them. If you only use what gcc already sues, you will be portable to everything gcc is portable to. If you use GNU extensions to libc, well, that isn’t portable enough in general. Heck, even what’s in libc isn’t portable enough, that’s half the reason why we have autoconf in the first place. If jit is on by default everywhere, then you need to be portable everywhere. If only on for linux, then well, it already has GNU extensions in libc. I don’t know if it is on by default and you didn’t say, so, hard to comment on it.
Re: [Patch] PR 61692 - Fix for inline asm ICE
On 11/15/14 20:59, David Wohlferd wrote: On 9/15/2014 2:51 PM, Jeff Law wrote: Let's go with your original inputs + outputs + labels change and punt the clobbers stuff for now. jeff I have also added the test code you requested. I have a release on file with the FSF, but don't have SVN write access. Problem: extract_insn() in recog.c will ICE if (noperands > MAX_RECOG_OPERANDS). Normally this isn't a problem since expand_asm_operands() in cfgexpand.c catches and reports a proper error for this condition. However, expand_asm_operands() only checks (ninputs + noutputs) instead of (ninputs + noutputs + nlabels), so you can get the ICE when using "asm goto." ChangeLog: 2014-11-15 David Wohlferd PR target/61692 * cfgexpand.c (expand_asm_operands): Count all inline asm params. * testsuite/gcc.dg/pr61692.c: New test. Thanks. I put the testsuite ChangeLog entry in the right place and added a dg-do directive to the testcase to ensure we just try to compile it. Thanks for your patience, things got rather busy and obviously I'm still working through my personal backlog of patches from folks. Jeff
Re: [PATCH] Two more spots for
On Mon, Dec 8, 2014 at 10:43 PM, Jakub Jelinek wrote: > Hi! > > I've noticed two new spots which use GET_MODE_SIZE (mode) > instead of the more efficient . > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2014-12-08 Jakub Jelinek > > * config/i386/sse.md (*mov_internal, *avx512f_gatherdi_2): > Use instead of GET_MODE_SIZE (mode). OK. Thanks, Uros.
Re: [PATCH] Fix ICE with non-lvalue vector subscripts and make sure non-lvalue vector subscripts aren't used as lvalues (PR target/63764)
On 11/20/14 11:09, Jakub Jelinek wrote: Hi! This patch fixes ICEs if a non-lvalue vector (say cast of one vector to another vector type) was subscripted and used as lhs. The following patch, if *vecp is not lvalue, will copy it to a temporary variable which can be made addressable for the subscription, and afterwards wrap it into a NON_LVALUE_EXPR so that it is properly rejected if later used on the lhs. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-20 Jakub Jelinek PR target/63764 c-family/ * c-common.h (convert_vector_to_pointer_for_subscript): Change return type to bool. * c-common.c: Include gimple-expr.c. This was in c-family and thus I didn't see it as committed... So when I went to review, I saw the typo gimple-expr.c when you meant to write gimple-expr.h in the ChangeLog entry. Fixed in the obvious way. jeff
Re: [PATCH] Set stores_off_frame_dead_at_return to false if sibling call
On 12/8/2014 3:01 PM, Jeff Law wrote: The above is wrong for sibcalls. Sibcall arguments are relative to the incoming argument pointer. Is this always the frame pointer? I don't think it's always the frame pointer. Don't we use an argument pointer for the PA64 runtime? If I recall, it was the only port that had a non-eliminable argument pointer at the time. I don't think PA64 is an argument against this as sibcalls don't work in the PA64 runtime (they are disabled in pa.c) because the argument pointer isn't a fixed register. I guess in theory it could be fixed if it was saved and restored across calls. DSE as it stands doesn't look at argument pointer based stores and I suspect they would be deleted with current code. Dave -- John David Anglindave.ang...@bell.net
Re: [PATCH] check_GNU_style.sh "80 characters exceeded" error fix
On 11/24/14 06:51, Mantas Mikaitis wrote: check_GNU_style.sh error "Lines should not exceed 80 characters" does not return the file name and line number where error is present, only the line of code. Whereas other kind of errors return full information. This small patch will fix this and make check_GNU_style.sh return full information when patch contains lines longer than 80 errors. Tested on patches containing >80 chars lines and the script produces full information as necessary. Would this be a useful enhancement for trunk? Mantas Mikaitis gcc/ChangeLog: * contrib/check_GNU_style.sh (col): Got rid of cut operation from the pipe chain and instead added cut inside awk command. Yes. Please install on the trunk. Thanks, Jeff
Re: PATCH: PR rtl-optimization/64037: Miscompilation with -Os and enum class : char parameter
Hello! > On Tue, Nov 25, 2014 at 5:05 PM, H.J. Lu wrote: >> On Tue, Nov 25, 2014 at 7:01 AM, Richard Biener >> wrote: >>> On Tue, Nov 25, 2014 at 1:57 PM, H.J. Lu wrote: Hi, The enclosed testcase fails on x86 when compiled with -Os since we pass a byte parameter with a byte load in caller and read it as an int in callee. The reason it only shows up with -Os is x86 backend encodes a byte load with an int load if -O isn't used. When a byte load is used, the upper 24 bits of the register have random value for none WORD_REGISTER_OPERATIONS targets. It happens because setup_incoming_promotions in combine.c has /* The mode and signedness of the argument before any promotions happen (equal to the mode of the pseudo holding it at that stage). */ mode1 = TYPE_MODE (TREE_TYPE (arg)); uns1 = TYPE_UNSIGNED (TREE_TYPE (arg)); /* The mode and signedness of the argument after any source language and TARGET_PROMOTE_PROTOTYPES-driven promotions. */ mode2 = TYPE_MODE (DECL_ARG_TYPE (arg)); uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg)); /* The mode and signedness of the argument as it is actually passed, after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions. */ mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, &uns3, TREE_TYPE (cfun->decl), 0); while they are actually passed in register by assign_parm_setup_reg in function.c: /* Store the parm in a pseudoregister during the function, but we may need to do it in a wider mode. Using 2 here makes the result consistent with promote_decl_mode and thus expand_expr_real_1. */ promoted_nominal_mode = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp, TREE_TYPE (current_function_decl), 2); where nominal_type and nominal_mode are set up with TREE_TYPE (parm) and TYPE_MODE (nominal_type). TREE_TYPE here is >>> >>> I think the bug is here, not in combine.c. Can you try going back in >>> history >>> for both snippets and see if they matched at some point? >>> >> >> The bug was introduced by >> >> https://gcc.gnu.org/ml/gcc-cvs/2007-09/msg00613.html >> >> commit 5d93234932c3d8617ce92b77b7013ef6bede9508 >> Author: shinwell >> Date: Thu Sep 20 11:01:18 2007 + >> >> gcc/ >> * combine.c: Include cgraph.h. >> (setup_incoming_promotions): Rework to allow more aggressive >> elimination of sign extensions when all call sites of the >> current function are known to lie within the current unit. >> >> >> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@128618 >> 138bc75d-0d04-0410-961f-82ee72b054a4 >> >> Before this commit, combine.c has >> >> enum machine_mode mode = TYPE_MODE (TREE_TYPE (arg)); >> int uns = TYPE_UNSIGNED (TREE_TYPE (arg)); >> >> mode = promote_mode (TREE_TYPE (arg), mode, &uns, 1); >> if (mode == GET_MODE (reg) && mode != DECL_MODE (arg)) >> { >> rtx x; >> x = gen_rtx_CLOBBER (DECL_MODE (arg), const0_rtx); >> x = gen_rtx_fmt_e ((uns ? ZERO_EXTEND : SIGN_EXTEND), mode, x); >> record_value_for_reg (reg, first, x); >> } >> >> It matches function.c: >> >> /* This is not really promoting for a call. However we need to be >> consistent with assign_parm_find_data_types and expand_expr_real_1. */ >> promoted_nominal_mode >> = promote_mode (data->nominal_type, data->nominal_mode, &unsignedp, 1); >> >> r128618 changed >> >> mode = promote_mode (TREE_TYPE (arg), mode, &uns, 1); >> >> to >> >> mode3 = promote_mode (DECL_ARG_TYPE (arg), mode2, &uns3, 1); >> >> It breaks none WORD_REGISTER_OPERATIONS targets. > > Hmm, I think that DECL_ARG_TYPE makes a difference only > for non-WORD_REGISTER_OPERATIONS targets. > > But yeah, isolated the above change looks wrong. > > Your patch is ok for trunk if nobody objects within 24h and for branches > after a week. > > Thanks, > Richard. This patch caused PR64213. Uros.
[Patch, Fortran] Convert gfc_notify_std to common diagnostics
This patch requires that the gfc_error patch has been applied, https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00607.html The patch does some missing '%s' to %qs and '...' to %<...%> for gfc_error, does likewise for gfc_notify_std and converts the latter into calls to gfc_error and gfc_warning. Build and regtested on OK for the trunk, once the gfc_error patch is in? Tobias 2014-12-08 Tobias Burnus Manuel López-Ibáñez fortran/ * error.c (gfc_error): Add variant which takes a va_list. (gfc_notify_std): Convert to common diagnostic. * array.c: Use %qs, %<...%> in more gfc_error calls and for gfc_notify_std. * check.c: Ditto. * data.c: Ditto. * decl.c: Ditto. * expr.c: Ditto. * interface.c: Ditto. * intrinsic.c: Ditto. * io.c: Ditto. * match.c: Ditto. * matchexp.c: Ditto. * module.c: Ditto. * openmp.c: Ditto. * parse.c: Ditto. * primary.c: Ditto. * resolve.c: Ditto. * simplify.c: Ditto. * symbol.c: Ditto. * trans-common.c: Ditto. * trans-intrinsic.c: Ditto. gcc/testsuite/ * gfortran.dg/realloc_on_assign_21.f90: Update dg-error. * gfortran.dg/warnings_are_errors_1.f: Ditto. * gfortran.dg/warnings_are_errors_1.f90: Ditto. diff --git a/gcc/fortran/array.c b/gcc/fortran/array.c index e27ca01..300bfeb 100644 --- a/gcc/fortran/array.c +++ b/gcc/fortran/array.c @@ -684,7 +684,7 @@ coarray: if (current_type == AS_EXPLICIT) { - gfc_error ("Upper bound of last coarray dimension must be '*' at %C"); + gfc_error ("Upper bound of last coarray dimension must be %<*%> at %C"); goto cleanup; } diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c index ef40e66..527123d 100644 --- a/gcc/fortran/check.c +++ b/gcc/fortran/check.c @@ -384,7 +384,7 @@ less_than_bitsize2 (const char *arg1, gfc_expr *expr1, const char *arg2, i3 = gfc_validate_kind (BT_INTEGER, expr1->ts.kind, false); if (i2 > gfc_integer_kinds[i3].bit_size) { - gfc_error ("'%s + %s' at %L must be less than or equal " + gfc_error ("%<%s + %s%> at %L must be less than or equal " "to BIT_SIZE(%qs)", arg2, arg3, &expr2->where, arg1); return false; @@ -581,7 +581,7 @@ dim_corank_check (gfc_expr *dim, gfc_expr *array) if (mpz_cmp_ui (dim->value.integer, 1) < 0 || mpz_cmp_ui (dim->value.integer, corank) > 0) { - gfc_error ("'dim' argument of %qs intrinsic at %L is not a valid " + gfc_error ("% argument of %qs intrinsic at %L is not a valid " "codimension index", gfc_current_intrinsic, &dim->where); return false; @@ -631,7 +631,7 @@ dim_rank_check (gfc_expr *dim, gfc_expr *array, int allow_assumed) if (mpz_cmp_ui (dim->value.integer, 1) < 0 || mpz_cmp_ui (dim->value.integer, rank) > 0) { - gfc_error ("'dim' argument of %qs intrinsic at %L is not a valid " + gfc_error ("% argument of %qs intrinsic at %L is not a valid " "dimension index", gfc_current_intrinsic, &dim->where); return false; @@ -1378,7 +1378,7 @@ gfc_check_cmplx (gfc_expr *x, gfc_expr *y, gfc_expr *kind) if (x->ts.type == BT_COMPLEX) { gfc_error ("%qs argument of %qs intrinsic at %L must not be " - "present if 'x' is COMPLEX", + "present if % is COMPLEX", gfc_current_intrinsic_arg[1]->name, gfc_current_intrinsic, &y->where); return false; @@ -1428,7 +1428,7 @@ check_co_collective (gfc_expr *a, gfc_expr *image_idx, gfc_expr *stat, /* Fortran 2008, 12.5.2.4, paragraph 18. */ if (gfc_has_vector_subscript (a)) { - gfc_error ("Argument 'A' with INTENT(INOUT) at %L of the intrinsic " + gfc_error ("Argument % with INTENT(INOUT) at %L of the intrinsic " "subroutine %s shall not have a vector subscript", &a->where, gfc_current_intrinsic); return false; @@ -1728,7 +1728,7 @@ gfc_check_count (gfc_expr *mask, gfc_expr *dim, gfc_expr *kind) return false; if (!kind_check (kind, 2, BT_INTEGER)) return false; - if (kind && !gfc_notify_std (GFC_STD_F2003, "'%s' intrinsic " + if (kind && !gfc_notify_std (GFC_STD_F2003, "%qs intrinsic " "with KIND argument at %L", gfc_current_intrinsic, &kind->where)) return false; @@ -1835,7 +1835,7 @@ gfc_check_dcmplx (gfc_expr *x, gfc_expr *y) if (x->ts.type == BT_COMPLEX) { gfc_error ("%qs argument of %qs intrinsic at %L must not be " - "present if 'x' is COMPLEX", + "present if % is COMPLEX", gfc_current_intrinsic_arg[1]->name, gfc_current_intrinsic, &y->where); return false; @@ -1908,7 +1908,8 @@ gfc_check_dot_product (gfc_expr *vector_a, gfc_expr *vector_b) if (! identical_dimen_shape (vector_a, 0, vector_b, 0)) { gfc_error ("Different shape for arguments %qs and %qs at %L for " - "intrinsic 'dot_product'", gfc_current_intrinsic_arg[0]->name, + "intrinsic %", + gfc_current_intrinsic_arg[0]->name, gfc_current_intrinsic_arg[1]->name, &vector_a->where); return false; } @@ -2146,9 +2147,9 @@ gfc
Re: [PATCH] Mark explicit decls as implicit when we've seen a prototype
On Mon, 8 Dec 2014, Richard Biener wrote: > I'm not pushing this further for stage3, but for stage1 I'd like > to eventually address this by splitting up builtin_info_type's > 'implicit_p' into a flags array providing implicit_p, declared_p, > used_p and maybe declared_in_system_header_p. Would you be > willing to fill in the gap computing "used_p" in the C frontend? C_DECL_USED is the right notion of "used" with the front end, in standard terms (i.e. the notion that requires a definition of the function to be present somewhere), though as noted some people may be relying on calls being optimized out. (Of course such people could get implicitly generated libm calls at present anyway; the point of checking "used" is to restrict call generation to cases where we know that some library with the function will be linked in, and that the function will have the right semantics.) -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] Set stores_off_frame_dead_at_return to false if sibling call
On 12/08/14 15:15, John David Anglin wrote: On 12/8/2014 3:01 PM, Jeff Law wrote: The above is wrong for sibcalls. Sibcall arguments are relative to the incoming argument pointer. Is this always the frame pointer? I don't think it's always the frame pointer. Don't we use an argument pointer for the PA64 runtime? If I recall, it was the only port that had a non-eliminable argument pointer at the time. I don't think PA64 is an argument against this as sibcalls don't work in the PA64 runtime (they are disabled in pa.c) because the argument pointer isn't a fixed register. I guess in theory it could be fixed if it was saved and restored across calls. But there's nothing that says another port in the future won't have similar characteristics as the PA, so while the PA isn't particularly important, it shows there's cases where arguments won't be accessed by the FP. DSE as it stands doesn't look at argument pointer based stores and I suspect they would be deleted with current code. Agreed. jeff
Re: [PATCH] Mark explicit decls as implicit when we've seen a prototype
On Mon, 8 Dec 2014, Richard Biener wrote: > The alternative is to decide "used" in the middle-end at one point, > for example at the end of all_lowering_passes where hopefully > we have constant folded and removed dead code enough. We can also > compute an overall "uses libm" flag to fix the testcase I reported > (of course we'd like to re-compute that at LTO time). If you do that, of course any optimizations before deciding "used" need to be conservative - assume that any function except the one you're optimizing a call of is not "used". -- Joseph S. Myers jos...@codesourcery.com
Re: FW: [PATCH] Fix PR ipa/64049
> On Mon, Dec 8, 2014 at 11:57 AM, Bernd Edlinger > wrote: > > > > > >> > >> Hello, > >> > >> > >> this surprisingly simple patch fixes PR ipa/64049. The root cause seems > >> to be > >> that in this test case we try to devirtualize a method call on a return > >> value. > >> > >> > >> Boot-Strapped and regression-tested on X86_64-linux-gnu. > >> OK for trunk? > > I think positive tests, in this case TREE_CODE (...) == VAR_DECL > are better. > > Ok with that change. The test there assumes that only values with well defined default value are parm decls, so in cases we see a call on default value of other kind of decl, we could turn it to undefined call. In what cases has RESULT_DECL sane initial value? We should at least drop a comment here. Honza > > Thanks, > Richard. > > >> > >> Thanks > >> Bernd. > >> > > > > Again with changelog. Sorry. > >
locales fixes
Hi After having installed all necessary locales on my system I end up with 4 failures. Here is a patch to fix them all. For numpunct test I consider that checking value of numpunct::grouping() result was not correct. libstdc++ doesn't control this value. I prefer to just check that values returned by numpunct are consistent with how numeric values are formatted. For the others, regarding time_get, I found nothing really smart to do. For 4.cc I simply had to remove some characters not present anymore. And for 38081-1 and 38081-2, despite having glibc 2.19 it is still showing previous values on my system. I even wonder if it has anything to do with glibc version, it rather depends on the installed locale info, no ? 2014-12-08 François Dumont * testsuite/22_locale/numpunct/members/char/3.cc: Check numpunct returned values are consistent with how numbers are formatted. * testsuite/22_locale/time_get/get_date/wchar_t/4.cc: Update expected values. * testsuite/22_locale/time_get/get_weekday/char/38081-1.cc: Don't use new values before glibc 2.20. * testsuite/22_locale/time_get/get_weekday/char/38081-2.cc: Likewise. Tested under Linux x86_64. François Index: testsuite/22_locale/numpunct/members/char/3.cc === --- testsuite/22_locale/numpunct/members/char/3.cc (revision 218492) +++ testsuite/22_locale/numpunct/members/char/3.cc (working copy) @@ -21,6 +21,7 @@ // 22.2.3.2 Template class numpunct_byname +#include #include #include @@ -30,13 +31,52 @@ bool test __attribute__((unused)) = true; - locale loc_it = locale("it_IT"); + // Use numpunct to generate representation for an arbitrary number. + const int number = 10; - const numpunct& nump_it = use_facet >(loc_it); + ostringstream ostr; + ostr.imbue(locale::classic()); + + ostr << number; + string c_number = ostr.str(); + locale loc_it = locale("it_IT"); + const numpunct& nump_it = use_facet >(loc_it); string g = nump_it.grouping(); - VERIFY( g == "" ); + size_t group_index = 0; + string it_number; + size_t it_pos = 0; + size_t offset = 0; + for (size_t pos = 0; pos != c_number.size(); ++pos) +{ + int group_size = + group_index < g.size() ? (int)g[group_index] + : (g.empty() + // No grouping, just make group size big enough so that + // there will be no insertion of thousands separator. + ? c_number.size() + : g[g.size() - 1]); + if (pos + offset < it_pos + group_size) + it_number.insert(it_number.begin(), c_number[c_number.size() - pos - 1]); + else + { + // Time to add the group separator. + it_number.insert(it_number.begin(), nump_it.thousands_sep()); + it_number.insert(it_number.begin(), c_number[c_number.size() - pos - 1]); + ++offset; + ++group_index; + it_pos = it_number.size() - 1; + } +} + + // Check that the result is identical to the one obtained with an + // ostringstream imbued with the it_IT locale. + ostr.str(""); + ostr.imbue(loc_it); + ostr << number; + + VERIFY( ostr.str() == it_number ); } int main() Index: testsuite/22_locale/time_get/get_date/wchar_t/4.cc === --- testsuite/22_locale/time_get/get_date/wchar_t/4.cc (revision 218492) +++ testsuite/22_locale/time_get/get_date/wchar_t/4.cc (working copy) @@ -43,7 +43,7 @@ const ios_base::iostate good = ios_base::goodbit; ios_base::iostate errorstate = good; - const wchar_t wstr[] = { 0x897f, 0x5143, L'2', L'0', L'0', L'3', + const wchar_t wstr[] = { /* 0x897f, 0x5143,*/ L'2', L'0', L'0', L'3', 0x5e74, L'1', L'2', 0x6708, L'1', L'7', 0x65e5 , 0x0 }; Index: testsuite/22_locale/time_get/get_weekday/char/38081-1.cc === --- testsuite/22_locale/time_get/get_weekday/char/38081-1.cc (revision 218492) +++ testsuite/22_locale/time_get/get_weekday/char/38081-1.cc (working copy) @@ -50,7 +50,7 @@ // ios_base::iostate&, tm*) const #if __GLIBC__ > 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ >= 14) -# if __GLIBC__ > 2 || __GLIBC_MINOR__ >= 17 +# if __GLIBC__ > 2 || __GLIBC_MINOR__ >= 20 iss.str("\xbf\xdd"); # else iss.str("\xbf\xdd\x2e"); @@ -76,7 +76,7 @@ VERIFY( errorstate == ios_base::eofbit ); #if __GLIBC__ > 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ >= 14) -# if __GLIBC__ > 2 || __GLIBC_MINOR__ >= 17 +# if __GLIBC__ > 2 || __GLIBC_MINOR__ >= 20 iss.str("\xbf\xdd\xd5\xd4\xd5\xdb\xec\xdd\xd8\xda"); # else iss.str("\xbf\xdd\x2e\xd5\xd4\xd5\xdb\xec\xdd\xd8\xda"); Index: testsuite/22_locale/time_get/get_weekday/char/38081-2.cc === --- testsuite/22_locale/time_get/get_weekday/char/38081-2.cc (revision 218492) +++ testsuite/22_locale/time_get/get_weekday/char/38081-2.cc (working copy) @@ -51,7 +51,7 @@ // ios_base::i