Re: Minor Cygwin patches
On 2013-07-14 20:15, JonY wrote: 2013-03-08 Yaakov Selkowitz * (gcc/testsuite/gcc.target/i386/pr25993.c): Skip unsupported test. This patch was Dave Korn's. Yaakov
Re: Minor Cygwin patches
On 2013-07-14 20:15, JonY wrote: 2013-03-08 Dave Korn * (gcc/config.gcc): Include Cygwin specific file. * (gcc/config/i386/cygwin.h): Link shared libgcc by default. * (gcc/config/i386/cygwin.h): Add --large-address-aware, and use --tsaware for exes * (gcc/config/i386/cygwin.h): Add -pthreads, -rdynamic stubs. * (gcc/config/i386/cygwin.opt): New file. Only the "link shared libgcc by default" part was Dave's; the rest is mine. Yaakov
Re: [PATCH, i386, PR57623] Introduce synonyms for BMI intrinsics
On 7/17/2013 7:42 PM, Jakub Jelinek wrote: > On Wed, Jul 17, 2013 at 07:39:00PM +0400, Kirill Yukhin wrote: >> Is it ok to install? > Yes. > Thanks! Change into 4_8 branch: http://gcc.gnu.org/ml/gcc-cvs/2013-07/msg00477.html K
Re: [PATCH] MIPS: IEEE 754-2008 features support
"Maciej W. Rozycki" writes: >> > Please also note that the writability of the individual new (HAS2008) >> > FCSR bits is optional e.g. a conforming processor may have NAN2008 >> > hardwired to 1 and ABS2008 hardwired to 0 (or likewise with NAN2008 >> > writable). >> >> OK, I'd missed that this was allowed, sorry. It just seems really >> unfortunate... > > Well, I'm not really sure what the Big Plan here is. It looks to me like > the non-arithmetic ABS/NEG feature is a really good thing, Me too. :-) > while the 2008 > NaN encoding has its shortcomings, e.g. unlike with the legacy encoding > there's no single sNaN bit pattern to preset FPRs or variables with to > catch uninitialised use that would work across all the floating-point > formats (S, D and PS). So it seems to me like there's no single superior > setting we could make the default for a group option. I see what you mean, and I suppose it wouldn't be too bad having separate -mabs=2008 and -mmac=2008 options for the legacy IEEE case. But when going to the trouble of switching NAN encoding, which needs a separate runtime, it seemed a shame that we couldn't also rely on the new improved ABS/NEG behaviour in that runtime too. Instead we have to potentially pass all three of -mnan=2008 -mabs=2008 -mmac=2008 in order to get what other targets get. Oh well... >> > It is needed in case the compiler was built without support for this >> > option i.e. configured against old binutils. I verified that it indeed >> > triggered in this case: >> > >> > Executing on host: mips-linux-gnu-gcc -fno-diagnostics-show-caret >> > -fdiagnostics-color=never -mnan=2008 -c -o mips_nan21474.o >> > mips_nan21474.c(timeout = 300) >> > mips-linux-gnu-gcc: error: unrecognized command line option '-mnan=2008' >> > compiler exited with status 1 >> > output is: >> > mips-linux-gnu-gcc: error: unrecognized command line option '-mnan=2008' >> >> That seems like bad practice though. In other cases we leave the >> assembler to report options that it doesn't understand. I think that's >> better because it's then clearer that the assembler needs to be upgraded. >> >> Within config/mips, the configure test should just control whether it's >> safe to use .nan when no -mnan option has been given. (This is what we >> did for -mmicromips vs ".set nomicromips" FWIW.) > > I can see your point and acknowledge the preexisting practice, but I > don't feel particularly convinced, especially in this case where we have a > feature that's never going to raise the user's attention when > miconfigured, but it also applies to the microMIPS case. The two issues I > see with it are: > > 1. Conceptually I see the toolchain as a whole and I don't see a value in >GCC producing known-unsupported assembly and relying on the assembler >(or the linker if applicable) to complain. I agree pointing at the >other tool being incapable or obsolete is a useful practice, but I also >think a clear message from GCC itself would be more appropriate (e.g. >"`-mfoo' unsupported, please reconfigure against current binutils"). But doing that consistently would mean e.g. that we would need to test assembler support for each individual -march= option, since new -march= options are added fairly often. It seems a lot of hassle to do that just so that we can force the user to rebuild GCC from the same sources as before. I think it's more user-friendly to include support for -mfoo and assume that the assembler supports whatever's needed -- leaving it to issue an error if not -- both because the problem tool is the one that reports the error, and because we don't force users to rebuild GCC when the first build could quite easily have had the feature they wanted. > 2. Technically I think we have an actual problem here, e.g. in the example >you referred we have a situation where GCC supports microMIPS >compilation in all cases, however non-microMIPS code is different >depending on whether the compiler has been configured against modern or >obsolete binutils. Now the latter case may prompt someone to upgrade >binutils, but there is nothing to prompt that person to reconfigure GCC >afterwards. As a result we have two cases of a toolchain comprised of >the same versions of both GCC and binutils, but depending on the >"history" of the GCC binaries code produced will be different. I think >this is subtler and riskier than just rejecting the relevant compiler >option outright. I think your argument is that it's bad for things like __attribute__((nomicromips)) to produce different code depending on the configured assembler, and that we should therefore just reject the attribute if the assembler doesn't support ".set nomicromips". Is that right? That has negative consequences too though. People often use GCC version checks to decide whether an attribute is supported. Conditionally disabling even nomicromips would force
Add myself and Shiva Chen in MATINTAINERS file as nds32 port maintainers
According to the announcement: http://gcc.gnu.org/ml/gcc/2013-07/msg00232.html I added myself and Shiva Chen as nds32 port maintainers. Index: ChangeLog === --- ChangeLog (revision 201046) +++ ChangeLog (working copy) @@ -1,3 +1,9 @@ +2013-07-19 Chung-Ju Wu +Shiva Chen + + * MAINTAINERS (nds32 port): Add Chung-Ju Wu and Shiva Chen as + nds32 port maintainers. + 2013-07-17 Tim Shen * MAINTAINERS (Write After Approval): Add myself. Index: MAINTAINERS === --- MAINTAINERS (revision 201046) +++ MAINTAINERS (working copy) @@ -87,6 +87,8 @@ mn10300 port Jeff Lawl...@redhat.com mn10300 port Alexandre Oliva aol...@redhat.com moxie port Anthony Green gr...@moxielogic.com +nds32 port Chung-Ju Wu jasonw...@gmail.com +nds32 port Shiva Chen shiva0...@gmail.com pdp11 port Paul Koning n...@arrl.net picochip port Daniel Towner d...@picochip.com rl78 port DJ Delorie d...@redhat.com Thanks GCC SC for accepting our contribution. After getting approval from a global reviewer, we will commit initial patches. We will also regularly publish our testing results on gcc-testresults mailing list. Best regards, jasonwucj
Re: [C++14/lambda/impicit-templates] Experimental polymorphic lambda and implicit function template patches
Hi Jason, There follows a re-spin of the previous patches. I've addressed your review comments and cleaned up much of the implementation (including code location). I've also redesigned the `auto' param handling (implicit function template generation) to work for plain functions and member functions as well as lambdas. This is intended as a step towards implementing the basics of `Terse Templates' as proposed in section 6.2.1 of N3580. Indeed, if you consider `auto' to be a concept whose constexpr implementation simply returns 'true', this is what the current implementation does. I've Ccd Andrew Sutton for visibility of the work in this area (PATCH 3/3 is the one of interest). I might have a play with this on the c++-concepts branch if I get time. As well as the syntax supported by the previous patch-set, this re-spin also supports the syntax demonstrated by the following examples for plain functions and member functions: 4. auto plain_fn_or_mem_fn (auto x, auto& y, auto const& z) { return x + y + z; } 5. template auto plain_fn_or_mem_fn (auto x, auto& y, Z const& z) { return x + y + z; } Currently out-of-line implicit member templates are supported but the template parameter list must be explicitly specified at the definition and the return type must be specified. For example: struct S { auto f(auto& a, auto b) { a += b; } void g(auto& a, auto b); }; template void S::g(A& a, B b) { a += b; } On Wed, 10 Jul 2013 19:35:24 -0700, Jason Merrill wrote: > On 07/01/2013 04:26 PM, Adam Butcher wrote: > > Any comments appreciated. Guidance on implementing the conversion > > operator for stateless generic lambdas would be useful. I've made a > > few attempts which were lacking! > > It seemed to me that N3649's description of the semantics should > translate pretty easily into an implementation. > Indeed. It is quite simple from a spec point of view. And a plain C++ implementation would be easy. Unfortunately in the compiler I have not found it as straight-forward. > What problems are you running into? > My previous attempts tried to reuse the existing non-template conversion-op code with numerous conditional blocks that got out of hand -- and SEGVd or ICEd to boot. I think it was due to not properly transforming the type declarators. Since reworking the implicit function template code I am going to experiment with a cleaner way. I'll get back to you if I get it done or if I hit a brick wall! As always, any feedback appreciated. Cheers, Adam Patch summary (3): Support template-parameter-list in lambda-declarator. Avoid crash on symbol table writing when generic lambda declared with iostream (or potentially other code) included. Support using `auto' in a function parameter list to introduce an implicit template parameter. gcc/cp/cp-tree.h | 11 gcc/cp/decl.c| 4 +- gcc/cp/decl2.c | 5 +- gcc/cp/lambda.c | 9 ++- gcc/cp/parser.c | 93 +--- gcc/cp/pt.c | 186 ++- gcc/symtab.c | 18 ++ 7 files changed, 297 insertions(+), 29 deletions(-) -- 1.8.3
[PATCH 1/3] [lambda] Support template-parameter-list in lambda-declarator.
--- gcc/cp/decl2.c | 5 +++-- gcc/cp/lambda.c | 9 - gcc/cp/parser.c | 36 ++-- gcc/cp/pt.c | 4 +++- 4 files changed, 48 insertions(+), 6 deletions(-) diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c index 1573ced..c166f6e 100644 --- a/gcc/cp/decl2.c +++ b/gcc/cp/decl2.c @@ -507,8 +507,9 @@ check_member_template (tree tmpl) || (TREE_CODE (decl) == TYPE_DECL && MAYBE_CLASS_TYPE_P (TREE_TYPE (decl { - /* The parser rejects template declarations in local classes. */ - gcc_assert (!current_function_decl); + /* The parser rejects template declarations in local classes +(with the exception of generic lambdas). */ + gcc_assert (!current_function_decl || LAMBDA_FUNCTION_P (decl)); /* The parser rejects any use of virtual in a function template. */ gcc_assert (!(TREE_CODE (decl) == FUNCTION_DECL && DECL_VIRTUAL_P (decl))); diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c index a53e692..98a7925 100644 --- a/gcc/cp/lambda.c +++ b/gcc/cp/lambda.c @@ -196,7 +196,7 @@ lambda_function (tree lambda) /*protect=*/0, /*want_type=*/false, tf_warning_or_error); if (lambda) -lambda = BASELINK_FUNCTIONS (lambda); +lambda = STRIP_TEMPLATE (get_first_fn (lambda)); return lambda; } @@ -759,6 +759,13 @@ maybe_add_lambda_conv_op (tree type) if (processing_template_decl) return; + if (DECL_TEMPLATE_INFO (callop) && DECL_TEMPLATE_RESULT +(DECL_TI_TEMPLATE (callop)) == callop) +{ + warning (0, "Conversion of a generic lambda to a function pointer is not currently implemented."); + return; +} + if (DECL_INITIAL (callop) == NULL_TREE) { /* If the op() wasn't instantiated due to errors, give up. */ diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 4b683bf..48c95e6 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -8790,6 +8790,7 @@ cp_parser_lambda_introducer (cp_parser* parser, tree lambda_expr) /* Parse the (optional) middle of a lambda expression. lambda-declarator: + < template-parameter-list [opt] > ( parameter-declaration-clause [opt] ) attribute-specifier [opt] mutable [opt] @@ -8809,10 +8810,32 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, tree lambda_expr) tree param_list = void_list_node; tree attributes = NULL_TREE; tree exception_spec = NULL_TREE; + tree template_param_list = NULL_TREE; tree t; - /* The lambda-declarator is optional, but must begin with an opening - parenthesis if present. */ + /* The template-parameter-list is optional, but must begin with + an opening angle if present. */ + if (cp_lexer_next_token_is (parser->lexer, CPP_LESS)) +{ + cp_lexer_consume_token (parser->lexer); + + if (cxx_dialect < cxx1y) + cp_parser_error (parser, +"Generic lambdas are only supported in C++1y mode."); + + push_deferring_access_checks (dk_deferred); + + template_param_list = cp_parser_template_parameter_list (parser); + + cp_parser_skip_to_end_of_template_parameter_list (parser); + + /* We just processed one more parameter list. */ + ++parser->num_template_parameter_lists; +} + + /* The parameter-declaration-clause is optional (unless + template-parameter-list was given), but must begin with an + opening parenthesis if present. */ if (cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN)) { cp_lexer_consume_token (parser->lexer); @@ -8858,6 +8881,8 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, tree lambda_expr) leave_scope (); } + else if (template_param_list != NULL_TREE) // generate diagnostic +cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN); /* Create the function call operator. @@ -8901,6 +8926,13 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, tree lambda_expr) DECL_ARTIFICIAL (fco) = 1; /* Give the object parameter a different name. */ DECL_NAME (DECL_ARGUMENTS (fco)) = get_identifier ("__closure"); + if (template_param_list) + { + pop_deferring_access_checks (); + fco = finish_member_template_decl (fco); + finish_template_decl (template_param_list); + --parser->num_template_parameter_lists; + } } finish_member_declaration (fco); diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index de054ac..3694ccc 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -9028,7 +9028,9 @@ instantiate_class_template_1 (tree type) tree decl = lambda_function (type); if (decl) { - instantiate_decl (decl, false, false); + if (!DECL_TEMPLATE_INFO (decl) || DECL_TEMPLATE_RESULT + (DECL_TI_TEMPLATE (decl)) != decl) + instantiate_decl (decl, false, false); /* We need to instantiate the capture list from the template
[PATCH 2/3] [lambda] Avoid crash on symbol table writing when generic lambda declared with iostream (or potentially other code) included.
--- gcc/symtab.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/gcc/symtab.c b/gcc/symtab.c index 85d47a8..1ada0f7 100644 --- a/gcc/symtab.c +++ b/gcc/symtab.c @@ -116,6 +116,15 @@ insert_to_assembler_name_hash (symtab_node node, bool with_clones) tree name = DECL_ASSEMBLER_NAME (node->symbol.decl); + + // FIXME: how does this nullptr get here when declaring a C++ + // FIXME: generic lambda and including iostream (or presumably + // FIXME: any other header with whatever property is triggering + // FIXME: this)!? + // + if (name == 0) + return; + aslot = htab_find_slot_with_hash (assembler_name_hash, name, decl_assembler_name_hash (name), INSERT); @@ -156,6 +165,15 @@ unlink_from_assembler_name_hash (symtab_node node, bool with_clones) else { tree name = DECL_ASSEMBLER_NAME (node->symbol.decl); + + // FIXME: how does this nullptr get here when declaring a C++ + // FIXME: generic lambda and including iostream (or presumably + // FIXME: any other header with whatever property is triggering + // FIXME: this)!? + // + if (name == 0) + return; + void **slot; slot = htab_find_slot_with_hash (assembler_name_hash, name, decl_assembler_name_hash (name), -- 1.8.3
[PATCH 3/3] [lambda] [basic-terse-templates] Support using `auto' in a function parameter list to introduce an implicit template parameter.
--- gcc/cp/cp-tree.h | 11 gcc/cp/decl.c| 4 +- gcc/cp/parser.c | 57 ++--- gcc/cp/pt.c | 182 ++- 4 files changed, 231 insertions(+), 23 deletions(-) diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index a837d22..08d9d5e 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -1034,6 +1034,7 @@ struct GTY(()) saved_scope { int x_processing_template_decl; int x_processing_specialization; BOOL_BITFIELD x_processing_explicit_instantiation : 1; + BOOL_BITFIELD x_fully_implicit_template : 1; BOOL_BITFIELD need_pop_function_context : 1; int unevaluated_operand; @@ -1088,6 +1089,12 @@ struct GTY(()) saved_scope { #define processing_specialization scope_chain->x_processing_specialization #define processing_explicit_instantiation scope_chain->x_processing_explicit_instantiation +/* Nonzero if the function being declared was made a template due to it's + parameter list containing generic type specifiers (`auto' or concept + identifiers) rather than an explicit template parameter list. */ + +#define fully_implicit_template scope_chain->x_fully_implicit_template + /* The cached class binding level, from the most recently exited class, or NULL if none. */ @@ -5453,12 +5460,16 @@ extern tree make_auto (void); extern tree make_decltype_auto (void); extern tree do_auto_deduction (tree, tree, tree); extern tree type_uses_auto (tree); +extern tree type_uses_auto_or_concept (tree); extern void append_type_to_template_for_access_check (tree, tree, tree, location_t); extern tree splice_late_return_type(tree, tree); extern bool is_auto(const_tree); +extern bool is_auto_or_concept (const_tree); extern tree process_template_parm (tree, location_t, tree, bool, bool); +extern tree add_implicit_template_parms(size_t, tree); +extern tree finish_fully_implicit_template (tree); extern tree end_template_parm_list (tree); extern void end_template_decl (void); extern tree maybe_update_decl_type (tree, tree); diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index c97134c..56b49dd 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -10329,9 +10329,9 @@ grokdeclarator (const cp_declarator *declarator, if (ctype || in_namespace) error ("cannot use %<::%> in parameter declaration"); - if (type_uses_auto (type)) + if (type_uses_auto (type) && cxx_dialect < cxx1y) { - error ("parameter declared %"); + error ("parameter declared % (unsupported prior to C++1y)"); type = error_mark_node; } diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 48c95e6..d6c4129 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -8933,6 +8933,11 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, tree lambda_expr) finish_template_decl (template_param_list); --parser->num_template_parameter_lists; } + else if (fully_implicit_template) + { + fco = finish_fully_implicit_template (fco); + --parser->num_template_parameter_lists; + } } finish_member_declaration (fco); @@ -16807,8 +16812,10 @@ cp_parser_direct_declarator (cp_parser* parser, /* Parse the parameter-declaration-clause. */ params = cp_parser_parameter_declaration_clause (parser); + /* Restore saved template parameter lists accounting for implicit +template parameters. */ parser->num_template_parameter_lists - = saved_num_template_parameter_lists; + += saved_num_template_parameter_lists; /* Consume the `)'. */ cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN); @@ -17908,6 +17915,7 @@ cp_parser_parameter_declaration_list (cp_parser* parser, bool *is_error) tree *tail = ¶meters; bool saved_in_unbraced_linkage_specification_p; int index = 0; + int implicit_template_parms = 0; /* Assume all will go well. */ *is_error = false; @@ -17935,11 +17943,17 @@ cp_parser_parameter_declaration_list (cp_parser* parser, bool *is_error) deprecated_state = DEPRECATED_SUPPRESS; if (parameter) - decl = grokdeclarator (parameter->declarator, - ¶meter->decl_specifiers, - PARM, - parameter->default_argument != NULL_TREE, - ¶meter->decl_specifiers.attributes); + { + if (parameter->decl_specifiers.type && is_auto_or_concept + (parameter->decl_specifiers.type)) + ++implicit_template_parms; + + dec
Re: [Patch, AArch64, ILP32] 2/5 More backend changes and support for small absolute and small PIC addressing models
On 26 June 2013 23:35, Yufeng Zhang wrote: > This patch updates the AArch64 backend to support the small absolute and > small PIC addressing models for ILP32; it also updates a number of other > backend macros and hooks in order to support ILP32. > > OK for the trunk? OK /Marcus
Re: [Patch, AArch64, ILP32] 1/5 Initial support - configury changes
On 2 July 2013 19:53, Yufeng Zhang wrote: > Hi Andrew, > > Please find the updated patch in the attachment that addresses your > comments. > > It now builds both ilp32 and lp64 multilibs by default, with the > --with-multilib-list support remaining to provide options to turn off one of > them. > > -mabi=ilp32 and -mabi=lp64 are now the command line options to use. The > SPECs have been updated as well. > > > Thanks, > Yufeng OK /Marcus
Re: [Patch, AArch64, ILP32] 3/5 Minor change in function.c:assign_parm_find_data_types()
On 26 June 2013 23:39, Yufeng Zhang wrote: > This patch updates assign_parm_find_data_types to assign passed_mode and > nominal_mode with the mode of the built pointer type instead of the > hard-coded Pmode in the case of pass-by-reference. This is in line with the > assignment to passed_mode and nominal_mode in other cases inside the > function. > > assign_parm_find_data_types generally uses TYPE_MODE to calculate > passed_mode and nominal_mode: > > /* Find mode of arg as it is passed, and mode of arg as it should be > during execution of this function. */ > passed_mode = TYPE_MODE (passed_type); > nominal_mode = TYPE_MODE (nominal_type); > > this includes the case when the passed argument is a pointer by itself. > > However there is a discrepancy when it deals with argument passed by > invisible reference; it builds the argument's corresponding pointer type, > but sets passed_mode and nominal_mode with Pmode directly. > > This is OK for targets where Pmode == ptr_mode, but on AArch64 with ILP32 > they are different with Pmode as DImode and ptr_mode as SImode. When such a > reference is passed on stack, the reference is prepared by the caller in the > lower 4 bytes of an 8-byte slot but is fetched by the callee as an 8-byte > datum, of which the higher 4 bytes may contain junk. It is probably the > combination of Pmode != ptr_mode and the particular ABI specification that > make the AArch64 ILP32 the first target on which the issue manifests itself. > > Bootstrapped on x86_64-none-linux-gnu. > > OK for the trunk? Yufeng, this change makes sense to me, however I can't approve such a change. /Marcus
Re: [Patch, AArch64, ILP32] 5/5 Define _ILP32 and __ILP32__
On 26 June 2013 23:42, Yufeng Zhang wrote: > gcc/ > * config/aarch64/aarch64.h (TARGET_CPU_CPP_BUILTINS): Define _ILP32 > and __ILP32__ when the ILP32 model is in use. > OK /Marcus
[testsuite] ad PR52641: Skip more tests on int16 or size16 targets.
Here are some more skips of tests that won't work on int16 targets. In most cases constants too big, bitfields too long or arrays too large are used. For gcc.dg/torture/pr56488.c there are some notes at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56488#c4 that explain for why this test fails if int = short. Ok to apply? Johann PR testsuite/52641 * gcc.c-torture/execute/pr57344-2.x: New. Skip int16. * gcc.dg/pr53265.c: Add dg-require-effective-target size32plus. * gcc.dg/torture/pr53366-1.c: Same. * gcc.dg/torture/pr57381.c: Add dg-require-effective-target int32plus. * gcc.dg/torture/pr56488.c: Same. * gcc.dg/torture/pr57584.c: Same. * gcc.dg/tree-ssa/pr57385.c: Same. * gcc.dg/pr57154.c: Add dg-require-effective-target scheduling. Index: gcc.c-torture/execute/pr57344-2.x === --- gcc.c-torture/execute/pr57344-2.x (revision 0) +++ gcc.c-torture/execute/pr57344-2.x (revision 0) @@ -0,0 +1,7 @@ +load_lib target-supports.exp + +if { [check_effective_target_int16] } { + return 1 +} + +return 0; Index: gcc.dg/pr53265.c === --- gcc.dg/pr53265.c (revision 200903) +++ gcc.dg/pr53265.c (working copy) @@ -1,6 +1,7 @@ /* PR tree-optimization/53265 */ /* { dg-do compile } */ /* { dg-options "-O2 -Wall" } */ +/* { dg-require-effective-target size32plus } */ void bar (void *); int baz (int); Index: gcc.dg/torture/pr57381.c === --- gcc.dg/torture/pr57381.c (revision 200903) +++ gcc.dg/torture/pr57381.c (working copy) @@ -1,4 +1,5 @@ /* { dg-do compile } */ +/* { dg-require-effective-target int32plus } */ struct S0 { int f0, f1, f2; }; Index: gcc.dg/torture/pr56488.c === --- gcc.dg/torture/pr56488.c (revision 200903) +++ gcc.dg/torture/pr56488.c (working copy) @@ -1,4 +1,5 @@ /* { dg-do run } */ +/* { dg-require-effective-target int32plus } */ int a, c, d = 1; struct S { int s; } b, f; Index: gcc.dg/torture/pr53366-1.c === --- gcc.dg/torture/pr53366-1.c (revision 200903) +++ gcc.dg/torture/pr53366-1.c (working copy) @@ -1,5 +1,6 @@ /* PR tree-optimization/53366 */ /* { dg-do run } */ +/* { dg-require-effective-target size32plus } */ extern void abort (void); Index: gcc.dg/torture/pr57584.c === --- gcc.dg/torture/pr57584.c (revision 200903) +++ gcc.dg/torture/pr57584.c (working copy) @@ -1,4 +1,5 @@ /* { dg-do compile } */ +/* { dg-require-effective-target int32plus } */ typedef int int32_t; typedef unsigned char uint8_t; Index: gcc.dg/tree-ssa/pr57385.c === --- gcc.dg/tree-ssa/pr57385.c (revision 200903) +++ gcc.dg/tree-ssa/pr57385.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-O1" } */ +/* { dg-require-effective-target int32plus } */ int c; Index: gcc.dg/pr57154.c === --- gcc.dg/pr57154.c (revision 200903) +++ gcc.dg/pr57154.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-O2 -fschedule-insns" } */ +/* { dg-require-effective-target scheduling } */ #define PF_FROZEN 0x0001 #define likely(x) __builtin_expect(!!(x), 1)
Re: [Patch, AArch64, ILP32] 4/5 Change tests to be ILP32-friendly
On 26 June 2013 23:41, Yufeng Zhang wrote: > gcc/testsuite/ > > * gcc.dg/20020219-1.c: Skip the test on aarch64*-*-* in ilp32. > * gcc.target/aarch64/aapcs64/test_18.c (struct y): Change the field > type from long to long long. > * gcc.target/aarch64/atomic-op-long.c: Update dg-final directives > to have effective-target keywords of lp64 and ilp32. > * gcc.target/aarch64/fcvt_double_int.c: Likewise. > * gcc.target/aarch64/fcvt_double_long.c: Likewise. > * gcc.target/aarch64/fcvt_double_uint.c: Likewise. > * gcc.target/aarch64/fcvt_double_ulong.c: Likewise. > * gcc.target/aarch64/fcvt_float_int.c: Likewise. > * gcc.target/aarch64/fcvt_float_long.c: Likewise. > * gcc.target/aarch64/fcvt_float_uint.c: Likewise. > * gcc.target/aarch64/fcvt_float_ulong.c: Likewise. > * gcc.target/aarch64/vect_smlal_1.c: Replace 'long' with 'long > long'. > OK /Marcus
Re: MIPS elimate trap-if-zero instruction if possible for divisions
Hi Jeff, Richard. None of the RTL optimzers look at TRAP_IF w.r.t optimiizations. I've decided to add some code to cprop.c and simplification primetives to simply_rtx () this do work once I fixed cprop to run on my testcase whicg only had functiions which 3 blocks currently it think theses nothing to do. I'jj submit a patchseparately for that. I'm currently tracking done a latent bug the cfg cleanup code which triggers a control inside basic block abort due to a (trap_if ((condtion) (const N)) being correctly optimized to ((trap_if (const_int 1)(conts_int N)). I ensure that when the condition is optimized to (const_int 1) that the block gets split after the unconditional trap satisfies control_flow_insn_p () predicate. The splt wroks so that the unconditional trap_if is the last instruction n its block but when cfg cleanup is run later it undoes this split and merges the blocks again thus triggering the verify_flow abort. Anyone know the rational (if any) for treating unconditional traps as control flow but not conditional traps. I would have expected them to be treated the same w.r.t flow control unless the condition is (const_int 0). Graham
RE: [PATCH, AArch64] Add vabs_s64 intrinsic
> On 12 Jul 2013, at 19:49, Ian Bolton wrote: > > > > > 2013-07-12 Ian Bolton > > > > gcc/ > > * config/aarch64/arm_neon.h (vabs_s64): New function. > > > > testsuite/ > > * gcc.target/aarch64/scalar_intrinsics.c (test_vabs_s64): Added > new > > test. > > OK > /Marcus I needed to update the patch to match argument naming conventions and function ordering conventions. Here is the new one. OK for commit? Cheers, Ian Index: gcc/config/aarch64/arm_neon.h === --- gcc/config/aarch64/arm_neon.h (revision 200594) +++ gcc/config/aarch64/arm_neon.h (working copy) @@ -17874,6 +17874,12 @@ vabs_f32 (float32x2_t __a) return __builtin_aarch64_absv2sf (__a); } +__extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) +vabs_s64 (int64x1_t __a) +{ + return __builtin_llabs (__a); +} + __extension__ static __inline float32x4_t __attribute__ ((__always_inline__)) vabsq_f32 (float32x4_t __a) { Index: gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c === --- gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c(revision 200594) +++ gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c(working copy) @@ -32,6 +32,18 @@ test_vaddd_s64_2 (int64x1_t a, int64x1_t vqaddd_s64 (a, d)); } +/* { dg-final { scan-assembler-times "\\tabs\\td\[0-9\]+, d\[0-9\]+" 1 } } */ + +int64x1_t +test_vabs_s64 (int64x1_t a) +{ + uint64x1_t res; + force_simd (a); + res = vabs_s64 (a); + force_simd (res); + return res; +} + /* { dg-final { scan-assembler-times "\\tcmeq\\td\[0-9\]+, d\[0-9\]+, d\[0-9\]+" 1 } } */ uint64x1_t
[PATCH, AArch64] Change to pass -mabi=* directly to the assembler
Hi, Following the work in AArch64 GAS to unify the ABI command line interface, this patch updates the compiler driver to pass -mabi=* directly to the assembler. The related GAS patch is here: http://www.sourceware.org/ml/binutils/2013-07/msg00180.html OK for the trunk (after the initial ILP32 patch set are committed)? Thanks, Yufeng gcc/ * config/aarch64/aarch64-elf.h (ASM_SPEC): Pass on -mabi=*. diff --git a/gcc/config/aarch64/aarch64-elf.h b/gcc/config/aarch64/aarch64-elf.h index 315a510..4757d22 100644 --- a/gcc/config/aarch64/aarch64-elf.h +++ b/gcc/config/aarch64/aarch64-elf.h @@ -140,8 +140,7 @@ %{mlittle-endian:-EL} \ %{mcpu=*:-mcpu=%*} \ %{march=*:-march=%*} \ -%{mabi=ilp32*:-milp32} \ -%{mabi=lp64*:-mlp64}" +%{mabi=*:-mabi=%*}" #endif #undef TYPE_OPERAND_FMT
Re: [patch,avr] Fix PR57516 fixed-point rounding in the overflow case
2013/7/18 Georg-Johann Lay : > Currently, the fixed-point rounding does not work correctly in the overflow > case. This is because of misreading section 2.1.7.2 of TR 18037. > > Rounding builtins expand to saturated addition and AND so that the instruction > sequence is > > add value1 > if not overflow goto 0 > load max value > 0: > and value2 > > where the correct sequence reads > > add value1 > if not overflow goto 0 > load max value > goto 1 > 0: > and value2 > 1: > > This change is performed by the patch. The round expander is transformed to > an > insn that uses avr_out_plus and avr_out_bitop to print most of the > instructions. > > Okay to apply? > > Johann > > gcc/ > PR target/57516 > * config/avr/avr-fixed.md (round3_const): Turn expander to insn. > * config/avr/avr.md (adjust_len): Add `round'. > * config/avr/avr-protos.h (avr_out_round): New prototype. > (avr_out_plus): Add `out_label' argument. > * config/avr/avr.c (avr_out_plus_1): Add `out_label' argument. > (avr_out_plus): Pass down `out_label' to avr_out_plus_1. > Handle the case where `insn' is just a pattern. > (avr_out_bitop): Handle the case where `insn' is just a pattern. > (avr_out_round): New function. > (avr_adjust_insn_length): Handle ADJUST_LEN_ROUND. > > libgcc/ > PR target/57516 > * config/avr/lib1funcs-fixed.S (__roundqq3, __rounduqq3) > (__round_s2_const, __round_u2_const) > (__round_s4_const, __round_u4_const, __round_x8): > Saturate result if addition result cannot be represented. > > gcc/testsuite/ > PR target/57516 > * gcc.target/avr/torture/builtins-4-roundfx.c (test2hr, test2k): > Adjust to corrected rounding. > > Approved. Denis.
Re: [PATCH, AArch64] Add vabs_s64 intrinsic
On 19/07/13 11:26, Ian Bolton wrote: On 12 Jul 2013, at 19:49, Ian Bolton wrote: 2013-07-12 Ian Bolton gcc/ * config/aarch64/arm_neon.h (vabs_s64): New function. testsuite/ * gcc.target/aarch64/scalar_intrinsics.c (test_vabs_s64): Added new test. OK /Marcus I needed to update the patch to match argument naming conventions and function ordering conventions. Here is the new one. OK for commit? Cheers, Ian OK /Marcus
Re: Fix GCC bug causing bootstrap failure with vectorizer turned on
On 07/18/2013 10:49 AM, Xinliang David Li wrote: The difference is that the relative order of DECL_UIDs do not change whether debug info is on or not, but there is no such guarantee when hashing is involved. Yea. I feel like an idiot for missing the hash vs comparison. Add a ChangeLog and this is good to go. Jeff
Re: RFC: Add of type-demotion pass
On 07/12/2013 01:13 PM, Marc Glisse wrote: Initial patch (from last year) actual implemented that in forwprop. Ah, reading the conversation from last year helped me understand a bit better. It's also worth noting that fold-const.c also does some type hoisting/sinking. Ideally that code should just be going away. So by implementing type-demotion there too, would lead to raise-condition. So there would be required additionally that within forwprop a straight line-depth conversion is done for statement-lists. All this doesn't fit pretty well into current concept of forward-propagation ... The cast demotion is of course something of interest for folding and might be fitting into forward-propagation-pass too. The main cause why it is implemented within demotion pass is, that this pass introduces such cast-demotion-folding opportunities due its "unsigned"-type expansion. So we want to fold that within pass and not waiting until a later pass optimizes such redundant sequences away. I hope we can at least find a way to share code between the passes. Well, I'd like to pull all of the type hoisting/sinking code out to its own pass. That may be a bit idealistic, but that's my hope/goal. If I understand, the main reason is because you want to go through the statements in reverse order, since this is the way the casts are being propagated (would forwprop also work, just more slowly, or would it miss opportunities across basic blocks?). It would miss some opportunities, Could you explain in what case? I guess my trouble understanding this is the same as in the next question, and I am missing a fundamental point... Anytime you hoist/sink a cast, you're moving it across an operation -- which can expose it as a redundant cast. Let's say you start with A = (T) x1; B = (T) x2; R = A & B; And sink the cast after the operation like this: C = x1 & x2; R = (T) C; We may find that that cast of C to type T is redundant. Similar cases can be found when we hoist the cast across the operation. I saw this kind of situation occur regularly when I was looking at Kai's hoisting/sinking patches. Now I believe one of Kai's goals is to allow our various pattern based folders to work better by not having to account for casting operations as often in sequences of statements we want to fold. I suspect that to see benefit from that we'd have to hoist, fold, sink, fold. That argues that hoisting/sinking should be independent of the folding step (which shouldn't be a surprise to any of us). That's a real good question; I find myself looking a lot at the bits in forwprop and I'm getting worried it's on its way to being an unmaintainable mess. Sadly, I'm making things worse rather than better with my recent changes. I'm still hoping more structure will become evident as I continue to work through various improvements. It looks to me like a gimple version of fold, except that since it is gimple, basic operations are an order of magnitude more complicated. But I don't really see why that would make it an unmaintainable mess, giant switches are not that bad. It's certainly moving towards a gimple version of fold and special casing everything as we convert from fold and to gimple_fold (or whatever we call it) is going to result in a horrid mess. I find myself thinking that at a high level we need to split out the forward and backward propagation bits into distinct passes. The backward propagation bits are just a tree combiner and the idioms used to follow the backward chains to create more complex trees and fold them need to be reusable components. It's totally silly (and ultimately unmaintainable) that each transformation is open-coding the walking of the use-def chain and simplification step. Jeff
Re: [Patch x86/darwin] fix PR51784 'PIC register not correctly preserved...'
Hi Uros, thanks for your reviews, On 18 Jul 2013, at 12:39, Uros Bizjak wrote: > On Thu, Jul 18, 2013 at 12:12 PM, Iain Sandoe wrote: >> >> So, I think we have to use the define_insn_and_split, or am I still missing >> something? > > Just a wild guess, do you also need "&& reload_completed" in the split > condition? good catch, thanks - this got cut erroneously from the last variant of the patch. Fixed & re-tested on x86_64-darwin12 / x86_64-linux (both at m32 and m64) showing the expected progressions on darwin (and correct behaviour on linux for the shlib example). OK for trunk? Ok for open branches? (this is a wrong-code bug) [N.B. No changes to the darwin-specific portions already approved by Mike] thanks Iain gcc/ PR target/51784 * config/i386/i386.c (output_set_got) [TARGET_MACHO]: Adjust to emit a second label for nonlocal goto receivers. Don't output pic base labels unless we're producing PIC; mark that action unreachable(). (ix86_save_reg): If the function contains a nonlocal label, save the PIC base reg. * config/darwin-protos.h (machopic_should_output_picbase_label): New. * gcc/config/darwin.c (emitted_pic_label_num): New GTY. (update_pic_label_number_if_needed): New. (machopic_output_function_base_name): Adjust for nonlocal receiver case. (machopic_should_output_picbase_label): New. * config/i386/i386.md (enum unspecv): UNSPECV_NLGR: New. (nonlocal_goto_receiver): New insn and split. gcc/config/darwin-protos.h | 1 + gcc/config/darwin.c| 30 +- gcc/config/i386/i386.c | 23 ++- gcc/config/i386/i386.md| 34 +- 4 files changed, 73 insertions(+), 15 deletions(-) diff --git a/gcc/config/darwin-protos.h b/gcc/config/darwin-protos.h index 0755e94..70b7fb0 100644 --- a/gcc/config/darwin-protos.h +++ b/gcc/config/darwin-protos.h @@ -25,6 +25,7 @@ extern void machopic_validate_stub_or_non_lazy_ptr (const char *); extern void machopic_output_function_base_name (FILE *); extern const char *machopic_indirection_name (rtx, bool); extern const char *machopic_mcount_stub_name (void); +extern bool machopic_should_output_picbase_label (void); #ifdef RTX_CODE diff --git a/gcc/config/darwin.c b/gcc/config/darwin.c index a049a5d..e07fa4c 100644 --- a/gcc/config/darwin.c +++ b/gcc/config/darwin.c @@ -369,14 +369,13 @@ machopic_gen_offset (rtx orig) static GTY(()) const char * function_base_func_name; static GTY(()) int current_pic_label_num; +static GTY(()) int emitted_pic_label_num; -void -machopic_output_function_base_name (FILE *file) +static void +update_pic_label_number_if_needed (void) { const char *current_name; - /* If dynamic-no-pic is on, we should not get here. */ - gcc_assert (!MACHO_DYNAMIC_NO_PIC_P); /* When we are generating _get_pc thunks within stubs, there is no current function. */ if (current_function_decl) @@ -394,7 +393,28 @@ machopic_output_function_base_name (FILE *file) ++current_pic_label_num; function_base_func_name = "L_machopic_stub_dummy"; } - fprintf (file, "L%011d$pb", current_pic_label_num); +} + +void +machopic_output_function_base_name (FILE *file) +{ + /* If dynamic-no-pic is on, we should not get here. */ + gcc_assert (!MACHO_DYNAMIC_NO_PIC_P); + + update_pic_label_number_if_needed (); + fprintf (file, "L%d$pb", current_pic_label_num); +} + +bool +machopic_should_output_picbase_label (void) +{ + update_pic_label_number_if_needed (); + + if (current_pic_label_num == emitted_pic_label_num) +return false; + + emitted_pic_label_num = current_pic_label_num; + return true; } /* The suffix attached to non-lazy pointer symbols. */ diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 5df6ab7..f523c2a 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -8827,10 +8827,8 @@ output_set_got (rtx dest, rtx label ATTRIBUTE_UNUSED) output_asm_insn ("mov%z0\t{%2, %0|%0, %2}", xops); #if TARGET_MACHO - /* Output the Mach-O "canonical" label name ("Lxx$pb") here too. This - is what will be referenced by the Mach-O PIC subsystem. */ - if (!label) - ASM_OUTPUT_LABEL (asm_out_file, MACHOPIC_FUNCTION_BASE_NAME); + /* We don't need a pic base, we're not producing pic. */ + gcc_unreachable (); #endif targetm.asm_out.internal_label (asm_out_file, "L", @@ -8845,12 +8843,18 @@ output_set_got (rtx dest, rtx label ATTRIBUTE_UNUSED) xops[2] = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (name)); xops[2] = gen_rtx_MEM (QImode, xops[2]); output_asm_insn ("call\t%X2", xops); - /* Output the Mach-O "canonical" label name ("Lxx$pb") here too. This - is what will be referenced by the Mach-O PIC subsystem. */ + #if TARGET_MACHO - if (!label) + /* Output the Mach-O "canonical" pic base label name ("Lxx$pb")
Re: [Patch x86/darwin] fix PR51784 'PIC register not correctly preserved...'
On Fri, Jul 19, 2013 at 02:47:51PM +0100, Iain Sandoe wrote: > + /* Get a new pic base. */ > + emit_insn (gen_set_got_labelled (pic_offset_table_rtx, label_rtx)); > + /* Correct this with the offset from the new to the old. */ > + xops[0] = xops[1] = pic_offset_table_rtx; > + label_rtx = gen_rtx_LABEL_REF (SImode, label_rtx); > + tmp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, label_rtx), > UNSPEC_MACHOPIC_OFFSET); Too long line, please wrap it. > + xops[2] = gen_rtx_CONST (Pmode, tmp); > + ix86_expand_binary_operator (MINUS, SImode, xops); > +} > + DONE; > +}) > + > ;; Avoid redundant prefixes by splitting HImode arithmetic to SImode. > > (define_split Jakub
[PATCH,committed] Disable intrinsic_nearest on AIX
Fortran intrinsic_nearest testcase fails on AIX because the magic single precision floating point value loaded as hex (z'7f7f') is not preserved by PowerPC lfs instruction and nextafterf() does not produce the expected result. Index: intrinsic_nearest.x === --- intrinsic_nearest.x (revision 201046) +++ intrinsic_nearest.x (working copy) @@ -2,5 +2,9 @@ # No Inf/NaN support on SPU. return 1 } +if [istarget "powerpc-ibm-aix*"] { +# z'7f7f' value not preserved by lfs instruction. +return 1 +} add-ieee-options return 0
[ARM][Insn classification refactoring 6/N] Delete "insn" attribute and update MOV classification
Hi, This patch is part of the ongoing work of ARM instruction classification cleanup. This patch deletes the "insn" attribute and moves the MOV/MVN instruction classification to the "type" attribute, where it is split into several types for a finer-grained classification. This has been tested with a full arm-none-eabi toolchain build and regression run, as well as using random code generation tests to compare the output versus a baseline compiler. OK for trunk? Thanks Sofiane - ChangeLog: * config/arm/arm.md (attribute "insn"): Delete. (attribute "type"): Add "mov_imm", "mov_reg", "mov_shift", "mov_shift_reg", "mvn_imm", "mvn_reg", "mvn_shift" and "mvn_shift_reg". (not_shiftsi): Update for attribute change. (not_shiftsi_compare0): Likewise. (not_shiftsi_compare0_scratch): Likewise. (arm_one_cmplsi2): Likewise. (thumb1_one_cmplsi2): Likewise. (notsi_compare0): Likewise. (notsi_compare0_scratch): Likewise. (thumb1_movdi_insn): Likewise. (arm_movsi_insn): Likewise. (movhi_insn_arch4): Likewise. (movhi_bytes): Likewise. (arm_movqi_insn): Likewise. (thumb1_movqi_insn): Likewise. (arm32_movhf): Likewise. (thumb1_movhf): Likewise. (arm_movsf_soft_insn): Likewise. (thumb1_movsf_insn): Likewise. (thumb_movdf_insn): Likewise. (movsicc_insn): Likewise. (movsfcc_soft_insn): Likewise. (and_scc): Likewise. (cond_move): Likewise. (if_move_not): Likewise. (if_not_move): Likewise. (if_shift_move): Likewise. (if_move_shift): Likewise. (if_shift_shift): Likewise. (if_not_arith): Likewise. (if_arith_not): Likewise. (cond_move_not): Likewise. * config/arm/neon.md (neon_mov): Update for attribute change. (neon_mov): Likewise. * config/arm/vfp.md (arm_movsi_vfp): Update for attribute change. (thumb2_movsi_vfp): Likewise. (movsf_vfp): Likewise. (thumb2_movsf_vfp): Likewise. * config/arm/arm.c (xscale_sched_adjust_cost): Update for attribute change. (cortexa7_older_only): Likewise. (cortexa7_younger): Likewise. * config/arm/arm1020e.md (1020alu_op): Update for attribute change. (1020alu_shift_op): Likewise. (1020alu_shift_reg_op): Likewise. * config/arm/arm1026ejs.md (alu_op): Update for attribute change. (alu_shift_op): Likewise. (alu_shift_reg_op): Likewise. * config/arm/arm1136jfs.md (11_alu_op): Update for attribute change. (11_alu_shift_op): Likewise. (11_alu_shift_reg_op): Likewise. * config/arm/arm926ejs.md (9_alu_op): Update for attribute change. (9_alu_shift_reg_op): Likewise. * config/arm/cortex-a15.md (cortex_a15_alu): Update for attribute change. (cortex_a15_alu_shift): Likewise. (cortex_a15_alu_shift_reg): Likewise. * config/arm/cortex-a5.md (cortex_a5_alu): Update for attribute change. (cortex_a5_alu_shift): Likewise. * config/arm/cortex-a53.md (cortex_a53_alu): Update for attribute change. (cortex_a53_alu_shift): Likewise. * config/arm/cortex-a7.md (cortex_a7_alu_imm): Update for attribute change. (cortex_a7_alu_reg): Likewise. (cortex_a7_alu_shift): Likewise. * config/arm/cortex-a8.md (cortex_a8_alu): Update for attribute change. (cortex_a8_alu_shift): Likewise. (cortex_a8_alu_shift_reg): Likewise. (cortex_a8_mov): Likewise. * config/arm/cortex-a9.md (cortex_a9_dp): Update for attribute change. (cortex_a9_dp_shift): Likewise. * config/arm/cortex-m4.md (cortex_m4_alu): Update for attribute change. * config/arm/cortex-r4.md (cortex_r4_alu): Update for attribute change. (cortex_r4_mov): Likewise. (cortex_r4_alu_shift): Likewise. (cortex_r4_alu_shift_reg): Likewise. * config/arm/fa526.md (526_alu_op): Update for attribute change. (526_alu_shift_op): Likewise. * config/arm/fa606te.md (606te_alu_op): Update for attribute change. * config/arm/fa626te.md (626te_alu_op): Update for attribute change. (626te_alu_shift_op): Likewise. * config/arm/fa726te.md (726te_shift_op): Update for attribute change. (726te_alu_op): Likewise. (726te_alu_shift_op): Likewise. (726te_alu_shift_reg_op): Likewise. * config/arm/fmp626.md (mp626_alu_op): Update for attribute change. (mp626_alu_shift_op): Likewise. * config/arm/marvell-pj4.md (pj4_alu_e1): Update for attribute change. (pj4_alu_e1_conds): Likewise. (pj4_alu): Likewise. (pj4_alu_conds): Likewise. (pj4_shift): Likewise. (pj4_shift_conds): Likewise. (pj4_alu_shift): Likewise. (pj4_alu_shift_conds): Likewise.
Re: [ubsan] Add libcall arguments
On Thu, Jul 18, 2013 at 03:47:28PM -0400, Jason Merrill wrote: > Please describe the hash table more up here. What are you tracking? > > >+ hashval_t h = iterative_hash_object (data->type, 0); > >+ h = iterative_hash_object (data->decl, h); > > If you hash the decl as well as the type, the find_slot in > ubsan_type_descriptor will almost never find an existing entry. Yeah. > >I have yet to handle > >freeing the hash table, but I think I'll need the GTY machinery for > >this (ubsan is not a pass, so I can't just call it at the end of the > >pas). Or maybe just create a destructor and use append_to_statement_list. > > That won't work; append_to_statement_list is for things that happen > at runtime, but freeing the hash table is something that needs to > happen in the compiler. Also, I wonder if we need to free the hash table ever, the only meaningful place would be when we are done with compilation, if all uses of this (even in the future) are from the FEs, perhaps freeing the hash table somewhere after parsing of the whole CU finished would be fine, but if we need it even from the middle-end, that might be too early. > >+/* This routine returns a magic number for TYPE. > >+ ??? This is probably too ugly. Tweak it. */ > >+ > >+static unsigned short > >+get_tinfo_for_type (tree type) > > Why map from size to some magic number rather than use the size > directly? Also, "tinfo" sounds to me like something to do with C++ > type_info. Yeah, get_ubsan_type_info_for_type would be better, plus The docs say that it should be log2 of the bit width, so I'd expect that you return (exact_log2 (TYPE_PRECISION (type)) << 1) | !TYPE_UNSIGNED (type) (perhaps with error if exact_log2 returns -1) and thus handle all possible type bitsizes. Or if you prefer to use GET_MODE_BITSIZE, that. Have you checked what clang does if you use typedefs? Does it use the names appearing in the typedefs rather than their underlying types, or is it TYPE_MAIN_VARIANT (type) actually? Also: ASM_FORMAT_PRIVATE_NAME (tmp_name, ".Lubsan_type", type_var_id_num++); I don't think the . at the beginning is going to work everywhere (on targets that don't support dots in labels), I think you want ASM_GENERATE_INTERNAL_LABEL instead and just use "Lubsan_type" as the second argument (note, this macro needs a buffer, doesn't alloca it). Jakub
Re: RFC: Add of type-demotion pass
Jeff Law wrote: >On 07/12/2013 01:13 PM, Marc Glisse wrote: >>> Initial patch (from last year) actual implemented that in forwprop. >> >> Ah, reading the conversation from last year helped me understand a >bit >> better. >It's also worth noting that fold-const.c also does some type >hoisting/sinking. Ideally that code should just be going away. > > >>> So by implementing type-demotion there too, would lead to >>> raise-condition. So there would be required additionally that >within >>> forwprop a straight line-depth conversion is done for >>> statement-lists. All this doesn't fit pretty well into current >>> concept of forward-propagation ... The cast demotion is of course >>> something of interest for folding and might be fitting into >>> forward-propagation-pass too. The main cause why it is implemented >>> within demotion pass is, that this pass introduces such >>> cast-demotion-folding opportunities due its "unsigned"-type >expansion. >>> So we want to fold that within pass and not waiting until a later >pass >>> optimizes such redundant sequences away. >> >> I hope we can at least find a way to share code between the passes. >Well, I'd like to pull all of the type hoisting/sinking code out to its > >own pass. That may be a bit idealistic, but that's my hope/goal. > > >> If I understand, the main reason is because you want to go through >the statements in reverse order, since this is the way the casts are >being propagated (would forwprop also work, just more slowly, or would it >miss opportunities across basic blocks?). >>> It would miss some opportunities, >> >> Could you explain in what case? I guess my trouble understanding this >is >> the same as in the next question, and I am missing a fundamental >point... >Anytime you hoist/sink a cast, you're moving it across an operation -- >which can expose it as a redundant cast. > >Let's say you start with > >A = (T) x1; >B = (T) x2; >R = A & B; > > >And sink the cast after the operation like this: > >C = x1 & x2; >R = (T) C; > >We may find that that cast of C to type T is redundant. Similar cases >can be found when we hoist the cast across the operation. I saw this >kind of situation occur regularly when I was looking at Kai's >hoisting/sinking patches. > >Now I believe one of Kai's goals is to allow our various pattern based >folders to work better by not having to account for casting operations >as often in sequences of statements we want to fold. I suspect that to > >see benefit from that we'd have to hoist, fold, sink, fold. That >argues >that hoisting/sinking should be independent of the folding step (which >shouldn't be a surprise to any of us). > > >>> That's a real good question; I find myself looking a lot at the bits >>> in forwprop and I'm getting worried it's on its way to being an >>> unmaintainable mess. Sadly, I'm making things worse rather than >>> better with my recent changes. I'm still hoping more structure will >>> become evident as I continue to work through various improvements. >> >> It looks to me like a gimple version of fold, except that since it is >> gimple, basic operations are an order of magnitude more complicated. >But >> I don't really see why that would make it an unmaintainable mess, >giant >> switches are not that bad. >It's certainly moving towards a gimple version of fold and special >casing everything as we convert from fold and to gimple_fold (or >whatever we call it) is going to result in a horrid mess. > >I find myself thinking that at a high level we need to split out the >forward and backward propagation bits into distinct passes. The >backward propagation bits are just a tree combiner and the idioms used >to follow the backward chains to create more complex trees and fold >them >need to be reusable components. It's totally silly (and ultimately >unmaintainable) that each transformation is open-coding the walking of >the use-def chain and simplification step. Splitting forward and backward propagation into separate passes creates a pass ordering issue. Rather than that all forward propagations should be formulated as backward or the other way around. A bit awkward in some cases maybe, but you can at least drive forward propagations from their sinks. As of a type demotion pass - I'd rather have this kind of thing as a lowering exposing mode promotion somewhen late in the simple optimizing stage. Btw, other demotions still happen from convert.c ... Richard. >Jeff
Re: RFC: Add of type-demotion pass
On Fri, 19 Jul 2013, Jeff Law wrote: It's also worth noting that fold-const.c also does some type hoisting/sinking. fold-const.c does everything ;-) Ideally that code should just be going away. Like most of the code from fold-const.c that isn't about folding constants, I guess, once it has a replacement at the gimple level. If I understand, the main reason is because you want to go through the statements in reverse order, since this is the way the casts are being propagated (would forwprop also work, just more slowly, or would it miss opportunities across basic blocks?). It would miss some opportunities, Could you explain in what case? I guess my trouble understanding this is the same as in the next question, and I am missing a fundamental point... Anytime you hoist/sink a cast, you're moving it across an operation -- which can expose it as a redundant cast. Let's say you start with A = (T) x1; B = (T) x2; R = A & B; And sink the cast after the operation like this: C = x1 & x2; R = (T) C; We may find that that cast of C to type T is redundant. Similar cases can be found when we hoist the cast across the operation. I saw this kind of situation occur regularly when I was looking at Kai's hoisting/sinking patches. IIRC the question here was how forwprop could miss opportunities that a backward (reverse dominance) walk finds, and I don't see that in this example. If the cast is redundant, forwprop is just as likely to detect it. Now I believe one of Kai's goals is to allow our various pattern based folders to work better by not having to account for casting operations as often in sequences of statements we want to fold. Ah. I thought it was mostly so operations would be performed in a smaller, hopefully cheaper mode. Simplifying combiner patterns would be nice if that worked. That's a real good question; I find myself looking a lot at the bits in forwprop and I'm getting worried it's on its way to being an unmaintainable mess. Sadly, I'm making things worse rather than better with my recent changes. I'm still hoping more structure will become evident as I continue to work through various improvements. It looks to me like a gimple version of fold, except that since it is gimple, basic operations are an order of magnitude more complicated. But I don't really see why that would make it an unmaintainable mess, giant switches are not that bad. It's certainly moving towards a gimple version of fold and special casing everything as we convert from fold and to gimple_fold (or whatever we call it) is going to result in a horrid mess. I find myself thinking that at a high level we need to split out the forward and backward propagation bits into distinct passes. They are already mostly split (2 separate loops in ssa_forward_propagate_and_combine), so that wouldn't be hard. The backward propagation bits are just a tree combiner and the idioms used to follow the backward chains to create more complex trees and fold them need to be reusable components. It's totally silly (and ultimately unmaintainable) that each transformation is open-coding the walking of the use-def chain and simplification step. It isn't completely open-coded. get_prop_source_stmt, can_propagate_from, defcodefor_name, etc make walking the use-def chain not so bad. Usually at this point Richard refers to Andrew's work on a gimple combiner... -- Marc Glisse
Re: [PATCH 1/3] [lambda] Support template-parameter-list in lambda-declarator.
On 07/19/2013 05:00 AM, Adam Butcher wrote: + warning (0, "Conversion of a generic lambda to a function pointer is not currently implemented."); ... +"Generic lambdas are only supported in C++1y mode."); We generally don't capitalize diagnostics or end them with a period. And the second message should mention "-std=c++1y or -std=gnu++1y". + push_deferring_access_checks (dk_deferred); Why do you need this? +/* Nonzero if the function being declared was made a template due to it's "its" + error ("parameter declared % (unsupported prior to C++1y)"); This should also mention the compiler flags. We should probably introduce a maybe_warn_cpp1y function to go along with ..._cpp0x. + else // extend current template parameter list Do we still need to do this, now that we're handling all the parameters at the end of the parameter list? Jason
Re: [PING] Two patches pending
On 07/09/2013 01:53 AM, Bin Cheng wrote: Hi, FW: [PATCH GCC]Relax the probability condition in CE pass when optimizing for code size http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00969.html This is OK. [PATCH ARM]Extend thumb1_reorg to save more comparison instructions http://gcc.gnu.org/ml/gcc-patches/2013-04/msg01057.html Ping Richard Earnshaw, Nick and Ramana privately -- one of them should be able to handle it. I don't think Paul Brook is active anymore. jeff
Re: [PATCH] Fix raw-string handling (PR preprocessor/57620)
On Thu, Jul 18, 2013 at 12:29:25PM -0400, Jason Merrill wrote: > Hmm, that logic is difficult to follow. It needs comments at least > explaining last_seen_* and why the loop in the suffix handling keeps > going after we change the phase to RAW_STR. > > Maybe instead of tracking last_seen_* BUFF_APPEND could copy into a > short local char array as well as the string buffer? So like this? BUFF_APPEND copies also the already parsed part of the string, so I didn't want to do the copying in there. One simplification (but perhaps making the code harder to read) would be to set temp_buffer_len = 17 while phase = RAW_STR, then all we could check would be if (temp_buffer_len < 17) and not look at phase at all. 2013-07-19 Jakub Jelinek PR preprocessor/57620 * lex.c (lex_raw_string): Undo phase1 and phase2 transformations between R" and final " rather than only in between R"del( and )del". * c-c++-common/raw-string-2.c (s12, u12, U12, L12): Remove. (main): Don't test {s,u,U,L}12. * c-c++-common/raw-string-13.c: New test. * c-c++-common/raw-string-14.c: New test. * c-c++-common/raw-string-15.c: New test. * c-c++-common/raw-string-16.c: New test. --- libcpp/lex.c.jj 2013-07-10 18:50:45.229759934 +0200 +++ libcpp/lex.c2013-07-19 19:19:05.675374588 +0200 @@ -1373,11 +1373,17 @@ static void lex_raw_string (cpp_reader *pfile, cpp_token *token, const uchar *base, const uchar *cur) { - const uchar *raw_prefix; - unsigned int raw_prefix_len = 0; + uchar raw_prefix[17]; + uchar temp_buffer[18]; + const uchar *orig_base; + unsigned int raw_prefix_len = 0, raw_suffix_len = 0; + enum raw_str_phase { RAW_STR_PREFIX, RAW_STR, RAW_STR_SUFFIX }; + raw_str_phase phase = RAW_STR_PREFIX; enum cpp_ttype type; size_t total_len = 0; + size_t temp_buffer_len = 0; _cpp_buff *first_buff = NULL, *last_buff = NULL; + size_t raw_prefix_start; _cpp_line_note *note = &pfile->buffer->notes[pfile->buffer->cur_note]; type = (*base == 'L' ? CPP_WSTRING : @@ -1385,57 +1391,6 @@ lex_raw_string (cpp_reader *pfile, cpp_t *base == 'u' ? (base[1] == '8' ? CPP_UTF8STRING : CPP_STRING16) : CPP_STRING); - raw_prefix = cur + 1; - while (raw_prefix_len < 16) -{ - switch (raw_prefix[raw_prefix_len]) - { - case ' ': case '(': case ')': case '\\': case '\t': - case '\v': case '\f': case '\n': default: - break; - /* Basic source charset except the above chars. */ - case 'a': case 'b': case 'c': case 'd': case 'e': case 'f': - case 'g': case 'h': case 'i': case 'j': case 'k': case 'l': - case 'm': case 'n': case 'o': case 'p': case 'q': case 'r': - case 's': case 't': case 'u': case 'v': case 'w': case 'x': - case 'y': case 'z': - case 'A': case 'B': case 'C': case 'D': case 'E': case 'F': - case 'G': case 'H': case 'I': case 'J': case 'K': case 'L': - case 'M': case 'N': case 'O': case 'P': case 'Q': case 'R': - case 'S': case 'T': case 'U': case 'V': case 'W': case 'X': - case 'Y': case 'Z': - case '0': case '1': case '2': case '3': case '4': case '5': - case '6': case '7': case '8': case '9': - case '_': case '{': case '}': case '#': case '[': case ']': - case '<': case '>': case '%': case ':': case ';': case '.': - case '?': case '*': case '+': case '-': case '/': case '^': - case '&': case '|': case '~': case '!': case '=': case ',': - case '"': case '\'': - raw_prefix_len++; - continue; - } - break; -} - - if (raw_prefix[raw_prefix_len] != '(') -{ - int col = CPP_BUF_COLUMN (pfile->buffer, raw_prefix + raw_prefix_len) - + 1; - if (raw_prefix_len == 16) - cpp_error_with_line (pfile, CPP_DL_ERROR, token->src_loc, col, -"raw string delimiter longer than 16 characters"); - else - cpp_error_with_line (pfile, CPP_DL_ERROR, token->src_loc, col, -"invalid character '%c' in raw string delimiter", -(int) raw_prefix[raw_prefix_len]); - pfile->buffer->cur = raw_prefix - 1; - create_literal (pfile, token, base, raw_prefix - 1 - base, CPP_OTHER); - return; -} - - cur = raw_prefix + raw_prefix_len + 1; - for (;;) -{ #define BUF_APPEND(STR,LEN)\ do { \ bufring_append (pfile, (const uchar *)(STR), (LEN), \ @@ -1443,10 +1398,16 @@ lex_raw_string (cpp_reader *pfile, cpp_t total_len += (LEN); \ } while (0); + orig_base = base; + ++cur; + raw_prefix_start = cur - base; + for (;;) +{ cppchar_t c; /* If we previously performed any trigraph or line splicing -transformations, undo them within the body of the raw
Re: [testsuite] ad PR52641: Skip more tests on int16 or size16 targets.
On Jul 19, 2013, at 2:23 AM, Georg-Johann Lay wrote: > Here are some more skips of tests that won't work on int16 targets. > > In most cases constants too big, bitfields too long or arrays too large are > used. For gcc.dg/torture/pr56488.c there are some notes at > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56488#c4 > > that explain for why this test fails if int = short. > > Ok to apply? Ok.
Re: [PATCH] Fix raw-string handling (PR preprocessor/57620)
On Fri, Jul 19, 2013 at 07:33:19PM +0200, Jakub Jelinek wrote: > On Thu, Jul 18, 2013 at 12:29:25PM -0400, Jason Merrill wrote: > > Hmm, that logic is difficult to follow. It needs comments at least > > explaining last_seen_* and why the loop in the suffix handling keeps > > going after we change the phase to RAW_STR. > > > > Maybe instead of tracking last_seen_* BUFF_APPEND could copy into a > > short local char array as well as the string buffer? > > So like this? BUFF_APPEND copies also the already parsed part of the > string, so I didn't want to do the copying in there. > One simplification (but perhaps making the code harder to read) > would be to set temp_buffer_len = 17 while phase = RAW_STR, then > all we could check would be if (temp_buffer_len < 17) and not look at phase > at all. Here is yet another version that does it in BUF_APPEND (plus the single place that didn't use BUF_APPEND and needs it), but won't do it for the BUF_APPEND (base, ...) case where we don't want to write anything. And it also has temp_buffer_len = 17 for RAW_STR change. 2013-07-19 Jakub Jelinek PR preprocessor/57620 * lex.c (lex_raw_string): Undo phase1 and phase2 transformations between R" and final " rather than only in between R"del( and )del". * c-c++-common/raw-string-2.c (s12, u12, U12, L12): Remove. (main): Don't test {s,u,U,L}12. * c-c++-common/raw-string-13.c: New test. * c-c++-common/raw-string-14.c: New test. * c-c++-common/raw-string-15.c: New test. * c-c++-common/raw-string-16.c: New test. --- libcpp/lex.c.jj 2013-07-10 18:50:45.229759934 +0200 +++ libcpp/lex.c2013-07-19 19:44:50.175035080 +0200 @@ -1373,11 +1373,20 @@ static void lex_raw_string (cpp_reader *pfile, cpp_token *token, const uchar *base, const uchar *cur) { - const uchar *raw_prefix; - unsigned int raw_prefix_len = 0; + uchar raw_prefix[17]; + uchar temp_buffer[18]; + const uchar *orig_base; + unsigned int raw_prefix_len = 0, raw_suffix_len = 0; + enum raw_str_phase { RAW_STR_PREFIX, RAW_STR, RAW_STR_SUFFIX }; + raw_str_phase phase = RAW_STR_PREFIX; enum cpp_ttype type; size_t total_len = 0; + /* Index into temp_buffer during phases other than RAW_STR, + during RAW_STR phase 17 to tell BUF_APPEND that nothing should + be appended to temp_buffer. */ + size_t temp_buffer_len = 0; _cpp_buff *first_buff = NULL, *last_buff = NULL; + size_t raw_prefix_start; _cpp_line_note *note = &pfile->buffer->notes[pfile->buffer->cur_note]; type = (*base == 'L' ? CPP_WSTRING : @@ -1385,68 +1394,31 @@ lex_raw_string (cpp_reader *pfile, cpp_t *base == 'u' ? (base[1] == '8' ? CPP_UTF8STRING : CPP_STRING16) : CPP_STRING); - raw_prefix = cur + 1; - while (raw_prefix_len < 16) -{ - switch (raw_prefix[raw_prefix_len]) - { - case ' ': case '(': case ')': case '\\': case '\t': - case '\v': case '\f': case '\n': default: - break; - /* Basic source charset except the above chars. */ - case 'a': case 'b': case 'c': case 'd': case 'e': case 'f': - case 'g': case 'h': case 'i': case 'j': case 'k': case 'l': - case 'm': case 'n': case 'o': case 'p': case 'q': case 'r': - case 's': case 't': case 'u': case 'v': case 'w': case 'x': - case 'y': case 'z': - case 'A': case 'B': case 'C': case 'D': case 'E': case 'F': - case 'G': case 'H': case 'I': case 'J': case 'K': case 'L': - case 'M': case 'N': case 'O': case 'P': case 'Q': case 'R': - case 'S': case 'T': case 'U': case 'V': case 'W': case 'X': - case 'Y': case 'Z': - case '0': case '1': case '2': case '3': case '4': case '5': - case '6': case '7': case '8': case '9': - case '_': case '{': case '}': case '#': case '[': case ']': - case '<': case '>': case '%': case ':': case ';': case '.': - case '?': case '*': case '+': case '-': case '/': case '^': - case '&': case '|': case '~': case '!': case '=': case ',': - case '"': case '\'': - raw_prefix_len++; - continue; - } - break; -} - - if (raw_prefix[raw_prefix_len] != '(') -{ - int col = CPP_BUF_COLUMN (pfile->buffer, raw_prefix + raw_prefix_len) - + 1; - if (raw_prefix_len == 16) - cpp_error_with_line (pfile, CPP_DL_ERROR, token->src_loc, col, -"raw string delimiter longer than 16 characters"); - else - cpp_error_with_line (pfile, CPP_DL_ERROR, token->src_loc, col, -"invalid character '%c' in raw string delimiter", -(int) raw_prefix[raw_prefix_len]); - pfile->buffer->cur = raw_prefix - 1; - create_literal (pfile, token, base, raw_prefix - 1 - base, CPP_OTHER); - return; -} - - cur = raw_prefix + raw_prefix_len + 1; - for (;;) -{ #define BUF_APPEND(STR,LEN)
Re: [AArch64] Rewrite vabs_s<8,16,32,64> AdvSIMD intrinsics to fold to tree.
On Fri, Jul 19, 2013 at 08:09:05PM +0100, James Greenhalgh wrote: > + /* Forecfully avoid optimization. */ \ A typo. > + asm volatile ("" : : : "memory"); \ > + test_vabs_##size (res1, pool1);\ > + for (i = 0; i < lanes_64; i++) \ > +if (res1[i] != expected1[i]) \ > + abort (); \ > + \ > + /* Forecfully avoid optimization. */ \ Same. Marek
Re: [ubsan] Add libcall arguments
On Fri, Jul 19, 2013 at 05:20:39PM +0200, Jakub Jelinek wrote: > > >I have yet to handle > > >freeing the hash table, but I think I'll need the GTY machinery for > > >this (ubsan is not a pass, so I can't just call it at the end of the > > >pas). Or maybe just create a destructor and use append_to_statement_list. > > > > That won't work; append_to_statement_list is for things that happen > > at runtime, but freeing the hash table is something that needs to > > happen in the compiler. > > Also, I wonder if we need to free the hash table ever, the only meaningful > place would be when we are done with compilation, if all uses of this (even > in the future) are from the FEs, perhaps freeing the hash table somewhere > after parsing of the whole CU finished would be fine, but if we need it even > from the middle-end, that might be too early. I'm afraid we'll need it even from the middle-end, at least when doing the signed overflow sanitization. > > >+/* This routine returns a magic number for TYPE. > > >+ ??? This is probably too ugly. Tweak it. */ > > >+ > > >+static unsigned short > > >+get_tinfo_for_type (tree type) > > > > Why map from size to some magic number rather than use the size > > directly? Also, "tinfo" sounds to me like something to do with C++ > > type_info. > > Yeah, get_ubsan_type_info_for_type would be better, plus > The docs say that it should be log2 of the bit width, so I'd expect that > you return > (exact_log2 (TYPE_PRECISION (type)) << 1) | !TYPE_UNSIGNED (type) > (perhaps with error if exact_log2 returns -1) and thus handle all possible > type bitsizes. Or if you prefer to use GET_MODE_BITSIZE, that. Ah, that's much better. I did it like that; also included the checking of return value of exact_log2. And yeah, the name wasn't very nice, so I've renamed it. > Have you checked what clang does if you use typedefs? Does it use > the names appearing in the typedefs rather than their underlying types, > or is it TYPE_MAIN_VARIANT (type) actually? Good point. E.g. for typedef const unsigned int foo_t; clang says 'unsigned int'. But we didn't do it; we'd output the foo_t type. I've adjusted that, so now we print the underlying type as clang does. > Also: > > ASM_FORMAT_PRIVATE_NAME (tmp_name, ".Lubsan_type", type_var_id_num++); > > I don't think the . at the beginning is going to work everywhere > (on targets that don't support dots in labels), > I think you want ASM_GENERATE_INTERNAL_LABEL instead and just use > "Lubsan_type" as the second argument (note, this macro needs a buffer, > doesn't alloca it). Okay, fixed. Thanks, Marek
Re: [ubsan] Add libcall arguments
On Fri, Jul 19, 2013 at 08:50:42PM +0200, Jakub Jelinek wrote: > On Fri, Jul 19, 2013 at 08:45:30PM +0200, Marek Polacek wrote: > > > >+uptr_type (void) > > > >+{ > > > >+ return build_nonstandard_integer_type (POINTER_SIZE, 1); > > > > > > Why not use uintptr_type_node? > > > > I suppose I could. I just followed suit what asan.c does. I didn't > > address this in this patch, but I can, if you want to. > > uintptr_type_node is a C/C++/ObjC/ObjC++ FE tree. So, if you use it just > in c-family/c-ubsan.c, that is just fine, but you can't use it in ubsan.c. In that case I prefer to keep uptr_type around. Even though I like uintptr_type_node more. > > @@ -67,8 +68,8 @@ inline bool > > ubsan_typedesc_hasher::equal (const ubsan_typedesc *d1, > > const ubsan_typedesc *d2) > > { > > - /* ??? Here, the types should have identical __typekind, > > - _typeinfo and __typename. Is this enough? */ > > + /* Here, the types should have identical __typekind, > > + _typeinfo and __typename. */ > >return d1->type == d2->type; > > } > > Only one underscore for _typeinfo ? I wonder where the _ disappeared. Will fix. Marek
[AArch64] Rewrite vabs_s<8,16,32,64> AdvSIMD intrinsics to fold to tree.
Hi, This patch uses aarch64_fold_builtin to fold all remaining variants of the vabs intrinsics to tree. Testcase added, full testsuite run for aarch64-none-elf with no issues. OK? Thanks, James --- gcc/ 2013-07-19 James Greenhalgh * config/aarch64/aarch64-builtins.c (aarch64_fold_builtin): Fold abs in all modes. * config/aarch64/aarch64-simd-builtins.def (abs): Enable for all modes. * config/aarch64/arm_neon.h (vabs_s<8,16,32,64): Rewrite using builtins. (vabs_f64): Add missing intrinsic. gcc/testsuite/ 2013-07-19 James Greenhalgh * gcc.target/aarch64/vabs_intrinsic_1.c: New file. diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index f49f06b..6816b9c 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -1325,7 +1325,7 @@ aarch64_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED, tree *args, switch (fcode) { - BUILTIN_VDQF (UNOP, abs, 2) + BUILTIN_VALLDI (UNOP, abs, 2) return fold_build1 (ABS_EXPR, type, args[0]); break; BUILTIN_VALLDI (BINOP, cmge, 0) diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index af2dd6e..55dead6 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -347,7 +347,7 @@ BUILTIN_VDQF (UNOP, frecpe, 0) BUILTIN_VDQF (BINOP, frecps, 0) - BUILTIN_VDQF (UNOP, abs, 2) + BUILTIN_VALLDI (UNOP, abs, 2) VAR1 (UNOP, vec_unpacks_hi_, 10, v4sf) VAR1 (BINOP, float_truncate_hi_, 0, v4sf) diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index 122fd7d..99cf123 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -4468,83 +4468,6 @@ vabds_f32 (float32_t a, float32_t b) return result; } -__extension__ static __inline int8x8_t __attribute__ ((__always_inline__)) -vabs_s8 (int8x8_t a) -{ - int8x8_t result; - __asm__ ("abs %0.8b,%1.8b" - : "=w"(result) - : "w"(a) - : /* No clobbers */); - return result; -} - -__extension__ static __inline int16x4_t __attribute__ ((__always_inline__)) -vabs_s16 (int16x4_t a) -{ - int16x4_t result; - __asm__ ("abs %0.4h,%1.4h" - : "=w"(result) - : "w"(a) - : /* No clobbers */); - return result; -} - -__extension__ static __inline int32x2_t __attribute__ ((__always_inline__)) -vabs_s32 (int32x2_t a) -{ - int32x2_t result; - __asm__ ("abs %0.2s,%1.2s" - : "=w"(result) - : "w"(a) - : /* No clobbers */); - return result; -} - -__extension__ static __inline int8x16_t __attribute__ ((__always_inline__)) -vabsq_s8 (int8x16_t a) -{ - int8x16_t result; - __asm__ ("abs %0.16b,%1.16b" - : "=w"(result) - : "w"(a) - : /* No clobbers */); - return result; -} - -__extension__ static __inline int16x8_t __attribute__ ((__always_inline__)) -vabsq_s16 (int16x8_t a) -{ - int16x8_t result; - __asm__ ("abs %0.8h,%1.8h" - : "=w"(result) - : "w"(a) - : /* No clobbers */); - return result; -} - -__extension__ static __inline int32x4_t __attribute__ ((__always_inline__)) -vabsq_s32 (int32x4_t a) -{ - int32x4_t result; - __asm__ ("abs %0.4s,%1.4s" - : "=w"(result) - : "w"(a) - : /* No clobbers */); - return result; -} - -__extension__ static __inline int64x2_t __attribute__ ((__always_inline__)) -vabsq_s64 (int64x2_t a) -{ - int64x2_t result; - __asm__ ("abs %0.2d,%1.2d" - : "=w"(result) - : "w"(a) - : /* No clobbers */); - return result; -} - __extension__ static __inline int16_t __attribute__ ((__always_inline__)) vaddlv_s8 (int8x8_t a) { @@ -17395,6 +17318,30 @@ vabs_f32 (float32x2_t __a) return __builtin_aarch64_absv2sf (__a); } +__extension__ static __inline float64x1_t __attribute__ ((__always_inline__)) +vabs_f64 (float64x1_t __a) +{ + return __builtin_fabs (__a); +} + +__extension__ static __inline int8x8_t __attribute__ ((__always_inline__)) +vabs_s8 (int8x8_t __a) +{ + return __builtin_aarch64_absv8qi (__a); +} + +__extension__ static __inline int16x4_t __attribute__ ((__always_inline__)) +vabs_s16 (int16x4_t __a) +{ + return __builtin_aarch64_absv4hi (__a); +} + +__extension__ static __inline int32x2_t __attribute__ ((__always_inline__)) +vabs_s32 (int32x2_t __a) +{ + return __builtin_aarch64_absv2si (__a); +} + __extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) vabs_s64 (int64x1_t __a) { @@ -17413,6 +17360,30 @@ vabsq_f64 (float64x2_t __a) return __builtin_aarch64_absv2df (__a); } +__extension__ static __inline int8x16_t __attribute__ ((__always_inline__)) +vabsq_s8 (int8x16_t __a) +{ + return __builtin_aarch64_absv16qi (__a); +} + +__extension__ static __inline int16x8_t __attribute__ ((__always_inline__)) +vabsq_s16 (int16x8
Re: [Patch x86/darwin] fix PR51784 'PIC register not correctly preserved...'
On Fri, Jul 19, 2013 at 3:47 PM, Iain Sandoe wrote: > Hi Uros, > > thanks for your reviews, > > On 18 Jul 2013, at 12:39, Uros Bizjak wrote: > >> On Thu, Jul 18, 2013 at 12:12 PM, Iain Sandoe wrote: >>> >>> So, I think we have to use the define_insn_and_split, or am I still missing >>> something? >> >> Just a wild guess, do you also need "&& reload_completed" in the split >> condition? > > good catch, thanks - this got cut erroneously from the last variant of the > patch. > > Fixed & re-tested on x86_64-darwin12 / x86_64-linux (both at m32 and m64) > showing the expected progressions on darwin (and correct behaviour on linux > for the shlib example). > gcc/ > > PR target/51784 > * config/i386/i386.c (output_set_got) [TARGET_MACHO]: Adjust to emit > a second label for > nonlocal goto receivers. Don't output pic base labels unless we're > producing PIC; mark > that action unreachable(). > (ix86_save_reg): If the function contains a nonlocal label, save the > PIC base reg. > * config/darwin-protos.h (machopic_should_output_picbase_label): New. > * gcc/config/darwin.c (emitted_pic_label_num): New GTY. > (update_pic_label_number_if_needed): New. > (machopic_output_function_base_name): Adjust for nonlocal receiver > case. > (machopic_should_output_picbase_label): New. > * config/i386/i386.md (enum unspecv): UNSPECV_NLGR: New. > (nonlocal_goto_receiver): New insn and split. > OK for trunk? Assuming that Jakub is OK with the patch, it is OK for trunk. > Ok for open branches? (this is a wrong-code bug) OK from the maintainer POV, but also needs release manager approval. I'd suggest a small improvement: +++ b/gcc/config/i386/i386.c @@ -8827,10 +8827,8 @@ output_set_got (rtx dest, rtx label ATTRIBUTE_UNUSED) output_asm_insn ("mov%z0\t{%2, %0|%0, %2}", xops); #if TARGET_MACHO - /* Output the Mach-O "canonical" label name ("Lxx$pb") here too. This - is what will be referenced by the Mach-O PIC subsystem. */ - if (!label) - ASM_OUTPUT_LABEL (asm_out_file, MACHOPIC_FUNCTION_BASE_NAME); + /* We don't need a pic base, we're not producing pic. */ + gcc_unreachable (); #endif Put this part just after the opening curly brace. There is no need to calculate xops and output asm insn for TARGET_MACHO. Thanks, Uros.
Re: [ubsan] Add libcall arguments
On Thu, Jul 18, 2013 at 03:47:28PM -0400, Jason Merrill wrote: > On 07/05/2013 10:04 AM, Marek Polacek wrote: > >+/* This type represents an entry in the hash table. */ > > Please describe the hash table more up here. What are you tracking? Ok, I've added two more comments. > >+ hashval_t h = iterative_hash_object (data->type, 0); > >+ h = iterative_hash_object (data->decl, h); > > If you hash the decl as well as the type, the find_slot in > ubsan_type_descriptor will almost never find an existing entry. Oops. Fixed. > >+uptr_type (void) > >+{ > >+ return build_nonstandard_integer_type (POINTER_SIZE, 1); > > Why not use uintptr_type_node? I suppose I could. I just followed suit what asan.c does. I didn't address this in this patch, but I can, if you want to. > >I have yet to handle > >freeing the hash table, but I think I'll need the GTY machinery for > >this (ubsan is not a pass, so I can't just call it at the end of the > >pas). Or maybe just create a destructor and use append_to_statement_list. > > That won't work; append_to_statement_list is for things that happen > at runtime, but freeing the hash table is something that needs to > happen in the compiler. Yeah, indeed. > >+/* This routine returns a magic number for TYPE. > >+ ??? This is probably too ugly. Tweak it. */ > >+ > >+static unsigned short > >+get_tinfo_for_type (tree type) > > Why map from size to some magic number rather than use the size > directly? Also, "tinfo" sounds to me like something to do with C++ > type_info. It's what the ubsan library wants; however, I rewrote & renamed that functions as per Jakub's suggestion. Thanks for the review. Does it look ok? Ran ubsan testsuite, both -m64/-m32. 2013-07-19 Marek Polacek * ubsan.c (struct ubsan_typedesc): Add comments. (ubsan_typedesc_hasher::hash): Don't hash the VAR_DECL element. (ubsan_typedesc_hasher::equal): Adjust comment. (ubsan_typedesc_get_alloc_pool): Remove comment. (empty_ubsan_typedesc_hash_table): Remove function. (ubsan_source_location_type): Remove bogus comment. (get_tinfo_for_type): Remove function. (get_ubsan_type_info_for_type): New function. (ubsan_type_descriptor): Use ASM_GENERATE_INTERNAL_LABEL instead of ASM_FORMAT_PRIVATE_NAME. Use TYPE_MAIN_VARIANT of the type. (ubsan_create_data): Likewise. --- gcc/ubsan.c.mp 2013-07-19 18:42:42.896249415 +0200 +++ gcc/ubsan.c 2013-07-19 20:34:11.459792189 +0200 @@ -34,7 +34,10 @@ along with GCC; see the file COPYING3. /* This type represents an entry in the hash table. */ struct ubsan_typedesc { + /* This represents the type of a variable. */ tree type; + + /* This is the VAR_DECL of the type. */ tree decl; }; @@ -56,9 +59,7 @@ struct ubsan_typedesc_hasher inline hashval_t ubsan_typedesc_hasher::hash (const ubsan_typedesc *data) { - hashval_t h = iterative_hash_object (data->type, 0); - h = iterative_hash_object (data->decl, h); - return h; + return iterative_hash_object (data->type, 0); } /* Compare two data types. */ @@ -67,8 +68,8 @@ inline bool ubsan_typedesc_hasher::equal (const ubsan_typedesc *d1, const ubsan_typedesc *d2) { - /* ??? Here, the types should have identical __typekind, - _typeinfo and __typename. Is this enough? */ + /* Here, the types should have identical __typekind, + _typeinfo and __typename. */ return d1->type == d2->type; } @@ -93,7 +94,6 @@ ubsan_typedesc_get_alloc_pool () ubsan_typedesc_alloc_pool = create_alloc_pool ("ubsan_typedesc", sizeof (ubsan_typedesc), 10); - // XXX But where do we free this? We'll need GTY machinery. return ubsan_typedesc_alloc_pool; } @@ -123,18 +123,6 @@ ubsan_typedesc_new (tree type, tree decl return desc; } -/* Clear all entries from the type descriptor hash table. */ - -#if 0 -static void -empty_ubsan_typedesc_hash_table () -{ - // XXX But when do we call this? - if (ubsan_typedesc_ht.is_created ()) -ubsan_typedesc_ht.empty (); -} -#endif - /* Build the ubsan uptr type. */ static tree @@ -245,7 +233,6 @@ ubsan_source_location_type (void) { fields[i] = build_decl (UNKNOWN_LOCATION, FIELD_DECL, get_identifier (field_names[i]), - //(i == 0) ? const_string_type_node (i == 0) ? build_pointer_type (const_char_type) : unsigned_type_node); DECL_CONTEXT (fields[i]) = ret; @@ -291,34 +278,16 @@ ubsan_source_location (location_t loc) return ctor; } -/* This routine returns a magic number for TYPE. - ??? This is probably too ugly. Tweak it. */ +/* This routine returns a magic number for TYPE. */ static unsigned short -get_tinfo_for_type (tree type) +get_ubsan_type_info_for_type (tree type) {
Re: [ubsan] Add libcall arguments
On Fri, Jul 19, 2013 at 08:45:30PM +0200, Marek Polacek wrote: > > >+uptr_type (void) > > >+{ > > >+ return build_nonstandard_integer_type (POINTER_SIZE, 1); > > > > Why not use uintptr_type_node? > > I suppose I could. I just followed suit what asan.c does. I didn't > address this in this patch, but I can, if you want to. uintptr_type_node is a C/C++/ObjC/ObjC++ FE tree. So, if you use it just in c-family/c-ubsan.c, that is just fine, but you can't use it in ubsan.c. > @@ -67,8 +68,8 @@ inline bool > ubsan_typedesc_hasher::equal (const ubsan_typedesc *d1, > const ubsan_typedesc *d2) > { > - /* ??? Here, the types should have identical __typekind, > - _typeinfo and __typename. Is this enough? */ > + /* Here, the types should have identical __typekind, > + _typeinfo and __typename. */ >return d1->type == d2->type; > } Only one underscore for _typeinfo ? Jakub
MAINTAINERS (Write After Approval): Add myself
Hi all, Add myself as Write After Approval. Yvan 2013-07-19 Yvan Roux * MAINTAINERS (Write After Approval): Add myself. r201069.diff Description: Binary data
Re: [Patch x86/darwin] fix PR51784 'PIC register not correctly preserved...'
On Fri, Jul 19, 2013 at 04:56:47PM +0200, Uros Bizjak wrote: > > OK for trunk? > > Assuming that Jakub is OK with the patch, it is OK for trunk. With the line wrapping fix and Uros' suggested improvement this is ok for both trunk and branches. Jakub
Re: msp430 port
> GPL + exception seems like the way to go, except in those cases where > the code is coming from a 3rd party. Already pointed out and fixed, but I'll double check that they're all GPL+E. > I'm assuming you documented all the MSP430 options. I didn't check them > closely. I'm also assuming the libgcc functions are reasonably correct. I did. > For popm, why not define a new output modifier instead of using %I, per > the comments. That seems cleaner to me. Done. > movqihi seems wrong. You really should determine why the standard > methods for handling automatic elimination of extensions when loading > from memory isn't working. I believe most RISC port in GCC uses those > mechanisms successfully. I went through every reference to LOAD_EXTEND_OP in the gcc sources, added printfs, and for each that got hit, either it was looking for a SUBREG (don't have one at that point) or was looking at the load and zero_extend as separate insns (had nothing to do) or was related to reload (nothing got reloaded). I checked sparc, and they basically did what I did - they included a pattern for load+extend (zero_extendqisi2_insn), despite setting LOAD_EXTEND_OP to ZERO_EXTEND and having WORD_REGISTER_OPERATIONS set. Same for mn10300. So... I'm thinking, since every QImode operation has an implicit zero_extend, for each of those I need a variant that includes a zero_extend operation as well?
[google gcc-4_8] Merge r201068 (fix for PR57878) from gcc-4_8-branch
Greetings, I committed merge of r201068 (backport of the fix for PR57878 from trunk) as r201071 to google/gcc-4_8 branch. Thanks, -- Paul Pluzhnikov
Fwd: [testsuite] ad PR52641: Skip more tests on int16 or size16 targets.
[ sorry for dup. ] On Jul 19, 2013, at 2:23 AM, Georg-Johann Lay wrote: > Here are some more skips of tests that won't work on int16 targets. > > In most cases constants too big, bitfields too long or arrays too large are > used. For gcc.dg/torture/pr56488.c there are some notes at > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56488#c4 > > that explain for why this test fails if int = short. > > Ok to apply? Ok.
[Patch, PR 57809] Wasted work in omega_eliminate_red()
Hi, The problem appears in revision 201034 in version 4.9. I attached a one-line patch that fixes it. I also reported this problem at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57809 I bootstrapped and ran the regression tests for this patch on x86_64-linux and all tests pass. In method "omega_eliminate_red()" in gcc/omega.c, the loop on line 2592 should break immediately after "red_found" is set to "1". All the iterations after "red_found" set to "1" do not perform any useful work, at best they just set "red_found" again to "1". There are three more similar problems in the same file gcc/omega.c: 1. omega_problem_has_red_equations(): the loop on line 4854 should break immediately after "result" is set to "true". 2. omega_problem_has_red_equations(): the loop on line 4907 should break immediately after "result" is set to "true". 3. omega_query_variable(): the loop on line 5252 should break immediately after "coupled" is set to "true". Index: gcc/omega.c === --- gcc/omega.c (revision 201034) +++ gcc/omega.c (working copy) @@ -2591,7 +2591,10 @@ for (red_found = 0, e = pb->num_geqs - 1; e >= 0; e--) if (pb->geqs[e].color == omega_red) - red_found = 1; + { +red_found = 1; +break; + } if (!red_found) { Index: gcc/omega.c === --- gcc/omega.c (revision 201034) +++ gcc/omega.c (working copy) @@ -4853,7 +4853,10 @@ for (e = pb->num_geqs - 1; e >= 0; e--) if (pb->geqs[e].color == omega_red) - result = true; + { +result = true; +break; + } if (!result) return false; Index: gcc/omega.c === --- gcc/omega.c (revision 201034) +++ gcc/omega.c (working copy) @@ -4906,7 +4906,10 @@ for (e = pb->num_geqs - 1; e >= 0; e--) if (pb->geqs[e].color == omega_red) - result = true; + { +result = true; +break; + } if (dump_file && (dump_flags & TDF_DETAILS)) { Index: gcc/omega.c === --- gcc/omega.c (revision 201034) +++ gcc/omega.c (working copy) @@ -5251,7 +5251,10 @@ for (e = pb->num_subs - 1; e >= 0; e--) if (pb->subs[e].coef[i] != 0) - coupled = true; + { +coupled = true; +break; + } for (e = pb->num_eqs - 1; e >= 0; e--) if (pb->eqs[e].coef[i] != 0) -Chang Index: gcc/omega.c === --- gcc/omega.c (revision 201034) +++ gcc/omega.c (working copy) @@ -2591,7 +2591,10 @@ for (red_found = 0, e = pb->num_geqs - 1; e >= 0; e--) if (pb->geqs[e].color == omega_red) - red_found = 1; + { +red_found = 1; +break; + } if (!red_found) {Index: gcc/omega.c === --- gcc/omega.c (revision 201034) +++ gcc/omega.c (working copy) @@ -4853,7 +4853,10 @@ for (e = pb->num_geqs - 1; e >= 0; e--) if (pb->geqs[e].color == omega_red) - result = true; + { +result = true; +break; + } if (!result) return false;Index: gcc/omega.c === --- gcc/omega.c (revision 201034) +++ gcc/omega.c (working copy) @@ -4906,7 +4906,10 @@ for (e = pb->num_geqs - 1; e >= 0; e--) if (pb->geqs[e].color == omega_red) - result = true; + { +result = true; +break; + } if (dump_file && (dump_flags & TDF_DETAILS)) {Index: gcc/omega.c === --- gcc/omega.c (revision 201034) +++ gcc/omega.c (working copy) @@ -5251,7 +5251,10 @@ for (e = pb->num_subs - 1; e >= 0; e--) if (pb->subs[e].coef[i] != 0) - coupled = true; + { +coupled = true; +break; + } for (e = pb->num_eqs - 1; e >= 0; e--) if (pb->eqs[e].coef[i] != 0)
[Patch, PR57803] Wasted work in gfc_build_dummy_array_decl()
Hi, The problem appears in revision 201034 in version 4.9. I attached a one-line patch that fixes it. I also reported this problem at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57803 I bootstrapped and ran the regression tests for this patch on x86_64-linux and all tests pass. In method "gfc_build_dummy_array_decl()" in gcc/fortran/trans-decl.c, the loop on line 978 should break immediately after "packed" is set to "PACKED_PARTIAL". All the iterations after "packed" set to "PACKED_PARTIAL" do not perform any useful work, at best they just set "packed" again to "PACKED_PARTIAL". Index: gcc/fortran/trans-decl.c === --- gcc/fortran/trans-decl.c(revision 201034) +++ gcc/fortran/trans-decl.c(working copy) @@ -975,7 +975,10 @@ && as->lower[n] && as->upper[n]->expr_type == EXPR_CONSTANT && as->lower[n]->expr_type == EXPR_CONSTANT)) - packed = PACKED_PARTIAL; + { + packed = PACKED_PARTIAL; + break; + } } } else -ChangIndex: gcc/fortran/trans-decl.c === --- gcc/fortran/trans-decl.c(revision 201034) +++ gcc/fortran/trans-decl.c(working copy) @@ -975,7 +975,10 @@ && as->lower[n] && as->upper[n]->expr_type == EXPR_CONSTANT && as->lower[n]->expr_type == EXPR_CONSTANT)) - packed = PACKED_PARTIAL; + { + packed = PACKED_PARTIAL; + break; + } } } else
[Patch, PR 57806] Wasted work in propagate_nothrow()
Hi, The problem appears in revision 201034 in version 4.9. I attached a one-line patch that fixes it. I also reported this problem at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57806 I bootstrapped and ran the regression tests for this patch on x86_64-linux and all tests pass. In method "propagate_nothrow()" in gcc/ipa-pure-const.c, the loop on line 1432 should break immediately after "can_throw" is set to "true". All the iterations after "can_throw" set to "true" do not perform any useful work, at best they just set "can_throw" again to "true". Index: gcc/ipa-pure-const.c === --- gcc/ipa-pure-const.c(revision 201034) +++ gcc/ipa-pure-const.c(working copy) @@ -1431,7 +1431,10 @@ } for (ie = node->indirect_calls; ie; ie = ie->next_callee) if (ie->can_throw_external) - can_throw = true; + { + can_throw = true; + break; + } w_info = (struct ipa_dfs_info *) w->symbol.aux; w = w_info->next_cycle; } -ChangIndex: gcc/ipa-pure-const.c === --- gcc/ipa-pure-const.c(revision 201034) +++ gcc/ipa-pure-const.c(working copy) @@ -1431,7 +1431,10 @@ } for (ie = node->indirect_calls; ie; ie = ie->next_callee) if (ie->can_throw_external) - can_throw = true; + { + can_throw = true; + break; + } w_info = (struct ipa_dfs_info *) w->symbol.aux; w = w_info->next_cycle; }
Re: msp430 port
> Every pattern that is using (subreg:SI (thing:PSI)) needs to be > explained on this list and given an explicit clearance. It really looks > like you're just papering over problems elsewhere. Most of them are just optimizations, but the problem with reload is that an SImode register *can't* hold a PSImode value. Registers are 20 bits; a PSImode value is 20 bits. SImode values require two registers (16 bits each). Converting between SImode and PSImode is both nontrivial and very common (pointer math). What I really need is an int20_t type in the core of gcc, so I can set Pmode to *that*, to avoid the SImode stuff completely. But that's a core change, not a target change. > As the comment in addhi_cy mentions, that is truly dangerous and > wrong. You can't let that stay in the port without proving GCC > won't do a code motion that causes problems. Just because sub, cmp, > shift, etc haven't shown up in your tests doesn't mean they can't > happen, they just didn't show up in your tests. I'll try again, but last time I put all the carry logic in all the insns, and code size grew quite a bit, and I couldn't figure out why. Do I need two patterns for each insn? One that clobbers the carry, and one that sets it? > I don't see any atomics? Not supported on this target? None. Limited opcode set, no multi-core, just a simple MCU...
Re: [AArch64] Rewrite vabs_s<8,16,32,64> AdvSIMD intrinsics to fold to tree.
On Fri, Jul 19, 2013 at 08:12:54PM +0100, Marek Polacek wrote: > On Fri, Jul 19, 2013 at 08:09:05PM +0100, James Greenhalgh wrote: > > + /* Forecfully avoid optimization. */\ > > A typo. Good spot, thanks for taking a look at this. Hopefully the attached has "forcefully" correct in both places and is otherwise typo free. OK? Thanks, James diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index f49f06b..6816b9c 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -1325,7 +1325,7 @@ aarch64_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED, tree *args, switch (fcode) { - BUILTIN_VDQF (UNOP, abs, 2) + BUILTIN_VALLDI (UNOP, abs, 2) return fold_build1 (ABS_EXPR, type, args[0]); break; BUILTIN_VALLDI (BINOP, cmge, 0) diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index af2dd6e..55dead6 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -347,7 +347,7 @@ BUILTIN_VDQF (UNOP, frecpe, 0) BUILTIN_VDQF (BINOP, frecps, 0) - BUILTIN_VDQF (UNOP, abs, 2) + BUILTIN_VALLDI (UNOP, abs, 2) VAR1 (UNOP, vec_unpacks_hi_, 10, v4sf) VAR1 (BINOP, float_truncate_hi_, 0, v4sf) diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index 122fd7d..99cf123 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -4468,83 +4468,6 @@ vabds_f32 (float32_t a, float32_t b) return result; } -__extension__ static __inline int8x8_t __attribute__ ((__always_inline__)) -vabs_s8 (int8x8_t a) -{ - int8x8_t result; - __asm__ ("abs %0.8b,%1.8b" - : "=w"(result) - : "w"(a) - : /* No clobbers */); - return result; -} - -__extension__ static __inline int16x4_t __attribute__ ((__always_inline__)) -vabs_s16 (int16x4_t a) -{ - int16x4_t result; - __asm__ ("abs %0.4h,%1.4h" - : "=w"(result) - : "w"(a) - : /* No clobbers */); - return result; -} - -__extension__ static __inline int32x2_t __attribute__ ((__always_inline__)) -vabs_s32 (int32x2_t a) -{ - int32x2_t result; - __asm__ ("abs %0.2s,%1.2s" - : "=w"(result) - : "w"(a) - : /* No clobbers */); - return result; -} - -__extension__ static __inline int8x16_t __attribute__ ((__always_inline__)) -vabsq_s8 (int8x16_t a) -{ - int8x16_t result; - __asm__ ("abs %0.16b,%1.16b" - : "=w"(result) - : "w"(a) - : /* No clobbers */); - return result; -} - -__extension__ static __inline int16x8_t __attribute__ ((__always_inline__)) -vabsq_s16 (int16x8_t a) -{ - int16x8_t result; - __asm__ ("abs %0.8h,%1.8h" - : "=w"(result) - : "w"(a) - : /* No clobbers */); - return result; -} - -__extension__ static __inline int32x4_t __attribute__ ((__always_inline__)) -vabsq_s32 (int32x4_t a) -{ - int32x4_t result; - __asm__ ("abs %0.4s,%1.4s" - : "=w"(result) - : "w"(a) - : /* No clobbers */); - return result; -} - -__extension__ static __inline int64x2_t __attribute__ ((__always_inline__)) -vabsq_s64 (int64x2_t a) -{ - int64x2_t result; - __asm__ ("abs %0.2d,%1.2d" - : "=w"(result) - : "w"(a) - : /* No clobbers */); - return result; -} - __extension__ static __inline int16_t __attribute__ ((__always_inline__)) vaddlv_s8 (int8x8_t a) { @@ -17395,6 +17318,30 @@ vabs_f32 (float32x2_t __a) return __builtin_aarch64_absv2sf (__a); } +__extension__ static __inline float64x1_t __attribute__ ((__always_inline__)) +vabs_f64 (float64x1_t __a) +{ + return __builtin_fabs (__a); +} + +__extension__ static __inline int8x8_t __attribute__ ((__always_inline__)) +vabs_s8 (int8x8_t __a) +{ + return __builtin_aarch64_absv8qi (__a); +} + +__extension__ static __inline int16x4_t __attribute__ ((__always_inline__)) +vabs_s16 (int16x4_t __a) +{ + return __builtin_aarch64_absv4hi (__a); +} + +__extension__ static __inline int32x2_t __attribute__ ((__always_inline__)) +vabs_s32 (int32x2_t __a) +{ + return __builtin_aarch64_absv2si (__a); +} + __extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) vabs_s64 (int64x1_t __a) { @@ -17413,6 +17360,30 @@ vabsq_f64 (float64x2_t __a) return __builtin_aarch64_absv2df (__a); } +__extension__ static __inline int8x16_t __attribute__ ((__always_inline__)) +vabsq_s8 (int8x16_t __a) +{ + return __builtin_aarch64_absv16qi (__a); +} + +__extension__ static __inline int16x8_t __attribute__ ((__always_inline__)) +vabsq_s16 (int16x8_t __a) +{ + return __builtin_aarch64_absv8hi (__a); +} + +__extension__ static __inline int32x4_t __attribute__ ((__always_inline__)) +vabsq_s32 (int32x4_t __a) +{ + return __builtin_aarch64_absv4si (__a); +} + +__extension__ static __inline int64x2_t __attribute__ ((__always_inline__))
Re: [PATCH] PR57878, Incorrect code: live register clobbered in split2
Thank you! backported as r201068 in gcc-4_8-branch. Thanks, Wei. On Thu, Jul 18, 2013 at 10:05 AM, Vladimir Makarov wrote: > On 07/15/2013 02:26 PM, Wei Mi wrote: >> Hi, >> >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57878 >> >> The bug occurs because tfreq is given higher priority than bigger mode >> in reload_pseudo_compare_func. When there are multiple reload pseudos >> in the same insn, and the pseudo with bigger mode has lower thread >> frequency than other reload pseudos, it is possible the bigger mode >> pseudo cannot find available hardregs. >> >> The proposed fix is to switch the priority of bigger mode and tfreq in >> reload_pseudo_compare_func. Besides I promoted lra_assert to >> gcc_assert at the end of to make adding testcase easier since >> lra_assert will not fire on 4.8.1. >> >> bootstrap and regression test are ok on x86_64-linux-gnu. Is it ok for >> trunk and 4.8 branch? >> >> Thanks, >> Wei. >> >> 2013-07-15 Wei Mi >> >> PR rtl-optimization/57878 >> * lra-assigns.c (reload_pseudo_compare_func): Switch the priority of >> bigger mode and tfreq. >> >> 2013-07-15 Wei Mi >> >> PR rtl-optimization/57878 >> * g++.dg/pr57518.C: New test. > In overall, the PR analysis and the patch is ok. The only problem the > patch can affect generated code performance especially for 32-bit > targets. I see regressions on 4 SPECInt2000 tests. > > Here is the patch I've committed into the trunk. It decreases the patch > effect on performance. > > The patch was successfully bootstrapped an tested on x86/x86-64. > > Committed to the trunk as rev. 201036. > > Wei Mi, could you commit it to gcc4.8 branch. > > Thanks. > > 2013-07-18 Vladimir Makarov > Wei Mi > > PR rtl-optimization/57878 > * lra-assigns.c (assign_by_spills): Move non_reload_pseudos to the > top. > (reload_pseudo_compare_func): Check nregs first for reload > pseudos. > > 2013-07-18 Wei Mi > > PR rtl-optimization/57878 > * g++.dg/pr57518.C: New test. >
RFC: Gimple combine/folding interface
I was creating a new gimple/folding interface and wanted some opinions on the interface. typedef double_int (*nonzerobits_t)(tree var); typedef tree (*valueizer_t)(tree var); class gimple_combine { public: gimple_combine(nonzerobits_t a, valueizer_t b) : nonzerobitsf(a), valueizerv(b), allow_full_reassiocation(false) {} gimple_combine() : nonzerobitsf(NULL), valueizerv(NULL), allow_full_reassiocation(false) {} gimple_combine(bool reas) : nonzerobitsf(NULL), valueizerv(NULL), allow_full_reassiocation(reas) {} tree build2 (location_t, enum tree_code, tree, tree, tree); tree build1 (location_t, enum tree_code, tree, tree); tree build3 (location_t, enum tree_code, tree, tree, tree, tree); tree combine (gimple); private: nonzerobits_t nonzerobitsf; valueizer_t valueizerv; bool allow_full_reassiocation; tree binary (location_t, enum tree_code, tree, tree, tree); tree unary (location_t, enum tree_code, tree, tree); tree ternary (location_t, enum tree_code, tree, tree, tree, tree); }; bool replace_rhs_after_ssa_combine (gimple_stmt_iterator *, tree); This is what I have so far and wonder if there is anything else I should include. This will be used to replace the folding code in fold-const.c and gimple-fold.c. Thanks, Andrew Pinski
Re: Fix GCC bug causing bootstrap failure with vectorizer turned on
We all missed it when just staring at the code :) David On Fri, Jul 19, 2013 at 5:45 AM, Jeff Law wrote: > On 07/18/2013 10:49 AM, Xinliang David Li wrote: >> >> The difference is that the relative order of DECL_UIDs do not change >> whether debug info is on or not, but there is no such guarantee when >> hashing is involved. > > Yea. I feel like an idiot for missing the hash vs comparison. > > Add a ChangeLog and this is good to go. > > Jeff >
[gomp4] Fix up C #pragma omp declare simd parsing
Hi! This is the C FE counterpart of the http://gcc.gnu.org/ml/gcc-patches/2013-07/msg00520.html changes. The OpenMP 4.0 standard in the end requires that the #pragma omp declare simd clauses are parsed at the scope of the function parameters, but lexically comes before the function return type, so we need to remember it and parse only later. The C++ FE had support for saving tokens and parsing them later, but the C FE didn't. So, while most of the patch is just OpenMP specific stuff, the struct c_parser, c_parser_consume_{token,pragma} and c_parse_file changes affect all C users. I've bootstrapped/regtested just those changes alone and didn't find measurable differences, both in bootstrap time and on some larger C sources. I'll need this support also for #pragma omp declare reduction parsing later on. Additionally, while for the function definitions arguments are pushed again into scope, for function prototypes this doesn't happen, so I had to write utility functions in c-decl.c to do that temporarily and undo that afterwards. Joseph, any comments? 2013-07-20 Jakub Jelinek * c-typeck.c (c_finish_omp_declare_simd): Moved to... * c-parser.c (c_finish_omp_declare_simd): ... here. Add first c_parser * argument and change last argument to vec. Parse clauses here from the saved c_tokens. (struct c_parser): Change tokens to c_token *. Add tokens_buf field. Change tokens_avail type to unsigned int. (c_parser_consume_token): If parser->tokens isn't &parser->tokens_buf[0], increment parser->tokens. (c_parser_consume_pragma): Likewise (c_parser_declaration_or_fndef): Change last argument to vec. Pass parser as first argument to c_finish_omp_declare_simd. For prototypes, fix code indentation and call temp_store_parm_decls and temp_pop_parm_decls around c_finish_omp_declare_simd. (c_parser_omp_variable_list): Remove declare_simd argument, call lookup_name unconditionally. (c_parser_omp_var_list_parens, c_parser_omp_clause_reduction, c_parser_omp_clause_depend, c_parser_omp_clause_map, c_parser_omp_clause_uniform): Adjust c_parser_omp_variable_list callers. (c_parser_omp_clause_aligned, c_parser_omp_clause_linear): Likewise. Remove declare_simd argument. (c_parser_omp_all_clauses): Remove declare_simd argument. Adjust c_parser_omp_clause_aligned and c_parser_omp_clause_linear callers. (c_parser_omp_declare_simd): Don't parse declare simd clauses here, instead push the clause c_tokens starting with simd token up to CPP_PRAGMA_EOL for each #pragma omp declare simd into a vector. (c_parser_omp_declare): Don't consume simd token here. (c_parse_file): Initialize tparser.tokens and the_parser->tokens here. * c-decl.c (temp_store_parm_decls, temp_pop_parm_decls): New functions. * c-tree.h (temp_store_parm_decls, temp_pop_parm_decls): New prototypes. (c_finish_omp_declare_simd): Remove prototype. * gcc.dg/gomp/declare-simd-1.c (f16, f17, f18): New tests. --- gcc/c/c-typeck.c.jj 2013-07-14 19:43:18.0 +0200 +++ gcc/c/c-typeck.c2013-07-19 22:30:39.552293476 +0200 @@ -11587,84 +11587,6 @@ c_finish_omp_clauses (tree clauses) return clauses; } -/* Finalize #pragma omp declare simd clauses after FNDECL has been parsed, - and put that into "omp declare simd" attribute. */ - -void -c_finish_omp_declare_simd (tree fndecl, tree parms, vec clauses) -{ - tree cl; - int i; - - if (clauses[0] == error_mark_node) -return; - if (fndecl == NULL_TREE || TREE_CODE (fndecl) != FUNCTION_DECL) -{ - error ("%<#pragma omp declare simd%> not immediately followed by " -"a function declaration or definition"); - clauses[0] = error_mark_node; - return; -} - if (clauses[0] == integer_zero_node) -{ - error_at (DECL_SOURCE_LOCATION (fndecl), - "%<#pragma omp declare simd%> not immediately followed by " - "a single function declaration or definition"); - clauses[0] = error_mark_node; - return; -} - - if (parms == NULL_TREE) -parms = DECL_ARGUMENTS (fndecl); - - FOR_EACH_VEC_ELT (clauses, i, cl) -{ - tree c, *pc, decl, name; - for (pc = &cl, c = cl; c; c = *pc) - { - bool remove = false; - switch (OMP_CLAUSE_CODE (c)) - { - case OMP_CLAUSE_UNIFORM: - case OMP_CLAUSE_LINEAR: - case OMP_CLAUSE_ALIGNED: - case OMP_CLAUSE_REDUCTION: - name = OMP_CLAUSE_DECL (c); - if (name == error_mark_node) - remove = true; - else - { - for (decl = parms; decl; decl = TREE_CHAIN (decl)) - if (DECL_NAME (decl) == name) - break; - if (decl == NULL_TREE) -
testsuite patches (1/14): Request dwarf-2 output where needed
Although the gcc.dg/debug/dwarf2/dwarf2.exp generally requests dwarf-2 information, so that the test will work with targets that have a different default output format, that doesn't happen where the test specifies specific target options. In that case, we have to specify -gdwarf-2 in the individual test case. Tested for avr with --target_board=atmega128-sim and native on i686-pc-linuc-gnu. 2013-05-13 Joern Rennecke * gcc.dg/debug/dwarf2/global-used-types.c: Request dwarf-2 output. * gcc.dg/debug/dwarf2/inline2.c: Likewise. * gcc.dg/debug/dwarf2/inline3.c: Likewise. * gcc.dg/debug/dwarf2/pr37726.c: Likewise. * gcc.dg/debug/dwarf2/pr41445-1.c: Likewise. * gcc.dg/debug/dwarf2/pr41445-2.c: Likewise. * gcc.dg/debug/dwarf2/pr41445-3.c: Likewise. * gcc.dg/debug/dwarf2/pr41445-4.c: Likewise. * gcc.dg/debug/dwarf2/pr41445-5.c: Likewise. * gcc.dg/debug/dwarf2/pr41445-6.c: Likewise. * gcc.dg/debug/dwarf2/pr41543.c: Likewise. * gcc.dg/debug/dwarf2/pr41695.c: Likewise. * gcc.dg/debug/dwarf2/pr43237.c: Likewise. * gcc.dg/debug/dwarf2/pr47939-1.c: Likewise. * gcc.dg/debug/dwarf2/pr47939-2.c: Likewise. * gcc.dg/debug/dwarf2/pr47939-3.c: Likewise. * gcc.dg/debug/dwarf2/pr47939-4.c: Likewise. * gcc.dg/debug/dwarf2/pr53948.c: Likewise. * gcc.dg/debug/dwarf2/struct-loc1.c: Likewise. Index: gcc.dg/debug/dwarf2/global-used-types.c === --- gcc.dg/debug/dwarf2/global-used-types.c (revision 201032) +++ gcc.dg/debug/dwarf2/global-used-types.c (working copy) @@ -1,6 +1,6 @@ /* Contributed by Dodji Seketeli - { dg-options "-g -dA -fno-merge-debug-strings" } + { dg-options "-gdwarf-2 -dA -fno-merge-debug-strings" } { dg-do compile } { dg-final { scan-assembler-times "DIE \\(0x\[^\n\]*\\) DW_TAG_enumeration_type" 1 } } { dg-final { scan-assembler-times "DIE \\(0x\[^\n\]*\\) DW_TAG_enumerator" 2 } } Index: gcc.dg/debug/dwarf2/inline2.c === --- gcc.dg/debug/dwarf2/inline2.c (revision 201032) +++ gcc.dg/debug/dwarf2/inline2.c (working copy) @@ -14,7 +14,7 @@ properly nested DW_TAG_inlined_subroutine DIEs for third, second and first. */ -/* { dg-options "-O -g3 -dA" } */ +/* { dg-options "-O -g3 -gdwarf-2 -dA" } */ /* { dg-do compile } */ /* There are 6 inlined subroutines: Index: gcc.dg/debug/dwarf2/inline3.c === --- gcc.dg/debug/dwarf2/inline3.c (revision 201032) +++ gcc.dg/debug/dwarf2/inline3.c (working copy) @@ -1,7 +1,7 @@ /* Verify that only one DW_AT_const_value is emitted for baz, not for baz abstract DIE and again inside of DW_TAG_inlined_subroutine. */ -/* { dg-options "-O2 -g -dA -fmerge-all-constants" } */ +/* { dg-options "-O2 -gdwarf-2 -dA -fmerge-all-constants" } */ /* { dg-do compile } */ /* { dg-final { scan-assembler-times " DW_AT_const_value" 1 } } */ Index: gcc.dg/debug/dwarf2/pr37726.c === --- gcc.dg/debug/dwarf2/pr37726.c (revision 201032) +++ gcc.dg/debug/dwarf2/pr37726.c (working copy) @@ -1,6 +1,6 @@ /* PR debug/37726 */ /* { dg-do compile } */ -/* { dg-options "-g -O0 -dA -fno-merge-debug-strings" } */ +/* { dg-options "-gdwarf-2 -O0 -dA -fno-merge-debug-strings" } */ int foo (int parm) { Index: gcc.dg/debug/dwarf2/pr41445-1.c === --- gcc.dg/debug/dwarf2/pr41445-1.c (revision 201032) +++ gcc.dg/debug/dwarf2/pr41445-1.c (working copy) @@ -2,7 +2,7 @@ /* Test that token after multi-line function-like macro use gets correct locus even when preprocessing separately. */ /* { dg-do compile } */ -/* { dg-options "-save-temps -g -O0 -dA -fno-merge-debug-strings" } */ +/* { dg-options "-save-temps -gdwarf-2 -O0 -dA -fno-merge-debug-strings" } */ #define A(a,b) int varh;A(1, Index: gcc.dg/debug/dwarf2/pr41445-2.c === --- gcc.dg/debug/dwarf2/pr41445-2.c (revision 201032) +++ gcc.dg/debug/dwarf2/pr41445-2.c (working copy) @@ -1,6 +1,6 @@ /* PR preprocessor/41445 */ /* { dg-do compile } */ -/* { dg-options "-g -O0 -dA -fno-merge-debug-strings" } */ +/* { dg-options "-gdwarf-2 -O0 -dA -fno-merge-debug-strings" } */ #include "pr41445-1.c" Index: gcc.dg/debug/dwarf2/pr41445-3.c === --- gcc.dg/debug/dwarf2/pr41445-3.c (revision 201032) +++ gcc.dg/debug/dwarf2/pr41445-3.c (working copy) @@ -2,7 +2,7 @@ /* Test that token after multi-line function-like macro use gets correct locus even when preprocessing separately. */ /* { dg-do compile } */ -/* { dg-options "-save-temps -g -O0 -dA -fno-merge-debug
testsuite patches (2/14): Don't run execute/pr56799.c for int16 targets
With 16 bit integers, (int) 0x0001 is just 0, so execution of that test fails. Tested for avr with --target_board=atmega128-sim and native on i686-pc-linuc-gnu. 2013-05-13 Joern Rennecke * gcc.c-torture/execute/pr56799.x: New file. Index: gcc.c-torture/execute/pr56799.x === --- gcc.c-torture/execute/pr56799.x (revision 0) +++ gcc.c-torture/execute/pr56799.x (working copy) @@ -0,0 +1,7 @@ +load_lib target-supports.exp + +if { [check_effective_target_int32plus] } { + return 0 +} + +return 1;
testsuite patches (3/14): gcc.dg/c99-stdint-1.c [avr-*-*]: Update line number for dg-bogus.
Because we now issue the error message for the place of the macro definition, the expected line number has to be updated. Tested for avr with --target_board=atmega128-sim and native on i686-pc-linuc-gnu. 2013-05-26 Joern Rennecke * gcc.dg/c99-stdint-1.c [avr-*-*]: Update line number for dg-bogus. Index: gcc.dg/c99-stdint-1.c === --- gcc.dg/c99-stdint-1.c (revision 201032) +++ gcc.dg/c99-stdint-1.c (working copy) @@ -214,7 +214,7 @@ test_max (void) void test_misc_limits (void) { -/* { dg-bogus "size" "ptrdiff is 16bits" { xfail avr-*-* } 218 } */ +/* { dg-bogus "size" "ptrdiff is 16bits" { xfail avr-*-* } 56 } */ CHECK_SIGNED_LIMITS_2(__PTRDIFF_TYPE__, PTRDIFF_MIN, PTRDIFF_MAX, -65535L, 65535L); #ifndef SIGNAL_SUPPRESS CHECK_LIMITS_2(sig_atomic_t, SIG_ATOMIC_MIN, SIG_ATOMIC_MAX, -127, 127, 255);
Re: [ubsan] Add libcall arguments
On 07/19/2013 03:01 PM, Marek Polacek wrote: On Fri, Jul 19, 2013 at 08:50:42PM +0200, Jakub Jelinek wrote: uintptr_type_node is a C/C++/ObjC/ObjC++ FE tree. So, if you use it just in c-family/c-ubsan.c, that is just fine, but you can't use it in ubsan.c. Any reason not to move it into the middle-end? Jason
testsuite patches (4/14): gcc.dg/pr57154.c requires scheduling
Tested for avr with --target_board=atmega128-sim and native on i686-pc-linuc-gnu. Committed as obvious. 2013-06-14 Joern Rennecke * gcc.dg/pr57154.c: Add dg-require-effective-target scheduling. Index: gcc.dg/pr57154.c === --- gcc.dg/pr57154.c(revision 201032) +++ gcc.dg/pr57154.c(working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-O2 -fschedule-insns" } */ +/* { dg-require-effective-target scheduling } */ #define PF_FROZEN 0x0001 #define likely(x) __builtin_expect(!!(x), 1)
testsuite patches (5/14): Skip stackalign/builtin-apply-2.c for avr
Tested for avr with --target_board=atmega128-sim and native on i686-pc-linuc-gnu. 2013-06-24 Joern Rennecke * gcc.dg/torture/stackalign/builtin-apply-2.c: Also skip for avr. Index: gcc.dg/torture/stackalign/builtin-apply-2.c === --- gcc.dg/torture/stackalign/builtin-apply-2.c (revision 201032) +++ gcc.dg/torture/stackalign/builtin-apply-2.c (working copy) @@ -6,7 +6,10 @@ /* { dg-do run } */ -/* { dg-skip-if "Variadic funcs use Base AAPCS. Normal funcs use VFP variant." { arm_hf_eabi } } */ +/* arm_hf_eabi: Variadic funcs use Base AAPCS. Normal funcs use VFP variant. + avr: Variadic funcs don't pass arguments in registers, while normal funcs +do. */ +/* { dg-skip-if "Variadic funcs use different argument passing from normal funcs" { arm_hf_eabi || { avr-*-* } } "*" "" } */ #define INTEGER_ARG 5
testsuite patches (6/14): Use sizeof (double) to define size of vector of two doubles
Tested for avr with --target_board=atmega128-sim and native on i686-pc-linuc-gnu. 2013-07-05 Joern Rennecke * gcc.dg/pr44214-1.c (v2df): Define size using sizeof (double). * gcc.dg/pr44214-3.c (v2df): Likewise. Index: gcc.dg/pr44214-1.c === --- gcc.dg/pr44214-1.c (revision 201032) +++ gcc.dg/pr44214-1.c (working copy) @@ -1,7 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -freciprocal-math -fdump-tree-ccp1" } */ -typedef double v2df __attribute__ ((vector_size (16))); +typedef double v2df __attribute__ ((vector_size (2 * sizeof (double; void do_div (v2df *a, v2df *b) { Index: gcc.dg/pr44214-3.c === --- gcc.dg/pr44214-3.c (revision 201032) +++ gcc.dg/pr44214-3.c (working copy) @@ -1,7 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -fdump-tree-ccp1" } */ -typedef double v2df __attribute__ ((vector_size (16))); +typedef double v2df __attribute__ ((vector_size (2 * sizeof (double; void do_div (v2df *a, v2df *b) {
testsuite patches (7/14): gcc.dg/pr46647.c: xfail for avr*-*-*
Tested for avr with --target_board=atmega128-sim and native on i686-pc-linuc-gnu. 2013-07-05 Joern Rennecke * gcc.dg/pr46647.c: xfail for avr*-*-*. Index: gcc.dg/pr46647.c === --- gcc.dg/pr46647.c(revision 201032) +++ gcc.dg/pr46647.c(working copy) @@ -25,6 +25,6 @@ func3 (void) return 0; } -/* The xfail for cris-* and crisv32-* is due to PR53535. */ -/* { dg-final { scan-tree-dump-not "memset" "optimized" { xfail cris-*-* crisv32-*-* } } } */ +/* The xfail for avr, cris-* and crisv32-* is due to PR53535. */ +/* { dg-final { scan-tree-dump-not "memset" "optimized" { xfail avr*-*-* cris-*-* crisv32-*-* } } } */ /* { dg-final { cleanup-tree-dump "optimized" } } */
testsuite patches (8/14): gcc.dg/pr53265.c: Disable test for avr*-*-
Tested for avr with --target_board=atmega128-sim and native on i686-pc-linuc-gnu. 2013-07-05 Joern Rennecke * gcc.dg/pr53265.c: Disable test for avr*-*-*. Index: gcc.dg/pr53265.c === --- gcc.dg/pr53265.c(revision 201032) +++ gcc.dg/pr53265.c(working copy) @@ -1,5 +1,5 @@ /* PR tree-optimization/53265 */ -/* { dg-do compile } */ +/* { dg-do compile { target { ! avr*-*-* } } } */ /* Don't run if arrays are too large. */ /* { dg-options "-O2 -Wall" } */ void bar (void *);
testsuite patches (9/14): Expect fewer memcpy in gcc.dg/strlenopt-1[013].c
Tested for avr with --target_board=atmega128-sim and native on i686-pc-linuc-gnu. 2013-07-05 Joern Rennecke * gcc.dg/strlenopt-10.c [avr*-*-*]: Reduce number of expected memcpy by one. * gcc.dg/strlenopt-11.c [avr*-*-*]: Likewise. Expect l to be optimized away. * gcc.dg/strlenopt-13.c [avr*-*-*]: Likewise. Index: gcc.dg/strlenopt-10.c === --- gcc.dg/strlenopt-10.c (revision 201032) +++ gcc.dg/strlenopt-10.c (working copy) @@ -70,7 +70,10 @@ main () } /* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */ -/* { dg-final { scan-tree-dump-times "memcpy \\(" 8 "strlen" } } */ +/* avr has BIGGEST_ALIGNMENT 8, allowing fold_builtin_memory_op + to expand the memcpy call at the end of fn2. */ +/* { dg-final { scan-tree-dump-times "memcpy \\(" 8 "strlen" { target { ! avr*-*-* } } } } */ +/* { dg-final { scan-tree-dump-times "memcpy \\(" 7 "strlen" { target { avr*-*-* } } } } */ /* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */ /* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */ /* { dg-final { scan-tree-dump-times "strchr \\(" 0 "strlen" } } */ Index: gcc.dg/strlenopt-11.c === --- gcc.dg/strlenopt-11.c (revision 201032) +++ gcc.dg/strlenopt-11.c (working copy) @@ -59,12 +59,18 @@ main () } /* { dg-final { scan-tree-dump-times "strlen \\(" 3 "strlen" } } */ -/* { dg-final { scan-tree-dump-times "memcpy \\(" 7 "strlen" } } */ +/* avr has BIGGEST_ALIGNMENT 8, allowing fold_builtin_memory_op + to expand the memcpy call at the end of fn1. */ +/* { dg-final { scan-tree-dump-times "memcpy \\(" 7 "strlen" { target { ! avr*-*-* } } } } */ +/* { dg-final { scan-tree-dump-times "memcpy \\(" 6 "strlen" { target { avr*-*-* } } } } */ /* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */ /* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */ /* { dg-final { scan-tree-dump-times "strchr \\(" 0 "strlen" } } */ /* { dg-final { scan-tree-dump-times "stpcpy \\(" 0 "strlen" } } */ -/* { dg-final { scan-tree-dump-times " _\[0-9\]* = strlen \\(\[^\n\r\]*;\[\n\r\]* l.0. = " 1 "strlen" } } */ -/* { dg-final { scan-tree-dump-times " _\[0-9\]* = strlen \\(\[^\n\r\]*;\[\n\r\]* l.6. = " 1 "strlen" } } */ -/* { dg-final { scan-tree-dump-times " _\[0-9\]* = strlen \\(\[^\n\r\]*;\[\n\r\]* l.9. = " 1 "strlen" } } */ +/* Where the memcpy is expanded, the assignemts to elements of l are + propagated. */ +/* { dg-final { scan-tree-dump-times " _\[0-9\]* = strlen \\(\[^\n\r\]*;\[\n\r\]* l.0. = " 1 "strlen" { target { ! avr*-*-* } } } } */ +/* { dg-final { scan-tree-dump-times " _\[0-9\]* = strlen \\(\[^\n\r\]*;\[\n\r\]* l.6. = " 1 "strlen" { target { ! avr*-*-* } } } } */ +/* { dg-final { scan-tree-dump-times " _\[0-9\]* = strlen \\(\[^\n\r\]*;\[\n\r\]* l.9. = " 1 "strlen" { target { ! avr*-*-* } } } } */ +/* { dg-final { scan-tree-dump-times " _\[0-9\]* = strlen \\(\[^\n\r\]*;" 3 "strlen" { target { avr*-*-* } } } } */ /* { dg-final { cleanup-tree-dump "strlen" } } */ Index: gcc.dg/strlenopt-13.c === --- gcc.dg/strlenopt-13.c (revision 201032) +++ gcc.dg/strlenopt-13.c (working copy) @@ -56,13 +56,19 @@ main () } /* { dg-final { scan-tree-dump-times "strlen \\(" 4 "strlen" } } */ -/* { dg-final { scan-tree-dump-times "memcpy \\(" 7 "strlen" } } */ +/* avr has BIGGEST_ALIGNMENT 8, allowing fold_builtin_memory_op + to expand the memcpy call at the end of fn1. */ +/* { dg-final { scan-tree-dump-times "memcpy \\(" 7 "strlen" { target { ! avr*-*-* } } } } */ +/* { dg-final { scan-tree-dump-times "memcpy \\(" 6 "strlen" { target { avr*-*-* } } } } */ /* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */ /* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */ /* { dg-final { scan-tree-dump-times "strchr \\(" 0 "strlen" } } */ /* { dg-final { scan-tree-dump-times "stpcpy \\(" 0 "strlen" } } */ -/* { dg-final { scan-tree-dump-times " _\[0-9\]* = strlen \\(\[^\n\r\]*;\[\n\r\]* l.0. = " 1 "strlen" } } */ -/* { dg-final { scan-tree-dump-times " _\[0-9\]* = strlen \\(\[^\n\r\]*;\[\n\r\]* l.1. = " 1 "strlen" } } */ -/* { dg-final { scan-tree-dump-times " _\[0-9\]* = strlen \\(\[^\n\r\]*;\[\n\r\]* l.5. = " 1 "strlen" } } */ -/* { dg-final { scan-tree-dump-times " _\[0-9\]* = strlen \\(\[^\n\r\]*;\[\n\r\]* l.6. = " 1 "strlen" } } */ +/* Where the memcpy is expanded, the assignemts to elements of l are + propagated. */ +/* { dg-final { scan-tree-dump-times " _\[0-9\]* = strlen \\(\[^\n\r\]*;\[\n\r\]* l.0. = " 1 "strlen" { target { ! avr*-*-* } } } } */ +/* { dg-final { scan-tree-dump-times " _\[0-9\]* = strlen \\(\[^\n\r\]*;\[\n\r\]* l.1. = " 1 "strlen" { target { ! avr*-*-* } } } } */ +/* { dg-final { scan-tree-dum
testsuite patches (10/14): Add missing test conditions in c-c++-common/scal-to-vec1.c
Tested for avr with --target_board=atmega128-sim and native on i686-pc-linuc-gnu. 2013-07-05 Joern Rennecke * c-c++-common/scal-to-vec1.c: Add !int16 and large_double conditions to tests that assume int/double are larger than short/float. Index: c-c++-common/scal-to-vec1.c === --- c-c++-common/scal-to-vec1.c (revision 201032) +++ c-c++-common/scal-to-vec1.c (working copy) @@ -26,13 +26,13 @@ int main (int argc, char *argv[]) { int i = 12; double d = 3.; -v1 = i + v0;/* { dg-error "conversion of scalar \[^\\n\]* to vector" } */ +v1 = i + v0;/* { dg-error "conversion of scalar \[^\\n\]* to vector" "scalar to vector" { target { ! int16 } } } */ v1 = 9 + v0;/* { dg-error "conversion of scalar \[^\\n\]* to vector" } */ -f1 = d + f0;/* { dg-error "conversion of scalar \[^\\n\]* to vector" } */ -f1 = 1.3 + f0; /* { dg-error "conversion of scalar \[^\\n\]* to vector" } */ +f1 = d + f0;/* { dg-error "conversion of scalar \[^\\n\]* to vector" "scalar to vector" { target { large_double } } } */ +f1 = 1.3 + f0; /* { dg-error "conversion of scalar \[^\\n\]* to vector" "scalar to vector" { target { large_double } } } */ f1 = sll + f0; /* { dg-error "conversion of scalar \[^\\n\]* to vector" } */ -f1 = ((int)998769576) + f0; /* { dg-error "conversion of scalar \[^\\n\]* to vector" } */ +f1 = ((int)998769576) + f0; /* { dg-error "conversion of scalar \[^\\n\]* to vector" "scalar to vector" { target { ! int16 } } } */ /* convert.c should take care of this. */ i1 = sfl + i0; /* { dg-error "can't convert value to a vector|invalid operands" } */
Re: [AArch64] Rewrite vabs_s<8,16,32,64> AdvSIMD intrinsics to fold to tree.
On 20 July 2013 00:42, James Greenhalgh wrote: > Hopefully the attached has "forcefully" correct in both places and is > otherwise typo free. > > OK? OK /Marcus
committed: testsuite patches (11/14): Don't expect null pointer check elimination for target { keeps_null_pointer_checks }
Tested for avr with --target_board=atmega128-sim and native on i686-pc-linuc-gnu. Committed as obvious. 2013-07-17 Joern Rennecke * gcc.dg/tree-ssa/pr21090.c: Do vrp1 scan check only for target { ! keeps_null_pointer_checks }. * gcc.dg/tree-ssa/unreachable.c: Do optimized scan check only for target { ! keeps_null_pointer_checks }. Index: gcc.dg/tree-ssa/pr21090.c === --- gcc.dg/tree-ssa/pr21090.c (revision 201032) +++ gcc.dg/tree-ssa/pr21090.c (working copy) @@ -19,5 +19,5 @@ foo (int a) return 0; } -/* { dg-final { scan-tree-dump-times "Folding predicate.*to 1" 1 "vrp1" } } */ +/* { dg-final { scan-tree-dump-times "Folding predicate.*to 1" 1 "vrp1" { target { ! keeps_null_pointer_checks } } } } */ /* { dg-final { cleanup-tree-dump "vrp1" } } */ Index: gcc.dg/tree-ssa/unreachable.c === --- gcc.dg/tree-ssa/unreachable.c (revision 201032) +++ gcc.dg/tree-ssa/unreachable.c (working copy) @@ -11,5 +11,5 @@ main() return 1; return 0; } -/* { dg-final { scan-tree-dump-not "bad_boy" "optimized"} } */ +/* { dg-final { scan-tree-dump-not "bad_boy" "optimized" { target { ! keeps_null_pointer_checks } } } } */ /* { dg-final { cleanup-tree-dump "optimized" } } */
testsuite patches (12/14): Add predicates to tests that depend on integer size
Tested for avr with --target_board=atmega128-sim and native on i686-pc-linuc-gnu. 2013-07-17 Joern Rennecke * c-c++-common/simulate-thread/bitfields-2.c: Run test only for target { ! int16 }. * gcc.dg/tree-ssa/pr54245.c: Do slsr scan only for target { ! int16 }. * gcc.dg/tree-ssa/slsr-1.c: Adjust multiplicators to scan for for target { int16 }. Restrict existing tests to target { int32 } where appropriate. * gcc.dg/tree-ssa/slsr-2.c, gcc.dg/tree-ssa/slsr-27.c: Likewise. * gcc.dg/tree-ssa/slsr-28.c, gcc.dg/tree-ssa/slsr-29.c: Likewise. * gcc.dg/tree-ssa/slsr-3.c, gcc.dg/tree-ssa/ssa-ccp-23.c: Likewise. * lib/target-supports.exp (check_effective_target_int32): New proc. Index: c-c++-common/simulate-thread/bitfields-2.c === --- c-c++-common/simulate-thread/bitfields-2.c (revision 201032) +++ c-c++-common/simulate-thread/bitfields-2.c (working copy) @@ -1,4 +1,4 @@ -/* { dg-do link } */ +/* { dg-do link { target { ! int16 } } } */ /* { dg-options "--param allow-store-data-races=0" } */ /* { dg-final { simulate-thread } } */ Index: gcc.dg/tree-ssa/pr54245.c === --- gcc.dg/tree-ssa/pr54245.c (revision 201032) +++ gcc.dg/tree-ssa/pr54245.c (working copy) @@ -45,5 +45,5 @@ int main(void) /* For now, disable inserting an initializer when the multiplication will take place in a smaller type than originally. This test may be deleted in future when this case is handled more precisely. */ -/* { dg-final { scan-tree-dump-times "Inserting initializer" 0 "slsr" } } */ +/* { dg-final { scan-tree-dump-times "Inserting initializer" 0 "slsr" { target { ! int16 } } } } */ /* { dg-final { cleanup-tree-dump "slsr" } } */ Index: gcc.dg/tree-ssa/slsr-1.c === --- gcc.dg/tree-ssa/slsr-1.c(revision 201032) +++ gcc.dg/tree-ssa/slsr-1.c(working copy) @@ -14,7 +14,9 @@ f (int *p, unsigned int n) foo (*(p + 48 + n * 4)); } -/* { dg-final { scan-tree-dump-times "\\+ 128|\\, 128>" 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "\\+ 128|\\, 128>" 1 "optimized" { target { int32plus } } } } */ /* { dg-final { scan-tree-dump-times "\\+ 64|\\, 64>" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "\\+ 192|\\, 192>" 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "\\+ 32|\\, 32>" 1 "optimized" { target { int16 } } } } */ +/* { dg-final { scan-tree-dump-times "\\+ 192|\\, 192>" 1 "optimized" { target { int32 } } } } */ +/* { dg-final { scan-tree-dump-times "\\+ 96|\\, 96>" 1 "optimized" { target { int16 } } } } */ /* { dg-final { cleanup-tree-dump "optimized" } } */ Index: gcc.dg/tree-ssa/slsr-2.c === --- gcc.dg/tree-ssa/slsr-2.c(revision 201032) +++ gcc.dg/tree-ssa/slsr-2.c(working copy) @@ -11,6 +11,8 @@ f (int *p, int n) foo (*(p + 16 + n * 4)); } -/* { dg-final { scan-tree-dump-times "\\+ 144|\\, 144>" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "\\+ 96|\\, 96>" 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "\\+ 144|\\, 144>" 1 "optimized" { target { int32 } } } } */ +/* { dg-final { scan-tree-dump-times "\\+ 72|\\, 72>" 1 "optimized" { target { int16 } } } } */ +/* { dg-final { scan-tree-dump-times "\\+ 96|\\, 96>" 1 "optimized" { target { int32 } } } } */ +/* { dg-final { scan-tree-dump-times "\\+ 48|\\, 48>" 1 "optimized" { target { int16 } } } } */ /* { dg-final { cleanup-tree-dump "optimized" } } */ Index: gcc.dg/tree-ssa/slsr-27.c === --- gcc.dg/tree-ssa/slsr-27.c (revision 201032) +++ gcc.dg/tree-ssa/slsr-27.c (working copy) @@ -16,7 +16,8 @@ f (struct x *p, unsigned int n) foo (p->a[n], p->c[n], p->b[n]); } -/* { dg-final { scan-tree-dump-times "\\* 4;" 1 "dom2" } } */ +/* { dg-final { scan-tree-dump-times "\\* 4;" 1 "dom2" { target { int32 } } } } */ +/* { dg-final { scan-tree-dump-times "\\* 2;" 1 "dom2" { target { int16 } } } } */ /* { dg-final { scan-tree-dump-times "p_\\d\+\\(D\\) \\+ \[^\r\n\]*_\\d\+;" 1 "dom2" } } */ /* { dg-final { scan-tree-dump-times "MEM\\\[\\(struct x \\*\\)\[^\r\n\]*_\\d\+" 3 "dom2" } } */ /* { dg-final { cleanup-tree-dump "dom2" } } */ Index: gcc.dg/tree-ssa/slsr-28.c === --- gcc.dg/tree-ssa/slsr-28.c (revision 201032) +++ gcc.dg/tree-ssa/slsr-28.c (working copy) @@ -20,7 +20,8 @@ f (struct x *p, unsigned int n) foo (p->b[n], p->a[n], p->c[n]); } -/* { dg-final { scan-tree-dump-times "\\* 4;" 1 "dom2" } } */ +/* { dg-final { scan-tree-dump-times "\\* 4;" 1 "dom2" { target { int32 } } } } */ +/* { dg-final { scan-tree-dump-times "\\* 2;" 1 "dom2" { target { int16 } } } } */ /* { dg-final
Committed: testsuite patches (13/14): Restrict tests that require large arrays / 16-bit wrap-around indices to target { size32plus }
Tested for avr with --target_board=atmega128-sim and native on i686-pc-linuc-gnu. 2013-07-17 Joern Rennecke * gcc.dg/torture/pr53366-1.c: Only run for target { size32plus }. * gcc.dg/torture/pr56488.c: Likewise. Index: gcc.dg/torture/pr53366-1.c === --- gcc.dg/torture/pr53366-1.c (revision 201032) +++ gcc.dg/torture/pr53366-1.c (working copy) @@ -1,5 +1,5 @@ /* PR tree-optimization/53366 */ -/* { dg-do run } */ +/* { dg-do run { target { size32plus } } } */ extern void abort (void); Index: gcc.dg/torture/pr56488.c === --- gcc.dg/torture/pr56488.c(revision 201032) +++ gcc.dg/torture/pr56488.c(working copy) @@ -1,4 +1,4 @@ -/* { dg-do run } */ +/* { dg-do run { target { size32plus } } } */ int a, c, d = 1; struct S { int s; } b, f;
testsuite patches (14/14): gcc.dg/tree-ssa/pr42585.c: [avr*-*-*]: Don't do scan check
Tested for avr with --target_board=atmega128-sim and native on i686-pc-linuc-gnu. 2013-07-17 Joern Rennecke * gcc.dg/tree-ssa/pr42585.c: Add avr*-*-* to list of targets to exclude from scan test. Index: gcc.dg/tree-ssa/pr42585.c === --- gcc.dg/tree-ssa/pr42585.c (revision 201032) +++ gcc.dg/tree-ssa/pr42585.c (working copy) @@ -35,6 +35,6 @@ Cyc_string_ungetc (int ignore, struct _f /* Whether the structs are totally scalarized or not depends on the MOVE_RATIO macro defintion in the back end. The scalarization will not take place when using small values for MOVE_RATIO. */ -/* { dg-final { scan-tree-dump-times "struct _fat_ptr _ans" 0 "optimized" { target { ! "arm*-*-* powerpc*-*-* s390*-*-* sh*-*-*" } } } } */ -/* { dg-final { scan-tree-dump-times "struct _fat_ptr _T2" 0 "optimized" { target { ! "arm*-*-* powerpc*-*-* s390*-*-* sh*-*-*" } } } } */ +/* { dg-final { scan-tree-dump-times "struct _fat_ptr _ans" 0 "optimized" { target { ! "arm*-*-* avr*-*-* powerpc*-*-* s390*-*-* sh*-*-*" } } } } */ +/* { dg-final { scan-tree-dump-times "struct _fat_ptr _T2" 0 "optimized" { target { ! "arm*-*-* avr*-*-* powerpc*-*-* s390*-*-* sh*-*-*" } } } } */ /* { dg-final { cleanup-tree-dump "optimized" } } */