Re: Simplify types of arrays

2018-11-08 Thread Richard Biener
On Wed, 7 Nov 2018, Jan Hubicka wrote:

> Hi,
> this patch simplfies types of arrays so we don't propagate duplicates
> when record/union contains array of pointers.
> With this we still miss simplification of pointers to arrays of
> structures (where we need to rebuild array same way as we rebuild
> pointers) and enumerate types. That should make simplification of types
> complete. Neither of those two seems very critical for GCC build,
> with the patch we are down to 24 duplicated types in bootstrap, I will
> collect data on firefox, but things looks quite good.
> (from tens of thousdant week ago).
> 
> The patch works, but I am somewhat nervous because modyfing type inplace
> affects its type_hash_canon_hash and friends.

Indeed - I don't think we want to do this here and this way.  What
you'd need to do is record uses of a type and replace all uses with
a simplified copy ...

> There are pre-existing
> modifications to function parameters and things seems to just work,
> but I wonder if we have any strategy on keeping hashes in tree.c
> consitent across free-lang data? Or are all those hashes unused/freed at
> this time?

They are not freed and I don't see why they should.

> lto-bootstrapped/regtested x86_64-linux.

Please lets not do this and instead keep the array of pointers
thing in mind.

Does it really matter so much?  That is, where are those arrays
with pointer to complete type referenced from?  Global variables?
As fields in records they are already fixed up properly, right?

Richard.

> Honza
> 
>   * tree.c (free_lang_data_in_type): Simplify types of arrays.
> Index: tree.c
> ===
> --- tree.c(revision 265877)
> +++ tree.c(working copy)
> @@ -5320,6 +5320,8 @@ free_lang_data_in_type (tree type, struc
> TREE_PURPOSE (p) = NULL;
>   }
>  }
> +  else if (TREE_CODE (type) == ARRAY_TYPE)
> +TREE_TYPE (type) = fld_simplified_type (TREE_TYPE (type), fld);
>else if (RECORD_OR_UNION_TYPE_P (type))
>  {
>/* Remove members that are not FIELD_DECLs from the field list
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


[PATCH] Remove unreachable nodes before IPA profile pass (PR ipa/87706).

2018-11-08 Thread Martin Liška
Hi.

In order to fix the warnings mentioned in the PR, we need
to run remove_unreachable_nodes after early tree passes. That's
however possible only within a IPA pass. Thus I'm calling that
before the profile PASS.

Patch survives regression tests on ppc64le-linux-gnu and majority
of warnings are gone in profiledbootstrap.

Ready for trunk?
Thanks,
Martin

gcc/ChangeLog:

2018-11-08  Martin Liska  

* tree-profile.c: Run TODO_remove_functions before "profile"
pass in order to remove dead functions that will trigger
-Wmissing-profile.
---
 gcc/tree-profile.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/gcc/tree-profile.c b/gcc/tree-profile.c
index d8f2a3b1ba4..c14ebc556a6 100644
--- a/gcc/tree-profile.c
+++ b/gcc/tree-profile.c
@@ -776,7 +776,7 @@ const pass_data pass_data_ipa_tree_profile =
   0, /* properties_required */
   0, /* properties_provided */
   0, /* properties_destroyed */
-  0, /* todo_flags_start */
+  TODO_remove_functions, /* todo_flags_start */
   TODO_dump_symtab, /* todo_flags_finish */
 };
 



Re: [PATCH] Fix PR87906

2018-11-08 Thread Rainer Orth
Hi Richard,

>>the new testcase FAILs on Solaris:
>>
>>+FAIL: g++.dg/lto/pr87906 cp_lto_pr87906_0.o assemble,  -O -fPIC -flto 
>>+UNRESOLVED: g++.dg/lto/pr87906 cp_lto_pr87906_0.o-cp_lto_pr87906_1.o
>>execute  -O -fPIC -flto 
>>+UNRESOLVED: g++.dg/lto/pr87906 cp_lto_pr87906_0.o-cp_lto_pr87906_1.o
>>link  -O -fPIC -flto 
>>+FAIL: g++.dg/lto/pr87906 cp_lto_pr87906_1.o assemble,  -O -fPIC -flto 
>>
>>/vol/gcc/src/hg/trunk/local/gcc/testsuite/g++.dg/lto/pr87906_0.C:6:11:
>>error: expected identifier before numeric constant
>>/vol/gcc/src/hg/trunk/local/gcc/testsuite/g++.dg/lto/pr87906_0.C:6:11:
>>error: expected unqualified-id before numeric constant
>>
>>and several more due to the -Dsun default.  How about
>>sed -e 's/sun/moon/g' instead ;-)
>
> Argh. Works for me. 

here's what I've installed after testing on i386-pc-solaris2.11.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2018-11-08  Rainer Orth  

* g++.dg/lto/pr87906_0.C: Use moon instead of possibly predefined
sun.
* g++.dg/lto/pr87906_1.C: Likewise.

# HG changeset patch
# Parent  9e2733fbb10cd25b0d4aec581864af37c615fdbf
Don't use predefined sun in g++.dg/lto/pr87906

diff --git a/gcc/testsuite/g++.dg/lto/pr87906_0.C b/gcc/testsuite/g++.dg/lto/pr87906_0.C
--- a/gcc/testsuite/g++.dg/lto/pr87906_0.C
+++ b/gcc/testsuite/g++.dg/lto/pr87906_0.C
@@ -3,13 +3,13 @@
 // { dg-extra-ld-options "-shared -nostdlib" }
 
 namespace com {
-namespace sun {
+namespace moon {
 namespace star {}
-} // namespace sun
+} // namespace moon
 } // namespace com
-namespace a = com::sun::star;
+namespace a = com::moon::star;
 namespace com {
-namespace sun {
+namespace moon {
 namespace star {
 namespace uno {
 class a {
@@ -28,7 +28,7 @@ class c {
 class RuntimeException : b {};
 } // namespace uno
 } // namespace star
-} // namespace sun
+} // namespace moon
 } // namespace com
 template  void d(int) { throw a::uno::RuntimeException(); }
 int f;
diff --git a/gcc/testsuite/g++.dg/lto/pr87906_1.C b/gcc/testsuite/g++.dg/lto/pr87906_1.C
--- a/gcc/testsuite/g++.dg/lto/pr87906_1.C
+++ b/gcc/testsuite/g++.dg/lto/pr87906_1.C
@@ -1,5 +1,5 @@
 namespace com {
-namespace sun {
+namespace moon {
 namespace star {
 namespace uno {
 class a {
@@ -15,9 +15,9 @@ class RuntimeException : b {};
 } // namespace uno
 class C : uno::RuntimeException {};
 } // namespace star
-} // namespace sun
+} // namespace moon
 } // namespace com
-using com::sun::star::C;
-using com::sun::star::uno::RuntimeException;
+using com::moon::star::C;
+using com::moon::star::uno::RuntimeException;
 void d() { throw RuntimeException(); }
 void e() { C(); }


Re: [PATCH][RFC] Sanitize equals and hash functions in hash-tables.

2018-11-08 Thread Martin Liška
On 11/7/18 11:23 PM, Jeff Law wrote:
> On 10/30/18 6:28 AM, Martin Liška wrote:
>> On 10/30/18 11:03 AM, Jakub Jelinek wrote:
>>> On Mon, Oct 29, 2018 at 04:14:21PM +0100, Martin Liška wrote:
 +hashtab_chk_error ()
 +{
 +  fprintf (stderr, "hash table checking failed: "
 + "equal operator returns true for a pair "
 + "of values with a different hash value");
>>> BTW, either use internal_error here, or at least if using fprintf
>>> terminate with \n, in your recent mail I saw:
>>> ...different hash valueduring RTL pass: vartrack
>>> ^^
>> Sure, fixed in attached patch.
>>
>> Martin
>>
 +  gcc_unreachable ();
 +}
>>> Jakub
>>>
>>
>> 0001-Sanitize-equals-and-hash-functions-in-hash-tables.patch
>>
>> From 0d9c979c845580a98767b83c099053d36eb49bb9 Mon Sep 17 00:00:00 2001
>> From: marxin 
>> Date: Mon, 29 Oct 2018 09:38:21 +0100
>> Subject: [PATCH] Sanitize equals and hash functions in hash-tables.
>>
>> ---
>>  gcc/hash-table.h | 40 +++-
>>  1 file changed, 39 insertions(+), 1 deletion(-)
>>
>> diff --git a/gcc/hash-table.h b/gcc/hash-table.h
>> index bd83345c7b8..694eedfc4be 100644
>> --- a/gcc/hash-table.h
>> +++ b/gcc/hash-table.h
>> @@ -503,6 +503,7 @@ private:
>>  
>>value_type *alloc_entries (size_t n CXX_MEM_STAT_INFO) const;
>>value_type *find_empty_slot_for_expand (hashval_t);
>> +  void verify (const compare_type &comparable, hashval_t hash);
>>bool too_empty_p (unsigned int);
>>void expand ();
>>static bool is_deleted (value_type &v)
>> @@ -882,8 +883,12 @@ hash_table
>>if (insert == INSERT && m_size * 3 <= m_n_elements * 4)
>>  expand ();
>>  
>> -  m_searches++;
>> +#if ENABLE_EXTRA_CHECKING
>> +if (insert == INSERT)
>> +  verify (comparable, hash);
>> +#endif
>>  
>> +  m_searches++;
>>value_type *first_deleted_slot = NULL;
>>hashval_t index = hash_table_mod1 (hash, m_size_prime_index);
>>hashval_t hash2 = hash_table_mod2 (hash, m_size_prime_index);
>> @@ -930,6 +935,39 @@ hash_table
>>return &m_entries[index];
>>  }
>>  
>> +#if ENABLE_EXTRA_CHECKING
>> +
>> +/* Report a hash table checking error.  */
>> +
>> +ATTRIBUTE_NORETURN ATTRIBUTE_COLD
>> +static void
>> +hashtab_chk_error ()
>> +{
>> +  fprintf (stderr, "hash table checking failed: "
>> +   "equal operator returns true for a pair "
>> +   "of values with a different hash value\n");
>> +  gcc_unreachable ();
>> +}
> I think an internal_error here is probably still better than a simple
> fprintf, even if the fprintf is terminated with a \n :-)

Fully agree with that, but I see a lot of build errors when using 
internal_error.

> 
> The question then becomes can we bootstrap with this stuff enabled and
> if not, are we likely to soon?  It'd be a shame to put it into
> EXTRA_CHECKING, but then not be able to really use EXTRA_CHECKING
> because we've got too many bugs to fix.

Unfortunately it's blocked with these 2 PRs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87845
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87847

I'm fine with having the patch in in next stage1 after the problems will
be fixed.

Martin

> 
>> +
>> +/* Verify that all existing elements in th hash table which are
> s/th/the/
> 
> 
> Jeff
> 



[C++ Patch] Fix two grokdeclarator locations

2018-11-08 Thread Paolo Carlini

Hi,

two additional grokdeclarator locations that we can easily fix by using 
declarator->id_loc. Slightly more interesting, testing revealed a latent 
issue in the make_id_declarator uses: cp_parser_member_declaration 
wasn't setting declarator->id_loc, thus I decided to add a location_t 
parameter to make_id_declarator itself and adjust all the callers. 
Tested x86_64-linux.


Thanks, Paolo.

//

/cp
2018-11-08  Paolo Carlini  

* parser.c (make_id_declarator): Add location_t parameter.
(cp_parser_lambda_declarator_opt): Adjust call.
(cp_parser_decomposition_declaration): Likewise.
(cp_parser_alias_declaration): Likewise.
(cp_parser_direct_declarator): Likewise.
(cp_parser_member_declaration): Likewise.
(cp_parser_objc_class_ivars): Likewise.
* decl.c (grokdeclarator): Use declarator->id_loc in two error messages.

/testsuite
2018-11-08  Paolo Carlini  

* g++.dg/cpp0x/nsdmi-union6.C: Test locations too.
* g++.dg/cpp0x/nsdmi6.C: Likewise.
* g++.dg/ext/flexary4.C: Likewise.
* g++.dg/ext/flexary9.C: Likewise.
* g++.dg/other/incomplete2.C: Likewise.
* g++.dg/parse/friend12.C: Likewise.Index: cp/decl.c
===
--- cp/decl.c   (revision 265869)
+++ cp/decl.c   (working copy)
@@ -12408,8 +12408,9 @@ grokdeclarator (const cp_declarator *declarator,
  {
if (unqualified_id)
  {
-   error ("field %qD has incomplete type %qT",
-  unqualified_id, type);
+   error_at (declarator->id_loc,
+ "field %qD has incomplete type %qT",
+ unqualified_id, type);
cxx_incomplete_type_inform (strip_array_types (type));
  }
else
@@ -12423,8 +12424,9 @@ grokdeclarator (const cp_declarator *declarator,
  {
if (friendp)
  {
-   error ("%qE is neither function nor member function; "
-  "cannot be declared friend", unqualified_id);
+   error_at (declarator->id_loc,
+ "%qE is neither function nor member function; "
+ "cannot be declared friend", unqualified_id);
return error_mark_node;
  }
decl = NULL_TREE;
Index: cp/parser.c
===
--- cp/parser.c (revision 265869)
+++ cp/parser.c (working copy)
@@ -1452,7 +1452,7 @@ make_declarator (cp_declarator_kind kind)
 
 static cp_declarator *
 make_id_declarator (tree qualifying_scope, tree unqualified_name,
-   special_function_kind sfk)
+   special_function_kind sfk, location_t id_location)
 {
   cp_declarator *declarator;
 
@@ -1477,7 +1477,8 @@ make_id_declarator (tree qualifying_scope, tree un
   declarator->u.id.qualifying_scope = qualifying_scope;
   declarator->u.id.unqualified_name = unqualified_name;
   declarator->u.id.sfk = sfk;
-  
+  declarator->id_loc = id_location;
+
   return declarator;
 }
 
@@ -10666,7 +10667,8 @@ cp_parser_lambda_declarator_opt (cp_parser* parser
 
 p = obstack_alloc (&declarator_obstack, 0);
 
-declarator = make_id_declarator (NULL_TREE, call_op_identifier, sfk_none);
+declarator = make_id_declarator (NULL_TREE, call_op_identifier, sfk_none,
+LAMBDA_EXPR_LOCATION (lambda_expr));
 
 quals = (LAMBDA_EXPR_MUTABLE_P (lambda_expr)
 ? TYPE_UNQUALIFIED : TYPE_QUAL_CONST);
@@ -10677,7 +10679,6 @@ cp_parser_lambda_declarator_opt (cp_parser* parser
   exception_spec,
return_type,
/*requires_clause*/NULL_TREE);
-declarator->id_loc = LAMBDA_EXPR_LOCATION (lambda_expr);
 declarator->std_attributes = attributes;
 
 fco = grokmethod (&return_type_specs,
@@ -13443,10 +13444,13 @@ cp_parser_decomposition_declaration (cp_parser *pa
   FOR_EACH_VEC_ELT (v, i, e)
 {
   if (i == 0)
-   declarator = make_id_declarator (NULL_TREE, e.get_value (), sfk_none);
+   declarator = make_id_declarator (NULL_TREE, e.get_value (),
+sfk_none, e.get_location ());
   else
-   declarator->u.id.unqualified_name = e.get_value ();
-  declarator->id_loc = e.get_location ();
+   {
+ declarator->u.id.unqualified_name = e.get_value ();
+ declarator->id_loc = e.get_location ();
+   }
   tree elt_pushed_scope;
   tree decl2 = start_decl (declarator, &decl_specs, SD_INITIALIZED,
   NULL_TREE, NULL_TREE, &elt_pushed_scope);
@@ -19264,8 +19268,7 @@ cp_parser_alias_declaration (cp_parser* parser)
   /*declarator=*/

Re: introduce --enable-mingw-full32 to default to --large-address-aware

2018-11-08 Thread Alexandre Oliva
On Nov  7, 2018, JonY <10wa...@gmail.com> wrote:

> On 11/07/2018 08:34 AM, Alexandre Oliva wrote:
>> On Nov  1, 2018, JonY wrote:
>> 
>>> Looks like it causes an error on 64bit:
>>> /usr/libexec/gcc/x86_64-w64-mingw32/ld: unrecognized option
>>> '--large-address-aware'
>> 
>> What does?  The patch I suggested?  The current trunk?
>> 
>> What was the command in this case?  How was the toolchain configured?

> No it's just a quick test to see how x86_64-w64-mingw32 reacts to
> --large-address-aware, it doesn't play well.

I understand, but you're getting a different result from what I am, I'd
like to understand why before attempting to "fix" something that AFAICT
is correct and behaving just as intended.  Maybe there are valid reasons
to drop --large-address-aware altogether on x86_64-w64-mingw32, but I'd
like to understand why, as in, how the error above came about for you,
when it didn't for me.

I built gcc and binutils in a single tree, with x86_64-w64-mingw32 as
the target, and the resulting linker would accept --large-address-aware
in the -mi386-pe emulation, and that emulation was explicitly enabled
when building -m32 binaries.  Does your w64 toolchain deviate from any
of these facts?


> x86_64-mingw32 is not used as far as I know, only with "w64" or "pc".

x86_64-mingw32's canonical form is x86_64-pc-mingw32, and they're
equivalent, so whatever you say or know about the latter should apply to
the shorter form as well.  Likewise, my questions and doubts about the
shorter form apply equally to the canonical one.

> The "w64" carries a special meaning to gcc dating back to the early
> 64bit port. It basically tells gcc to use mingw-w64 specific features
> that are not found on the regular mingw.org CRT at the time.

I see.  So -pc- and -w64- are not supposed to be equivalent indeed.
Thanks.

> This might be affecting the "pc" vendor build, can you check
> x86_64-pc-mingw32 just to see if it is affected?

I did.  Its -m32 mode seems broken to me.  As I wrote, the linker
emulation is never specified, so it would build 64-bit binaries even
when given -m32.  Because it lacks explicit linker emulation
specification, it can't have --large-address-aware specified either.

In my link tests, i686-mingw32 and x86_64-w64-mingw32 both worked (in
that their respective linkers wouldn't reject the --large-address-aware
passed by gcc) with the --large-address-aware patch that is installed in
trunk.  Ditto with the incremental patch I posted last week, that would
have only improved x86_64-pc-mingw32.

-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist
Hay que enGNUrecerse, pero sin perder la terGNUra jamás-GNUChe


Re: [PATCH, ARM, ping2] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-11-08 Thread Thomas Preudhomme
Ping?

Best regards,

Thomas

On Thu, 1 Nov 2018 at 16:03, Thomas Preudhomme
 wrote:
>
> Ping?
>
> Best regards,
>
> Thomas
> On Fri, 26 Oct 2018 at 22:41, Thomas Preudhomme
>  wrote:
> >
> > Hi,
> >
> > Please find updated patch to fix PR85434: spilling of stack protector
> > guard's address on ARM. Quite a few changes have been made to the ARM
> > part since last round of review so I think it makes more sense to
> > review it anew. Ran bootstrap + regression testsuite + glibc build +
> > glibc regression testsuite for Arm and Thumb-2 and bootstrap +
> > regression testsuite for Thumb-1. GCC's regression testsuite was run
> > in 3 configurations in all those cases:
> >
> > - default configuration (no RUNTESTFLAGS)
> > - with -fstack-protector-all
> > - with -fPIC -fstack-protector-all (to exercise both codepath in stack
> > protector's split code)
> >
> > None of this show any regression beyond some new scan fail with
> > -fstack-protector-all or -fPIC due to unexpected code sequence for the
> > testcases concerned and some guality swing due to less optimization
> > with new stack protector on.
> >
> > Patch description and ChangeLog below.
> >
> > In case of high register pressure in PIC mode, address of the stack
> > protector's guard can be spilled on ARM targets as shown in PR85434,
> > thus allowing an attacker to control what the canary would be compared
> > against. ARM does lack stack_protect_set and stack_protect_test insn
> > patterns, defining them does not help as the address is expanded
> > regularly and the patterns only deal with the copy and test of the
> > guard with the canary.
> >
> > This problem does not occur for x86 targets because the PIC access and
> > the test can be done in the same instruction. Aarch64 is exempt too
> > because PIC access insn pattern are mov of UNSPEC which prevents it from
> > the second access in the epilogue being CSEd in cse_local pass with the
> > first access in the prologue.
> >
> > The approach followed here is to create new "combined" set and test
> > standard pattern names that take the unexpanded guard and do the set or
> > test. This allows the target to use an opaque pattern (eg. using UNSPEC)
> > to hide the individual instructions being generated to the compiler and
> > split the pattern into generic load, compare and branch instruction
> > after register allocator, therefore avoiding any spilling. This is here
> > implemented for the ARM targets. For targets not implementing these new
> > standard pattern names, the existing stack_protect_set and
> > stack_protect_test pattern names are used.
> >
> > To be able to split PIC access after register allocation, the functions
> > had to be augmented to force a new PIC register load and to control
> > which register it loads into. This is because sharing the PIC register
> > between prologue and epilogue could lead to spilling due to CSE again
> > which an attacker could use to control what the canary gets compared
> > against.
> >
> > ChangeLog entries are as follows:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-10-26  Thomas Preud'homme  
> >
> > * target-insns.def (stack_protect_combined_set): Define new standard
> > pattern name.
> > (stack_protect_combined_test): Likewise.
> > * cfgexpand.c (stack_protect_prologue): Try new
> > stack_protect_combined_set pattern first.
> > * function.c (stack_protect_epilogue): Try new
> > stack_protect_combined_test pattern first.
> > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> > parameters to control which register to use as PIC register and force
> > reloading PIC register respectively.  Insert in the stream of insns if
> > possible.
> > (legitimize_pic_address): Expose above new parameters in prototype and
> > adapt recursive calls accordingly.  Use pic_reg if non null instead of
> > cached one.
> > (arm_load_pic_register): Add pic_reg parameter and use it if non null.
> > (arm_legitimize_address): Adapt to new legitimize_pic_address
> > prototype.
> > (thumb_legitimize_address): Likewise.
> > (arm_emit_call_insn): Adapt to require_pic_register prototype change.
> > (arm_expand_prologue): Adapt to arm_load_pic_register prototype change.
> > (thumb1_expand_prologue): Likewise.
> > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
> > change.
> > (arm_load_pic_register): Likewise.
> > * config/arm/predicated.md (guard_addr_operand): New predicate.
> > (guard_operand): New predicate.
> > * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
> > prototype change.
> > (builtin_setjmp_receiver expander): Adapt to thumb1_expand_prologue
> > prototype change.
> > (stack_protect_combined_set): New expander..
> > (stack_protect_combined_set_insn): New insn_and_split pattern.
> > (stack_protect_set_insn): New insn pattern.
> > (stack_protect_combined_test): New expander.
> > (stack_protect_combined_test_insn): New insn_and_split pattern.
> > (arm_stack_protect_test_insn): New insn pattern.
> > * config/arm

Re: [PATCH, arm] Backport -- Fix ICE during thunk generation with -mlong-calls

2018-11-08 Thread Ramana Radhakrishnan
On 07/11/2018 17:49, Mihail Ionescu wrote:
> Hi All,
> 
> This is a backport from trunk for GCC 8 and 7.
> 
> SVN revision: r264595.
> 
> Regression tested on arm-none-eabi.
> 
> 
> gcc/ChangeLog
> 
> 2018-11-02  Mihail Ionescu  
> 
>   Backport from mainiline
>   2018-09-26  Eric Botcazou  
> 
>   * config/arm/arm.c (arm_reorg): Skip Thumb reorg pass for thunks.
>   (arm32_output_mi_thunk): Deal with long calls.
> 
> gcc/testsuite/ChangeLog
> 
> 2018-11-02  Mihail Ionescu  
> 
>   Backport from mainiline
>   2018-09-17  Eric Botcazou  
> 
>   * g++.dg/other/thunk2a.C: New test.
>   * g++.dg/other/thunk2b.C: Likewise.
> 
> 
> If everything is ok, could someone commit it on my behalf?
> 
> Best regards,
>  Mihail
> 

It is a regression since my rewrite of this code.

Ok to backport to the release branches, it's been on trunk for a while 
and not shown any issues - please give the release managers a day or so 
to object.

regards
Ramana


Re: [PATCH] [aarch64] Correct the maximum shift amount for shifted operands.

2018-11-08 Thread Kyrill Tkachov

Hi Christoph,

On 07/11/18 21:51, christoph.muell...@theobroma-systems.com wrote:

From: Christoph Muellner 

The aarch64 ISA specification allows a left shift amount to be applied
after extension in the range of 0 to 4 (encoded in the imm3 field).



Indeed. That looks correct from my reading of the spec as well (section C6.2.3 
for example).


This is true for at least the following instructions:

 * ADD (extend register)
 * ADDS (extended register)
 * SUB (extended register)

The result of this patch can be seen, when compiling the following code:

uint64_t myadd(uint64_t a, uint64_t b)
{
return a+(((uint8_t)b)<<4);
}

Without the patch the following sequence will be generated:

 :
   0:   d37c1c21 ubfiz   x1, x1, #4, #8
   4:   8b20 add x0, x1, x0
   8:   d65f03c0 ret

With the patch the ubfiz will be merged into the add instruction:

 :
   0:   8b211000 add x0, x0, w1, uxtb #4
   4:   d65f03c0 ret

*** gcc/ChangeLog ***



The patch looks correct to me but can you please clarify how it has been tested?
Patches need to be bootstrapped and the full testsuite run.

I also have a few comments on the ChangeLog


2018-xx-xx  Christoph Muellner 



Two spaces between your name and the email (also, missing an 's' in the email?).


* gcc/config/aarch64/aarch64.c: Correct the maximum shift amount
for shifted operands.


The path should be relative to the ChangeLog location.
Also, please specify the function name being changed.
In this case it should be
* config/aarch64/aarch64.c (aarch64_uxt_size): Correct...


* gcc/testsuite/gcc.target/aarch64/extend.c: Adjust the
testcases to cover the changed shift amount.


This should be in a separate ChangeLog entry as it will go into 
gcc/testsuite/ChangeLog.
The path would be:
* gcc.target/aarch64/extend.c



Signed-off-by: Christoph Muellner 
Signed-off-by: Philipp Tomsich 


GCC doesn't use Signed-off-by tags (though no one will complain about them 
either) but if
Philipp Tomsich wrote any of the code here (which I think is implied by the 
Signed-off-by tag?) then
his name should appear in the ChangeLog entry.

Thanks for the patch!
As I said, the code looks good to me, but you'll need a maintainer to approve it
(I've cc'ed the aarch64 maintainers for you) once the testing is clarified and 
the ChangeLog is fixed.

Kyrill



---
 gcc/config/aarch64/aarch64.c  |  2 +-
 gcc/testsuite/gcc.target/aarch64/extend.c | 16 
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index c82c7b6..c85988a 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -8190,7 +8190,7 @@ aarch64_output_casesi (rtx *operands)
 int
 aarch64_uxt_size (int shift, HOST_WIDE_INT mask)
 {
-  if (shift >= 0 && shift <= 3)
+  if (shift >= 0 && shift <= 4)
 {
   int size;
   for (size = 8; size <= 32; size *= 2)
diff --git a/gcc/testsuite/gcc.target/aarch64/extend.c 
b/gcc/testsuite/gcc.target/aarch64/extend.c
index f399e55..7986c5b 100644
--- a/gcc/testsuite/gcc.target/aarch64/extend.c
+++ b/gcc/testsuite/gcc.target/aarch64/extend.c
@@ -32,8 +32,8 @@ ldr_sxtw0 (char *arr, int i)
 unsigned long long
 adddi_uxtw (unsigned long long a, unsigned int i)
 {
-  /* { dg-final { scan-assembler "add\tx\[0-9\]+,.*uxtw #?3" } } */
-  return a + ((unsigned long long)i << 3);
+  /* { dg-final { scan-assembler "add\tx\[0-9\]+,.*uxtw #?4" } } */
+  return a + ((unsigned long long)i << 4);
 }

 unsigned long long
@@ -46,8 +46,8 @@ adddi_uxtw0 (unsigned long long a, unsigned int i)
 long long
 adddi_sxtw (long long a, int i)
 {
-  /* { dg-final { scan-assembler "add\tx\[0-9\]+,.*sxtw #?3" } } */
-  return a + ((long long)i << 3);
+  /* { dg-final { scan-assembler "add\tx\[0-9\]+,.*sxtw #?4" } } */
+  return a + ((long long)i << 4);
 }

 long long
@@ -60,8 +60,8 @@ adddi_sxtw0 (long long a, int i)
 unsigned long long
 subdi_uxtw (unsigned long long a, unsigned int i)
 {
-  /* { dg-final { scan-assembler "sub\tx\[0-9\]+,.*uxtw #?3" } } */
-  return a - ((unsigned long long)i << 3);
+  /* { dg-final { scan-assembler "sub\tx\[0-9\]+,.*uxtw #?4" } } */
+  return a - ((unsigned long long)i << 4);
 }

 unsigned long long
@@ -74,8 +74,8 @@ subdi_uxtw0 (unsigned long long a, unsigned int i)
 long long
 subdi_sxtw (long long a, int i)
 {
-  /* { dg-final { scan-assembler "sub\tx\[0-9\]+,.*sxtw #?3" } } */
-  return a - ((long long)i << 3);
+  /* { dg-final { scan-assembler "sub\tx\[0-9\]+,.*sxtw #?4" } } */
+  return a - ((long long)i << 4);
 }

 long long
--
2.9.5





[PATCH] Fix PR87929

2018-11-08 Thread Richard Biener


Bootstrapped & tested on x86_64-unknown-linux-gnu, applied.

Richard.

>From f3d02105b799975047a51ad80488f9cd928419bf Mon Sep 17 00:00:00 2001
From: Richard Guenther 
Date: Thu, 8 Nov 2018 10:22:09 +0100
Subject: [PATCH] fix-pr87929

2018-11-08  Richard Biener  

PR tree-optimization/87929
* tree-complex.c (expand_complex_comparison): Clean EH.

* gcc.dg/pr87929.c: New testcase.

diff --git a/gcc/testsuite/gcc.dg/pr87929.c b/gcc/testsuite/gcc.dg/pr87929.c
new file mode 100644
index 000..f64f7ada442
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr87929.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-fexceptions -fnon-call-exceptions -fsignaling-nans" } */
+
+#define complex __complex__
+#define _Complex_I (1.0iF)
+
+extern void f2c_4d__( complex float *, complex float *);
+extern void abort (void);
+
+void f2c_4c__(void)
+{
+  complex float x,ret_val;
+  x = 1234 + 5678 * _Complex_I;
+  f2c_4d__(&ret_val,&x);
+  if ( x != ret_val ) abort();
+}
diff --git a/gcc/tree-complex.c b/gcc/tree-complex.c
index 49088081bb0..4bf644f9473 100644
--- a/gcc/tree-complex.c
+++ b/gcc/tree-complex.c
@@ -1558,6 +1558,8 @@ expand_complex_comparison (gimple_stmt_iterator *gsi, 
tree ar, tree ai,
 }
 
   update_stmt (stmt);
+  if (maybe_clean_eh_stmt (stmt))
+gimple_purge_dead_eh_edges (gimple_bb (stmt));
 }
 
 /* Expand inline asm that sets some complex SSA_NAMEs.  */


Re: Simplify types of arrays

2018-11-08 Thread Jan Hubicka
> On Wed, 7 Nov 2018, Jan Hubicka wrote:
> 
> > Hi,
> > this patch simplfies types of arrays so we don't propagate duplicates
> > when record/union contains array of pointers.
> > With this we still miss simplification of pointers to arrays of
> > structures (where we need to rebuild array same way as we rebuild
> > pointers) and enumerate types. That should make simplification of types
> > complete. Neither of those two seems very critical for GCC build,
> > with the patch we are down to 24 duplicated types in bootstrap, I will
> > collect data on firefox, but things looks quite good.
> > (from tens of thousdant week ago).
> > 
> > The patch works, but I am somewhat nervous because modyfing type inplace
> > affects its type_hash_canon_hash and friends.
> 
> Indeed - I don't think we want to do this here and this way.  What
> you'd need to do is record uses of a type and replace all uses with
> a simplified copy ...
> 
> > There are pre-existing
> > modifications to function parameters and things seems to just work,
> > but I wonder if we have any strategy on keeping hashes in tree.c
> > consitent across free-lang data? Or are all those hashes unused/freed at
> > this time?
> 
> They are not freed and I don't see why they should.
> 
> > lto-bootstrapped/regtested x86_64-linux.
> 
> Please lets not do this and instead keep the array of pointers
> thing in mind.
> 
> Does it really matter so much?  That is, where are those arrays
> with pointer to complete type referenced from?  Global variables?
> As fields in records they are already fixed up properly, right?

If you have record that contains array of pointers or pointer to array
they are not simplified or made incomplete at the moment.
simplified_type_of looks only on the pointers and references at the
moment.  I have patch doing that from simplified_type_of, so I can give
it testing.

Honza
> 
> Richard.
> 
> > Honza
> > 
> > * tree.c (free_lang_data_in_type): Simplify types of arrays.
> > Index: tree.c
> > ===
> > --- tree.c  (revision 265877)
> > +++ tree.c  (working copy)
> > @@ -5320,6 +5320,8 @@ free_lang_data_in_type (tree type, struc
> >   TREE_PURPOSE (p) = NULL;
> > }
> >  }
> > +  else if (TREE_CODE (type) == ARRAY_TYPE)
> > +TREE_TYPE (type) = fld_simplified_type (TREE_TYPE (type), fld);
> >else if (RECORD_OR_UNION_TYPE_P (type))
> >  {
> >/* Remove members that are not FIELD_DECLs from the field list
> > 
> > 
> 
> -- 
> Richard Biener 
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
> 21284 (AG Nuernberg)


Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register

2018-11-08 Thread Renlin Li

Hi,

On 11/06/2018 06:58 PM, Jeff Law wrote:

On 11/6/18 3:52 AM, Renlin Li wrote:

Hi Jeff & Peter,

On 11/05/2018 07:41 PM, Jeff Law wrote:

On 11/5/18 12:36 PM, Peter Bergner wrote:

On 11/5/18 1:20 PM, Jeff Law wrote:

On 11/1/18 4:07 PM, Peter Bergner wrote:

On 11/1/18 1:50 PM, Renlin Li wrote:

Is there any update on this issues?
arm-none-linux-gnueabihf native toolchain has been mis-compiled
for a while.


  From the analysis I've done, my commit is just exposing latent issues
in LRA.  Can you try the patch I submitted here to see if it helps?

    https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01757.html

It survives on powerpc64le-linux, x86_64-linux and s390x-linux.
Jeff threw it on his testers and said he saw an arm issue and was
trying to come up with a test case for me to debug.

So I don't think the ARM issues are related to your patch, they may
have
been related the combiner changes that went in around the same time.

Yes, there are issues related to the combiner changes.

But the IRA/LRA change dose cause the arm-none-linux-gnueabihf bootstrap
native toolchain mis-compiled.
And the new patch seems not fix this problem.

That's strange.  I'm bootstrapping arm-linux-gnueabihf daily with qemu +
a suitable root filesystem using Peter's most recent testing patch.




I am trying to extract a test case, but it is a little bit hard as the
toolchain itself is mis-compiled.
And it ICEs when compile test case with it.

What I would suggest doing is to first start with running the testsuite
against the stage1 compiler before/after Peter's changes.  Sometimes
that'll turn up something useful and you can avoid debuging things
through stage2/stage3.


Hi Jeff,
Thanks for the suggestion! I could reproduce it with stage1 compiler.



Hi Peter,

I think I found the problem!

As described in the PR, a hard register is used in
an pre/post modify expression. The hard register is live, but updated.
In this case, we should make it conflicting with all pseudos live at
that point.  Does it make sense?





It fixes the ICE of mis-compiled arm-linux-gnueabihf toolchain described in the
PR.

I attached the patch for discussion.  I haven't give a complete test on arm or
any other targets, yet. (Probably need more adjusting)

I will run arm and aarch64 regression test, cross and native.

Regards,
Renlin

BTW, The pre/post modify expression is generated by auto_inc/auto_dec pass.
somehow, it merges with hard register,  for example function argument registers.
This optimization make the life for RA harder. Probably we don't want that pass
too aggressive. @Wilco.
(This IRA/LRA and the combiner change reveals a lot of issues,
force us to work on it and improve the compiler :) .)

gcc/ChangeLog:

2018-11-08  Renlin Li  
PR middle-end/87899
* lra-lives.c (process_bb_lives): Make hard register of INOUT
type conflict with all live pseudo.










jeff

diff --git a/gcc/lra-lives.c b/gcc/lra-lives.c
index 0bf8cd06a302c8a6fcb914b94f953cdaa86597a2..370a7254cac7dbde4e320424e09274cee66c50b9 100644
--- a/gcc/lra-lives.c
+++ b/gcc/lra-lives.c
@@ -878,11 +878,25 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p)
 
   /* See which defined values die here.  */
   for (reg = curr_id->regs; reg != NULL; reg = reg->next)
-	if (reg->type == OP_OUT
-	&& ! reg_early_clobber_p (reg, n_alt) && ! reg->subreg_p)
-	  need_curr_point_incr
-	|= mark_regno_dead (reg->regno, reg->biggest_mode,
-curr_point);
+	if (! reg_early_clobber_p (reg, n_alt) && ! reg->subreg_p)
+	  {
+	if (reg->type == OP_OUT)
+	  need_curr_point_incr
+		|= mark_regno_dead (reg->regno, reg->biggest_mode,
+curr_point);
+
+	// This is a hard register, and it must be live.  Keep it live and
+	// make it conflict with all live pseudo registers.
+	else if (reg->type == OP_INOUT && reg->regno < FIRST_PSEUDO_REGISTER)
+	  {
+		lra_assert (TEST_HARD_REG_BIT (hard_regs_live, reg->regno));
+
+		unsigned int i;
+		EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, i)
+		  SET_HARD_REG_BIT (lra_reg_info[i].conflict_hard_regs,
+reg->regno);
+	  }
+	  }
 
   for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next)
 	if (reg->type == OP_OUT


Re: [EXT] Re: [Driver] Add support for -fuse-ld=lld

2018-11-08 Thread Romain Geissler
On Tue, 6 Nov 2018, H.J. Lu wrote:

> On Sat, Oct 20, 2018 at 3:19 AM Romain Geissler
>  wrote:
> >
> > I would like to raise again the question of supporting -fuse-ld=lld.
> >
>
> LGTM.  But I can't approve it.
>
> --
> H.J.

Hi,

Is there any gcc reviewer that could comment whether or not this small
feature is welcome in gcc ?

Cheers,
Romain


[PR86438] compare-elim: cope with set of in_b

2018-11-08 Thread Alexandre Oliva
When in_a resolves to a register set in the prev_clobber insn, we may
use the SET_SRC for the compare instead.  However, when in_b so
resolves, we proceed to use the reg with its earlier value.  When both
resolve to the same register and prev_clobber is an insn that modifies
the register, this arrangement may cause the compare to match (when it
shouldn't) and the elimination of the compare to incorrectly succeed.

(set (reg 1) (plus (reg 1) (const_int N)))
(set (reg 2) (reg 1))
(set (reg flags) (compare (reg 1) (reg 2)))

in_a: (reg 1)--> (plus (reg 1) (const_int N))
in_b: (reg 2) -> (reg 1) -/> oops

(parallel [
 (set (reg flags) (compare (plus (reg 1) (const_int N))
   (reg 1))) ;; should be (plus...)
 (set (reg 1) (plus (reg 1) (const_int N)))])
(set (reg 2) (reg 1))

This patch arranges for in_b to also undergo SET_SRC substitution
when appropriate, with a shortcut for when in_a and in_b are the same
rtx.

Bootstrapped, now regression-testing, on x86_64- and i686-linux-gnu.  Ok
to install if it passes?


for  gcc/ChangeLog

PR rtl-optimization/86438
* compare-elim.c (try_eliminate_compare): Use SET_SRC instead
of in_b for the compare if in_b is SET_DEST.

for  gcc/testsuite/ChangeLog

PR rtl-optimization/86438
* gcc.dg/torture/pr86438.c: New.
---
 gcc/compare-elim.c |   25 +
 gcc/testsuite/gcc.dg/torture/pr86438.c |   28 
 2 files changed, 45 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr86438.c

diff --git a/gcc/compare-elim.c b/gcc/compare-elim.c
index 50bbaa84b6df..8afbe76c502b 100644
--- a/gcc/compare-elim.c
+++ b/gcc/compare-elim.c
@@ -734,7 +734,7 @@ try_merge_compare (struct comparison *cmp)
 static bool
 try_eliminate_compare (struct comparison *cmp)
 {
-  rtx flags, in_a, in_b, cmp_src;
+  rtx flags, in_a, in_b, cmp_a, cmp_b;
 
   if (try_merge_compare (cmp))
 return true;
@@ -786,7 +786,7 @@ try_eliminate_compare (struct comparison *cmp)
 
   rtx x = XVECEXP (PATTERN (insn), 0, 0);
   if (rtx_equal_p (SET_DEST (x), in_a))
-cmp_src = SET_SRC (x);
+cmp_a = SET_SRC (x);
 
   /* Also check operations with implicit extensions, e.g.:
  [(set (reg:DI)
@@ -800,7 +800,7 @@ try_eliminate_compare (struct comparison *cmp)
   && (GET_CODE (SET_SRC (x)) == ZERO_EXTEND
   || GET_CODE (SET_SRC (x)) == SIGN_EXTEND)
   && GET_MODE (XEXP (SET_SRC (x), 0)) == GET_MODE (in_a))
-cmp_src = XEXP (SET_SRC (x), 0);
+cmp_a = XEXP (SET_SRC (x), 0);
 
   /* Also check fully redundant comparisons, e.g.:
  [(set (reg:SI)
@@ -811,7 +811,7 @@ try_eliminate_compare (struct comparison *cmp)
   && GET_CODE (SET_SRC (x)) == MINUS
   && rtx_equal_p (XEXP (SET_SRC (x), 0), in_a)
   && rtx_equal_p (XEXP (SET_SRC (x), 1), in_b))
-cmp_src = in_a;
+cmp_a = in_a;
 
   else
 return false;
@@ -819,17 +819,26 @@ try_eliminate_compare (struct comparison *cmp)
   /* If the source uses addressing modes with side effects, we can't
  do the merge because we'd end up with a PARALLEL that has two
  instances of that side effect in it.  */
-  if (side_effects_p (cmp_src))
+  if (side_effects_p (cmp_a))
+return false;
+
+  if (in_a == in_b)
+cmp_b = cmp_a;
+  else if (rtx_equal_p (SET_DEST (x), in_b))
+cmp_b = SET_SRC (x);
+  else
+cmp_b = in_b;
+  if (side_effects_p (cmp_b))
 return false;
 
   /* Determine if we ought to use a different CC_MODE here.  */
-  flags = maybe_select_cc_mode (cmp, cmp_src, in_b);
+  flags = maybe_select_cc_mode (cmp, cmp_a, cmp_b);
   if (flags == NULL)
 flags = gen_rtx_REG (cmp->orig_mode, targetm.flags_regnum);
 
   /* Generate a new comparison for installation in the setter.  */
-  rtx y = copy_rtx (cmp_src);
-  y = gen_rtx_COMPARE (GET_MODE (flags), y, in_b);
+  rtx y = copy_rtx (cmp_a);
+  y = gen_rtx_COMPARE (GET_MODE (flags), y, copy_rtx (cmp_b));
   y = gen_rtx_SET (flags, y);
 
   /* Canonicalize instruction to:
diff --git a/gcc/testsuite/gcc.dg/torture/pr86438.c 
b/gcc/testsuite/gcc.dg/torture/pr86438.c
new file mode 100644
index ..d4a23f313a55
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr86438.c
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+
+typedef unsigned int u32;
+typedef unsigned long long u64;
+#if __SIZEOF_INT128__
+typedef unsigned __int128 u128;
+#else
+typedef u64 u128;
+#endif
+
+u128 g;
+
+static __attribute__ ((noinline, noclone))
+void check (u64 a, u64 b)
+{
+  if (a != 0 || b != 4)
+__builtin_abort ();
+}
+
+int
+main (void)
+{
+  u64 d = (g ? 5 : 4);
+  u32 f = __builtin_sub_overflow_p (d, (u128) d, (u64) 0);
+  u128 x = g + f + d;
+  check ((x >> 32) >> 32, x);
+  return 0;
+}

-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist
Hay q

[RFC][PATCH]Merge VEC_COND_EXPR into MASK_STORE after loop vectorization

2018-11-08 Thread Renlin Li

Hi all,

When allow-store-data-races is enabled, ifcvt would prefer to generated
conditional select and unconditional store to convert certain if statement
into:

_ifc_1 = val
_ifc_2 = A[i]
val = cond? _ifc_1 : _ifc_2
A[i] = val

When the loop got vectorized, this pattern will be turned into
MASK_LOAD, VEC_COND_EXPR and MASK_STORE. This could be improved.

The change here add a post processing function to combine the VEC_COND_EXPR
expression into MASK_STORE, and delete related dead code.

I am a little bit conservative here.
I didn't change the default behavior of ifcvt to always generate MASK_STORE,
although it should be safe in all cases (allow or dis-allow store data race).

MASK_STORE might not well handled in vectorization pass compared with
conditional select. It might be too early and aggressive to do that in ifcvt.
And the performance of MASK_STORE might not good for some platforms.
(We could add --param or target hook to differentiate this ifcvt behavior
on different platforms)

Another reason I did not do that in ifcvt is the data reference
analysis. To create a MASK_STORE, a pointer is created as the first
argument to the internal function call. If the pointer is created out of
array references, e.g. x = &A[i], data reference analysis could not properly
analysis the relationship between MEM_REF (x) and ARRAY_REF (A, i). This
will create a versioned loop beside the vectorized one.
(I have hacks to look through the MEM_REF, and restore the reference back to
ARRAY_REF (A, i).  Maybe we could do analysis on lowered address expression?
I saw we have gimple laddress pass to aid the vectorizer)

The approach here comes a little bit late, on the condition that vector
MASK_STORE is generated by loop vectorizer. In this case, it is definitely
beneficial to do the code transformation.

Any thoughts on the best way to fix the issue?


This patch has been tested with aarch64-none-elf, no regressions.

Regards,
Renlin

gcc/ChangeLog:

2018-11-08  Renlin Li  

* tree-vectorizer.h (combine_sel_mask_store): Declare new function.
* tree-vect-loop.c (combine_sel_mask_store): Define new function.
* tree-vectorizer.c (vectorize_loops): Call it.

gcc/testsuite/ChangeLog:

2018-11-08  Renlin Li  

* gcc.target/aarch64/sve/combine_vcond_mask_store_1.c: New.

diff --git a/gcc/testsuite/gcc.target/aarch64/sve/combine_vcond_mask_store_1.c b/gcc/testsuite/gcc.target/aarch64/sve/combine_vcond_mask_store_1.c
new file mode 100644
index ..64f6b7b00f58ee45bd4a2f91c1a9404911f1a09f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/combine_vcond_mask_store_1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize --param allow-store-data-races=1 -fdump-tree-vect-details" } */
+
+void test ()
+{
+  static int array[100];
+  for (unsigned i = 1; i < 16; ++i)
+{
+  int a = array[i];
+  if (a & 1)
+	array[i] = a + 1;
+  if (array[i] > 10)
+	array[i] = a + 2;
+}
+}
+
+/* { dg-final { scan-tree-dump-times "Combining VEC_COND_EXPR and MASK_STORE" 1 "vect" } } */
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 177b284e9c617a41c33d1387ba5afbed51d8ed00..9e1a167d03ea5bf640e58b3426d42b4e3c74da56 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -8539,6 +8539,166 @@ vect_transform_loop (loop_vec_info loop_vinfo)
   return epilogue;
 }
 
+/*
+   When allow-store-data-races=1, if-conversion will convert certain if
+   statements into:
+   A[i] = cond ? val : A[i].
+   If the loop is successfully vectorized,
+   MASK_LOAD + VEC_COND_EXPR + MASK_STORE will be generated.
+
+   This pattern could be combined into a single MASK_STORE with new mask.
+   The new mask is the combination of original mask and the value selection mask
+   in VEC_COND_EXPR.
+
+   After the transformation, the MASK_LOAD and VEC_COND_EXPR might be dead.  */
+
+void
+combine_sel_mask_store (struct loop *loop)
+{
+  basic_block *bbs = get_loop_body (loop);
+  unsigned nbbs = loop->num_nodes;
+  unsigned i;
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+
+  vect_location = find_loop_location (loop);
+  for (i = 0; i < nbbs; i++)
+{
+  bb = bbs[i];
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
+	   gsi_next (&gsi))
+	{
+	  gimple *mask_store = gsi_stmt (gsi);
+	  if (!gimple_call_internal_p (mask_store, IFN_MASK_STORE))
+	continue;
+
+	  /*
+	 X = MASK_LOAD (PTR, -, MASK)
+	 VAL = ...
+	 Y = VEC_COND (cond, VAL, X)
+	 MASK_STORE (PTR, -, MASK, Y)
+	  */
+	  tree vec_op = gimple_call_arg (mask_store, 3);
+	  tree store_mask = gimple_call_arg (mask_store, 2);
+	  if (TREE_CODE (vec_op) == SSA_NAME)
+	{
+	  gimple *def = SSA_NAME_DEF_STMT (vec_op);
+	  gassign *assign = dyn_cast  (def);
+	  if (!assign || gimple_assign_rhs_code (assign) != VEC_COND_EXPR)
+		continue;
+
+	  tree sel_cond = gimple_assign_rhs1 (assign);
+	  tree true_val = gimple_assign_rhs2 (assign);
+	

Re: [PATCH 1/4] cgraph: add selftest::symbol_table_test

2018-11-08 Thread Richard Biener
On Wed, Nov 7, 2018 at 5:22 PM David Malcolm  wrote:
>
> This patch adds a selftest fixture for overriding the "symtab" global,
> so that selftests involving symtab nodes can be isolated from each
> other: each selftest can have its own symbol_table instance.
>
> In particular, this ensures that nodes can have a predictable "order"
> and thus predictable dump names within selftests.
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu, in
> conjunction with the rest of the patch kit.
>
> OK for trunk?

OK.

> gcc/ChangeLog:
> * cgraph.c: Include "selftest.h".
> (saved_symtab): New variable.
> (selftest::symbol_table_test::symbol_table_test): New ctor.
> (selftest::symbol_table_test::~symbol_table_test): New dtor.
> (selftest::test_symbol_table_test): New test.
> (selftest::cgraph_c_tests): New.
> * cgraph.h (saved_symtab): New decl.
> (selftest::symbol_table_test): New class.
> * selftest-run-tests.c (selftest::run_tests): Call
> selftest::cgraph_c_tests.
> * selftest.h (selftest::cgraph_c_tests): New decl.
> ---
>  gcc/cgraph.c | 67 
> 
>  gcc/cgraph.h | 23 +
>  gcc/selftest-run-tests.c |  1 +
>  gcc/selftest.h   |  1 +
>  4 files changed, 92 insertions(+)
>
> diff --git a/gcc/cgraph.c b/gcc/cgraph.c
> index b432f7e..b3dd429 100644
> --- a/gcc/cgraph.c
> +++ b/gcc/cgraph.c
> @@ -62,6 +62,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "gimplify.h"
>  #include "stringpool.h"
>  #include "attribs.h"
> +#include "selftest.h"
>
>  /* FIXME: Only for PROP_loops, but cgraph shouldn't have to know about this. 
>  */
>  #include "tree-pass.h"
> @@ -3765,4 +3766,70 @@ cgraph_edge::sreal_frequency ()
>: caller->count);
>  }
>
> +/* A stashed copy of "symtab" for use by selftest::symbol_table_test.
> +   This needs to be a global so that it can be a GC root, and thus
> +   prevent the stashed copy from being garbage-collected if the GC runs
> +   during a symbol_table_test.  */
> +
> +symbol_table *saved_symtab;
> +
> +#if CHECKING_P
> +
> +namespace selftest {
> +
> +/* class selftest::symbol_table_test.  */
> +
> +/* Constructor.  Store the old value of symtab, and create a new one.  */
> +
> +symbol_table_test::symbol_table_test ()
> +{
> +  gcc_assert (saved_symtab == NULL);
> +  saved_symtab = symtab;
> +  symtab = new (ggc_cleared_alloc  ()) symbol_table ();
> +}
> +
> +/* Destructor.  Restore the old value of symtab.  */
> +
> +symbol_table_test::~symbol_table_test ()
> +{
> +  gcc_assert (saved_symtab != NULL);
> +  symtab = saved_symtab;
> +  saved_symtab = NULL;
> +}
> +
> +/* Verify that symbol_table_test works.  */
> +
> +static void
> +test_symbol_table_test ()
> +{
> +  /* Simulate running two selftests involving symbol tables.  */
> +  for (int i = 0; i < 2; i++)
> +{
> +  symbol_table_test stt;
> +  tree test_decl = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL,
> +  get_identifier ("test_decl"),
> +  build_function_type_list (void_type_node,
> +NULL_TREE));
> +  cgraph_node *node = cgraph_node::get_create (test_decl);
> +  gcc_assert (node);
> +
> +  /* Verify that the node has order 0 on both iterations,
> +and thus that nodes have predictable dump names in selftests.  */
> +  ASSERT_EQ (node->order, 0);
> +  ASSERT_STREQ (node->dump_name (), "test_decl/0");
> +}
> +}
> +
> +/* Run all of the selftests within this file.  */
> +
> +void
> +cgraph_c_tests ()
> +{
> +  test_symbol_table_test ();
> +}
> +
> +} // namespace selftest
> +
> +#endif /* CHECKING_P */
> +
>  #include "gt-cgraph.h"
> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
> index 71c5453..d326866 100644
> --- a/gcc/cgraph.h
> +++ b/gcc/cgraph.h
> @@ -3350,4 +3350,27 @@ xstrdup_for_dump (const char *transient_str)
>return ggc_strdup (transient_str);
>  }
>
> +extern GTY(()) symbol_table *saved_symtab;
> +
> +#if CHECKING_P
> +
> +namespace selftest {
> +
> +/* An RAII-style class for use in selftests for temporarily using a different
> +   symbol_table, so that such tests can be isolated from each other.  */
> +
> +class symbol_table_test
> +{
> + public:
> +  /* Constructor.  Override "symtab".  */
> +  symbol_table_test ();
> +
> +  /* Constructor.  Restore the saved_symtab.  */
> +  ~symbol_table_test ();
> +};
> +
> +} // namespace selftest
> +
> +#endif /* CHECKING_P */
> +
>  #endif  /* GCC_CGRAPH_H  */
> diff --git a/gcc/selftest-run-tests.c b/gcc/selftest-run-tests.c
> index 562ada7..6d65d24 100644
> --- a/gcc/selftest-run-tests.c
> +++ b/gcc/selftest-run-tests.c
> @@ -73,6 +73,7 @@ selftest::run_tests ()
>unique_ptr_tests_cc_tests ();
>opt_proposer_c_tests ();
>json_cc_tests ();
> +  cgraph_c_tests ();
>op

Re: [PATCH 3/4] support %f in pp_format

2018-11-08 Thread Richard Biener
On Wed, Nov 7, 2018 at 5:22 PM David Malcolm  wrote:
>
> Numerous formatted messages from the inliner use %f, mostly as %f, but
> occasionally with length modifiers.
>
> This patch implements the simplest case of "%f" for pp_format (with no
> modifier support) to make it easier to port these messages from fprintf
> to dump_printf_loc.
>
> The selftest has an assertion that %f on 1.0 is printed as "1.00".
> This comes from the host's sprintf, and I believe this is guaranteed by
> POSIX: "If the precision is missing, it shall be taken as 6".  If this is
> an issue I can drop the selftest.
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu, in
> conjunction with the rest of the patch kit.
>
> OK for trunk?

OK.

> gcc/c-family/ChangeLog:
> * c-format.c (gcc_dump_printf_char_table): Add entry for %f.
>
> gcc/ChangeLog:
> * pretty-print.c (pp_format): Handle %f.
> (selftest::test_pp_format): Add test of %f.
> * pretty-print.h (pp_double): New macro.
>
> gcc/testsuite/ChangeLog:
> * gcc.dg/format/gcc_diag-10.c: Add coverage for %f.
> ---
>  gcc/c-family/c-format.c   | 3 +++
>  gcc/pretty-print.c| 6 ++
>  gcc/pretty-print.h| 1 +
>  gcc/testsuite/gcc.dg/format/gcc_diag-10.c | 2 ++
>  4 files changed, 12 insertions(+)
>
> diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
> index 385ee1a..8d91a77 100644
> --- a/gcc/c-family/c-format.c
> +++ b/gcc/c-family/c-format.c
> @@ -810,6 +810,9 @@ static const format_char_info 
> gcc_dump_printf_char_table[] =
>/* T requires a "tree" at runtime.  */
>{ "T",   1, STD_C89, { T89_T,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  
> BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "", "\"",   
> NULL },
>
> +  /* %f requires a "double"; it doesn't support modifiers.  */
> +  { "f",   0, STD_C89, { T89_D,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  
> BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "", "\"",   
> NULL },
> +
>{ NULL,  0, STD_C89, NOLENGTHS, NULL, NULL, NULL }
>  };
>
> diff --git a/gcc/pretty-print.c b/gcc/pretty-print.c
> index 7dd900b..19ef75b 100644
> --- a/gcc/pretty-print.c
> +++ b/gcc/pretty-print.c
> @@ -977,6 +977,7 @@ pp_indent (pretty_printer *pp)
> %ld, %li, %lo, %lu, %lx: long versions of the above.
> %lld, %lli, %llo, %llu, %llx: long long versions.
> %wd, %wi, %wo, %wu, %wx: HOST_WIDE_INT versions.
> +   %f: double
> %c: character.
> %s: string.
> %p: pointer (printed in a host-dependent manner).
> @@ -1307,6 +1308,10 @@ pp_format (pretty_printer *pp, text_info *text)
>   (pp, *text->args_ptr, precision, unsigned, "u");
>   break;
>
> +   case 'f':
> + pp_double (pp, va_arg (*text->args_ptr, double));
> + break;
> +
> case 'Z':
>   {
> int *v = va_arg (*text->args_ptr, int *);
> @@ -2160,6 +2165,7 @@ test_pp_format ()
>ASSERT_PP_FORMAT_2 ("17 12345678", "%wo %x", (HOST_WIDE_INT)15, 
> 0x12345678);
>ASSERT_PP_FORMAT_2 ("0xcafebabe 12345678", "%wx %x", 
> (HOST_WIDE_INT)0xcafebabe,
>   0x12345678);
> +  ASSERT_PP_FORMAT_2 ("1.00 12345678", "%f %x", 1.0, 0x12345678);
>ASSERT_PP_FORMAT_2 ("A 12345678", "%c %x", 'A', 0x12345678);
>ASSERT_PP_FORMAT_2 ("hello world 12345678", "%s %x", "hello world",
>   0x12345678);
> diff --git a/gcc/pretty-print.h b/gcc/pretty-print.h
> index 2decc51..a6e60f1 100644
> --- a/gcc/pretty-print.h
> +++ b/gcc/pretty-print.h
> @@ -330,6 +330,7 @@ pp_get_prefix (const pretty_printer *pp) { return 
> pp->prefix; }
>pp_string (PP, pp_buffer (PP)->digit_buffer);\
>  }  \
>while (0)
> +#define pp_double(PP, F)   pp_scalar (PP, "%f", F)
>  #define pp_pointer(PP, P)  pp_scalar (PP, "%p", P)
>
>  #define pp_identifier(PP, ID)  pp_string (PP, (pp_translate_identifiers (PP) 
> \
> diff --git a/gcc/testsuite/gcc.dg/format/gcc_diag-10.c 
> b/gcc/testsuite/gcc.dg/format/gcc_diag-10.c
> index 97a1993..ba2629b 100644
> --- a/gcc/testsuite/gcc.dg/format/gcc_diag-10.c
> +++ b/gcc/testsuite/gcc.dg/format/gcc_diag-10.c
> @@ -183,4 +183,6 @@ void test_dump (tree t, gimple *stmt, cgraph_node *node)
>dump ("%T", t);
>dump ("%G", stmt);
>dump ("%C", node);
> +  dump ("%f", 1.0);
> +  dump ("%4.2f", 1.0); /* { dg-warning "format" } */
>  }
> --
> 1.8.5.3
>


Re: [PATCH 2/4] dump_printf: add "%C" for dumping cgraph_node *

2018-11-08 Thread Richard Biener
On Wed, Nov 7, 2018 at 5:22 PM David Malcolm  wrote:
>
> This patch implements support for %C in dump_printf for dumping
> cgraph_node *.
> (I would have preferred to have a code for printing symtab_node *
> and both subclasses, but there doesn't seem to be a good way for
> -Wformat to handle inheritance, so, failing that, I went with
> this approach).
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu, in
> conjunction with the rest of the patch kit.
>
> OK for trunk?

OK

> gcc/c-family/ChangeLog:
> * c-format.c (local_cgraph_node_ptr_node): New variable.
> (gcc_dump_printf_char_table): Add entry for %C.
> (get_pointer_to_named_type): New function, taken from the handling
> code for "gimple *" from...
> (init_dynamic_diag_info): ...here.  Add handling for
> "cgraph_node *".
> * c-format.h (T_CGRAPH_NODE): New.
>
> gcc/ChangeLog:
> * dump-context.h (ASSERT_IS_CGRAPH_NODE): New macro.
> * dumpfile.c (make_item_for_dump_cgraph_node): Move to before...
> (dump_pretty_printer::decode_format): Implement "%C" for
> cgraph_node *.
> (selftest::test_capture_of_dump_calls): Rename "where" to
> "stmt_loc".  Convert test_decl to a function decl and set its
> location.  Add a symbol_table_test RAII instance and a
> cgraph_node, using it to test "%C" and dump_symtab_node.
>
> gcc/testsuite/ChangeLog:
> * gcc.dg/format/gcc_diag-10.c (cgraph_node): New typedef.
> (test_dump): Add testing of %C.
> ---
>  gcc/c-family/c-format.c   |  56 ++-
>  gcc/c-family/c-format.h   |   1 +
>  gcc/dump-context.h|   8 +++
>  gcc/dumpfile.c| 115 
> ++
>  gcc/testsuite/gcc.dg/format/gcc_diag-10.c |   5 +-
>  5 files changed, 135 insertions(+), 50 deletions(-)
>
> diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
> index a1133c7..385ee1a 100644
> --- a/gcc/c-family/c-format.c
> +++ b/gcc/c-family/c-format.c
> @@ -60,6 +60,7 @@ struct function_format_info
>  /* Initialized in init_dynamic_diag_info.  */
>  static GTY(()) tree local_tree_type_node;
>  static GTY(()) tree local_gimple_ptr_node;
> +static GTY(()) tree local_cgraph_node_ptr_node;
>  static GTY(()) tree locus;
>
>  static bool decode_format_attr (tree, function_format_info *, int);
> @@ -803,6 +804,9 @@ static const format_char_info 
> gcc_dump_printf_char_table[] =
>/* E and G require a "gimple *" argument at runtime.  */
>{ "EG",   1, STD_C89, { T89_G,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  
> BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "", "\"",   
> NULL },
>
> +  /* C requires a "cgraph_node *" argument at runtime.  */
> +  { "C",   1, STD_C89, { T_CGRAPH_NODE,   BADLEN,  BADLEN,  BADLEN,  BADLEN, 
>  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "", "\"",   
> NULL },
> +
>/* T requires a "tree" at runtime.  */
>{ "T",   1, STD_C89, { T89_T,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  
> BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "", "\"",   
> NULL },
>
> @@ -3879,6 +3883,33 @@ init_dynamic_gfc_info (void)
>  }
>  }
>
> +/* Lookup the type named NAME and return a pointer-to-NAME type if found.
> +   Otherwise, return void_type_node if NAME has not been used yet, or 
> NULL_TREE if
> +   NAME is not a type (issuing an error).  */
> +
> +static tree
> +get_pointer_to_named_type (const char *name)
> +{
> +  tree result;
> +  if ((result = maybe_get_identifier (name)))
> +{
> +  result = identifier_global_value (result);
> +  if (result)
> +   {
> + if (TREE_CODE (result) != TYPE_DECL)
> +   {
> + error ("%qs is not defined as a type", name);
> + result = NULL_TREE;
> +   }
> + else
> +   result = TREE_TYPE (result);
> +   }
> +}
> +  else
> +result = void_type_node;
> +  return result;
> +}
> +
>  /* Determine the types of "tree" and "location_t" in the code being
> compiled for use in GCC's diagnostic custom format attributes.  You
> must have set dynamic_format_types before calling this function.  */
> @@ -3932,25 +3963,12 @@ init_dynamic_diag_info (void)
>/* Similar to the above but for gimple*.  */
>if (!local_gimple_ptr_node
>|| local_gimple_ptr_node == void_type_node)
> -{
> -  if ((local_gimple_ptr_node = maybe_get_identifier ("gimple")))
> -   {
> - local_gimple_ptr_node
> -   = identifier_global_value (local_gimple_ptr_node);
> - if (local_gimple_ptr_node)
> -   {
> - if (TREE_CODE (local_gimple_ptr_node) != TYPE_DECL)
> -   {
> - error ("% is not defined as a type");
> - local_gimple_ptr_node = 0;
> -   }
> - else
> -   local_gimple_ptr_node = TREE_TYPE (local_g

Re: [PATCH 4/4] ipa-inline.c/tree-inline.c: port from fprintf to dump API (PR ipa/86395)

2018-11-08 Thread Richard Biener
On Wed, Nov 7, 2018 at 5:22 PM David Malcolm  wrote:
>
> This patch ports various fprintf calls in the inlining code to using
> the dump API, using the %C format code for printing cgraph_node *.
> I focused on the dump messages that seemed most significant to
> end-users; I didn't port all of the calls.
>
> Doing so makes this information appear in -fopt-info and in
> optimization records, rather than just in the dump_file.
>
> It also changes the affected dumpfile-dumps from being unconditional
> (assuming the dump_file is enabled) to being guarded by the MSG_*
> status.  Hence various tests with dg-final scan-*-dump directives
> need to gain "-all" or "-optimized" suffixes to -fdump-ipa-inline.
>
> The use of %C throughout also slightly changes the dump format for
> several messages, e.g. changing:
>
>  Inlining void inline_me(char*) into int main(int, char**).
>
> to:
>
> ../../src/gcc/testsuite/g++.dg/tree-ssa/inline-1.C:13:8: optimized:  Inlining 
> void inline_me(char*)/0 into int main(int, char**)/2.
>
> amongst other things adding "/order" suffixes to the cgraph node
> names.
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu, in
> conjunction with the rest of the patch kit.
>
> OK for trunk?

OK.

Thanks for doing this!
Richard.

> gcc/ChangeLog:
> PR ipa/86395
> * doc/invoke.texi (-fdump-ipa-): Document the "-optimized",
> "-missed", "-note", and "-all" sub-options.
> * ipa-inline.c (caller_growth_limits): Port from fprintf to dump
> API.
> (can_early_inline_edge_p): Likewise.
> (want_early_inline_function_p): Likewise.
> (want_inline_self_recursive_call_p): Likewise.
> (recursive_inlining): Likewise.
> (inline_small_functions): Likewise.
> (flatten_function): Likewise.
> (ipa_inline): Likewise.
> (inline_always_inline_functions): Likewise.
> (early_inline_small_functions): Likewise.
> (early_inliner): Likewise.
> * tree-inline.c (expand_call_inline): Likewise.
>
> gcc/testsuite/ChangeLog:
> PR ipa/86395
> * g++.dg/ipa/devirt-12.C: Add "-all" suffix to
> "-fdump-ipa-inline".
> * g++.dg/ipa/imm-devirt-1.C: Add "-optimized" suffix to
> "-fdump-tree-einline".
> * g++.dg/tree-prof/inline_mismatch_args.C: Add "-all" suffix to
> "-fdump-tree-einline".
> * g++.dg/tree-ssa/inline-1.C: Add "-optimized" suffix to
> "-fdump-tree-einline".
> * g++.dg/tree-ssa/inline-2.C: Likewise.
> * g++.dg/tree-ssa/inline-3.C: Likewise.
> * g++.dg/tree-ssa/inline-4.C: New test, based on inline-1.C, but
> using "-fopt-info-inline".
> * gcc.dg/ipa/fopt-info-inline-1.c: New test.
> * gcc.dg/ipa/inline-4.c:  Add "-all" suffix to
> "-fdump-ipa-inline".  Add "-fopt-info-inline" and dg-optimized
> directive.
> * gcc.dg/ipa/inline-7.c: Add "-optimized" suffix to
> "-fdump-tree-einline".  Add "-fopt-info-inline" and dg-optimized
> directive.  Update scan-tree-dump-times to reflect /order
> suffixes.
> * gcc.dg/ipa/inlinehint-4.c: Update scan-tree-dump-times to
> reflect /order suffixes.
> * gcc.dg/plugin/dump-1.c: Add "-loop" to "-fopt-info-note" to
> avoid getting extra messages from inliner.
> * gcc.dg/plugin/dump-2.c: Likewise.
> * gcc.dg/pr26570.c: Add dg-prune-output to ignore new
> "function body not available" missed optimization messages.
> * gcc.dg/pr71969-2.c: Update scan-tree-dump-times to reflect
> /order suffixes.
> * gcc.dg/pr71969-3.c: Likewise.
> * gcc.dg/tree-ssa/inline-11.c: Add "-all" suffix to
> "-fdump-tree-einline".
> * gcc.dg/tree-ssa/inline-3.c: Add "-optimized" suffix to
> "-fdump-tree-einline".  Update scan-tree-dump-times to reflect
> /order suffixes.
> * gcc.dg/tree-ssa/inline-4.c: Add "-optimized" suffix to
> "-fdump-tree-einline".  Add "-fopt-info-inline" and dg-optimized
> directive.
> * gcc.dg/tree-ssa/inline-8.c: Add "-optimized" suffix to
> "-fdump-tree-einline".
> * gfortran.dg/pr79966.f90: Update scan-ipa-dump to reflect /order
> suffixes.
> ---
>  gcc/doc/invoke.texi|  13 ++
>  gcc/ipa-inline.c   | 191 
> +++--
>  gcc/testsuite/g++.dg/ipa/devirt-12.C   |   2 +-
>  gcc/testsuite/g++.dg/ipa/imm-devirt-1.C|   2 +-
>  .../g++.dg/tree-prof/inline_mismatch_args.C|   2 +-
>  gcc/testsuite/g++.dg/tree-ssa/inline-1.C   |   2 +-
>  gcc/testsuite/g++.dg/tree-ssa/inline-2.C   |   2 +-
>  gcc/testsuite/g++.dg/tree-ssa/inline-3.C   |   2 +-
>  gcc/testsuite/g++.dg/tree-ssa/inline-4.C   |  32 
>  gcc/testsuite/gcc.dg/ipa/fopt-info-inline-1.c  |  44 +
>  gcc/testsuite/gcc.dg/ipa/i

Re: [PATCH] Remove unreachable nodes before IPA profile pass (PR ipa/87706).

2018-11-08 Thread Martin Liška
On 11/8/18 12:19 PM, Jan Hubicka wrote:
>> Hi.
>>
>> In order to fix the warnings mentioned in the PR, we need
>> to run remove_unreachable_nodes after early tree passes. That's
>> however possible only within a IPA pass. Thus I'm calling that
>> before the profile PASS.
>>
>> Patch survives regression tests on ppc64le-linux-gnu and majority
>> of warnings are gone in profiledbootstrap.
>>
>> Ready for trunk?
> 
> I think we want to do that even with no -fprofile-generate because the
> unreachable code otherwise goes into all other IPA passes for no good
> reason.  So perhaps adding it as todo after the early optimization
> metapass?

That fails due to gcc_assert.
So one needs:

diff --git a/gcc/passes.c b/gcc/passes.c
index d838d909941..be92a2f3be3 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -485,7 +485,7 @@ const pass_data pass_data_all_early_optimizations =
   0, /* properties_provided */
   0, /* properties_destroyed */
   0, /* todo_flags_start */
-  0, /* todo_flags_finish */
+  TODO_remove_functions | TODO_rebuild_cgraph_edges, /* todo_flags_finish */
 };
 
 class pass_all_early_optimizations : public gimple_opt_pass
@@ -1989,7 +1989,8 @@ execute_todo (unsigned int flags)
  of IPA pass queue.  */
   if (flags & TODO_remove_functions)
 {
-  gcc_assert (!cfun);
+  gcc_assert (!cfun
+ || strcmp (current_pass->name, "early_optimizations") == 0);
   symtab->remove_unreachable_nodes (dump_file);
 }
 

Or do you prefer to a new pass_remove_functions pass that will be added after
pass_local_optimization_passes ?

Martin

> 
> Honza
>> Thanks,
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2018-11-08  Martin Liska  
>>
>>  * tree-profile.c: Run TODO_remove_functions before "profile"
>>  pass in order to remove dead functions that will trigger
>>  -Wmissing-profile.
>> ---
>>  gcc/tree-profile.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>>
> 
>> diff --git a/gcc/tree-profile.c b/gcc/tree-profile.c
>> index d8f2a3b1ba4..c14ebc556a6 100644
>> --- a/gcc/tree-profile.c
>> +++ b/gcc/tree-profile.c
>> @@ -776,7 +776,7 @@ const pass_data pass_data_ipa_tree_profile =
>>0, /* properties_required */
>>0, /* properties_provided */
>>0, /* properties_destroyed */
>> -  0, /* todo_flags_start */
>> +  TODO_remove_functions, /* todo_flags_start */
>>TODO_dump_symtab, /* todo_flags_finish */
>>  };
>>  
>>
> 



Re: [PATCH] Remove unreachable nodes before IPA profile pass (PR ipa/87706).

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 12:39 PM Martin Liška  wrote:
>
> On 11/8/18 12:19 PM, Jan Hubicka wrote:
> >> Hi.
> >>
> >> In order to fix the warnings mentioned in the PR, we need
> >> to run remove_unreachable_nodes after early tree passes. That's
> >> however possible only within a IPA pass. Thus I'm calling that
> >> before the profile PASS.
> >>
> >> Patch survives regression tests on ppc64le-linux-gnu and majority
> >> of warnings are gone in profiledbootstrap.
> >>
> >> Ready for trunk?
> >
> > I think we want to do that even with no -fprofile-generate because the
> > unreachable code otherwise goes into all other IPA passes for no good
> > reason.  So perhaps adding it as todo after the early optimization
> > metapass?
>
> That fails due to gcc_assert.
> So one needs:
>
> diff --git a/gcc/passes.c b/gcc/passes.c
> index d838d909941..be92a2f3be3 100644
> --- a/gcc/passes.c
> +++ b/gcc/passes.c
> @@ -485,7 +485,7 @@ const pass_data pass_data_all_early_optimizations =
>0, /* properties_provided */
>0, /* properties_destroyed */
>0, /* todo_flags_start */
> -  0, /* todo_flags_finish */
> +  TODO_remove_functions | TODO_rebuild_cgraph_edges, /* todo_flags_finish */
>  };
>
>  class pass_all_early_optimizations : public gimple_opt_pass
> @@ -1989,7 +1989,8 @@ execute_todo (unsigned int flags)
>   of IPA pass queue.  */
>if (flags & TODO_remove_functions)
>  {
> -  gcc_assert (!cfun);
> +  gcc_assert (!cfun
> + || strcmp (current_pass->name, "early_optimizations") == 0);
>symtab->remove_unreachable_nodes (dump_file);
>  }
>
>
> Or do you prefer to a new pass_remove_functions pass that will be added after
> pass_local_optimization_passes ?

Can you make it todo_flags_start of pass_ipa_tree_profile instead?

> Martin
>
> >
> > Honza
> >> Thanks,
> >> Martin
> >>
> >> gcc/ChangeLog:
> >>
> >> 2018-11-08  Martin Liska  
> >>
> >>  * tree-profile.c: Run TODO_remove_functions before "profile"
> >>  pass in order to remove dead functions that will trigger
> >>  -Wmissing-profile.
> >> ---
> >>  gcc/tree-profile.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >>
> >
> >> diff --git a/gcc/tree-profile.c b/gcc/tree-profile.c
> >> index d8f2a3b1ba4..c14ebc556a6 100644
> >> --- a/gcc/tree-profile.c
> >> +++ b/gcc/tree-profile.c
> >> @@ -776,7 +776,7 @@ const pass_data pass_data_ipa_tree_profile =
> >>0, /* properties_required */
> >>0, /* properties_provided */
> >>0, /* properties_destroyed */
> >> -  0, /* todo_flags_start */
> >> +  TODO_remove_functions, /* todo_flags_start */
> >>TODO_dump_symtab, /* todo_flags_finish */
> >>  };
> >>
> >>
> >
>


Re: [PATCH] Remove unreachable nodes before IPA profile pass (PR ipa/87706).

2018-11-08 Thread Jan Hubicka
> On Thu, Nov 8, 2018 at 12:39 PM Martin Liška  wrote:
> >
> > On 11/8/18 12:19 PM, Jan Hubicka wrote:
> > >> Hi.
> > >>
> > >> In order to fix the warnings mentioned in the PR, we need
> > >> to run remove_unreachable_nodes after early tree passes. That's
> > >> however possible only within a IPA pass. Thus I'm calling that
> > >> before the profile PASS.
> > >>
> > >> Patch survives regression tests on ppc64le-linux-gnu and majority
> > >> of warnings are gone in profiledbootstrap.
> > >>
> > >> Ready for trunk?
> > >
> > > I think we want to do that even with no -fprofile-generate because the
> > > unreachable code otherwise goes into all other IPA passes for no good
> > > reason.  So perhaps adding it as todo after the early optimization
> > > metapass?
> >
> > That fails due to gcc_assert.
> > So one needs:
> >
> > diff --git a/gcc/passes.c b/gcc/passes.c
> > index d838d909941..be92a2f3be3 100644
> > --- a/gcc/passes.c
> > +++ b/gcc/passes.c
> > @@ -485,7 +485,7 @@ const pass_data pass_data_all_early_optimizations =
> >0, /* properties_provided */
> >0, /* properties_destroyed */
> >0, /* todo_flags_start */
> > -  0, /* todo_flags_finish */
> > +  TODO_remove_functions | TODO_rebuild_cgraph_edges, /* todo_flags_finish 
> > */
> >  };
> >
> >  class pass_all_early_optimizations : public gimple_opt_pass
> > @@ -1989,7 +1989,8 @@ execute_todo (unsigned int flags)
> >   of IPA pass queue.  */
> >if (flags & TODO_remove_functions)
> >  {
> > -  gcc_assert (!cfun);
> > +  gcc_assert (!cfun
> > + || strcmp (current_pass->name, "early_optimizations") == 
> > 0);
> >symtab->remove_unreachable_nodes (dump_file);
> >  }
> >
> >
> > Or do you prefer to a new pass_remove_functions pass that will be added 
> > after
> > pass_local_optimization_passes ?
> 
> Can you make it todo_flags_start of pass_ipa_tree_profile instead?

It fails because all_early_optimizations are now gimple pass, so it
should be TODO after pass_local_optimization_passes?

Honza
> 
> > Martin
> >
> > >
> > > Honza
> > >> Thanks,
> > >> Martin
> > >>
> > >> gcc/ChangeLog:
> > >>
> > >> 2018-11-08  Martin Liska  
> > >>
> > >>  * tree-profile.c: Run TODO_remove_functions before "profile"
> > >>  pass in order to remove dead functions that will trigger
> > >>  -Wmissing-profile.
> > >> ---
> > >>  gcc/tree-profile.c | 2 +-
> > >>  1 file changed, 1 insertion(+), 1 deletion(-)
> > >>
> > >>
> > >
> > >> diff --git a/gcc/tree-profile.c b/gcc/tree-profile.c
> > >> index d8f2a3b1ba4..c14ebc556a6 100644
> > >> --- a/gcc/tree-profile.c
> > >> +++ b/gcc/tree-profile.c
> > >> @@ -776,7 +776,7 @@ const pass_data pass_data_ipa_tree_profile =
> > >>0, /* properties_required */
> > >>0, /* properties_provided */
> > >>0, /* properties_destroyed */
> > >> -  0, /* todo_flags_start */
> > >> +  TODO_remove_functions, /* todo_flags_start */
> > >>TODO_dump_symtab, /* todo_flags_finish */
> > >>  };
> > >>
> > >>
> > >
> >


Re: [PATCH] Remove unreachable nodes before IPA profile pass (PR ipa/87706).

2018-11-08 Thread Martin Liška
On 11/8/18 12:46 PM, Jan Hubicka wrote:
>> On Thu, Nov 8, 2018 at 12:39 PM Martin Liška  wrote:
>>>
>>> On 11/8/18 12:19 PM, Jan Hubicka wrote:
> Hi.
>
> In order to fix the warnings mentioned in the PR, we need
> to run remove_unreachable_nodes after early tree passes. That's
> however possible only within a IPA pass. Thus I'm calling that
> before the profile PASS.
>
> Patch survives regression tests on ppc64le-linux-gnu and majority
> of warnings are gone in profiledbootstrap.
>
> Ready for trunk?

 I think we want to do that even with no -fprofile-generate because the
 unreachable code otherwise goes into all other IPA passes for no good
 reason.  So perhaps adding it as todo after the early optimization
 metapass?
>>>
>>> That fails due to gcc_assert.
>>> So one needs:
>>>
>>> diff --git a/gcc/passes.c b/gcc/passes.c
>>> index d838d909941..be92a2f3be3 100644
>>> --- a/gcc/passes.c
>>> +++ b/gcc/passes.c
>>> @@ -485,7 +485,7 @@ const pass_data pass_data_all_early_optimizations =
>>>0, /* properties_provided */
>>>0, /* properties_destroyed */
>>>0, /* todo_flags_start */
>>> -  0, /* todo_flags_finish */
>>> +  TODO_remove_functions | TODO_rebuild_cgraph_edges, /* todo_flags_finish 
>>> */
>>>  };
>>>
>>>  class pass_all_early_optimizations : public gimple_opt_pass
>>> @@ -1989,7 +1989,8 @@ execute_todo (unsigned int flags)
>>>   of IPA pass queue.  */
>>>if (flags & TODO_remove_functions)
>>>  {
>>> -  gcc_assert (!cfun);
>>> +  gcc_assert (!cfun
>>> + || strcmp (current_pass->name, "early_optimizations") == 
>>> 0);
>>>symtab->remove_unreachable_nodes (dump_file);
>>>  }
>>>
>>>
>>> Or do you prefer to a new pass_remove_functions pass that will be added 
>>> after
>>> pass_local_optimization_passes ?
>>
>> Can you make it todo_flags_start of pass_ipa_tree_profile instead?
> 
> It fails because all_early_optimizations are now gimple pass, so it
> should be TODO after pass_local_optimization_passes?

Please read my last email.

Martin

> 
> Honza
>>
>>> Martin
>>>

 Honza
> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> 2018-11-08  Martin Liska  
>
>  * tree-profile.c: Run TODO_remove_functions before "profile"
>  pass in order to remove dead functions that will trigger
>  -Wmissing-profile.
> ---
>  gcc/tree-profile.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
>

> diff --git a/gcc/tree-profile.c b/gcc/tree-profile.c
> index d8f2a3b1ba4..c14ebc556a6 100644
> --- a/gcc/tree-profile.c
> +++ b/gcc/tree-profile.c
> @@ -776,7 +776,7 @@ const pass_data pass_data_ipa_tree_profile =
>0, /* properties_required */
>0, /* properties_provided */
>0, /* properties_destroyed */
> -  0, /* todo_flags_start */
> +  TODO_remove_functions, /* todo_flags_start */
>TODO_dump_symtab, /* todo_flags_finish */
>  };
>
>

>>>



Re: [EXT] Re: [Driver] Add support for -fuse-ld=lld

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 12:00 PM Romain Geissler
 wrote:
>
> On Tue, 6 Nov 2018, H.J. Lu wrote:
>
> > On Sat, Oct 20, 2018 at 3:19 AM Romain Geissler
> >  wrote:
> > >
> > > I would like to raise again the question of supporting -fuse-ld=lld.
> > >
> >
> > LGTM.  But I can't approve it.
> >
> > --
> > H.J.
>
> Hi,
>
> Is there any gcc reviewer that could comment whether or not this small
> feature is welcome in gcc ?

The patch is OK.

Thanks,
Richard.

> Cheers,
> Romain


Implement {get,set}_range_info() variants that work with value_range's

2018-11-08 Thread Aldy Hernandez
get/set_range_info() currently returns the extremes of a range.  I have 
implemented overloaded variants that return a proper range.  In the 
future we should use actual ranges throughout, and not depend on range 
extremes, as depending on this behavior could causes us to lose precision.


I am also including changes to size_must_be_zero_p() to show how we 
should be using the range API, as opposed to performing error prone 
ad-hoc calculations on ranges and anti-ranges.


Martin, I'm not saying your code is wrong.  There are numerous other 
places in the compiler where we manipulate ranges/anti-ranges directly, 
all of which should be adapted in the future.  Everywhere there is a 
mention of VR_RANGE/VR_ANTI_RANGE in the compiler is suspect.  We should 
ideally be using intersect/union/may_contain_p/null_p, etc.


OK pending another round of tests?  (As I had tested this patch along 
with a whole slew of other upcoming changes ;-)).


Aldy
gcc/

	* gimple-fold.c (size_must_be_zero_p): Use value_range API instead
	of performing ad-hoc calculations.
	* tree-ssanames.c (set_range_info): New overloaded function
	accepting value_range &.
	(get_range_info): Same.
	* tree-ssanames.h (set_range_info_raw): Remove.
	(set_range_info): New prototype.
	(get_range_info): Same.

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 5468b604dec..4d9bdcd097f 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -635,9 +635,8 @@ var_decl_component_p (tree var)
 	  && TREE_CODE (TREE_OPERAND (inner, 0)) == ADDR_EXPR));
 }
 
-/* If the SIZE argument representing the size of an object is in a range
-   of values of which exactly one is valid (and that is zero), return
-   true, otherwise false.  */
+/* Return TRUE if the SIZE argument, representing the size of an
+   object, is in a range of values of which exactly zero is valid.  */
 
 static bool
 size_must_be_zero_p (tree size)
@@ -648,21 +647,19 @@ size_must_be_zero_p (tree size)
   if (TREE_CODE (size) != SSA_NAME || !INTEGRAL_TYPE_P (TREE_TYPE (size)))
 return false;
 
-  wide_int min, max;
-  enum value_range_kind rtype = get_range_info (size, &min, &max);
-  if (rtype != VR_ANTI_RANGE)
-return false;
-
   tree type = TREE_TYPE (size);
   int prec = TYPE_PRECISION (type);
 
-  wide_int wone = wi::one (prec);
-
   /* Compute the value of SSIZE_MAX, the largest positive value that
  can be stored in ssize_t, the signed counterpart of size_t.  */
   wide_int ssize_max = wi::lshift (wi::one (prec), prec - 1) - 1;
-
-  return wi::eq_p (min, wone) && wi::geu_p (max, ssize_max);
+  value_range valid_range (VR_RANGE,
+			   build_int_cst (type, 0),
+			   wide_int_to_tree (type, ssize_max));
+  value_range vr;
+  get_range_info (size, vr);
+  vr.intersect (&valid_range);
+  return vr.null_p ();
 }
 
 /* Fold function call to builtin mem{{,p}cpy,move}.  Try to detect and
diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index ff906e831e5..a2c2efb634a 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -398,6 +398,15 @@ set_range_info (tree name, enum value_range_kind range_type,
   set_range_info_raw (name, range_type, min, max);
 }
 
+/* Store range information for NAME from a value_range.  */
+
+void
+set_range_info (tree name, const value_range &vr)
+{
+  wide_int min = wi::to_wide (vr.min ());
+  wide_int max = wi::to_wide (vr.max ());
+  set_range_info (name, vr.kind (), min, max);
+}
 
 /* Gets range information MIN, MAX and returns enum value_range_kind
corresponding to tree ssa_name NAME.  enum value_range_kind returned
@@ -421,6 +430,27 @@ get_range_info (const_tree name, wide_int *min, wide_int *max)
   return SSA_NAME_RANGE_TYPE (name);
 }
 
+/* Gets range information corresponding to ssa_name NAME and stores it
+   in a value_range VR.  Returns the value_range_kind.  */
+
+enum value_range_kind
+get_range_info (const_tree name, value_range &vr)
+{
+  tree min, max;
+  wide_int wmin, wmax;
+  enum value_range_kind kind = get_range_info (name, &wmin, &wmax);
+
+  if (kind == VR_VARYING || kind == VR_UNDEFINED)
+min = max = NULL;
+  else
+{
+  min = wide_int_to_tree (TREE_TYPE (name), wmin);
+  max = wide_int_to_tree (TREE_TYPE (name), wmax);
+}
+  vr = value_range (kind, min, max);
+  return kind;
+}
+
 /* Set nonnull attribute to pointer NAME.  */
 
 void
diff --git a/gcc/tree-ssanames.h b/gcc/tree-ssanames.h
index 18a001a5461..a5ff14e524f 100644
--- a/gcc/tree-ssanames.h
+++ b/gcc/tree-ssanames.h
@@ -69,12 +69,11 @@ struct GTY ((variable_size)) range_info_def {
 /* Sets the value range to SSA.  */
 extern void set_range_info (tree, enum value_range_kind, const wide_int_ref &,
 			const wide_int_ref &);
-extern void set_range_info_raw (tree, enum value_range_kind,
-const wide_int_ref &,
-const wide_int_ref &);
+extern void set_range_info (tree, const value_range &);
 /* Gets the value range from SSA.  */
 extern enum value_range_kind get_range_info (const_tree, wide_int *,
 	 wide_int *

vr_values::{get,update}_value_range: use value_range API

2018-11-08 Thread Aldy Hernandez

As per $SUBJECT.

Depends on get_range_info() API changes I have just submitted.

OK for trunk?

commit 01b8d323f896692d2e999b607f4464861d9ccd8f
Author: Aldy Hernandez 
Date:   Thu Nov 8 12:56:41 2018 +0100

* vr-values.c (vr_values::get_value_range): Use value_range API
instead of piecing together ranges.
(vr_values::update_value_range): Same.

diff --git a/gcc/vr-values.c b/gcc/vr-values.c
index 8c9fd159146..4edc5a467ee 100644
--- a/gcc/vr-values.c
+++ b/gcc/vr-values.c
@@ -121,15 +121,9 @@ vr_values::get_value_range (const_tree var)
 	set_value_range_to_nonnull (vr, TREE_TYPE (sym));
 	  else if (INTEGRAL_TYPE_P (TREE_TYPE (sym)))
 	{
-	  wide_int min, max;
-	  value_range_kind rtype = get_range_info (var, &min, &max);
-	  if (rtype == VR_RANGE || rtype == VR_ANTI_RANGE)
-		set_value_range (vr, rtype,
- wide_int_to_tree (TREE_TYPE (var), min),
- wide_int_to_tree (TREE_TYPE (var), max),
- NULL);
-	  else
-		set_value_range_to_varying (vr);
+	  get_range_info (var, *vr);
+	  if (vr->undefined_p ())
+		vr->set_varying ();
 	}
 	  else
 	set_value_range_to_varying (vr);
@@ -178,17 +172,10 @@ vr_values::update_value_range (const_tree var, value_range *new_vr)
  factor that in.  */
   if (INTEGRAL_TYPE_P (TREE_TYPE (var)))
 {
-  wide_int min, max;
-  value_range_kind rtype = get_range_info (var, &min, &max);
+  value_range nr;
+  value_range_kind rtype = get_range_info (var, nr);
   if (rtype == VR_RANGE || rtype == VR_ANTI_RANGE)
-	{
-	  tree nr_min, nr_max;
-	  nr_min = wide_int_to_tree (TREE_TYPE (var), min);
-	  nr_max = wide_int_to_tree (TREE_TYPE (var), max);
-	  value_range nr;
-	  nr.set_and_canonicalize (rtype, nr_min, nr_max, NULL);
-	  new_vr->intersect (&nr);
-	}
+	new_vr->intersect (&nr);
 }
 
   /* Update the value range, if necessary.  */


expr_not_equal_to: use value_range API

2018-11-08 Thread Aldy Hernandez

All this nonsense:

-  rtype = get_range_info (t, &min, &max);
-  if (rtype == VR_RANGE)
-   {
- if (wi::lt_p (max, w, TYPE_SIGN (TREE_TYPE (t
-   return true;
- if (wi::lt_p (w, min, TYPE_SIGN (TREE_TYPE (t
-   return true;
-   }
-  else if (rtype == VR_ANTI_RANGE
-  && wi::le_p (min, w, TYPE_SIGN (TREE_TYPE (t)))
-  && wi::le_p (w, max, TYPE_SIGN (TREE_TYPE (t

Replaced by an API like Kutulu intended.

+  get_range_info (t, vr);
+  if (!vr.may_contain_p (wide_int_to_tree (TREE_TYPE (t), w)))

Ain't it grand?

OK for trunk, depending on get_range_info changes of course?

Aldy
commit 3a3fa7eb1baba60d17b4b7731972951173c5d615
Author: Aldy Hernandez 
Date:   Thu Nov 8 13:04:59 2018 +0100

* fold-const.c (expr_not_equal_to): Use value_range API.

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 5399288dfc5..744f946fa15 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -9255,8 +9255,7 @@ tree_expr_nonzero_p (tree t)
 bool
 expr_not_equal_to (tree t, const wide_int &w)
 {
-  wide_int min, max, nz;
-  value_range_kind rtype;
+  value_range vr;
   switch (TREE_CODE (t))
 {
 case INTEGER_CST:
@@ -9265,17 +9264,8 @@ expr_not_equal_to (tree t, const wide_int &w)
 case SSA_NAME:
   if (!INTEGRAL_TYPE_P (TREE_TYPE (t)))
 	return false;
-  rtype = get_range_info (t, &min, &max);
-  if (rtype == VR_RANGE)
-	{
-	  if (wi::lt_p (max, w, TYPE_SIGN (TREE_TYPE (t
-	return true;
-	  if (wi::lt_p (w, min, TYPE_SIGN (TREE_TYPE (t
-	return true;
-	}
-  else if (rtype == VR_ANTI_RANGE
-	   && wi::le_p (min, w, TYPE_SIGN (TREE_TYPE (t)))
-	   && wi::le_p (w, max, TYPE_SIGN (TREE_TYPE (t
+  get_range_info (t, vr);
+  if (!vr.may_contain_p (wide_int_to_tree (TREE_TYPE (t), w)))
 	return true;
   /* If T has some known zero bits and W has any of those bits set,
 	 then T is known not to be equal to W.  */


Re: [RFC][PATCH]Merge VEC_COND_EXPR into MASK_STORE after loop vectorization

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 12:02 PM Renlin Li  wrote:
>
> Hi all,
>
> When allow-store-data-races is enabled, ifcvt would prefer to generated
> conditional select and unconditional store to convert certain if statement
> into:
>
> _ifc_1 = val
> _ifc_2 = A[i]
> val = cond? _ifc_1 : _ifc_2
> A[i] = val
>
> When the loop got vectorized, this pattern will be turned into
> MASK_LOAD, VEC_COND_EXPR and MASK_STORE. This could be improved.

I'm somewhat confused - the vectorizer doesn't generate a masked store when
if-conversion didn't create one in the first place.

In particular with allow-store-data-races=1 (what your testcase uses)
there are no
masked loads/stores generated at all.   So at least you need a better testcase
to motivate (one that doesn't load from array[i] so that we know the conditional
stores might trap).

So what I see (with store data races not allowed) from ifcvt is

   [local count: 1006632961]:
  # i_20 = PHI 
  # ivtmp_18 = PHI 
  a_10 = array[i_20];
  _1 = a_10 & 1;
  _2 = a_10 + 1;
  _32 = _1 != 0;
  _33 = &array[i_20];
  .MASK_STORE (_33, 32B, _32, _2);
  prephitmp_8 = _1 != 0 ? _2 : a_10;
  _4 = a_10 + 2;
  _34 = prephitmp_8 > 10;
  .MASK_STORE (_33, 32B, _34, _4);
  i_13 = i_20 + 1;
  ivtmp_5 = ivtmp_18 - 1;
  if (ivtmp_5 != 0)

and what you want to do is merge

  prephitmp_8 = _1 != 0 ? _2 : a_10;
  _34 = prephitmp_8 > 10;

somehow?  But your patch mentions that _4 should be prephitmp_8 so
it wouldn't do anything here?

> The change here add a post processing function to combine the VEC_COND_EXPR
> expression into MASK_STORE, and delete related dead code.
>
> I am a little bit conservative here.
> I didn't change the default behavior of ifcvt to always generate MASK_STORE,
> although it should be safe in all cases (allow or dis-allow store data race).
>
> MASK_STORE might not well handled in vectorization pass compared with
> conditional select. It might be too early and aggressive to do that in ifcvt.
> And the performance of MASK_STORE might not good for some platforms.
> (We could add --param or target hook to differentiate this ifcvt behavior
> on different platforms)
>
> Another reason I did not do that in ifcvt is the data reference
> analysis. To create a MASK_STORE, a pointer is created as the first
> argument to the internal function call. If the pointer is created out of
> array references, e.g. x = &A[i], data reference analysis could not properly
> analysis the relationship between MEM_REF (x) and ARRAY_REF (A, i). This
> will create a versioned loop beside the vectorized one.

Actually for your testcase it won't vectorize because there's compile-time
aliasing (somehow we miss handling of dependence distance zero?!)

> (I have hacks to look through the MEM_REF, and restore the reference back to
> ARRAY_REF (A, i).  Maybe we could do analysis on lowered address expression?
> I saw we have gimple laddress pass to aid the vectorizer)

An old idea of mine is to have dependence analysis fall back to lowered address
form when it fails to match two references.  This would require re-analysis and
eventually storing an alternate "inner reference" in the data-ref structure.

> The approach here comes a little bit late, on the condition that vector
> MASK_STORE is generated by loop vectorizer. In this case, it is definitely
> beneficial to do the code transformation.
>
> Any thoughts on the best way to fix the issue?
>
>
> This patch has been tested with aarch64-none-elf, no regressions.
>
> Regards,
> Renlin
>
> gcc/ChangeLog:
>
> 2018-11-08  Renlin Li  
>
> * tree-vectorizer.h (combine_sel_mask_store): Declare new function.
> * tree-vect-loop.c (combine_sel_mask_store): Define new function.
> * tree-vectorizer.c (vectorize_loops): Call it.
>
> gcc/testsuite/ChangeLog:
>
> 2018-11-08  Renlin Li  
>
> * gcc.target/aarch64/sve/combine_vcond_mask_store_1.c: New.
>


Tweak ALAP calculation in SCHED_PRESSURE_MODEL

2018-11-08 Thread Kyrill Tkachov



This patch fixes a flaw in the relationship between the way that
SCHED_PRESSURE_MODEL calculates the alap and depth vs how it uses
them in model_order_p.  A comment in model_order_p says:

  /* Combine the length of the longest path of satisfied true dependencies
 that leads to each instruction (depth) with the length of the longest
 path of any dependencies that leads from the instruction (alap).
 Prefer instructions with the greatest combined length.  If the combined
 lengths are equal, prefer instructions with the greatest depth.

 The idea is that, if we have a set S of "equal" instructions that each
 have ALAP value X, and we pick one such instruction I, any true-dependent
 successors of I that have ALAP value X - 1 should be preferred over S.
 This encourages the schedule to be "narrow" rather than "wide".
 However, if I is a low-priority instruction that we decided to
 schedule because of its model_classify_pressure, and if there
 is a set of higher-priority instructions T, the aforementioned
 successors of I should not have the edge over T.  */

The expectation was that scheduling an instruction X would give a
greater priority to the highest-priority successor instructions Y than
X had: Y.depth would be X.depth + 1 and Y.alap would be X.alap - 1,
giving an equal combined height, but with the greater depth winning as
a tie-breaker. But this doesn't work if the alap value was ultimately
determined by an anti-dependence.

This is particularly bad when --param max-pending-list-length kicks in,
since we then start adding fake anti-dependencies in order to keep the
list length down.  These fake dependencies tend to be on the critical
path.

The attached patch avoids that by making the alap calculation only
look at true dependencies.  This shouldn't be too bad, since we use
INSN_PRIORITY as the final tie-breaker than that does take
anti-dependencies into account.

This reduces the number of spills in the hot function from 436.cactusADM
by 14% on aarch64 at -O3 (and the number of instructions in general).
SPEC2017 shows a minor improvement on Cortex-A72 (about 0.1% overall).
Thanks to Wilco for the benchmarking.

Bootstrapped and tested on aarch64-none-linux-gnu.

Is this ok for trunk?

Thanks,
Kyrill

2018-11-08  Richard Sandiford  

gcc/
* haifa-sched.c (model_analyze_insns): Only add 1 to the consumer's
ALAP if the dependence is a true dependence.
diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
index 1fdc9df9fb26f23758ec8326cec91eecc4c917c1..01825de440c2e818eceab5ab7411b20b05ee54f1 100644
--- a/gcc/haifa-sched.c
+++ b/gcc/haifa-sched.c
@@ -3504,8 +3504,10 @@ model_analyze_insns (void)
 	FOR_EACH_DEP (iter, SD_LIST_FORW, sd_it, dep)
 	  {
 	con = MODEL_INSN_INFO (DEP_CON (dep));
-	if (con->insn && insn->alap < con->alap + 1)
-	  insn->alap = con->alap + 1;
+	unsigned int min_alap
+	  = con->alap + (DEP_TYPE (dep) == REG_DEP_TRUE);
+	if (con->insn && insn->alap < min_alap)
+	  insn->alap = min_alap;
 	  }
 
 	insn->old_queue = QUEUE_INDEX (iter);


[PATCH] Make lambda vector and friends use HWI

2018-11-08 Thread Richard Biener


This completes the transition to HWI for lamda vectors.  Previously
we propagated the HWI use of int_cst_value used throughout the dataref
code to local vars but left the actual lambda representation at int.
That's prone to overflows and silent wrong-code.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2018-11-08  Richard Biener  

* tree-data-ref.h (lambda_int): New typedef.
(lambda_vector_gcd): Adjust.
(lambda_vector_new): Likewise.
(lambda_matrix_new): Likewise.

* tree-data-ref.c  (print_lambda_vector): Adjust.

diff --git a/gcc/tree-data-ref.c b/gcc/tree-data-ref.c
index 66e780d635a..6019c6168bf 100644
--- a/gcc/tree-data-ref.c
+++ b/gcc/tree-data-ref.c
@@ -393,7 +393,7 @@ print_lambda_vector (FILE * outfile, lambda_vector vector, 
int n)
   int i;
 
   for (i = 0; i < n; i++)
-fprintf (outfile, "%3d ", vector[i]);
+fprintf (outfile, "%3d ", (int)vector[i]);
   fprintf (outfile, "\n");
 }
 
diff --git a/gcc/tree-data-ref.h b/gcc/tree-data-ref.h
index 525d27f04b9..439a8b986dd 100644
--- a/gcc/tree-data-ref.h
+++ b/gcc/tree-data-ref.h
@@ -138,7 +138,8 @@ struct dr_alias
space. A vector space is a set that is closed under vector addition
and scalar multiplication.  In this vector space, an element is a list of
integers.  */
-typedef int *lambda_vector;
+typedef HOST_WIDE_INT lambda_int;
+typedef lambda_int *lambda_vector;
 
 /* An integer matrix.  A matrix consists of m vectors of length n (IE
all vectors are the same length).  */
@@ -611,11 +612,11 @@ void split_constant_offset (tree , tree *, tree *);
 
 /* Compute the greatest common divisor of a VECTOR of SIZE numbers.  */
 
-static inline int
+static inline lambda_int
 lambda_vector_gcd (lambda_vector vector, int size)
 {
   int i;
-  int gcd1 = 0;
+  lambda_int gcd1 = 0;
 
   if (size > 0)
 {
@@ -632,7 +633,7 @@ static inline lambda_vector
 lambda_vector_new (int size)
 {
   /* ???  We shouldn't abuse the GC allocator here.  */
-  return ggc_cleared_vec_alloc (size);
+  return ggc_cleared_vec_alloc (size);
 }
 
 /* Clear out vector VEC1 of length SIZE.  */
@@ -686,7 +687,7 @@ lambda_matrix_new (int m, int n, struct obstack 
*lambda_obstack)
   mat = XOBNEWVEC (lambda_obstack, lambda_vector, m);
 
   for (i = 0; i < m; i++)
-mat[i] = XOBNEWVEC (lambda_obstack, int, n);
+mat[i] = XOBNEWVEC (lambda_obstack, lambda_int, n);
 
   return mat;
 }


record_ranges_from_incoming_edge: use value_range API for creating new range

2018-11-08 Thread Aldy Hernandez
This one's rather obvious and does not depend on any get_range_info API 
change.


OK for trunk?
* gimple-ssa-evrp-analyze.c
(evrp_range_analyzer::record_ranges_from_incoming_edge): Use
value_range API instead of piecing together ranges.

diff --git a/gcc/gimple-ssa-evrp-analyze.c b/gcc/gimple-ssa-evrp-analyze.c
index 3e5287b1b0b..ec39895cceb 100644
--- a/gcc/gimple-ssa-evrp-analyze.c
+++ b/gcc/gimple-ssa-evrp-analyze.c
@@ -207,7 +207,8 @@ evrp_range_analyzer::record_ranges_from_incoming_edge (basic_block bb)
 	 getting first [64, +INF] and then ~[0, 0] from
 		 conditions like (s & 0x3cc0) == 0).  */
 	  value_range *old_vr = get_value_range (vrs[i].first);
-	  value_range tem (old_vr->kind (), old_vr->min (), old_vr->max ());
+	  value_range tem = *old_vr;
+	  tem.equiv_clear ();
 	  tem.intersect (vrs[i].second);
 	  if (tem.ignore_equivs_equal_p (*old_vr))
 		continue;


Re: [PATCH] Remove extra memory allocation of strings.

2018-11-08 Thread Kyrill Tkachov

Hi Martin,

On 23/10/18 14:17, Martin Liška wrote:

Hello.

As a follow up patch I would like to remove redundant string allocation
on string which is not needed in my opinion.


I think this change is correct, as these functions don't modify the string,
just read it in different ways.

You'll still need approval from a maintainer.

Thanks,
Kyrill


That bootstrap on aarch64-linux.

Martin





Re: [PATCH] [aarch64] Correct the maximum shift amount for shifted operands.

2018-11-08 Thread Christoph Müllner
Hi Kyrill,

> On 08.11.2018, at 11:16, Kyrill Tkachov  wrote:
> 
> Hi Christoph,
> 
> On 07/11/18 21:51, christoph.muell...@theobroma-systems.com wrote:
>> From: Christoph Muellner 
>> 
>> The aarch64 ISA specification allows a left shift amount to be applied
>> after extension in the range of 0 to 4 (encoded in the imm3 field).
>> 
> 
> Indeed. That looks correct from my reading of the spec as well (section 
> C6.2.3 for example).
> 
>> This is true for at least the following instructions:
>> 
>> * ADD (extend register)
>> * ADDS (extended register)
>> * SUB (extended register)
>> 
>> The result of this patch can be seen, when compiling the following code:
>> 
>> uint64_t myadd(uint64_t a, uint64_t b)
>> {
>>return a+(((uint8_t)b)<<4);
>> }
>> 
>> Without the patch the following sequence will be generated:
>> 
>>  :
>>   0:   d37c1c21 ubfiz   x1, x1, #4, #8
>>   4:   8b20 add x0, x1, x0
>>   8:   d65f03c0 ret
>> 
>> With the patch the ubfiz will be merged into the add instruction:
>> 
>>  :
>>   0:   8b211000 add x0, x0, w1, uxtb #4
>>   4:   d65f03c0 ret
>> 
>> *** gcc/ChangeLog ***
>> 
> 
> The patch looks correct to me but can you please clarify how it has been 
> tested?
> Patches need to be bootstrapped and the full testsuite run.

I've tested native (on an aarch64 system) with
> make bootstrap
> make check RUNTESTFLAGS="aarch64.exp"
and haven't found any regressions (compared to this patch not being applied).

Currently I'm running make check without any RUNTESTFLAGS, but that will take
some time to finish (already 1.5 hours running on a 3 GHz machine).

My approach to evaluate the test run is to diff the output of make check
(with and without the patch applied). Is there a better approach for this?
Especially is there a way to parallelise test execution?
I have 32 cores, which could significantly reduce the test duration,
but when using -j32, then my diff approach does not work anymore.
How is this intended to be done?

> I also have a few comments on the ChangeLog
> 
>> 2018-xx-xx  Christoph Muellner 
>> 
> 
> Two spaces between your name and the email (also, missing an 's' in the 
> email?).
> 
>>* gcc/config/aarch64/aarch64.c: Correct the maximum shift amount
>>for shifted operands.
> 
> The path should be relative to the ChangeLog location.
> Also, please specify the function name being changed.
> In this case it should be
>* config/aarch64/aarch64.c (aarch64_uxt_size): Correct...
> 
>>* gcc/testsuite/gcc.target/aarch64/extend.c: Adjust the
>>testcases to cover the changed shift amount.
> 
> This should be in a separate ChangeLog entry as it will go into 
> gcc/testsuite/ChangeLog.
> The path would be:
>* gcc.target/aarch64/extend.c
> 
>> 
>> Signed-off-by: Christoph Muellner 
>> Signed-off-by: Philipp Tomsich 
> 
> GCC doesn't use Signed-off-by tags (though no one will complain about them 
> either) but if
> Philipp Tomsich wrote any of the code here (which I think is implied by the 
> Signed-off-by tag?) then
> his name should appear in the ChangeLog entry.

Thank's for your comments, I've already addressed them locally.
I'll resend once testing is completed.

Thanks,
Christoph

> Thanks for the patch!
> As I said, the code looks good to me, but you'll need a maintainer to approve 
> it
> (I've cc'ed the aarch64 maintainers for you) once the testing is clarified 
> and the ChangeLog is fixed.
> 
> Kyrill
> 
> 
>> ---
>> gcc/config/aarch64/aarch64.c  |  2 +-
>> gcc/testsuite/gcc.target/aarch64/extend.c | 16 
>> 2 files changed, 9 insertions(+), 9 deletions(-)
>> 
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index c82c7b6..c85988a 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -8190,7 +8190,7 @@ aarch64_output_casesi (rtx *operands)
>> int
>> aarch64_uxt_size (int shift, HOST_WIDE_INT mask)
>> {
>> -  if (shift >= 0 && shift <= 3)
>> +  if (shift >= 0 && shift <= 4)
>> {
>>   int size;
>>   for (size = 8; size <= 32; size *= 2)
>> diff --git a/gcc/testsuite/gcc.target/aarch64/extend.c 
>> b/gcc/testsuite/gcc.target/aarch64/extend.c
>> index f399e55..7986c5b 100644
>> --- a/gcc/testsuite/gcc.target/aarch64/extend.c
>> +++ b/gcc/testsuite/gcc.target/aarch64/extend.c
>> @@ -32,8 +32,8 @@ ldr_sxtw0 (char *arr, int i)
>> unsigned long long
>> adddi_uxtw (unsigned long long a, unsigned int i)
>> {
>> -  /* { dg-final { scan-assembler "add\tx\[0-9\]+,.*uxtw #?3" } } */
>> -  return a + ((unsigned long long)i << 3);
>> +  /* { dg-final { scan-assembler "add\tx\[0-9\]+,.*uxtw #?4" } } */
>> +  return a + ((unsigned long long)i << 4);
>> }
>> 
>> unsigned long long
>> @@ -46,8 +46,8 @@ adddi_uxtw0 (unsigned long long a, unsigned int i)
>> long long
>> adddi_sxtw (long long a, int i)
>> {
>> -  /* { dg-final { scan-assembler "add\tx\[0-9\]+,.*sxtw #?3" } } */
>> -  

Re: [PATCH] [aarch64] Correct the maximum shift amount for shifted operands.

2018-11-08 Thread Kyrill Tkachov

Hi Christoph,

On 08/11/18 12:20, Christoph Müllner wrote:

Hi Kyrill,

> On 08.11.2018, at 11:16, Kyrill Tkachov  wrote:
>
> Hi Christoph,
>
> On 07/11/18 21:51, christoph.muell...@theobroma-systems.com wrote:
>> From: Christoph Muellner 
>>
>> The aarch64 ISA specification allows a left shift amount to be applied
>> after extension in the range of 0 to 4 (encoded in the imm3 field).
>>
>
> Indeed. That looks correct from my reading of the spec as well (section 
C6.2.3 for example).
>
>> This is true for at least the following instructions:
>>
>> * ADD (extend register)
>> * ADDS (extended register)
>> * SUB (extended register)
>>
>> The result of this patch can be seen, when compiling the following code:
>>
>> uint64_t myadd(uint64_t a, uint64_t b)
>> {
>>return a+(((uint8_t)b)<<4);
>> }
>>
>> Without the patch the following sequence will be generated:
>>
>>  :
>>   0:   d37c1c21 ubfiz   x1, x1, #4, #8
>>   4:   8b20 add x0, x1, x0
>>   8:   d65f03c0 ret
>>
>> With the patch the ubfiz will be merged into the add instruction:
>>
>>  :
>>   0:   8b211000 add x0, x0, w1, uxtb #4
>>   4:   d65f03c0 ret
>>
>> *** gcc/ChangeLog ***
>>
>
> The patch looks correct to me but can you please clarify how it has been 
tested?
> Patches need to be bootstrapped and the full testsuite run.

I've tested native (on an aarch64 system) with
> make bootstrap
> make check RUNTESTFLAGS="aarch64.exp"
and haven't found any regressions (compared to this patch not being applied).

Currently I'm running make check without any RUNTESTFLAGS, but that will take
some time to finish (already 1.5 hours running on a 3 GHz machine).

My approach to evaluate the test run is to diff the output of make check
(with and without the patch applied). Is there a better approach for this?
Especially is there a way to parallelise test execution?
I have 32 cores, which could significantly reduce the test duration,
but when using -j32, then my diff approach does not work anymore.
How is this intended to be done?


Feel free to add -j32, that will make testing much more practicable.
The way to check the results is to look at the .sum and .log files and compare 
them.

For example, in the build directory in $BUILD/gcc/testsuite

$ find. -name *.sum

./gfortran/gfortran.sum
./gcc/gcc.sum
./g++/g++.sum

You can compare these before and after your patch. These are generated in a 
consistent format, even with a -j make option.

You can also use the scripts in contrib/ (dg-extract-results.sh and 
dg-cmp-results.sh) in the source tree that do some
of the log extracting and analysis for you.

Thanks,
Kyrill



> I also have a few comments on the ChangeLog
>
>> 2018-xx-xx  Christoph Muellner 
>>
>
> Two spaces between your name and the email (also, missing an 's' in the 
email?).
>
>>* gcc/config/aarch64/aarch64.c: Correct the maximum shift amount
>>for shifted operands.
>
> The path should be relative to the ChangeLog location.
> Also, please specify the function name being changed.
> In this case it should be
>* config/aarch64/aarch64.c (aarch64_uxt_size): Correct...
>
>>* gcc/testsuite/gcc.target/aarch64/extend.c: Adjust the
>>testcases to cover the changed shift amount.
>
> This should be in a separate ChangeLog entry as it will go into 
gcc/testsuite/ChangeLog.
> The path would be:
>* gcc.target/aarch64/extend.c
>
>>
>> Signed-off-by: Christoph Muellner 
>> Signed-off-by: Philipp Tomsich 
>
> GCC doesn't use Signed-off-by tags (though no one will complain about them 
either) but if
> Philipp Tomsich wrote any of the code here (which I think is implied by the 
Signed-off-by tag?) then
> his name should appear in the ChangeLog entry.

Thank's for your comments, I've already addressed them locally.
I'll resend once testing is completed.

Thanks,
Christoph

> Thanks for the patch!
> As I said, the code looks good to me, but you'll need a maintainer to approve 
it
> (I've cc'ed the aarch64 maintainers for you) once the testing is clarified 
and the ChangeLog is fixed.
>
> Kyrill
>
>
>> ---
>> gcc/config/aarch64/aarch64.c  |  2 +-
>> gcc/testsuite/gcc.target/aarch64/extend.c | 16 
>> 2 files changed, 9 insertions(+), 9 deletions(-)
>>
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index c82c7b6..c85988a 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -8190,7 +8190,7 @@ aarch64_output_casesi (rtx *operands)
>> int
>> aarch64_uxt_size (int shift, HOST_WIDE_INT mask)
>> {
>> -  if (shift >= 0 && shift <= 3)
>> +  if (shift >= 0 && shift <= 4)
>> {
>>   int size;
>>   for (size = 8; size <= 32; size *= 2)
>> diff --git a/gcc/testsuite/gcc.target/aarch64/extend.c 
b/gcc/testsuite/gcc.target/aarch64/extend.c
>> index f399e55..7986c5b 100644
>> --- a/gcc/testsuite/gcc.target/aarch64/extend.c
>> +++ b/gcc/testsuite/gcc.target/aarch64

[committed] value_range::may_contain_p: Do not access extremes directly

2018-11-08 Thread Aldy Hernandez

This is an obvious patch I will commit once testing completes.

Aldy
* tree-vrp.c (may_contain_p): Do not access m_min/m_max directly.

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 17b0b6c6037..7af676a416d 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -250,10 +250,10 @@ value_range::may_contain_p (tree val) const
 
   if (m_kind == VR_ANTI_RANGE)
 {
-  int res = value_inside_range (val, m_min, m_max);
+  int res = value_inside_range (val, min (), max ());
   return res == 0 || res == -2;
 }
-  return value_inside_range (val, m_min, m_max) != 0;
+  return value_inside_range (val, min (), max ()) != 0;
 }
 
 void


Re: [PATCH] [aarch64] Correct the maximum shift amount for shifted operands.

2018-11-08 Thread Christoph Müllner


> On 08.11.2018, at 13:25, Kyrill Tkachov  wrote:
> 
> Hi Christoph,
> 
> On 08/11/18 12:20, Christoph Müllner wrote:
>> Hi Kyrill,
>> 
>> > On 08.11.2018, at 11:16, Kyrill Tkachov  
>> > wrote:
>> >
>> > Hi Christoph,
>> >
>> > On 07/11/18 21:51, christoph.muell...@theobroma-systems.com wrote:
>> >> From: Christoph Muellner 
>> >>
>> >> The aarch64 ISA specification allows a left shift amount to be applied
>> >> after extension in the range of 0 to 4 (encoded in the imm3 field).
>> >>
>> >
>> > Indeed. That looks correct from my reading of the spec as well (section 
>> > C6.2.3 for example).
>> >
>> >> This is true for at least the following instructions:
>> >>
>> >> * ADD (extend register)
>> >> * ADDS (extended register)
>> >> * SUB (extended register)
>> >>
>> >> The result of this patch can be seen, when compiling the following code:
>> >>
>> >> uint64_t myadd(uint64_t a, uint64_t b)
>> >> {
>> >>return a+(((uint8_t)b)<<4);
>> >> }
>> >>
>> >> Without the patch the following sequence will be generated:
>> >>
>> >>  :
>> >>   0:   d37c1c21 ubfiz   x1, x1, #4, #8
>> >>   4:   8b20 add x0, x1, x0
>> >>   8:   d65f03c0 ret
>> >>
>> >> With the patch the ubfiz will be merged into the add instruction:
>> >>
>> >>  :
>> >>   0:   8b211000 add x0, x0, w1, uxtb #4
>> >>   4:   d65f03c0 ret
>> >>
>> >> *** gcc/ChangeLog ***
>> >>
>> >
>> > The patch looks correct to me but can you please clarify how it has been 
>> > tested?
>> > Patches need to be bootstrapped and the full testsuite run.
>> 
>> I've tested native (on an aarch64 system) with
>> > make bootstrap
>> > make check RUNTESTFLAGS="aarch64.exp"
>> and haven't found any regressions (compared to this patch not being applied).
>> 
>> Currently I'm running make check without any RUNTESTFLAGS, but that will take
>> some time to finish (already 1.5 hours running on a 3 GHz machine).
>> 
>> My approach to evaluate the test run is to diff the output of make check
>> (with and without the patch applied). Is there a better approach for this?
>> Especially is there a way to parallelise test execution?
>> I have 32 cores, which could significantly reduce the test duration,
>> but when using -j32, then my diff approach does not work anymore.
>> How is this intended to be done?
> 
> Feel free to add -j32, that will make testing much more practicable.
> The way to check the results is to look at the .sum and .log files and 
> compare them.
> 
> For example, in the build directory in $BUILD/gcc/testsuite
> 
> $ find. -name *.sum
> 
> ./gfortran/gfortran.sum
> ./gcc/gcc.sum
> ./g++/g++.sum
> 
> You can compare these before and after your patch. These are generated in a 
> consistent format, even with a -j make option.
> 
> You can also use the scripts in contrib/ (dg-extract-results.sh and 
> dg-cmp-results.sh) in the source tree that do some
> of the log extracting and analysis for you.

Exactly what I was searching for!

Thanks,
Christoph

> 
> Thanks,
> Kyrill
> 
>> 
>> > I also have a few comments on the ChangeLog
>> >
>> >> 2018-xx-xx  Christoph Muellner 
>> >>
>> >
>> > Two spaces between your name and the email (also, missing an 's' in the 
>> > email?).
>> >
>> >>* gcc/config/aarch64/aarch64.c: Correct the maximum shift amount
>> >>for shifted operands.
>> >
>> > The path should be relative to the ChangeLog location.
>> > Also, please specify the function name being changed.
>> > In this case it should be
>> >* config/aarch64/aarch64.c (aarch64_uxt_size): Correct...
>> >
>> >>* gcc/testsuite/gcc.target/aarch64/extend.c: Adjust the
>> >>testcases to cover the changed shift amount.
>> >
>> > This should be in a separate ChangeLog entry as it will go into 
>> > gcc/testsuite/ChangeLog.
>> > The path would be:
>> >* gcc.target/aarch64/extend.c
>> >
>> >>
>> >> Signed-off-by: Christoph Muellner 
>> >> 
>> >> Signed-off-by: Philipp Tomsich 
>> >
>> > GCC doesn't use Signed-off-by tags (though no one will complain about them 
>> > either) but if
>> > Philipp Tomsich wrote any of the code here (which I think is implied by 
>> > the Signed-off-by tag?) then
>> > his name should appear in the ChangeLog entry.
>> 
>> Thank's for your comments, I've already addressed them locally.
>> I'll resend once testing is completed.
>> 
>> Thanks,
>> Christoph
>> 
>> > Thanks for the patch!
>> > As I said, the code looks good to me, but you'll need a maintainer to 
>> > approve it
>> > (I've cc'ed the aarch64 maintainers for you) once the testing is clarified 
>> > and the ChangeLog is fixed.
>> >
>> > Kyrill
>> >
>> >
>> >> ---
>> >> gcc/config/aarch64/aarch64.c  |  2 +-
>> >> gcc/testsuite/gcc.target/aarch64/extend.c | 16 
>> >> 2 files changed, 9 insertions(+), 9 deletions(-)
>> >>
>> >> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> >> index c82c7b6..c85988a 100644
>> >> --- a/gcc/config/aarch64/

Re: [PATCH, testsuite]: Use int128 effective target for gcc.dg/pr87874.c

2018-11-08 Thread Christophe Lyon
On Wed, 7 Nov 2018 at 16:50, Uros Bizjak  wrote:
>
> 2018-11-07  Uros Bizjak  
>
> * gcc.dg/pr87874.c: Compile only for int128 effective target.
>
> Tested on x86_64-linux-gnu {,-m32} and committed to mainline SVN.
>

This doesn't work on aarch64 -mabi=ilp32:
/gcc/testsuite/gcc.dg/pr87874.c: In function 'em':
/gcc/testsuite/gcc.dg/pr87874.c:19:50: warning: conversion from 'long
long unsigned int' to 'long unsigned int' changes value from
'18446744073709551615' to '4294967295' [-Woverflow]
FAIL: gcc.dg/pr87874.c (test for excess errors)


> Uros.


Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register

2018-11-08 Thread Peter Bergner
On 11/8/18 4:57 AM, Renlin Li wrote:
> I think I found the problem!
> 
> As described in the PR, a hard register is used in
> an pre/post modify expression. The hard register is live, but updated.
> In this case, we should make it conflicting with all pseudos live at
> that point.  Does it make sense?

Do you have a reproducer test case I can look at?  I'd like to see the
problematical rtl to help me determine whether your patch is correct
or not.  ...and thank you for debugging this!

Peter



misc VRP cleanups for value_range API

2018-11-08 Thread Aldy Hernandez

Stupid boring changes.

OK?
* tree-vrp.c (value_range::check): Do not access internals
directly.
(value_range::singleton_p): Same.
(value_range::type): Same.
(vrp_finalize): Use value_range API.

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 7af676a416d..3ccf2e491d6 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -157,7 +157,7 @@ value_range::check ()
   }
 case VR_UNDEFINED:
 case VR_VARYING:
-  gcc_assert (!m_min && !m_max);
+  gcc_assert (!min () && !max ());
   gcc_assert (!m_equiv || bitmap_empty_p (m_equiv));
   break;
 default:
@@ -291,11 +291,11 @@ bool
 value_range::singleton_p (tree *result) const
 {
   if (m_kind == VR_RANGE
-  && vrp_operand_equal_p (m_min, m_max)
-  && is_gimple_min_invariant (m_min))
+  && vrp_operand_equal_p (min (), max ())
+  && is_gimple_min_invariant (min ()))
 {
   if (result)
-*result = m_min;
+*result = min ();
   return true;
 }
   return false;
@@ -306,8 +306,8 @@ value_range::type () const
 {
   /* Types are only valid for VR_RANGE and VR_ANTI_RANGE, which are
  known to have non-zero min/max.  */
-  gcc_assert (m_min);
-  return TREE_TYPE (m_min);
+  gcc_assert (min ());
+  return TREE_TYPE (min ());
 }
 
 /* Dump value range to FILE.  */
@@ -6558,9 +6558,7 @@ vrp_prop::vrp_finalize (bool warn_array_bounds_p)
 	  && range_includes_zero_p (vr) == 0)
 	set_ptr_nonnull (name);
   else if (!POINTER_TYPE_P (TREE_TYPE (name)))
-	set_range_info (name, vr->kind (),
-			wi::to_wide (vr->min ()),
-			wi::to_wide (vr->max ()));
+	set_range_info (name, *vr);
 }
 
   /* If we're checking array refs, we want to merge information on


implement value_range::domain_p()

2018-11-08 Thread Aldy Hernandez
I believe I've seen this idiom more than once.  I know for sure I've 
used it in our ssa-range branch :).  I'll hunt for the other uses and 
adjust accordingly.


OK?
* tree-vrp.h (value_range::domain_p): New.
* tree-vrp.c (value_range::domain_p): New.
* tree-ssa-strlen.c (strlen_check_and_optimize_stmt): Use
value_range::domain_p instead of doing things ad-hoc.

diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index 669c315dce2..ddb61e5a7f3 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -3687,19 +3687,16 @@ strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi, bool *cleanup_eh)
 			/* Reading a character before the final '\0'
 			   character.  Just set the value range to ~[0, 0]
 			   if we don't have anything better.  */
-			wide_int min, max;
+			value_range vr;
+			get_range_info (lhs, vr);
 			tree type = TREE_TYPE (lhs);
-			enum value_range_kind vr
-			  = get_range_info (lhs, &min, &max);
-			if (vr == VR_VARYING
-			|| (vr == VR_RANGE
-&& min == wi::min_value (TYPE_PRECISION (type),
-			 TYPE_SIGN (type))
-&& max == wi::max_value (TYPE_PRECISION (type),
-			 TYPE_SIGN (type
-			  set_range_info (lhs, VR_ANTI_RANGE,
-	  wi::zero (TYPE_PRECISION (type)),
-	  wi::zero (TYPE_PRECISION (type)));
+			if (vr.domain_p ())
+			  {
+			value_range vr (VR_ANTI_RANGE,
+	build_int_cst (type, 0),
+	build_int_cst (type, 0));
+			set_range_info (lhs, vr);
+			  }
 		  }
 		  }
 	  }
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 3ccf2e491d6..e2126c21974 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -178,6 +178,22 @@ value_range::equal_p (const value_range &other, bool ignore_equivs) const
 	  || vrp_bitmap_equal_p (m_equiv, other.m_equiv)));
 }
 
+/* Return TRUE if value_range spans the entire domain.  */
+
+bool
+value_range::domain_p () const
+{
+  if (varying_p ())
+return true;
+  if (m_kind == VR_RANGE)
+{
+  tree type = TREE_TYPE (type ());
+  value_range vr (VR_RANGE, TYPE_MIN_VALUE (type), TYPE_MAX_VALUE (type));
+  return equal_p (vr, /*ignore_equivs=*/true);
+}
+  return false;
+}
+
 /* Return equality while ignoring equivalence bitmap.  */
 
 bool
diff --git a/gcc/tree-vrp.h b/gcc/tree-vrp.h
index c251329a195..c5811d50bbf 100644
--- a/gcc/tree-vrp.h
+++ b/gcc/tree-vrp.h
@@ -64,6 +64,7 @@ class GTY((for_user)) value_range
   /* Misc methods.  */
   tree type () const;
   bool null_p () const;
+  bool domain_p () const;
   bool may_contain_p (tree) const;
   bool singleton_p (tree *result = NULL) const;
   void deep_copy (const value_range *);


Re: [PATCH] Come up with htab_hash_string_vptr and use string-specific if possible.

2018-11-08 Thread Michael Matz
Hi,

On Wed, 7 Nov 2018, Martin Liška wrote:

> > Whereever the new function belongs it certainly isn't system.h.  Also 
> > the definition in a header seems excessive.  Sure, it enables inlining 
> > of it, but that seems premature optimization.  It contains a loop, and 
> > inlining anything with loops that aren't very likely to loop just once 
> > or never just blows code for no gain.  Also as the function is leaf 
> > there won't be any second-order effect from inlining.
> 
> Ok, works for me. As you know my main motivation was to provide stronger 
> type declaration that can be used for 'const char *'. So what about 
> providing 2 wrappers and poisoning the implementation?

That seems better.  But still, why declare this in system.h?  It seems 
hash-table.h seems more appropriate.


Ciao,
Michael.

[PATCH] Fix PR87621, failed outer loop vectorization

2018-11-08 Thread Richard Biener


The following fixes another instance of PR87914, this time it is
DOM wrecking loop-header copying presenting vectorization with
wrong loop form.  The loop header copying pass right before
vectorization isn't of great help since nothing cleans up after
it so loop form is still bogus.

I've tried to move loop-header copying after jump threading but
the fallout is too large for the moment.  So this resorts to
instead run CSE on the copied blocks.  I've took the liberty
to cleanup the vec-ch predicate a bit as well.

This then reveals another ICE in reduction handling which this
patch fixes.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Richard.

>From ea6b41e717b32c93ab3a27df55632f06fa1d71fc Mon Sep 17 00:00:00 2001
From: Richard Guenther 
Date: Thu, 8 Nov 2018 11:23:12 +0100
Subject: [PATCH] fix-pr87621

PR tree-optimization/87621
* tree-vect-loop.c (vectorizable_reduction): Handle reduction
op with only phi inputs.
* tree-ssa-loop-ch.c: Include tree-ssa-sccvn.h.
(ch_base::copy_headers): Run CSE on copied loop headers.
(pass_ch_vect::process_loop_p): Simplify.

* g++.dg/vect/pr87621.cc: New testcase.

diff --git a/gcc/testsuite/g++.dg/vect/pr87621.cc 
b/gcc/testsuite/g++.dg/vect/pr87621.cc
new file mode 100644
index 000..cfc53be4ee1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vect/pr87621.cc
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+
+extern "C" double pow(double, double);
+template 
+T pow(T x, unsigned int n)
+{
+  if (!n)
+return 1;
+
+  T y = 1;
+  while (n > 1)
+{
+  if (n%2)
+   y *= x;
+  x = x*x;
+  n /= 2;
+}
+  return x*y;
+}
+
+void testVec(int* x)
+{
+  for (int i = 0; i < 8; ++i)
+x[i] = pow(x[i], 10);
+}
+
+/* { dg-final { scan-tree-dump "OUTER LOOP VECTORIZED" "vect" { target { 
vect_double && vect_hw_misalign } } } } */
diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c
index c876d62405f..4d4813df3c8 100644
--- a/gcc/tree-ssa-loop-ch.c
+++ b/gcc/tree-ssa-loop-ch.c
@@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-inline.h"
 #include "tree-ssa-scopedtables.h"
 #include "tree-ssa-threadedge.h"
+#include "tree-ssa-sccvn.h"
 #include "params.h"
 
 /* Duplicates headers of loops if they are small enough, so that the statements
@@ -297,12 +298,14 @@ ch_base::copy_headers (function *fun)
   bool changed = false;
 
   if (number_of_loops (fun) <= 1)
-  return 0;
+return 0;
 
   bbs = XNEWVEC (basic_block, n_basic_blocks_for_fn (fun));
   copied_bbs = XNEWVEC (basic_block, n_basic_blocks_for_fn (fun));
   bbs_size = n_basic_blocks_for_fn (fun);
 
+  auto_vec > copied;
+
   FOR_EACH_LOOP (loop, 0)
 {
   int initial_limit = PARAM_VALUE (PARAM_MAX_LOOP_HEADER_INSNS);
@@ -371,6 +374,7 @@ ch_base::copy_headers (function *fun)
  fprintf (dump_file, "Duplication failed.\n");
  continue;
}
+  copied.safe_push (std::make_pair (entry, loop));
 
   /* If the loop has the form "for (i = j; i < j + 10; i++)" then
 this copying can introduce a case where we rely on undefined
@@ -422,7 +426,28 @@ ch_base::copy_headers (function *fun)
 }
 
   if (changed)
-update_ssa (TODO_update_ssa);
+{
+  update_ssa (TODO_update_ssa);
+  /* After updating SSA form perform CSE on the loop header
+copies.  This is esp. required for the pass before
+vectorization since nothing cleans up copied exit tests
+that can now be simplified.  CSE from the entry of the
+region we copied till all loop exit blocks but not
+entering the loop itself.  */
+  for (unsigned i = 0; i < copied.length (); ++i)
+   {
+ edge entry = copied[i].first;
+ loop_p loop = copied[i].second;
+ vec exit_edges = get_loop_exit_edges (loop);
+ bitmap exit_bbs = BITMAP_ALLOC (NULL);
+ for (unsigned j = 0; j < exit_edges.length (); ++j)
+   bitmap_set_bit (exit_bbs, exit_edges[j]->dest->index);
+ bitmap_set_bit (exit_bbs, loop->header->index);
+ do_rpo_vn (cfun, entry, exit_bbs);
+ BITMAP_FREE (exit_bbs);
+ exit_edges.release ();
+   }
+}
   free (bbs);
   free (copied_bbs);
 
@@ -473,24 +498,13 @@ pass_ch_vect::process_loop_p (struct loop *loop)
   if (loop->dont_vectorize)
 return false;
 
-  if (!do_while_loop_p (loop))
-return true;
-
- /* The vectorizer won't handle anything with multiple exits, so skip.  */
+  /* The vectorizer won't handle anything with multiple exits, so skip.  */
   edge exit = single_exit (loop);
   if (!exit)
 return false;
 
-  /* Copy headers iff there looks to be code in the loop after the exit block,
- i.e. the exit block has an edge to another block (besides the latch,
- which should be empty).  */
-  edge_iterator ei;
-  edge e;
-  FOR_EACH_EDGE (e, ei, exit->src->succs)
-if (!loop_exit_edge_p (loop, e)
-   && e->dest != loop->header
-   && e->de

Re: [PATCH 1/6] [RS6000] rs6000_output_call for external call insn assembly output

2018-11-08 Thread Alan Modra
On Wed, Nov 07, 2018 at 06:08:33PM -0600, Segher Boessenkool wrote:
> On Wed, Nov 07, 2018 at 04:07:15PM +1030, Alan Modra wrote:
> > +extern const char *rs6000_output_call (rtx *, unsigned int, bool, const 
> > char *);
> 
> Maybe have a separate rs6000_output_call and rs6000_output_sibcall?  Bare
> boolean function parameters aren't great.  (They can of course both call
> rs6000_output_call_1 or whatever, if that makes sense).
> 
> > --- a/gcc/config/rs6000/rs6000.c
> > +++ b/gcc/config/rs6000/rs6000.c
> > @@ -21380,6 +21380,37 @@ rs6000_assemble_integer (rtx x, unsigned int size, 
> > int aligned_p)
> >return default_assemble_integer (x, size, aligned_p);
> >  }
> >  
> > +/* Return a template string for assembly to emit when making an
> > +   external call.  FUN is the %z argument, ARG is either NULL or
> > +   a @TLSGD or @TLSLD __tls_get_addr argument specifier.  */
> > +
> > +const char *
> > +rs6000_output_call (rtx *operands, unsigned int fun, bool sibcall,
> > +   const char *arg)
> > +{
> > +  /* -Wformat-overflow workaround, without which gcc thinks that %u
> > +  might produce 10 digits.  FUN is 0 or 1 as of 2018-03.  */
> > +  gcc_assert (fun <= 6);
> 
> So "fun" is the operand number.  Rename it, and use MAX_RECOG_OPERANDS
> instead of 6?  And allow for it to take 2 or 3 chars to print :-)

Eh, so I can't have a little fun?  funop then?  (It's the index of the
call operand that has the function symref or whatever to call.)

> "operands" is unused here, compiling this will warn.

Oops, I did bootstrap each patch individually at one stage..
Probably happened when I edited the patchset, to reduce the size of
later changes.  rs6000_output_call didn't have "operands" to start
with.

> "output" is a lie, this function doesn't output anything.  Hardly the
> only case of this in the rs6000 port, but it is annoying.  What would be
> a good name for this...  "rs6000_template_for_call"?

OK, I went with rs6000_call_template and rs6000_sibcall_template.
Next patch also changed to use rs6000_indirect_call_template and
rs6000_indirect_sibcall_template.  Final patch uses
rs6000_pltseq_template.

> Are there some patterns that can be collapsed to one after this?

No, I don't think so.  Patch 5/6 does some of that for TLS calls.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 3/6] [RS6000] Replace TLSmode with P, and correct tls call mems

2018-11-08 Thread Alan Modra
On Wed, Nov 07, 2018 at 07:11:28PM -0600, Segher Boessenkool wrote:
> On Wed, Nov 07, 2018 at 04:08:26PM +1030, Alan Modra wrote:
> > There is really no need to define a TLSmode mode iterator that is
> > identical (since !TARGET_64BIT == TARGET_32BIT) to the much used P
> > mode iterator.
> 
> Nice :-)
> 
> > It's nonsense to think we might ever want to support
> > 32-bit TLS on 64-bit or vice versa!  The patch also fixes a minor
> > error in the call mems.  All other direct calls use (call (mem:SI ..)).
> 
> You can also replace  with ,  with
> , and l with .  Also, was "TLSmode:"
> needed anywhere?  I don't see any other iterator used in those patterns.

OK, done.  TLSmode: was used in the tls insn pattern names and in the
output templates that now use ptrload.  P: does just as well.

-- 
Alan Modra
Australia Development Lab, IBM


cleanups and unification of value_range dumping code

2018-11-08 Thread Aldy Hernandez
[Richard, you're right.  An overloaded debug() is better than 
this->dump().  Anyone who thinks different is wrong.  End of discussion.]


There's no reason to have multiple ways of dumping a value range.  And 
I'm not even talking about ipa-prop.* which decided that they should 
implement their own version of value_ranges basically to annoy me. 
Attached is a patch to remedy this.


I have made three small user-visible changes to the dumping code-- 
cause, well... you know me... I can't leave things alone.


1. It *REALLY* irks me that bools are dumped with -INF/+INF.  Everyone 
knows that bools should be 0/1.  It's in the Bible.  Look it up.


2. Analogous to #1, what's up with signed 1-bit fields?  Who understands 
[0, +INF] as [0, 0]?  No one; that's right.  (It's still beyond me the 
necessity of signed bit fields in the standard period, but I'll pick my 
battles.)


3. I find it quite convenient to display the tree type along with the 
range as:


[1, 1] int EQUIVALENCES { me, myself, I }

Finally.. As mentioned, I've implement an overloaded debug() because 
it's *so* nice to use gdb and an alias of:


define dd
  call debug($arg0)
end

...and have stuff just work.

I do realize that we still have dump() lying around.  At some point, we 
should start dropping external access to the dump methods for objects 
that are never dumped by the compiler (but by the user at debug time).


OK for trunk?
* gimple-pretty-print.c (dump_ssaname_info): Use value_range
dumper to dump range information.
* tree-vrp.c (value_range::dump_range): New.
(value_range::dump): Add variant for dumping to pretty_printer.
Adjust FILE * version accordingly.
(debug_value_range): Rename to debug.

Remove unused prototypes: dump_value_range, debug_value_range,
dump_all_value_ranges, dump_vr_equiv, debug_vr_equiv.
* tree-vrp.h (value_range::dump): Add new variant that takes
pretty_printer.
(value_range::dump_range): New.

diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index 7dfec9120ab..73af30ba92f 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -2122,21 +2122,11 @@ dump_ssaname_info (pretty_printer *buffer, tree node, int spc)
   if (!POINTER_TYPE_P (TREE_TYPE (node))
   && SSA_NAME_RANGE_INFO (node))
 {
-  wide_int min, max, nonzero_bits;
-  value_range_kind range_type = get_range_info (node, &min, &max);
+  value_range vr;
 
-  if (range_type == VR_VARYING)
-	pp_printf (buffer, "# RANGE VR_VARYING");
-  else if (range_type == VR_RANGE || range_type == VR_ANTI_RANGE)
-	{
-	  pp_printf (buffer, "# RANGE ");
-	  pp_printf (buffer, "%s[", range_type == VR_RANGE ? "" : "~");
-	  pp_wide_int (buffer, min, TYPE_SIGN (TREE_TYPE (node)));
-	  pp_printf (buffer, ", ");
-	  pp_wide_int (buffer, max, TYPE_SIGN (TREE_TYPE (node)));
-	  pp_printf (buffer, "]");
-	}
-  nonzero_bits = get_nonzero_bits (node);
+  get_range_info (node, vr);
+  vr.dump (buffer);
+  wide_int nonzero_bits = get_nonzero_bits (node);
   if (nonzero_bits != -1)
 	{
 	  pp_string (buffer, " NONZERO ");
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index b3f6490e1ca..db5e59b59e3 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -327,57 +327,69 @@ value_range::type () const
   return TREE_TYPE (min ());
 }
 
-/* Dump value range to FILE.  */
+void
+value_range::dump_range (pretty_printer *buffer) const
+{
+  tree ttype = type ();
+  pp_printf (buffer, "%s[", (m_kind == VR_ANTI_RANGE) ? "~" : "");
+  if (INTEGRAL_TYPE_P (ttype)
+  && !TYPE_UNSIGNED (ttype)
+  && vrp_val_is_min (min ())
+  && TYPE_PRECISION (ttype) != 1)
+pp_printf (buffer, "-INF");
+  else
+dump_generic_node (buffer, min (), 0, TDF_SLIM, false);
+
+  pp_printf (buffer, ", ");
+
+  if (INTEGRAL_TYPE_P (ttype)
+  && vrp_val_is_max (max ())
+  && TYPE_PRECISION (ttype) != 1)
+pp_printf (buffer, "+INF");
+  else
+dump_generic_node (buffer, max (), 0, TDF_SLIM, false);
+  pp_string (buffer, "] ");
+  dump_generic_node (buffer, ttype, 0, TDF_SLIM, false);
+}
 
 void
-value_range::dump (FILE *file) const
+value_range::dump (pretty_printer *buffer) const
 {
   if (undefined_p ())
-fprintf (file, "UNDEFINED");
+pp_string (buffer, "UNDEFINED");
   else if (m_kind == VR_RANGE || m_kind == VR_ANTI_RANGE)
 {
-  tree type = TREE_TYPE (min ());
-
-  fprintf (file, "%s[", (m_kind == VR_ANTI_RANGE) ? "~" : "");
-
-  if (INTEGRAL_TYPE_P (type)
-	  && !TYPE_UNSIGNED (type)
-	  && vrp_val_is_min (min ()))
-	fprintf (file, "-INF");
-  else
-	print_generic_expr (file, min ());
-
-  fprintf (file, ", ");
-
-  if (INTEGRAL_TYPE_P (type)
-	  && vrp_val_is_max (max ()))
-	fprintf (file, "+INF");
-  else
-	print_generic_expr (file, max ());
-
-  fprintf (file, "]");
-
+  dump_range (buffer);
   

Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules

2018-11-08 Thread Wilco Dijkstra
Hi,

> But the max. error in sinh/cosh/atanh is less than 2 ULP, with some math
> libraries.  It could be < 1 ULP, in theory, so sinh(atanh(x)) less than
> 2 ULP even.

You can't add ULP errors in general - a tiny difference in the input can 
make a huge difference in the result if the derivative is > 1. 

Even with perfect implementations of 0.501ULP on easy functions with
no large derivatives you could get a 2ULP total error if the perfectly rounded
and actual result end up rounding in different directions in the 2nd function...

So you have to measure ULP error since it is quite counter intuitive.

> And signed zeroes.  Yeah.  I think it would have to be
> flag_unsafe_math_optimizations + some more.

Indeed.

Wilco


Re: [PATCH] Come up with htab_hash_string_vptr and use string-specific if possible.

2018-11-08 Thread Martin Liška
On 11/8/18 2:15 PM, Michael Matz wrote:
> Hi,
> 
> On Wed, 7 Nov 2018, Martin Liška wrote:
> 
>>> Whereever the new function belongs it certainly isn't system.h.  Also 
>>> the definition in a header seems excessive.  Sure, it enables inlining 
>>> of it, but that seems premature optimization.  It contains a loop, and 
>>> inlining anything with loops that aren't very likely to loop just once 
>>> or never just blows code for no gain.  Also as the function is leaf 
>>> there won't be any second-order effect from inlining.
>>
>> Ok, works for me. As you know my main motivation was to provide stronger 
>> type declaration that can be used for 'const char *'. So what about 
>> providing 2 wrappers and poisoning the implementation?

Hi.

> 
> That seems better.  But still, why declare this in system.h?  It seems 
> hash-table.h seems more appropriate.

I need to declare it before I'll poison it. As system.h is included very early,
one can guarantee that there will be no usage before the poisoning happens.

Martin

> 
> 
> Ciao,
> Michael.
> 



[PATCH] Instrument only selected files (PR gcov-profile/87442).

2018-11-08 Thread Martin Liška
Hi.

The patch is about possibility to filter which files are instrumented. The usage
is explained in the PR.

Patch can bootstrap and survives regression tests on x86_64-linux-gnu.

Ready for trunk?
Thanks,
Martin

gcc/ChangeLog:

2018-11-08  Martin Liska  

PR gcov-profile/87442
* common.opt: Add -fprofile-filter-files and -fprofile-exclude-files
options.
* doc/invoke.texi: Document them.
* tree-profile.c (parse_profile_filter): New.
(parse_profile_file_filtering): Likewise.
(release_profile_file_filtering): Likewise.
(include_source_file_for_profile): Likewise.
(tree_profiling): Filter source files based on the
newly added options.

gcc/testsuite/ChangeLog:

2018-11-08  Martin Liska  

PR gcov-profile/87442
* gcc.dg/profile-filtering-1.c: New test.
* gcc.dg/profile-filtering-2.c: New test.
---
 gcc/common.opt |  8 +++
 gcc/doc/invoke.texi| 19 +
 gcc/testsuite/gcc.dg/profile-filtering-1.c | 37 ++
 gcc/testsuite/gcc.dg/profile-filtering-2.c | 37 ++
 gcc/tree-profile.c | 84 ++
 5 files changed, 185 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/profile-filtering-1.c
 create mode 100644 gcc/testsuite/gcc.dg/profile-filtering-2.c


diff --git a/gcc/common.opt b/gcc/common.opt
index 5a5d33205a4..6494011cd6e 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2067,6 +2067,14 @@ fprofile-update=
 Common Joined RejectNegative Enum(profile_update) Var(flag_profile_update) Init(PROFILE_UPDATE_SINGLE)
 -fprofile-update=[single|atomic|prefer-atomic]	Set the profile update method.
 
+fprofile-filter-files=
+Common Joined RejectNegative Var(flag_profile_filter_files)
+Instrument only functions from files where names match any regular expression (separated by a semi-colon).
+
+fprofile-exclude-files=
+Common Joined RejectNegative Var(flag_profile_exclude_files)
+Instrument only functions from files where names do not match all the regular expressions (separated by a semi-colon).
+
 Enum
 Name(profile_update) Type(enum profile_update) UnknownError(unknown profile update method %qs)
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 849bb76dc25..c2f29dbd522 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -483,6 +483,7 @@ Objective-C and Objective-C++ Dialects}.
 @gccoptlist{-p  -pg  -fprofile-arcs  --coverage  -ftest-coverage @gol
 -fprofile-abs-path @gol
 -fprofile-dir=@var{path}  -fprofile-generate  -fprofile-generate=@var{path} @gol
+-fprofile-update=@var{method}  -fprofile-filter-files=@var{regex}  -fprofile-exclude-files=@var{regex} @gol
 -fsanitize=@var{style}  -fsanitize-recover  -fsanitize-recover=@var{style} @gol
 -fasan-shadow-offset=@var{number}  -fsanitize-sections=@var{s1},@var{s2},... @gol
 -fsanitize-undefined-trap-on-error  -fbounds-check @gol
@@ -11808,6 +11809,24 @@ when supported by a target, or to @samp{single} otherwise.  The GCC driver
 automatically selects @samp{prefer-atomic} when @option{-pthread}
 is present in the command line.
 
+@item -fprofile-filter-files=@var{regex}
+@opindex fprofile-filter-files
+
+Instrument only functions from files where names match
+any regular expression (separated by a semi-colon).
+
+For example, @option{-fprofile-filter-files=main.c;module.*.c} will instrument
+only @file{main.c} and all C files starting with 'module'.
+
+@item -fprofile-exclude-files=@var{regex}
+@opindex fprofile-exclude-files
+
+Instrument only functions from files where names do not match
+all the regular expressions (separated by a semi-colon).
+
+For example, @option{-fprofile-exclude-files=/usr/*} will prevent instrumentation
+of all files that are located in @file{/usr/} folder.
+
 @item -fsanitize=address
 @opindex fsanitize=address
 Enable AddressSanitizer, a fast memory error detector.
diff --git a/gcc/testsuite/gcc.dg/profile-filtering-1.c b/gcc/testsuite/gcc.dg/profile-filtering-1.c
new file mode 100644
index 000..f123e24b2a8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/profile-filtering-1.c
@@ -0,0 +1,37 @@
+/* { dg-require-profiling "-fprofile-generate" } */
+/* { dg-options "-O2 -fprofile-generate -fprofile-filter-files=.\*filtering-1.c -fdump-tree-optimized" } */
+
+extern void abort (void);
+
+int *p1;
+int *p2;
+int *p3;
+
+int ga = 100;
+
+int
+sub (int i, int j)
+{
+  int k;
+  int l;
+  int m;
+  int n;
+  p1 = &k;
+  p2 = &l;
+  p3 = &m;
+  k = 20;
+  l = 30;
+  m = 40;
+  n = i / j;
+  return n + ga;
+}
+
+int
+main(void)
+{
+  if (sub (99, 33) != 103)
+abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump "PROF_edge" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/profile-filtering-2.c b/gcc/testsuite/gcc.dg/profile-filtering-2.c
new file mode 100644
index 000..98bd3aea00f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/profile-filtering-2.c
@@ -0,0 +1,37 @@
+/* { dg-require-profiling "-fprofile-gener

Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register

2018-11-08 Thread Renlin Li

Hi Peter,

On 11/08/2018 12:35 PM, Peter Bergner wrote:

On 11/8/18 4:57 AM, Renlin Li wrote:

I think I found the problem!

As described in the PR, a hard register is used in
an pre/post modify expression. The hard register is live, but updated.
In this case, we should make it conflicting with all pseudos live at
that point.  Does it make sense?


Do you have a reproducer test case I can look at?  I'd like to see the
problematical rtl to help me determine whether your patch is correct
or not.  ...and thank you for debugging this!

Peter



Sure! (I was trying to send the mail, but it failed with large attachment.)
I attached the dump file in the bugzilla ticket: 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87899
I remove the unrelated dump (tar -xzvf xxx.tgz)

The code you want to check is the following in ira pass:
insn 10905: r1 = r2040
insn 208: use and update r1 with pre_modify
insn 191: use pseudo r2040

I could not create a test case. This dump is created with stage1 compiler 
compiling next stage compiler.
The (not helpful) command line is:


/home/renlin/try-new/./prev-gcc/cc1plus -quiet -nostdinc++ -v -I 
/home/renlin/try-new/prev-arm-none-linux-gnueabihf/libstdc++-v3/include/arm-none-linux-gnueabihf -I 
/home/renlin/try-new/prev-arm-none-linux-gnueabihf/libstdc++-v3/include -I /home/renlin/gcc/libstdc++-v3/libsupc++ -I . -I . -I ../../gcc/gcc -I 
../../gcc/gcc/. -I ../../gcc/gcc/../include -I ../../gcc/gcc/../libcpp/include -I ../../gcc/gcc/../libdecnumber -I ../../gcc/gcc/../libdecnumber/dpd 
-I ../libdecnumber -I ../../gcc/gcc/../libbacktrace -imultilib . -imultiarch arm-linux-gnueabihf -iprefix 
/home/renlin/try-new/prev-gcc/../lib/gcc/arm-none-linux-gnueabihf/9.0.0/ -isystem /home/renlin/try-new/./prev-gcc/include -isystem 
/home/renlin/try-new/./prev-gcc/include-fixed -MMD tree-loop-distribution.d -MF ./.deps/tree-loop-distribution.TPo -MP -MT tree-loop-distribution.o 
-D_GNU_SOURCE -D IN_GCC -D HAVE_CONFIG_H ../../gcc/gcc/tree-loop-distribution.c -quiet -dumpbase tree-loop-distribution.c -mfloat-abi=hard -mfpu=neon 
-mthumb -mtls-dialect=gnu -march=armv7-a+simd -auxbase-strip tree-loop-distribution.s -g -gtoggle -O2 -Wextra -Wall -Wno-narrowing -Wwrite-strings 
-Wcast-qual -Wsuggest-attribute=format -Woverloaded-virtual -Wpedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -version 
-fno-PIE -fno-checking -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -fno-common -o tree-loop-distribution.s


Regards,
Renlin


Re: [PATCH] Fix PR 86572

2018-11-08 Thread H.J. Lu
On Mon, Nov 5, 2018 at 3:20 PM Bernd Edlinger  wrote:
>
> On 11/5/18 1:28 AM, H.J. Lu wrote:
> > On Sun, Nov 4, 2018 at 10:02 AM Jeff Law  wrote:
> >>
> >> On 10/22/18 9:08 AM, Bernd Edlinger wrote:
> >>> Hi!
> >>>
> >>> This makes c_strlen avoid an unsafe strlen folding of const arguments
> >>> with non-const offset.  Currently a negative out of bounds offset
> >>> makes the strlen function return an extremely large number, and
> >>> at the same time, prevents the VRP machinery, to determine the correct
> >>> range if the strlen function in this case.
> >>>
> >>> Fixed by doing the whole computation in size_t and casting the
> >>> result back to ssize_t.
> >>>
> >>>
> >>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> >>> Is it OK for trunk?
> >>>
> >>>
> >>> Thanks
> >>> Bernd.
> >>>
> >>>
> >>> patch-pr86572.diff
> >>>
> >>> gcc:
> >>> 2018-10-22  Bernd Edlinger  
> >>>
> >>>PR tree-optimization/86572
> >>>* builtins.c (c_strlen): Handle negative offsets in a safe way.
> >>>
> >>> testsuite:
> >>> 2018-10-22  Bernd Edlinger  
> >>>
> >>>PR tree-optimization/86572
> >>>* gcc.dg/pr86572.c: New test.
> >> OK.
> >> jeff
> >
> > This caused:
> >
> > /export/gnu/import/git/gcc-test-ia32/src-trunk/gcc/testsuite/gcc.dg/warn-strlen-no-nul.c:56:1:
> > internal compiler error: verify_gimple failed^M
> > 0x8922dc4 verify_gimple_in_seq(gimple*)^M
> >  ../../src-trunk/gcc/tree-cfg.c:5082^M
> > 0x86899d7 gimplify_body(tree_node*, bool)^M
> >  ../../src-trunk/gcc/gimplify.c:12859^M
> > 0x8689b8b gimplify_function_tree(tree_node*)^M
> >  ../../src-trunk/gcc/gimplify.c:12949^M
> > 0x84f7690 cgraph_node::analyze()^M
> >  ../../src-trunk/gcc/cgraphunit.c:667^M
> > 0x84fa1d8 analyze_functions^M
> >  ../../src-trunk/gcc/cgraphunit.c:1126^M
> > 0x84fadd3 symbol_table::finalize_compilation_unit()^M
> >  ../../src-trunk/gcc/cgraphunit.c:2833^M
> > Please submit a full bug report,^M
> > with preprocessed source if appropriate.^M
> > Please include the complete backtrace with any bug report.^M
> > See  for instructions.^M
> > compiler exited with status 1
> > FAIL: gcc.dg/warn-strlen-no-nul.c (internal compiler error)
> >
> > on i386.
> >
>
> Ah yes thanks.
>
> This is caused by an incorrect folding in string_constant.
> After stripping the type casts in the POINTER_PLUS_EXPR
> we add the offset which is sizetype to what is left
> over from arg1, which is probably even a correctness issue,
> if the type cast was a narrowing one.
>
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu (this time
> tested with RUNTESTFLAGS="--target_board=unix\{-m32,\}")
> Is it OK for trunk?
>

I opened:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87940

to track it.

-- 
H.J.


Re: Implement {get,set}_range_info() variants that work with value_range's

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 12:52 PM Aldy Hernandez  wrote:
>
> get/set_range_info() currently returns the extremes of a range.  I have
> implemented overloaded variants that return a proper range.  In the
> future we should use actual ranges throughout, and not depend on range
> extremes, as depending on this behavior could causes us to lose precision.
>
> I am also including changes to size_must_be_zero_p() to show how we
> should be using the range API, as opposed to performing error prone
> ad-hoc calculations on ranges and anti-ranges.

Yeah, I've been talking this all along but not being heard...

> Martin, I'm not saying your code is wrong.  There are numerous other
> places in the compiler where we manipulate ranges/anti-ranges directly,
> all of which should be adapted in the future.  Everywhere there is a
> mention of VR_RANGE/VR_ANTI_RANGE in the compiler is suspect.  We should
> ideally be using intersect/union/may_contain_p/null_p, etc.

null_p is a bad name btw, I just confused it with empty_p ... (which we have
as undefined_p).  contains_only_zero_p would be less confusing.

> OK pending another round of tests?  (As I had tested this patch along
> with a whole slew of other upcoming changes ;-)).

OK.

Richard.

>
> Aldy


Re: vr_values::{get,update}_value_range: use value_range API

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 1:01 PM Aldy Hernandez  wrote:
>
> As per $SUBJECT.
>
> Depends on get_range_info() API changes I have just submitted.
>
> OK for trunk?

OK


Re: Implement {get,set}_range_info() variants that work with value_range's

2018-11-08 Thread Aldy Hernandez




On 11/8/18 8:59 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 12:52 PM Aldy Hernandez  wrote:


get/set_range_info() currently returns the extremes of a range.  I have
implemented overloaded variants that return a proper range.  In the
future we should use actual ranges throughout, and not depend on range
extremes, as depending on this behavior could causes us to lose precision.

I am also including changes to size_must_be_zero_p() to show how we
should be using the range API, as opposed to performing error prone
ad-hoc calculations on ranges and anti-ranges.


Yeah, I've been talking this all along but not being heard...


My girlfriend says I don't listen.  It could be related.


Martin, I'm not saying your code is wrong.  There are numerous other
places in the compiler where we manipulate ranges/anti-ranges directly,
all of which should be adapted in the future.  Everywhere there is a
mention of VR_RANGE/VR_ANTI_RANGE in the compiler is suspect.  We should
ideally be using intersect/union/may_contain_p/null_p, etc.


null_p is a bad name btw, I just confused it with empty_p ... (which we have
as undefined_p).  contains_only_zero_p would be less confusing.


Yes, a horrible name.  I noticed so as I debugged precisely this bit. 
How about zero_p?


Aldy


Re: [PATCH, testsuite]: Use int128 effective target for gcc.dg/pr87874.c

2018-11-08 Thread Uros Bizjak
On Thu, Nov 8, 2018 at 1:29 PM Christophe Lyon
 wrote:
>
> On Wed, 7 Nov 2018 at 16:50, Uros Bizjak  wrote:
> >
> > 2018-11-07  Uros Bizjak  
> >
> > * gcc.dg/pr87874.c: Compile only for int128 effective target.
> >
> > Tested on x86_64-linux-gnu {,-m32} and committed to mainline SVN.
> >
>
> This doesn't work on aarch64 -mabi=ilp32:
> /gcc/testsuite/gcc.dg/pr87874.c: In function 'em':
> /gcc/testsuite/gcc.dg/pr87874.c:19:50: warning: conversion from 'long
> long unsigned int' to 'long unsigned int' changes value from
> '18446744073709551615' to '4294967295' [-Woverflow]
> FAIL: gcc.dg/pr87874.c (test for excess errors)

Does attached patch works for you?

Uros.
diff --git a/gcc/testsuite/gcc.dg/pr87874.c b/gcc/testsuite/gcc.dg/pr87874.c
index 1480a5e54937..80debe0b4806 100644
--- a/gcc/testsuite/gcc.dg/pr87874.c
+++ b/gcc/testsuite/gcc.dg/pr87874.c
@@ -16,7 +16,7 @@ em (int u5, int fo, int s7)
   if (es == 0)
 if (nb == *vk)
   {
-const unsigned long int uint64_max = 18446744073709551615ul;
+const unsigned long long int uint64_max = 18446744073709551615ull;
 __int128 ks = uint64_max / 2 + 1;
 
 while (s7 < 1)


Re: expr_not_equal_to: use value_range API

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 1:09 PM Aldy Hernandez  wrote:
>
> All this nonsense:
>
> -  rtype = get_range_info (t, &min, &max);
> -  if (rtype == VR_RANGE)
> -   {
> - if (wi::lt_p (max, w, TYPE_SIGN (TREE_TYPE (t
> -   return true;
> - if (wi::lt_p (w, min, TYPE_SIGN (TREE_TYPE (t
> -   return true;
> -   }
> -  else if (rtype == VR_ANTI_RANGE
> -  && wi::le_p (min, w, TYPE_SIGN (TREE_TYPE (t)))
> -  && wi::le_p (w, max, TYPE_SIGN (TREE_TYPE (t
>
> Replaced by an API like Kutulu intended.
>
> +  get_range_info (t, vr);
> +  if (!vr.may_contain_p (wide_int_to_tree (TREE_TYPE (t), w)))
>
> Ain't it grand?

Well.  The not-so-grand thing is that you possibly ggc-allocate
three INTEGER_CST nodes here.

So, no ...?

Shouldn't this instead use wide-int-range.h?  (yeah, there's
the class-ification still missing there)

Richard.

> OK for trunk, depending on get_range_info changes of course?
>
> Aldy


Re: record_ranges_from_incoming_edge: use value_range API for creating new range

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 1:17 PM Aldy Hernandez  wrote:
>
> This one's rather obvious and does not depend on any get_range_info API
> change.
>
> OK for trunk?

Hmm, no - that's broken.  IIRC m_equiv are shared bitmaps if you
do tem = *old_vr so you modify it in place with equiv_clear().

Thus, operator= should be really deleted or mapped to value_range::set()
in which case tem = *old_vr would do useless bitmap allocation and
copying that you then clear.

It's also two lines of code instead of one.

Richard.


Re: expr_not_equal_to: use value_range API

2018-11-08 Thread Aldy Hernandez




On 11/8/18 9:21 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 1:09 PM Aldy Hernandez  wrote:


All this nonsense:

-  rtype = get_range_info (t, &min, &max);
-  if (rtype == VR_RANGE)
-   {
- if (wi::lt_p (max, w, TYPE_SIGN (TREE_TYPE (t
-   return true;
- if (wi::lt_p (w, min, TYPE_SIGN (TREE_TYPE (t
-   return true;
-   }
-  else if (rtype == VR_ANTI_RANGE
-  && wi::le_p (min, w, TYPE_SIGN (TREE_TYPE (t)))
-  && wi::le_p (w, max, TYPE_SIGN (TREE_TYPE (t

Replaced by an API like Kutulu intended.

+  get_range_info (t, vr);
+  if (!vr.may_contain_p (wide_int_to_tree (TREE_TYPE (t), w)))

Ain't it grand?


Well.  The not-so-grand thing is that you possibly ggc-allocate
three INTEGER_CST nodes here.


Hmmm... I'd really prefer to use a simple API call, instead of having to 
twiddle with the extremes manually.  Ideally no one should be looking 
inside of a value_range.


Do recommend another way of implementing may_contain_p ?

Aldy



So, no ...?

Shouldn't this instead use wide-int-range.h?  (yeah, there's
the class-ification still missing there)

Richard.


OK for trunk, depending on get_range_info changes of course?

Aldy


Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register

2018-11-08 Thread Peter Bergner
On 11/8/18 5:48 AM, Richard Biener wrote:
> Err, that looks very much like a hack that manages to hide the issue.

It's true we do not want to hide the issue by adding unneeded conflicts,
since that can lead to unnecessary spills.  However, ...


> Esp. adding conflicts in a loop that says "See which defined values die here."
> is quite fishy.

..the original loop is dealing with some of the gory details you never read
about in academic RA papers.  This code is used to catch the case where an insn
defines a register(s) that is never used.  Because it is never used, it never
ends up in the "live" (ie, live and available) set, which can cause us to miss
some required conflicts.

That said, I still need to look at the RTL from the bad program before
determining whether the patch is correct or not.  Computing accurate
conflict information is a delicate thing.

Peter



Re: record_ranges_from_incoming_edge: use value_range API for creating new range

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 3:24 PM Richard Biener
 wrote:
>
> On Thu, Nov 8, 2018 at 1:17 PM Aldy Hernandez  wrote:
> >
> > This one's rather obvious and does not depend on any get_range_info API
> > change.
> >
> > OK for trunk?
>
> Hmm, no - that's broken.  IIRC m_equiv are shared bitmaps if you
> do tem = *old_vr so you modify it in place with equiv_clear().
>
> Thus, operator= should be really deleted or mapped to value_range::set()
> in which case tem = *old_vr would do useless bitmap allocation and
> copying that you then clear.
>
> It's also two lines of code instead of one.

And...  (making uses of operator= not link):

/space/rguenther/src/gcc-slpcost/gcc/ipa-prop.c:1781: undefined
reference to `value_range::operator=(value_range const&)'
/space/rguenther/src/gcc-slpcost/gcc/tree-ssa-threadedge.c:169:
undefined reference to `value_range::operator=(value_range const&)'
/space/rguenther/src/gcc-slpcost/gcc/tree-vrp.c:230: undefined
reference to `value_range::operator=(value_range const&)'
/space/rguenther/src/gcc-slpcost/gcc/tree-vrp.c:237: undefined
reference to `value_range::operator=(value_range const&)'
/space/rguenther/src/gcc-slpcost/gcc/tree-vrp.c:562: undefined
reference to `value_range::operator=(value_range const&)'
tree-vrp.o:/space/rguenther/src/gcc-slpcost/gcc/tree-vrp.c:1163: more
undefined references to `value_range::operator=(value_range const&)'
follow

can you investiage all those for the same error?

Since we need to do deep copying for the equiv bitmap I think
operator=() should be
not implemented:

diff --git a/gcc/tree-vrp.h b/gcc/tree-vrp.h
index c251329a195..ad4b0cd621b 100644
--- a/gcc/tree-vrp.h
+++ b/gcc/tree-vrp.h
@@ -45,6 +45,7 @@ class GTY((for_user)) value_range
   void update (value_range_kind, tree, tree);
   bool operator== (const value_range &) const;
   bool operator!= (const value_range &) const;
+  value_range& operator=(const value_range &);
   void intersect (const value_range *);
   void union_ (const value_range *);

likewise for copy construction.

Richard.

> Richard.


Re: record_ranges_from_incoming_edge: use value_range API for creating new range

2018-11-08 Thread Aldy Hernandez




On 11/8/18 9:24 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 1:17 PM Aldy Hernandez  wrote:


This one's rather obvious and does not depend on any get_range_info API
change.

OK for trunk?


Hmm, no - that's broken.  IIRC m_equiv are shared bitmaps if you
do tem = *old_vr so you modify it in place with equiv_clear().


Good point.



Thus, operator= should be really deleted or mapped to value_range::set()
in which case tem = *old_vr would do useless bitmap allocation and
copying that you then clear.


Actually, I was thinking that perhaps the assignment and equality 
operators should disregard the equivalence bitmap.  In this case, copy 
everything EXCEPT the bitmap and set the resulting equivalence bitmap to 
NULL.


It's also annoying to use ::ignore_equivs_equal_p().  Since that seems 
to be the predominant way of comparing ranges, perhaps it should be the 
default.


Aldy


Re: misc VRP cleanups for value_range API

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 1:42 PM Aldy Hernandez  wrote:
>
> Stupid boring changes.
>
> OK?

Well, IMHO using m_min is making clear you are accessing a member
while using min () does not.

So no, please do not make this kind of changes?

Richard.


Re: misc VRP cleanups for value_range API

2018-11-08 Thread Aldy Hernandez




On 11/8/18 9:31 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 1:42 PM Aldy Hernandez  wrote:


Stupid boring changes.

OK?


Well, IMHO using m_min is making clear you are accessing a member
while using min () does not.


There is already prior art here.  I believe I discussed this before and 
committed some bits like this.




So no, please do not make this kind of changes?


It also makes it easier for my work going forward.  So not technically 
needed, but it would make my life easier as I change the min/max 
implementation in the distant future.  Please?


Aldy


Re: implement value_range::domain_p()

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 1:52 PM Aldy Hernandez  wrote:
>
> I believe I've seen this idiom more than once.  I know for sure I've
> used it in our ssa-range branch :).  I'll hunt for the other uses and
> adjust accordingly.

domain_p?!  Isn't that the same as varying_p()?  Also

+  if (m_kind == VR_RANGE)
+{
+  tree type = TREE_TYPE (type ());
+  value_range vr (VR_RANGE, TYPE_MIN_VALUE (type), TYPE_MAX_VALUE (type));

TYPE_MIN/MAX_VALUE is NULL for pointers

+  return equal_p (vr, /*ignore_equivs=*/true);

But equivs essentially are making the range smaller!  The range
is the intersection of itself with all equiv ranges!

Richard.

+}


> OK?


Re: cleanups and unification of value_range dumping code

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 2:32 PM Aldy Hernandez  wrote:
>
> [Richard, you're right.  An overloaded debug() is better than
> this->dump().  Anyone who thinks different is wrong.  End of discussion.]
>
> There's no reason to have multiple ways of dumping a value range.  And
> I'm not even talking about ipa-prop.* which decided that they should
> implement their own version of value_ranges basically to annoy me.
> Attached is a patch to remedy this.
>
> I have made three small user-visible changes to the dumping code--
> cause, well... you know me... I can't leave things alone.
>
> 1. It *REALLY* irks me that bools are dumped with -INF/+INF.  Everyone
> knows that bools should be 0/1.  It's in the Bible.  Look it up.
>
> 2. Analogous to #1, what's up with signed 1-bit fields?  Who understands
> [0, +INF] as [0, 0]?  No one; that's right.  (It's still beyond me the
> necessity of signed bit fields in the standard period, but I'll pick my
> battles.)
>
> 3. I find it quite convenient to display the tree type along with the
> range as:
>
> [1, 1] int EQUIVALENCES { me, myself, I }

the type inbetween the range and equivalences?  seriously?  If so then
_please_ dump the type before the range:

  int [1, 1] EQUIV { ... }

> Finally.. As mentioned, I've implement an overloaded debug() because
> it's *so* nice to use gdb and an alias of:
>
> define dd
>   call debug($arg0)
> end
>
> ...and have stuff just work.
>
> I do realize that we still have dump() lying around.  At some point, we
> should start dropping external access to the dump methods for objects
> that are never dumped by the compiler (but by the user at debug time).

+DEBUG_FUNCTION void
+debug (const value_range *vr)
+{
+  dump_value_range (stderr, vr);
+}
+
+DEBUG_FUNCTION void
+debug (const value_range vr)

^^^ missing a & (const reference?)

+{
+  dump_value_range (stderr, &vr);
+}

also

@@ -2122,21 +2122,11 @@ dump_ssaname_info (pretty_printer *buffer,
tree node, int spc)
   if (!POINTER_TYPE_P (TREE_TYPE (node))
   && SSA_NAME_RANGE_INFO (node))
 {
-  wide_int min, max, nonzero_bits;
-  value_range_kind range_type = get_range_info (node, &min, &max);
+  value_range vr;

-  if (range_type == VR_VARYING)
-   pp_printf (buffer, "# RANGE VR_VARYING");
-  else if (range_type == VR_RANGE || range_type == VR_ANTI_RANGE)
-   {
- pp_printf (buffer, "# RANGE ");
- pp_printf (buffer, "%s[", range_type == VR_RANGE ? "" : "~");
- pp_wide_int (buffer, min, TYPE_SIGN (TREE_TYPE (node)));
- pp_printf (buffer, ", ");
- pp_wide_int (buffer, max, TYPE_SIGN (TREE_TYPE (node)));
- pp_printf (buffer, "]");
-   }
-  nonzero_bits = get_nonzero_bits (node);
+  get_range_info (node, vr);

this is again allocating INTEGER_CSTs for no good reason.  dumping really
shuld avoid possibly messing with the GC state.

wide-int-range again?

Richard.

> OK for trunk?


Re: implement value_range::domain_p()

2018-11-08 Thread Aldy Hernandez




On 11/8/18 9:34 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 1:52 PM Aldy Hernandez  wrote:


I believe I've seen this idiom more than once.  I know for sure I've
used it in our ssa-range branch :).  I'll hunt for the other uses and
adjust accordingly.


domain_p?!  Isn't that the same as varying_p()?  Also


Sigh, yes.  If we kept normalized value_ranges around where [MIN,MAX] 
degraded into VR_VARYING, then yes.  But alas, we have two ways of 
representing the entire domain.  Don't look at me.  That crap was 
already there :).


Another approach would be to call ::set_and_canonicalize() before 
checking varying_p() and teach the canonicalize function that [MIN, MAX] 
is VR_VARYING.  How does that sound?


Aldy



+  if (m_kind == VR_RANGE)
+{
+  tree type = TREE_TYPE (type ());
+  value_range vr (VR_RANGE, TYPE_MIN_VALUE (type), TYPE_MAX_VALUE (type));

TYPE_MIN/MAX_VALUE is NULL for pointers

+  return equal_p (vr, /*ignore_equivs=*/true);

But equivs essentially are making the range smaller!  The range
is the intersection of itself with all equiv ranges!

Richard.

+}



OK?


Re: Implement {get,set}_range_info() variants that work with value_range's

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 3:05 PM Aldy Hernandez  wrote:
>
>
>
> On 11/8/18 8:59 AM, Richard Biener wrote:
> > On Thu, Nov 8, 2018 at 12:52 PM Aldy Hernandez  wrote:
> >>
> >> get/set_range_info() currently returns the extremes of a range.  I have
> >> implemented overloaded variants that return a proper range.  In the
> >> future we should use actual ranges throughout, and not depend on range
> >> extremes, as depending on this behavior could causes us to lose precision.
> >>
> >> I am also including changes to size_must_be_zero_p() to show how we
> >> should be using the range API, as opposed to performing error prone
> >> ad-hoc calculations on ranges and anti-ranges.
> >
> > Yeah, I've been talking this all along but not being heard...
>
> My girlfriend says I don't listen.  It could be related.
>
> >> Martin, I'm not saying your code is wrong.  There are numerous other
> >> places in the compiler where we manipulate ranges/anti-ranges directly,
> >> all of which should be adapted in the future.  Everywhere there is a
> >> mention of VR_RANGE/VR_ANTI_RANGE in the compiler is suspect.  We should
> >> ideally be using intersect/union/may_contain_p/null_p, etc.
> >
> > null_p is a bad name btw, I just confused it with empty_p ... (which we have
> > as undefined_p).  contains_only_zero_p would be less confusing.
>
> Yes, a horrible name.  I noticed so as I debugged precisely this bit.
> How about zero_p?

Probably the same ambiguous connotation?  But yes, way better than null_p.

Richard.

> Aldy


Re: expr_not_equal_to: use value_range API

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 3:27 PM Aldy Hernandez  wrote:
>
>
>
> On 11/8/18 9:21 AM, Richard Biener wrote:
> > On Thu, Nov 8, 2018 at 1:09 PM Aldy Hernandez  wrote:
> >>
> >> All this nonsense:
> >>
> >> -  rtype = get_range_info (t, &min, &max);
> >> -  if (rtype == VR_RANGE)
> >> -   {
> >> - if (wi::lt_p (max, w, TYPE_SIGN (TREE_TYPE (t
> >> -   return true;
> >> - if (wi::lt_p (w, min, TYPE_SIGN (TREE_TYPE (t
> >> -   return true;
> >> -   }
> >> -  else if (rtype == VR_ANTI_RANGE
> >> -  && wi::le_p (min, w, TYPE_SIGN (TREE_TYPE (t)))
> >> -  && wi::le_p (w, max, TYPE_SIGN (TREE_TYPE (t
> >>
> >> Replaced by an API like Kutulu intended.
> >>
> >> +  get_range_info (t, vr);
> >> +  if (!vr.may_contain_p (wide_int_to_tree (TREE_TYPE (t), w)))
> >>
> >> Ain't it grand?
> >
> > Well.  The not-so-grand thing is that you possibly ggc-allocate
> > three INTEGER_CST nodes here.
>
> Hmmm... I'd really prefer to use a simple API call, instead of having to
> twiddle with the extremes manually.  Ideally no one should be looking
> inside of a value_range.
>
> Do recommend another way of implementing may_contain_p ?

I think many places dealing with get_range_info () should instead
work on the (to be created...) 1:1 copy of value_range ontop of
wide-int-range instead.

So - can you add a wide_int_range class to wide-int-range.h
that implements the same (but with wide-ints rather than trees
obviously) API as value-range?

IIRC you once had sth like that to simpify arg passing to the
workers but otherwise without much functionality?

Richard.

> Aldy
>
> >
> > So, no ...?
> >
> > Shouldn't this instead use wide-int-range.h?  (yeah, there's
> > the class-ification still missing there)
> >
> > Richard.
> >
> >> OK for trunk, depending on get_range_info changes of course?
> >>
> >> Aldy


Re: Implement {get,set}_range_info() variants that work with value_range's

2018-11-08 Thread Aldy Hernandez




On 11/8/18 9:41 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 3:05 PM Aldy Hernandez  wrote:




On 11/8/18 8:59 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 12:52 PM Aldy Hernandez  wrote:


get/set_range_info() currently returns the extremes of a range.  I have
implemented overloaded variants that return a proper range.  In the
future we should use actual ranges throughout, and not depend on range
extremes, as depending on this behavior could causes us to lose precision.

I am also including changes to size_must_be_zero_p() to show how we
should be using the range API, as opposed to performing error prone
ad-hoc calculations on ranges and anti-ranges.


Yeah, I've been talking this all along but not being heard...


My girlfriend says I don't listen.  It could be related.


Martin, I'm not saying your code is wrong.  There are numerous other
places in the compiler where we manipulate ranges/anti-ranges directly,
all of which should be adapted in the future.  Everywhere there is a
mention of VR_RANGE/VR_ANTI_RANGE in the compiler is suspect.  We should
ideally be using intersect/union/may_contain_p/null_p, etc.


null_p is a bad name btw, I just confused it with empty_p ... (which we have
as undefined_p).  contains_only_zero_p would be less confusing.


Yes, a horrible name.  I noticed so as I debugged precisely this bit.
How about zero_p?


Probably the same ambiguous connotation?  But yes, way better than null_p.


Well...for starters I started with the nomenclature already in VRP which 
was range_is_null.


Also, how is zero_p() ambiguous?  We want to know if the range is [0, 
0].  That reads pretty obvious to me.


Aldy


Re: Implement {get,set}_range_info() variants that work with value_range's

2018-11-08 Thread Jeff Law
On 11/8/18 7:44 AM, Aldy Hernandez wrote:
> 
> 
> On 11/8/18 9:41 AM, Richard Biener wrote:
>> On Thu, Nov 8, 2018 at 3:05 PM Aldy Hernandez  wrote:
>>>
>>>
>>>
>>> On 11/8/18 8:59 AM, Richard Biener wrote:
 On Thu, Nov 8, 2018 at 12:52 PM Aldy Hernandez 
 wrote:
>
> get/set_range_info() currently returns the extremes of a range.  I
> have
> implemented overloaded variants that return a proper range.  In the
> future we should use actual ranges throughout, and not depend on range
> extremes, as depending on this behavior could causes us to lose
> precision.
>
> I am also including changes to size_must_be_zero_p() to show how we
> should be using the range API, as opposed to performing error prone
> ad-hoc calculations on ranges and anti-ranges.

 Yeah, I've been talking this all along but not being heard...
>>>
>>> My girlfriend says I don't listen.  It could be related.
>>>
> Martin, I'm not saying your code is wrong.  There are numerous other
> places in the compiler where we manipulate ranges/anti-ranges
> directly,
> all of which should be adapted in the future.  Everywhere there is a
> mention of VR_RANGE/VR_ANTI_RANGE in the compiler is suspect.  We
> should
> ideally be using intersect/union/may_contain_p/null_p, etc.

 null_p is a bad name btw, I just confused it with empty_p ... (which
 we have
 as undefined_p).  contains_only_zero_p would be less confusing.
>>>
>>> Yes, a horrible name.  I noticed so as I debugged precisely this bit.
>>> How about zero_p?
>>
>> Probably the same ambiguous connotation?  But yes, way better than
>> null_p.
> 
> Well...for starters I started with the nomenclature already in VRP which
> was range_is_null.
Probably due to using it for NULL pointer test elimination.  But yea, in
retrospect it's an awful name.

> 
> Also, how is zero_p() ambiguous?  We want to know if the range is [0,
> 0].  That reads pretty obvious to me.
Seems reasonable to me.  THe question is will we push that name into all
the other places that use "null" when discussing ranges like ~[0,0] or
ranges that ultimately include [0,0].  Not all of these are in [E]VRP IIRC.

jeff


Re: record_ranges_from_incoming_edge: use value_range API for creating new range

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 3:31 PM Aldy Hernandez  wrote:
>
>
>
> On 11/8/18 9:24 AM, Richard Biener wrote:
> > On Thu, Nov 8, 2018 at 1:17 PM Aldy Hernandez  wrote:
> >>
> >> This one's rather obvious and does not depend on any get_range_info API
> >> change.
> >>
> >> OK for trunk?
> >
> > Hmm, no - that's broken.  IIRC m_equiv are shared bitmaps if you
> > do tem = *old_vr so you modify it in place with equiv_clear().
>
> Good point.
>
> >
> > Thus, operator= should be really deleted or mapped to value_range::set()
> > in which case tem = *old_vr would do useless bitmap allocation and
> > copying that you then clear.
>
> Actually, I was thinking that perhaps the assignment and equality
> operators should disregard the equivalence bitmap.  In this case, copy
> everything EXCEPT the bitmap and set the resulting equivalence bitmap to
> NULL.

I think that's equally confusing.

> It's also annoying to use ::ignore_equivs_equal_p().  Since that seems
> to be the predominant way of comparing ranges, perhaps it should be the
> default.

I think a good approach would be to isolate m_equiv more because it is
really an implementation detail of the propagator.  Thus, make

class value_range_with_equiv : public value_range
{
... all the equiv stuff..
}

make the lattice of type value_range_with_equiv and see what tickles
down.

value_range_with_equiv wouldn't implement copy and assignment
(too expensive) and value_range can do with the trivial implementation.

And most consumers/workers can just work on the equiv-less variants.

Richard.

> Aldy


Re: expr_not_equal_to: use value_range API

2018-11-08 Thread Aldy Hernandez




On 11/8/18 9:43 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 3:27 PM Aldy Hernandez  wrote:




On 11/8/18 9:21 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 1:09 PM Aldy Hernandez  wrote:


All this nonsense:

-  rtype = get_range_info (t, &min, &max);
-  if (rtype == VR_RANGE)
-   {
- if (wi::lt_p (max, w, TYPE_SIGN (TREE_TYPE (t
-   return true;
- if (wi::lt_p (w, min, TYPE_SIGN (TREE_TYPE (t
-   return true;
-   }
-  else if (rtype == VR_ANTI_RANGE
-  && wi::le_p (min, w, TYPE_SIGN (TREE_TYPE (t)))
-  && wi::le_p (w, max, TYPE_SIGN (TREE_TYPE (t

Replaced by an API like Kutulu intended.

+  get_range_info (t, vr);
+  if (!vr.may_contain_p (wide_int_to_tree (TREE_TYPE (t), w)))

Ain't it grand?


Well.  The not-so-grand thing is that you possibly ggc-allocate
three INTEGER_CST nodes here.


Hmmm... I'd really prefer to use a simple API call, instead of having to
twiddle with the extremes manually.  Ideally no one should be looking
inside of a value_range.

Do recommend another way of implementing may_contain_p ?


I think many places dealing with get_range_info () should instead
work on the (to be created...) 1:1 copy of value_range ontop of
wide-int-range instead.


I'd prefer to not expose that we're going to use wide_int or any other 
implementation to the users of get_range_info().




So - can you add a wide_int_range class to wide-int-range.h
that implements the same (but with wide-ints rather than trees
obviously) API as value-range?


Hmmm, I don't have time for this release cycle.  Perhaps something to be 
entertained for GCC+1?


Again, I prefer my patch as is.  I cleans up the code, and keeps us from 
introducing problematic bugs.  Anything dealing with anti ranges is 
fraught with peril, as my cleanups to tree-vrp revealed.


If using these INTEGER_CST's causes any measurable performance 
difference, I'd be happy to look into it.


Aldy


Re: misc VRP cleanups for value_range API

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 3:34 PM Aldy Hernandez  wrote:
>
>
>
> On 11/8/18 9:31 AM, Richard Biener wrote:
> > On Thu, Nov 8, 2018 at 1:42 PM Aldy Hernandez  wrote:
> >>
> >> Stupid boring changes.
> >>
> >> OK?
> >
> > Well, IMHO using m_min is making clear you are accessing a member
> > while using min () does not.
>
> There is already prior art here.  I believe I discussed this before and
> committed some bits like this.
>
> >
> > So no, please do not make this kind of changes?
>
> It also makes it easier for my work going forward.  So not technically
> needed, but it would make my life easier as I change the min/max
> implementation in the distant future.  Please?

Hmfp :/

OK

> Aldy


Re: record_ranges_from_incoming_edge: use value_range API for creating new range

2018-11-08 Thread Jeff Law
On 11/8/18 7:49 AM, Richard Biener wrote:
> On Thu, Nov 8, 2018 at 3:31 PM Aldy Hernandez  wrote:
>>
>>
>>
>> On 11/8/18 9:24 AM, Richard Biener wrote:
>>> On Thu, Nov 8, 2018 at 1:17 PM Aldy Hernandez  wrote:

 This one's rather obvious and does not depend on any get_range_info API
 change.

 OK for trunk?
>>>
>>> Hmm, no - that's broken.  IIRC m_equiv are shared bitmaps if you
>>> do tem = *old_vr so you modify it in place with equiv_clear().
>>
>> Good point.
>>
>>>
>>> Thus, operator= should be really deleted or mapped to value_range::set()
>>> in which case tem = *old_vr would do useless bitmap allocation and
>>> copying that you then clear.
>>
>> Actually, I was thinking that perhaps the assignment and equality
>> operators should disregard the equivalence bitmap.  In this case, copy
>> everything EXCEPT the bitmap and set the resulting equivalence bitmap to
>> NULL.
> 
> I think that's equally confusing.
Agreed.  The existence of the bitmaps to me indicates these objects
aren't really good candidates for operator overloads.

> 
>> It's also annoying to use ::ignore_equivs_equal_p().  Since that seems
>> to be the predominant way of comparing ranges, perhaps it should be the
>> default.
> 
> I think a good approach would be to isolate m_equiv more because it is
> really an implementation detail of the propagator.  Thus, make
> 
> class value_range_with_equiv : public value_range
> {
> ... all the equiv stuff..
> }
> 
> make the lattice of type value_range_with_equiv and see what tickles
> down.
> 
> value_range_with_equiv wouldn't implement copy and assignment
> (too expensive) and value_range can do with the trivial implementation.
> 
> And most consumers/workers can just work on the equiv-less variants.
Most likely yes based on my experiences last year with teasing bits of
this stuff apart.

jeff


Re: implement value_range::domain_p()

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 3:40 PM Aldy Hernandez  wrote:
>
>
>
> On 11/8/18 9:34 AM, Richard Biener wrote:
> > On Thu, Nov 8, 2018 at 1:52 PM Aldy Hernandez  wrote:
> >>
> >> I believe I've seen this idiom more than once.  I know for sure I've
> >> used it in our ssa-range branch :).  I'll hunt for the other uses and
> >> adjust accordingly.
> >
> > domain_p?!  Isn't that the same as varying_p()?  Also
>
> Sigh, yes.  If we kept normalized value_ranges around where [MIN,MAX]
> degraded into VR_VARYING, then yes.  But alas, we have two ways of
> representing the entire domain.  Don't look at me.  That crap was
> already there :).
>
> Another approach would be to call ::set_and_canonicalize() before
> checking varying_p() and teach the canonicalize function that [MIN, MAX]
> is VR_VARYING.  How does that sound?

But that's still broken for the case where it has equivalences.  I fear that
if you look at users we'll end up with three or more different
varying_p/domain_p
things all being subtly different...

As said in the other thread we should see to separate equivs out
of the way.

> Aldy
>
> >
> > +  if (m_kind == VR_RANGE)
> > +{
> > +  tree type = TREE_TYPE (type ());
> > +  value_range vr (VR_RANGE, TYPE_MIN_VALUE (type), TYPE_MAX_VALUE 
> > (type));
> >
> > TYPE_MIN/MAX_VALUE is NULL for pointers
> >
> > +  return equal_p (vr, /*ignore_equivs=*/true);
> >
> > But equivs essentially are making the range smaller!  The range
> > is the intersection of itself with all equiv ranges!
> >
> > Richard.
> >
> > +}
> >
> >
> >> OK?


Re: record_ranges_from_incoming_edge: use value_range API for creating new range

2018-11-08 Thread Aldy Hernandez




On 11/8/18 9:49 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 3:31 PM Aldy Hernandez  wrote:




On 11/8/18 9:24 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 1:17 PM Aldy Hernandez  wrote:


This one's rather obvious and does not depend on any get_range_info API
change.

OK for trunk?


Hmm, no - that's broken.  IIRC m_equiv are shared bitmaps if you
do tem = *old_vr so you modify it in place with equiv_clear().


Good point.



Thus, operator= should be really deleted or mapped to value_range::set()
in which case tem = *old_vr would do useless bitmap allocation and
copying that you then clear.


Actually, I was thinking that perhaps the assignment and equality
operators should disregard the equivalence bitmap.  In this case, copy
everything EXCEPT the bitmap and set the resulting equivalence bitmap to
NULL.


I think that's equally confusing.


I don't think so.  When you ask whether range A and range B are equal, 
you're almost always interested in the range itself, not the equivalence 
table behind it.


We could also get rid of the == operator and just provide:

bool equal_p(bool ignore_equivs);

How does this sound?




It's also annoying to use ::ignore_equivs_equal_p().  Since that seems
to be the predominant way of comparing ranges, perhaps it should be the
default.


I think a good approach would be to isolate m_equiv more because it is
really an implementation detail of the propagator.  Thus, make

class value_range_with_equiv : public value_range
{
... all the equiv stuff..
}

make the lattice of type value_range_with_equiv and see what tickles
down.

value_range_with_equiv wouldn't implement copy and assignment
(too expensive) and value_range can do with the trivial implementation.

And most consumers/workers can just work on the equiv-less variants.


I like this.  Unfortunately, not feasible for this cycle (for me 
anyhow-- patches welcome though :)).  How about equal_p() as described 
above?


Aldy


Re: expr_not_equal_to: use value_range API

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 3:50 PM Aldy Hernandez  wrote:
>
>
>
> On 11/8/18 9:43 AM, Richard Biener wrote:
> > On Thu, Nov 8, 2018 at 3:27 PM Aldy Hernandez  wrote:
> >>
> >>
> >>
> >> On 11/8/18 9:21 AM, Richard Biener wrote:
> >>> On Thu, Nov 8, 2018 at 1:09 PM Aldy Hernandez  wrote:
> 
>  All this nonsense:
> 
>  -  rtype = get_range_info (t, &min, &max);
>  -  if (rtype == VR_RANGE)
>  -   {
>  - if (wi::lt_p (max, w, TYPE_SIGN (TREE_TYPE (t
>  -   return true;
>  - if (wi::lt_p (w, min, TYPE_SIGN (TREE_TYPE (t
>  -   return true;
>  -   }
>  -  else if (rtype == VR_ANTI_RANGE
>  -  && wi::le_p (min, w, TYPE_SIGN (TREE_TYPE (t)))
>  -  && wi::le_p (w, max, TYPE_SIGN (TREE_TYPE (t
> 
>  Replaced by an API like Kutulu intended.
> 
>  +  get_range_info (t, vr);
>  +  if (!vr.may_contain_p (wide_int_to_tree (TREE_TYPE (t), w)))
> 
>  Ain't it grand?
> >>>
> >>> Well.  The not-so-grand thing is that you possibly ggc-allocate
> >>> three INTEGER_CST nodes here.
> >>
> >> Hmmm... I'd really prefer to use a simple API call, instead of having to
> >> twiddle with the extremes manually.  Ideally no one should be looking
> >> inside of a value_range.
> >>
> >> Do recommend another way of implementing may_contain_p ?
> >
> > I think many places dealing with get_range_info () should instead
> > work on the (to be created...) 1:1 copy of value_range ontop of
> > wide-int-range instead.
>
> I'd prefer to not expose that we're going to use wide_int or any other
> implementation to the users of get_range_info().

But it's exposed at the moment.  And I don't see it change.  And you
should not allocate memory for no good reason.

> >
> > So - can you add a wide_int_range class to wide-int-range.h
> > that implements the same (but with wide-ints rather than trees
> > obviously) API as value-range?
>
> Hmmm, I don't have time for this release cycle.  Perhaps something to be
> entertained for GCC+1?
>
> Again, I prefer my patch as is.  I cleans up the code, and keeps us from
> introducing problematic bugs.  Anything dealing with anti ranges is
> fraught with peril, as my cleanups to tree-vrp revealed.
>
> If using these INTEGER_CST's causes any measurable performance
> difference, I'd be happy to look into it.

Just leave the code unchanged then in this release?

Richard.

> Aldy


Re: record_ranges_from_incoming_edge: use value_range API for creating new range

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 3:54 PM Aldy Hernandez  wrote:
>
>
>
> On 11/8/18 9:49 AM, Richard Biener wrote:
> > On Thu, Nov 8, 2018 at 3:31 PM Aldy Hernandez  wrote:
> >>
> >>
> >>
> >> On 11/8/18 9:24 AM, Richard Biener wrote:
> >>> On Thu, Nov 8, 2018 at 1:17 PM Aldy Hernandez  wrote:
> 
>  This one's rather obvious and does not depend on any get_range_info API
>  change.
> 
>  OK for trunk?
> >>>
> >>> Hmm, no - that's broken.  IIRC m_equiv are shared bitmaps if you
> >>> do tem = *old_vr so you modify it in place with equiv_clear().
> >>
> >> Good point.
> >>
> >>>
> >>> Thus, operator= should be really deleted or mapped to value_range::set()
> >>> in which case tem = *old_vr would do useless bitmap allocation and
> >>> copying that you then clear.
> >>
> >> Actually, I was thinking that perhaps the assignment and equality
> >> operators should disregard the equivalence bitmap.  In this case, copy
> >> everything EXCEPT the bitmap and set the resulting equivalence bitmap to
> >> NULL.
> >
> > I think that's equally confusing.
>
> I don't think so.  When you ask whether range A and range B are equal,
> you're almost always interested in the range itself, not the equivalence
> table behind it.
>
> We could also get rid of the == operator and just provide:
>
> bool equal_p(bool ignore_equivs);
>
> How does this sound?

Good.

> >
> >> It's also annoying to use ::ignore_equivs_equal_p().  Since that seems
> >> to be the predominant way of comparing ranges, perhaps it should be the
> >> default.
> >
> > I think a good approach would be to isolate m_equiv more because it is
> > really an implementation detail of the propagator.  Thus, make
> >
> > class value_range_with_equiv : public value_range
> > {
> > ... all the equiv stuff..
> > }
> >
> > make the lattice of type value_range_with_equiv and see what tickles
> > down.
> >
> > value_range_with_equiv wouldn't implement copy and assignment
> > (too expensive) and value_range can do with the trivial implementation.
> >
> > And most consumers/workers can just work on the equiv-less variants.
>
> I like this.  Unfortunately, not feasible for this cycle (for me
> anyhow-- patches welcome though :)).  How about equal_p() as described
> above?

Works for me but you still need to sort out the copying, so if you think
splitting is not feasible (I'll give it a try) then please disable assingment
and copy operators and fixup code.

Richard.

>
> Aldy


Fix PR middle-end/87916

2018-11-08 Thread Eric Botcazou
Since expand_thunk no longer overrides the DECL_IGNORED_P setting on the thunk 
from the front-end in every case, duplicate_thunk_for_node may create a new 
thunk without DECL_IGNORED_P set and this doesn't play with inlining and early 
debug info generation; fixed by forcing DECL_IGNORED_P as before in this case.

Bootstrapped/regtested on x86-64/Linux, applied on the mainline as obvious.


2018-11-08  Eric Botcazou  

PR middle-end/87916
* cgraphclones.c (duplicate_thunk_for_node): Also set DECL_IGNORED_P.


2018-11-08  Eric Botcazou  

* g++.dg/other/pr87916.C: New test.

-- 
Eric Botcazou// PR middle-end/87916
// Testcase by Martin Liška 

// { dg-do compile }
// { dg-options "-O2 -g" }
// { dg-additional-options "-fPIC" { target fpic } }

struct a {
  virtual ~a();
};
template  class c {
public:
  class d {
  public:
d(c);
b *operator->();
  };
};
int e, f;
class g {
public:
  class h {
  public:
virtual void j(g &, int &, bool) = 0;
  };
  c k();
  int *l();
  int *m();
};
int *g::l() try {
  for (c::d i(k());;)
i->j(*this, e, true);
} catch (int) {
  return 0;
}
int *g::m() try {
  for (c::d i(k());;)
i->j(*this, f, false);
} catch (int) {
  return 0;
}
struct n : a, g::h {
  void o();
  void j(g &, int &, bool) { o(); }
};
Index: cgraphclones.c
===
--- cgraphclones.c	(revision 265866)
+++ cgraphclones.c	(working copy)
@@ -321,6 +321,10 @@ duplicate_thunk_for_node (cgraph_node *t
 		   "artificial_thunk");
   SET_DECL_ASSEMBLER_NAME (new_decl, DECL_NAME (new_decl));
 
+  /* We need to force DECL_IGNORED_P because the new thunk is created after
+ early debug was run.  */
+  DECL_IGNORED_P (new_decl) = 1;
+
   new_thunk = cgraph_node::create (new_decl);
   set_new_clone_decl_and_node_flags (new_thunk);
   new_thunk->definition = true;


Re: GCC options for kernel live-patching (Was: Add a new option to control inlining only on static functions)

2018-11-08 Thread Jan Hubicka
> 2018-11-07  Martin Liska  
> 
>   * common.opt: Add -fipa-stack-alignment flag.
>   * doc/invoke.texi: Document it.
>   * final.c (rest_of_clean_state): Guard stack
>   shrinking with flag.

OK
> gcc/ChangeLog:
> 
> 2018-11-07  Martin Liska  
> 
>   * cgraph.h (ipa_discover_readonly_nonaddressable_vars): Rename
>   to ...
>   (ipa_discover_nonaddressable_vars): ... this.
>   * common.opt: Come up with new flag -fipa-reference-addressable.
>   * doc/invoke.texi: Document it.
>   * ipa-reference.c (propagate): Call the renamed fn.
>   * ipa-visibility.c (whole_program_function_and_variable_visibility):
>   Likewise.
>   * ipa.c (ipa_discover_readonly_nonaddressable_vars): Renamed to
>   ...
>   (ipa_discover_nonaddressable_vars): ... this.  Discove
>   non-addressable variables only with the newly added flag.
>   * opts.c: Enable the newly added flag with -O1 and higher
>   optimization level.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-11-07  Martin Liska  
> 
>   * gcc.dg/tree-ssa/writeonly-2.c: New test.
> ---
>  gcc/cgraph.h|  2 +-
>  gcc/common.opt  |  6 +-
>  gcc/doc/invoke.texi | 10 --
>  gcc/ipa-reference.c |  2 +-
>  gcc/ipa-visibility.c|  2 +-
>  gcc/ipa.c   | 11 +++
>  gcc/opts.c  |  1 +
>  gcc/testsuite/gcc.dg/tree-ssa/writeonly-2.c | 20 
>  8 files changed, 44 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/writeonly-2.c
> 
> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
> index c13d79850fa..bf65d426cda 100644
> --- a/gcc/cgraph.h
> +++ b/gcc/cgraph.h
> @@ -2403,7 +2403,7 @@ void record_references_in_initializer (tree, bool);
>  
>  /* In ipa.c  */
>  void cgraph_build_static_cdtor (char which, tree body, int priority);
> -bool ipa_discover_readonly_nonaddressable_vars (void);
> +bool ipa_discover_nonaddressable_vars (void);

I guess ipa_discover_variable_flags :)
>  
>  /* In varpool.c  */
>  tree ctor_for_folding (tree);
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 2971dc21b1f..6a64b0e27d5 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1718,7 +1718,11 @@ Perform Identical Code Folding for variables.
>  
>  fipa-reference
>  Common Report Var(flag_ipa_reference) Init(0) Optimization
> -Discover readonly and non addressable static variables.
> +Discover read-only and non addressable static variables.
> +
> +fipa-reference-addressable
> +Common Report Var(flag_ipa_reference_addressable) Init(0) Optimization
> +Discover read-only and write-only addressable variables.

I guess this should say
Discover read-only, write-only and non-addressable static variables 

OK with that change.
Honza

>  
>  fipa-matrix-reorg
>  Common Ignore
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index ae260c6ac6d..82c6fa913e8 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -412,8 +412,8 @@ Objective-C and Objective-C++ Dialects}.
>  -finline-functions  -finline-functions-called-once  -finline-limit=@var{n} 
> @gol
>  -finline-small-functions  -fipa-cp  -fipa-cp-clone @gol
>  -fipa-bit-cp -fipa-vrp @gol
> --fipa-pta  -fipa-profile  -fipa-pure-const  -fipa-reference  -fipa-icf @gol
> --fira-algorithm=@var{algorithm} @gol
> +-fipa-pta  -fipa-profile  -fipa-pure-const  -fipa-reference  
> -fipa-reference-addressable @gol
> +-fipa-icf  -fira-algorithm=@var{algorithm} @gol
>  -fira-region=@var{region}  -fira-hoist-pressure @gol
>  -fira-loop-pressure  -fno-ira-share-save-slots @gol
>  -fno-ira-share-spill-slots @gol
> @@ -7866,6 +7866,7 @@ compilation time.
>  -fipa-pure-const @gol
>  -fipa-profile @gol
>  -fipa-reference @gol
> +-fipa-reference-addressable @gol
>  -fmerge-constants @gol
>  -fmove-loop-invariants @gol
>  -fomit-frame-pointer @gol
> @@ -8895,6 +8896,11 @@ Discover which static variables do not escape the
>  compilation unit.
>  Enabled by default at @option{-O} and higher.
>  
> +@item -fipa-reference-addressable
> +@opindex fipa-reference-addressable
> +Discover read-only and write-only addressable variables.
> +Enabled by default at @option{-O} and higher.
> +
>  @item -fipa-pta
>  @opindex fipa-pta
>  Perform interprocedural pointer analysis and interprocedural modification
> diff --git a/gcc/ipa-reference.c b/gcc/ipa-reference.c
> index 43bbdae5d66..2cdce3cbfa6 100644
> --- a/gcc/ipa-reference.c
> +++ b/gcc/ipa-reference.c
> @@ -705,7 +705,7 @@ propagate (void)
>if (dump_file)
>  cgraph_node::dump_cgraph (dump_file);
>  
> -  remove_p = ipa_discover_readonly_nonaddressable_vars ();
> +  remove_p = ipa_discover_nonaddressable_vars ();
>generate_summary ();
>  
>/* Propagate the local information through the call graph to produce
> diff --git a/gcc/ipa-visibility.c b/gcc/ipa-visibility.c
> index 000207fa

Re: record_ranges_from_incoming_edge: use value_range API for creating new range

2018-11-08 Thread Aldy Hernandez




On 11/8/18 9:56 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 3:54 PM Aldy Hernandez  wrote:




On 11/8/18 9:49 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 3:31 PM Aldy Hernandez  wrote:




On 11/8/18 9:24 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 1:17 PM Aldy Hernandez  wrote:


This one's rather obvious and does not depend on any get_range_info API
change.

OK for trunk?


Hmm, no - that's broken.  IIRC m_equiv are shared bitmaps if you
do tem = *old_vr so you modify it in place with equiv_clear().


Good point.



Thus, operator= should be really deleted or mapped to value_range::set()
in which case tem = *old_vr would do useless bitmap allocation and
copying that you then clear.


Actually, I was thinking that perhaps the assignment and equality
operators should disregard the equivalence bitmap.  In this case, copy
everything EXCEPT the bitmap and set the resulting equivalence bitmap to
NULL.


I think that's equally confusing.


I don't think so.  When you ask whether range A and range B are equal,
you're almost always interested in the range itself, not the equivalence
table behind it.

We could also get rid of the == operator and just provide:

 bool equal_p(bool ignore_equivs);

How does this sound?


Good.




It's also annoying to use ::ignore_equivs_equal_p().  Since that seems
to be the predominant way of comparing ranges, perhaps it should be the
default.


I think a good approach would be to isolate m_equiv more because it is
really an implementation detail of the propagator.  Thus, make

class value_range_with_equiv : public value_range
{
... all the equiv stuff..
}

make the lattice of type value_range_with_equiv and see what tickles
down.

value_range_with_equiv wouldn't implement copy and assignment
(too expensive) and value_range can do with the trivial implementation.

And most consumers/workers can just work on the equiv-less variants.


I like this.  Unfortunately, not feasible for this cycle (for me
anyhow-- patches welcome though :)).  How about equal_p() as described
above?


Works for me but you still need to sort out the copying, so if you think
splitting is not feasible (I'll give it a try) then please disable assingment
and copy operators and fixup code.


Are you saying you'll try implementing value_range_with_equiv : 
value_range?  That would be of great help!


In the meantime I could provide equal_p(bool ignore_equivs) and perhaps 
copy(bool ignore_equivs), while disabling assignment and comparison 
operators.


Aldy


Re: implement value_range::domain_p()

2018-11-08 Thread Aldy Hernandez




On 11/8/18 9:53 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 3:40 PM Aldy Hernandez  wrote:




On 11/8/18 9:34 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 1:52 PM Aldy Hernandez  wrote:


I believe I've seen this idiom more than once.  I know for sure I've
used it in our ssa-range branch :).  I'll hunt for the other uses and
adjust accordingly.


domain_p?!  Isn't that the same as varying_p()?  Also


Sigh, yes.  If we kept normalized value_ranges around where [MIN,MAX]
degraded into VR_VARYING, then yes.  But alas, we have two ways of
representing the entire domain.  Don't look at me.  That crap was
already there :).

Another approach would be to call ::set_and_canonicalize() before
checking varying_p() and teach the canonicalize function that [MIN, MAX]
is VR_VARYING.  How does that sound?


But that's still broken for the case where it has equivalences.  I fear that
if you look at users we'll end up with three or more different
varying_p/domain_p
things all being subtly different...


Bah, I give up.  I don't have time to look into possible subtleties wrt 
equivalences right now.  I'll drop this patch.




As said in the other thread we should see to separate equivs out
of the way.


And as I meant to say in the other thread, I'll buy you many beers if 
you can do this for this release :).


Aldy


Re: record_ranges_from_incoming_edge: use value_range API for creating new range

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 4:00 PM Aldy Hernandez  wrote:
>
>
>
> On 11/8/18 9:56 AM, Richard Biener wrote:
> > On Thu, Nov 8, 2018 at 3:54 PM Aldy Hernandez  wrote:
> >>
> >>
> >>
> >> On 11/8/18 9:49 AM, Richard Biener wrote:
> >>> On Thu, Nov 8, 2018 at 3:31 PM Aldy Hernandez  wrote:
> 
> 
> 
>  On 11/8/18 9:24 AM, Richard Biener wrote:
> > On Thu, Nov 8, 2018 at 1:17 PM Aldy Hernandez  wrote:
> >>
> >> This one's rather obvious and does not depend on any get_range_info API
> >> change.
> >>
> >> OK for trunk?
> >
> > Hmm, no - that's broken.  IIRC m_equiv are shared bitmaps if you
> > do tem = *old_vr so you modify it in place with equiv_clear().
> 
>  Good point.
> 
> >
> > Thus, operator= should be really deleted or mapped to value_range::set()
> > in which case tem = *old_vr would do useless bitmap allocation and
> > copying that you then clear.
> 
>  Actually, I was thinking that perhaps the assignment and equality
>  operators should disregard the equivalence bitmap.  In this case, copy
>  everything EXCEPT the bitmap and set the resulting equivalence bitmap to
>  NULL.
> >>>
> >>> I think that's equally confusing.
> >>
> >> I don't think so.  When you ask whether range A and range B are equal,
> >> you're almost always interested in the range itself, not the equivalence
> >> table behind it.
> >>
> >> We could also get rid of the == operator and just provide:
> >>
> >>  bool equal_p(bool ignore_equivs);
> >>
> >> How does this sound?
> >
> > Good.
> >
> >>>
>  It's also annoying to use ::ignore_equivs_equal_p().  Since that seems
>  to be the predominant way of comparing ranges, perhaps it should be the
>  default.
> >>>
> >>> I think a good approach would be to isolate m_equiv more because it is
> >>> really an implementation detail of the propagator.  Thus, make
> >>>
> >>> class value_range_with_equiv : public value_range
> >>> {
> >>> ... all the equiv stuff..
> >>> }
> >>>
> >>> make the lattice of type value_range_with_equiv and see what tickles
> >>> down.
> >>>
> >>> value_range_with_equiv wouldn't implement copy and assignment
> >>> (too expensive) and value_range can do with the trivial implementation.
> >>>
> >>> And most consumers/workers can just work on the equiv-less variants.
> >>
> >> I like this.  Unfortunately, not feasible for this cycle (for me
> >> anyhow-- patches welcome though :)).  How about equal_p() as described
> >> above?
> >
> > Works for me but you still need to sort out the copying, so if you think
> > splitting is not feasible (I'll give it a try) then please disable 
> > assingment
> > and copy operators and fixup code.
>
> Are you saying you'll try implementing value_range_with_equiv :
> value_range?  That would be of great help!

Yes, I'll experiment with this.  Meanwhile noticed bogus

  /* We're going to need to unwind this range.  We can
 not use VR as that's a stack object.  We have to allocate
 a new range and push the old range onto the stack.  We
 also have to be very careful about sharing the underlying
 bitmaps.  Ugh.  */
  value_range *new_vr = vr_values->allocate_value_range ();
  *new_vr = vr;
  new_vr->equiv_clear ();

...

> In the meantime I could provide equal_p(bool ignore_equivs) and perhaps
> copy(bool ignore_equivs), while disabling assignment and comparison
> operators.

Yes, that sounds good.

Richard.

> Aldy


[Ada] Disable DECL_BIT_FIELD_REPRESENTATIVE machinery in some cases

2018-11-08 Thread Eric Botcazou
We can have quite convoluted layouts in Ada when a representation clause is 
given for a record type with variant part and this doesn't always play nice 
with the DECL_BIT_FIELD_REPRESENTATIVE machinery.  This patch arranges for 
DECL_BIT_FIELD_TYPE to be cleared on the variant part in these cases.

Tested on x86_64-suse-linux, applied on the mainline.


2018-11-08  Eric Botcazou  

* gcc-interface/decl.c (components_to_record): Remove obsolete kludge.
* gcc-interface/utils.c (make_packable_type): Set TYPE_PACKED on the
new type but do not take into account the setting on the old type for
the new fields.  Rename a local variable.
(finish_record_type): Clear DECL_BIT_FIELD_TYPE on a variant part at
offset 0, if any.
(create_field_decl): Tweak comment.

-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 265866)
+++ gcc-interface/decl.c	(working copy)
@@ -8146,23 +8146,7 @@ components_to_record (Node_Id gnat_compo
 
   /* Chain the variant part at the end of the field list.  */
   if (gnu_variant_part)
-{
-  /* We make an exception if the variant part is at offset 0, has a fixed
-	 size, and there is a single rep'ed field placed after it because, in
-	 this case, there is an obvious order of increasing position.  */
-  if (variants_have_rep
-	  && TREE_CODE (DECL_SIZE_UNIT (gnu_variant_part)) == INTEGER_CST
-	  && gnu_rep_list
-	  && gnu_field_list == gnu_rep_list
-	  && !tree_int_cst_lt (DECL_FIELD_OFFSET (gnu_rep_list),
-			   DECL_SIZE_UNIT (gnu_variant_part)))
-	{
-	  DECL_CHAIN (gnu_variant_part) = gnu_field_list;
-	  gnu_field_list = gnu_variant_part;
-	}
-  else
-	gnu_field_list = chainon (gnu_field_list, gnu_variant_part);
-}
+gnu_field_list = chainon (gnu_field_list, gnu_variant_part);
 
   if (cancel_alignment)
 SET_TYPE_ALIGN (gnu_record_type, 0);
Index: gcc-interface/utils.c
===
--- gcc-interface/utils.c	(revision 265866)
+++ gcc-interface/utils.c	(working copy)
@@ -973,6 +973,7 @@ make_packable_type (tree type, bool in_r
  Note that we rely on the pointer equality created here for
  TYPE_NAME to look through conversions in various places.  */
   TYPE_NAME (new_type) = TYPE_NAME (type);
+  TYPE_PACKED (new_type) = 1;
   TYPE_JUSTIFIED_MODULAR_P (new_type) = TYPE_JUSTIFIED_MODULAR_P (type);
   TYPE_CONTAINS_TEMPLATE_P (new_type) = TYPE_CONTAINS_TEMPLATE_P (type);
   TYPE_REVERSE_STORAGE_ORDER (new_type) = TYPE_REVERSE_STORAGE_ORDER (type);
@@ -1018,7 +1019,7 @@ make_packable_type (tree type, bool in_r
   for (tree field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
 {
   tree new_field_type = TREE_TYPE (field);
-  tree new_field, new_size;
+  tree new_field, new_field_size;
 
   if (RECORD_OR_UNION_TYPE_P (new_field_type)
 	  && !TYPE_FAT_POINTER_P (new_field_type)
@@ -1034,14 +1035,15 @@ make_packable_type (tree type, bool in_r
 	  && !TYPE_FAT_POINTER_P (new_field_type)
 	  && !TYPE_CONTAINS_TEMPLATE_P (new_field_type)
 	  && TYPE_ADA_SIZE (new_field_type))
-	new_size = TYPE_ADA_SIZE (new_field_type);
+	new_field_size = TYPE_ADA_SIZE (new_field_type);
   else
-	new_size = DECL_SIZE (field);
+	new_field_size = DECL_SIZE (field);
 
+  /* This is a layout with full representation, alignment and size clauses
+	 so we simply pass 0 as PACKED like gnat_to_gnu_field in this case.  */
   new_field
 	= create_field_decl (DECL_NAME (field), new_field_type, new_type,
-			 new_size, bit_position (field),
-			 TYPE_PACKED (type),
+			 new_field_size, bit_position (field), 0,
 			 !DECL_NONADDRESSABLE_P (field));
 
   DECL_INTERNAL_P (new_field) = DECL_INTERNAL_P (field);
@@ -1896,6 +1898,14 @@ finish_record_type (tree record_type, tr
 	DECL_BIT_FIELD (field) = 0;
 	}
 
+  /* Clear DECL_BIT_FIELD_TYPE for a variant part at offset 0, it's simply
+	 not supported by the DECL_BIT_FIELD_REPRESENTATIVE machinery because
+	 the variant part is always the last field in the list.  */
+  if (DECL_INTERNAL_P (field)
+	  && TREE_CODE (TREE_TYPE (field)) == QUAL_UNION_TYPE
+	  && integer_zerop (pos))
+	DECL_BIT_FIELD_TYPE (field) = NULL_TREE;
+
   /* If we still have DECL_BIT_FIELD set at this point, we know that the
 	 field is technically not addressable.  Except that it can actually
 	 be addressed if it is BLKmode and happens to be properly aligned.  */
@@ -2725,9 +2735,9 @@ create_field_decl (tree name, tree type,
 	size = round_up (size, BITS_PER_UNIT);
 }
 
-  /* If we may, according to ADDRESSABLE, make a bitfield if a size is
+  /* If we may, according to ADDRESSABLE, make a bitfield when the size is
  specified for two reasons: first if the size differs from the natural
- size.  Second, if the alignment is insufficient.  There are a number of
+ size; second, if the a

Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register

2018-11-08 Thread Peter Bergner
On 11/8/18 4:57 AM, Renlin Li wrote:
> I think I found the problem!
> 
> As described in the PR, a hard register is used in
> an pre/post modify expression. The hard register is live, but updated.
> In this case, we should make it conflicting with all pseudos live at
> that point.  Does it make sense?
[snip]
> It fixes the ICE of mis-compiled arm-linux-gnueabihf toolchain described in 
> the
> PR.
> 
> I attached the patch for discussion.  I haven't give a complete test on arm or
> any other targets, yet. (Probably need more adjusting)

Yes, this is the problem.  We see from the dump, that r2040 does not conflict 
with
hard reg r1:

;; a2040(r1597,l0) conflicts: 
;; total conflict hard regs:
;; conflict hard regs:

...and we have the following RTL:

(insn 10905 179 199 2 (set (reg:SI 1 r1)
(reg/f:SI 2040)) "../../gcc/gcc/vec.h":1654 647 {*thumb2_movsi_vfp}
 (nil))

...

(insn 208 202 182 2 (set (mem/f/c:SI (pre_modify:SI (reg:SI 1 r1)
(plus:SI (reg:SI 1 r1)
(const_int 12 [0xc]))) [129 loop_nest.m_vec+0 S4 A32])
(reg:SI 1048)) "../../gcc/gcc/vec.h":1654 647 {*thumb2_movsi_vfp}
 (expr_list:REG_INC (reg:SI 1 r1)
(nil)))

...

(insn 191 189 192 2 (set (mem/f/c:SI (plus:SI (reg/f:SI 2040)
(const_int 8 [0x8])) [367 ddrs_table+0 S4 A32])
(reg/f:SI 1047)) "../../gcc/gcc/tree-loop-distribution.c":2741 647 
{*thumb2_movsi_vfp}
 (nil))

So my patch caused us to (correctly) skip adding a conflict between r1 and
r2040 due to the register copy in insn 10905.  However, they really should
conflict as you found due to the definition of r1 in insn 208 and the fact
we don't add one is a latent bug in LRA.  I think your patch is on the right
track, but not totally there yet.  Imagine instead that the references to r1
and r2040 were swapped, so instead we have:

(insn 10905 179 199 2 (set (reg:SI 2040)
(reg/f:SI 1 r1)) "../../gcc/gcc/vec.h":1654 647 {*thumb2_movsi_vfp}
 (nil))

...

(insn 208 202 182 2 (set (mem/f/c:SI (pre_modify:SI (reg:SI 2040)
(plus:SI (reg:SI 2040)
(const_int 12 [0xc]))) [129 loop_nest.m_vec+0 S4 A32])
(reg:SI 1048)) "../../gcc/gcc/vec.h":1654 647 {*thumb2_movsi_vfp}
 (expr_list:REG_INC (reg:SI 2040)
(nil)))

...

(insn 191 189 192 2 (set (mem/f/c:SI (plus:SI (reg/f:SI 1 r1)
(const_int 8 [0x8])) [367 ddrs_table+0 S4 A32])
(reg/f:SI 1047)) "../../gcc/gcc/tree-loop-distribution.c":2741 647 
{*thumb2_movsi_vfp}
 (nil))

Even with your patch, we'd miss adding the conflict between r1 and r2040.
Let me think about how we should solve this one.

And a *BIG* thank you for tracking down the problem!!!

Peter



Re: [PATCH] Remove extra memory allocation of strings.

2018-11-08 Thread James Greenhalgh
On Tue, Oct 23, 2018 at 08:17:43AM -0500, Martin Liška wrote:
> Hello.
> 
> As a follow up patch I would like to remove redundant string allocation
> on string which is not needed in my opinion.
> 
> That bootstrap on aarch64-linux.


OK,

Thanks,
James

> From a21a626055442635057985323bb42ef29526e182 Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Mon, 22 Oct 2018 15:18:23 +0200
> Subject: [PATCH] Remove extra memory allocation of strings.
> 
> gcc/ChangeLog:
> 
> 2018-10-22  Martin Liska  
> 
>   * config/aarch64/aarch64.c (aarch64_parse_arch): Do not copy
>   string to a stack buffer.
>   (aarch64_parse_cpu): Likewise.
>   (aarch64_parse_tune): Likewise.


Re: implement value_range::domain_p()

2018-11-08 Thread Richard Biener
On Thu, Nov 8, 2018 at 4:05 PM Aldy Hernandez  wrote:
>
>
>
> On 11/8/18 9:53 AM, Richard Biener wrote:
> > On Thu, Nov 8, 2018 at 3:40 PM Aldy Hernandez  wrote:
> >>
> >>
> >>
> >> On 11/8/18 9:34 AM, Richard Biener wrote:
> >>> On Thu, Nov 8, 2018 at 1:52 PM Aldy Hernandez  wrote:
> 
>  I believe I've seen this idiom more than once.  I know for sure I've
>  used it in our ssa-range branch :).  I'll hunt for the other uses and
>  adjust accordingly.
> >>>
> >>> domain_p?!  Isn't that the same as varying_p()?  Also
> >>
> >> Sigh, yes.  If we kept normalized value_ranges around where [MIN,MAX]
> >> degraded into VR_VARYING, then yes.  But alas, we have two ways of
> >> representing the entire domain.  Don't look at me.  That crap was
> >> already there :).
> >>
> >> Another approach would be to call ::set_and_canonicalize() before
> >> checking varying_p() and teach the canonicalize function that [MIN, MAX]
> >> is VR_VARYING.  How does that sound?
> >
> > But that's still broken for the case where it has equivalences.  I fear that
> > if you look at users we'll end up with three or more different
> > varying_p/domain_p
> > things all being subtly different...
>
> Bah, I give up.  I don't have time to look into possible subtleties wrt
> equivalences right now.  I'll drop this patch.
>
> >
> > As said in the other thread we should see to separate equivs out
> > of the way.
>
> And as I meant to say in the other thread, I'll buy you many beers if
> you can do this for this release :).

Well, yall made a mess out of the nicely contained VRP, and topped
it with C++.

Now it's ... a mess.

And for whatever reason all the C++-ification and refactoring had to happen
for GCC 9 :/

> Aldy


Re: record_ranges_from_incoming_edge: use value_range API for creating new range

2018-11-08 Thread Jeff Law
On 11/8/18 8:14 AM, Richard Biener wrote:
> On Thu, Nov 8, 2018 at 4:00 PM Aldy Hernandez  wrote:
>>
>>
>>
>> On 11/8/18 9:56 AM, Richard Biener wrote:
>>> On Thu, Nov 8, 2018 at 3:54 PM Aldy Hernandez  wrote:



 On 11/8/18 9:49 AM, Richard Biener wrote:
> On Thu, Nov 8, 2018 at 3:31 PM Aldy Hernandez  wrote:
>>
>>
>>
>> On 11/8/18 9:24 AM, Richard Biener wrote:
>>> On Thu, Nov 8, 2018 at 1:17 PM Aldy Hernandez  wrote:

 This one's rather obvious and does not depend on any get_range_info API
 change.

 OK for trunk?
>>>
>>> Hmm, no - that's broken.  IIRC m_equiv are shared bitmaps if you
>>> do tem = *old_vr so you modify it in place with equiv_clear().
>>
>> Good point.
>>
>>>
>>> Thus, operator= should be really deleted or mapped to value_range::set()
>>> in which case tem = *old_vr would do useless bitmap allocation and
>>> copying that you then clear.
>>
>> Actually, I was thinking that perhaps the assignment and equality
>> operators should disregard the equivalence bitmap.  In this case, copy
>> everything EXCEPT the bitmap and set the resulting equivalence bitmap to
>> NULL.
>
> I think that's equally confusing.

 I don't think so.  When you ask whether range A and range B are equal,
 you're almost always interested in the range itself, not the equivalence
 table behind it.

 We could also get rid of the == operator and just provide:

  bool equal_p(bool ignore_equivs);

 How does this sound?
>>>
>>> Good.
>>>
>
>> It's also annoying to use ::ignore_equivs_equal_p().  Since that seems
>> to be the predominant way of comparing ranges, perhaps it should be the
>> default.
>
> I think a good approach would be to isolate m_equiv more because it is
> really an implementation detail of the propagator.  Thus, make
>
> class value_range_with_equiv : public value_range
> {
> ... all the equiv stuff..
> }
>
> make the lattice of type value_range_with_equiv and see what tickles
> down.
>
> value_range_with_equiv wouldn't implement copy and assignment
> (too expensive) and value_range can do with the trivial implementation.
>
> And most consumers/workers can just work on the equiv-less variants.

 I like this.  Unfortunately, not feasible for this cycle (for me
 anyhow-- patches welcome though :)).  How about equal_p() as described
 above?
>>>
>>> Works for me but you still need to sort out the copying, so if you think
>>> splitting is not feasible (I'll give it a try) then please disable 
>>> assingment
>>> and copy operators and fixup code.
>>
>> Are you saying you'll try implementing value_range_with_equiv :
>> value_range?  That would be of great help!
> 
> Yes, I'll experiment with this.  Meanwhile noticed bogus
> 
>   /* We're going to need to unwind this range.  We can
>  not use VR as that's a stack object.  We have to allocate
>  a new range and push the old range onto the stack.  We
>  also have to be very careful about sharing the underlying
>  bitmaps.  Ugh.  */
>   value_range *new_vr = vr_values->allocate_value_range ();
>   *new_vr = vr;
>   new_vr->equiv_clear ();
I hate this hunk of code.  In fact, it's probably the biggest wart from
the classification & simplification work last year.

The problem is the local stack object can't be pushed onto the unwinding
stack -- it'll be destroyed when we leave scope and we end up popping
total junk later when we restore the range.

You also have to be careful because of the bitmap sharing.

Jeff


Re: implement value_range::domain_p()

2018-11-08 Thread Aldy Hernandez




On 11/8/18 10:24 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 4:05 PM Aldy Hernandez  wrote:




On 11/8/18 9:53 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 3:40 PM Aldy Hernandez  wrote:




On 11/8/18 9:34 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 1:52 PM Aldy Hernandez  wrote:


I believe I've seen this idiom more than once.  I know for sure I've
used it in our ssa-range branch :).  I'll hunt for the other uses and
adjust accordingly.


domain_p?!  Isn't that the same as varying_p()?  Also


Sigh, yes.  If we kept normalized value_ranges around where [MIN,MAX]
degraded into VR_VARYING, then yes.  But alas, we have two ways of
representing the entire domain.  Don't look at me.  That crap was
already there :).

Another approach would be to call ::set_and_canonicalize() before
checking varying_p() and teach the canonicalize function that [MIN, MAX]
is VR_VARYING.  How does that sound?


But that's still broken for the case where it has equivalences.  I fear that
if you look at users we'll end up with three or more different
varying_p/domain_p
things all being subtly different...


Bah, I give up.  I don't have time to look into possible subtleties wrt
equivalences right now.  I'll drop this patch.



As said in the other thread we should see to separate equivs out
of the way.


And as I meant to say in the other thread, I'll buy you many beers if
you can do this for this release :).


Well, yall made a mess out of the nicely contained VRP, and topped
it with C++.

Now it's ... a mess.


You wish!  VRP has been a mess for quite a long time.



And for whatever reason all the C++-ification and refactoring had to happen
for GCC 9 :/


You're gonna absolutely love what we have in store for GCC 10 ;-).

Aldy


Re: expr_not_equal_to: use value_range API

2018-11-08 Thread Aldy Hernandez




On 11/8/18 9:55 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 3:50 PM Aldy Hernandez  wrote:




On 11/8/18 9:43 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 3:27 PM Aldy Hernandez  wrote:




On 11/8/18 9:21 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 1:09 PM Aldy Hernandez  wrote:


All this nonsense:

-  rtype = get_range_info (t, &min, &max);
-  if (rtype == VR_RANGE)
-   {
- if (wi::lt_p (max, w, TYPE_SIGN (TREE_TYPE (t
-   return true;
- if (wi::lt_p (w, min, TYPE_SIGN (TREE_TYPE (t
-   return true;
-   }
-  else if (rtype == VR_ANTI_RANGE
-  && wi::le_p (min, w, TYPE_SIGN (TREE_TYPE (t)))
-  && wi::le_p (w, max, TYPE_SIGN (TREE_TYPE (t

Replaced by an API like Kutulu intended.

+  get_range_info (t, vr);
+  if (!vr.may_contain_p (wide_int_to_tree (TREE_TYPE (t), w)))

Ain't it grand?


Well.  The not-so-grand thing is that you possibly ggc-allocate
three INTEGER_CST nodes here.


Hmmm... I'd really prefer to use a simple API call, instead of having to
twiddle with the extremes manually.  Ideally no one should be looking
inside of a value_range.

Do recommend another way of implementing may_contain_p ?


I think many places dealing with get_range_info () should instead
work on the (to be created...) 1:1 copy of value_range ontop of
wide-int-range instead.


I'd prefer to not expose that we're going to use wide_int or any other
implementation to the users of get_range_info().


But it's exposed at the moment.  And I don't see it change.  And you
should not allocate memory for no good reason.



So - can you add a wide_int_range class to wide-int-range.h
that implements the same (but with wide-ints rather than trees
obviously) API as value-range?


Hmmm, I don't have time for this release cycle.  Perhaps something to be
entertained for GCC+1?

Again, I prefer my patch as is.  I cleans up the code, and keeps us from
introducing problematic bugs.  Anything dealing with anti ranges is
fraught with peril, as my cleanups to tree-vrp revealed.

If using these INTEGER_CST's causes any measurable performance
difference, I'd be happy to look into it.


Just leave the code unchanged then in this release?


Ok.


Re: [PATCH, testsuite]: Use int128 effective target for gcc.dg/pr87874.c

2018-11-08 Thread Christophe Lyon
On Thu, 8 Nov 2018 at 15:11, Uros Bizjak  wrote:
>
> On Thu, Nov 8, 2018 at 1:29 PM Christophe Lyon
>  wrote:
> >
> > On Wed, 7 Nov 2018 at 16:50, Uros Bizjak  wrote:
> > >
> > > 2018-11-07  Uros Bizjak  
> > >
> > > * gcc.dg/pr87874.c: Compile only for int128 effective target.
> > >
> > > Tested on x86_64-linux-gnu {,-m32} and committed to mainline SVN.
> > >
> >
> > This doesn't work on aarch64 -mabi=ilp32:
> > /gcc/testsuite/gcc.dg/pr87874.c: In function 'em':
> > /gcc/testsuite/gcc.dg/pr87874.c:19:50: warning: conversion from 'long
> > long unsigned int' to 'long unsigned int' changes value from
> > '18446744073709551615' to '4294967295' [-Woverflow]
> > FAIL: gcc.dg/pr87874.c (test for excess errors)
>
> Does attached patch works for you?
>

Yes, thanks.

> Uros.


Re: introduce --enable-mingw-full32 to default to --large-address-aware

2018-11-08 Thread JonY
On 11/08/2018 09:45 AM, Alexandre Oliva wrote:
> On Nov  7, 2018, JonY <10wa...@gmail.com> wrote:
> 
>> On 11/07/2018 08:34 AM, Alexandre Oliva wrote:
>>> On Nov  1, 2018, JonY wrote:
>>>
 Looks like it causes an error on 64bit:
 /usr/libexec/gcc/x86_64-w64-mingw32/ld: unrecognized option
 '--large-address-aware'
>>>
>>> What does?  The patch I suggested?  The current trunk?
>>>
>>> What was the command in this case?  How was the toolchain configured?
> 
>> No it's just a quick test to see how x86_64-w64-mingw32 reacts to
>> --large-address-aware, it doesn't play well.
> 
> I understand, but you're getting a different result from what I am, I'd
> like to understand why before attempting to "fix" something that AFAICT
> is correct and behaving just as intended.  Maybe there are valid reasons
> to drop --large-address-aware altogether on x86_64-w64-mingw32, but I'd
> like to understand why, as in, how the error above came about for you,
> when it didn't for me.
> 
> I built gcc and binutils in a single tree, with x86_64-w64-mingw32 as
> the target, and the resulting linker would accept --large-address-aware
> in the -mi386-pe emulation, and that emulation was explicitly enabled
> when building -m32 binaries.  Does your w64 toolchain deviate from any
> of these facts?
> 

No, no. By quick I just mean using -Wl,--large-address-aware on an
existing gcc install, nothing complex. Sorry about not making it clear.

> 
>> x86_64-mingw32 is not used as far as I know, only with "w64" or "pc".
> 
> x86_64-mingw32's canonical form is x86_64-pc-mingw32, and they're
> equivalent, so whatever you say or know about the latter should apply to
> the shorter form as well.  Likewise, my questions and doubts about the
> shorter form apply equally to the canonical one.
> 
>> The "w64" carries a special meaning to gcc dating back to the early
>> 64bit port. It basically tells gcc to use mingw-w64 specific features
>> that are not found on the regular mingw.org CRT at the time.
> 
> I see.  So -pc- and -w64- are not supposed to be equivalent indeed.
> Thanks.
> 
>> This might be affecting the "pc" vendor build, can you check
>> x86_64-pc-mingw32 just to see if it is affected?
> 
> I did.  Its -m32 mode seems broken to me.  As I wrote, the linker
> emulation is never specified, so it would build 64-bit binaries even
> when given -m32.  Because it lacks explicit linker emulation
> specification, it can't have --large-address-aware specified either.
> 
> In my link tests, i686-mingw32 and x86_64-w64-mingw32 both worked (in
> that their respective linkers wouldn't reject the --large-address-aware
> passed by gcc) with the --large-address-aware patch that is installed in
> trunk.  Ditto with the incremental patch I posted last week, that would
> have only improved x86_64-pc-mingw32.
> 

I now understand the problem, thanks for the clarification about the
patch. Patch is OK.



signature.asc
Description: OpenPGP digital signature


Re: [PATCH, ARM, ping2] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-11-08 Thread Kyrill Tkachov

Hi Thomas,

On 08/11/18 09:52, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On Thu, 1 Nov 2018 at 16:03, Thomas Preudhomme
 wrote:

Ping?

Best regards,

Thomas
On Fri, 26 Oct 2018 at 22:41, Thomas Preudhomme
 wrote:

Hi,

Please find updated patch to fix PR85434: spilling of stack protector
guard's address on ARM. Quite a few changes have been made to the ARM
part since last round of review so I think it makes more sense to
review it anew. Ran bootstrap + regression testsuite + glibc build +
glibc regression testsuite for Arm and Thumb-2 and bootstrap +
regression testsuite for Thumb-1. GCC's regression testsuite was run
in 3 configurations in all those cases:

- default configuration (no RUNTESTFLAGS)
- with -fstack-protector-all
- with -fPIC -fstack-protector-all (to exercise both codepath in stack
protector's split code)

None of this show any regression beyond some new scan fail with
-fstack-protector-all or -fPIC due to unexpected code sequence for the
testcases concerned and some guality swing due to less optimization
with new stack protector on.

Patch description and ChangeLog below.

In case of high register pressure in PIC mode, address of the stack
protector's guard can be spilled on ARM targets as shown in PR85434,
thus allowing an attacker to control what the canary would be compared
against. ARM does lack stack_protect_set and stack_protect_test insn
patterns, defining them does not help as the address is expanded
regularly and the patterns only deal with the copy and test of the
guard with the canary.

This problem does not occur for x86 targets because the PIC access and
the test can be done in the same instruction. Aarch64 is exempt too
because PIC access insn pattern are mov of UNSPEC which prevents it from
the second access in the epilogue being CSEd in cse_local pass with the
first access in the prologue.

The approach followed here is to create new "combined" set and test
standard pattern names that take the unexpanded guard and do the set or
test. This allows the target to use an opaque pattern (eg. using UNSPEC)
to hide the individual instructions being generated to the compiler and
split the pattern into generic load, compare and branch instruction
after register allocator, therefore avoiding any spilling. This is here
implemented for the ARM targets. For targets not implementing these new
standard pattern names, the existing stack_protect_set and
stack_protect_test pattern names are used.

To be able to split PIC access after register allocation, the functions
had to be augmented to force a new PIC register load and to control
which register it loads into. This is because sharing the PIC register
between prologue and epilogue could lead to spilling due to CSE again
which an attacker could use to control what the canary gets compared
against.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-10-26  Thomas Preud'homme  

* target-insns.def (stack_protect_combined_set): Define new standard
pattern name.
(stack_protect_combined_test): Likewise.
* cfgexpand.c (stack_protect_prologue): Try new
stack_protect_combined_set pattern first.
* function.c (stack_protect_epilogue): Try new
stack_protect_combined_test pattern first.
* config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
parameters to control which register to use as PIC register and force
reloading PIC register respectively.  Insert in the stream of insns if
possible.
(legitimize_pic_address): Expose above new parameters in prototype and
adapt recursive calls accordingly.  Use pic_reg if non null instead of
cached one.
(arm_load_pic_register): Add pic_reg parameter and use it if non null.
(arm_legitimize_address): Adapt to new legitimize_pic_address
prototype.
(thumb_legitimize_address): Likewise.
(arm_emit_call_insn): Adapt to require_pic_register prototype change.
(arm_expand_prologue): Adapt to arm_load_pic_register prototype change.
(thumb1_expand_prologue): Likewise.
* config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
change.
(arm_load_pic_register): Likewise.
* config/arm/predicated.md (guard_addr_operand): New predicate.
(guard_operand): New predicate.
* config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
prototype change.
(builtin_setjmp_receiver expander): Adapt to thumb1_expand_prologue
prototype change.
(stack_protect_combined_set): New expander..
(stack_protect_combined_set_insn): New insn_and_split pattern.
(stack_protect_set_insn): New insn pattern.
(stack_protect_combined_test): New expander.
(stack_protect_combined_test_insn): New insn_and_split pattern.
(arm_stack_protect_test_insn): New insn pattern.
* config/arm/thumb1.md (thumb1_stack_protect_test_insn): New insn pattern.
* config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
(UNSPEC_SP_TEST): Likewise.
* doc/md.texi (stack_protect_combined_set): Document new standard
pattern name.
(stack_protect_set): Clarify that the operand for guard's address is
legal.
(stack_protect_combined_test):

[Ada] Fix wrong code for loops with convoluted control flow

2018-11-08 Thread Eric Botcazou
This is a regression present since the -faggressive-loop-optimizations option 
was introduced, leading to wrong code in some cases for "while" loops with 
convoluted control flow.  It's caused by a bad interaction between three 
different things: specific support we have in gigi for -fnon-call-exceptions,
loop invariant motion (-ftree-loop-im) and -faggressive-loop-optimizations.

In light of this, we have decided to tidy up this area in gigi in order to 
make the compiler more robust in default mode while losing nothing in terms of 
run-time performance in -gnatp mode.

Tested on x86_64-suse-linux, applied on the mainline, 8 and 7 branches.


2018-11-08  Eric Botcazou  

* fe.h (Suppress_Checks): Declare.
* gcc-interface/misc.c (gnat_init_gcc_eh): Set -fnon-call-exceptions
only if checks are not suppressed and -faggressive-loop-optimizations
only if they are.
* gcc-interface/trans.c (struct loop_info_d): Remove has_checks and
warned_aggressive_loop_optimizations fields.
(gigi): Do not clear warn_aggressive_loop_optimizations here.
(Raise_Error_to_gnu): Do not set has_checks.
(gnat_to_gnu) : Remove support for aggressive
loop optimizations.


2018-11-08  Eric Botcazou  

* gnat.dg/null_pointer_deref1.adb: Remove -gnatp and add pragma.
* gnat.dg/null_pointer_deref2.adb: Likewise.
* gnat.dg/null_pointer_deref3.adb: Likewise.
* gnat.dg/opt74.adb: New test.
* gnat.dg/opt74_pkg.ad[sb]: New helper.
* gnat.dg/warn12.adb: Delete.
* gnat.dg/warn12_pkg.ads: Likewise.


-- 
Eric BotcazouIndex: ada/fe.h
===
--- ada/fe.h	(revision 265866)
+++ ada/fe.h	(working copy)
@@ -193,6 +193,7 @@ extern Boolean In_Same_Source_Unit
 #define GNAT_Mode  opt__gnat_mode
 #define List_Representation_Info   opt__list_representation_info
 #define No_Strict_Aliasing_CP  opt__no_strict_aliasing
+#define Suppress_Checksopt__suppress_checks
 
 typedef enum {
   Front_End_SJLJ, Back_End_ZCX, Back_End_SJLJ
@@ -207,6 +208,7 @@ extern Boolean Generate_SCO_Instance_Tab
 extern Boolean GNAT_Mode;
 extern Int List_Representation_Info;
 extern Boolean No_Strict_Aliasing_CP;
+extern Boolean Suppress_Checks;
 
 #define ZCX_Exceptionsopt__zcx_exceptions
 #define SJLJ_Exceptions   opt__sjlj_exceptions
Index: ada/gcc-interface/misc.c
===
--- ada/gcc-interface/misc.c	(revision 265866)
+++ ada/gcc-interface/misc.c	(working copy)
@@ -392,7 +392,7 @@ gnat_init_gcc_eh (void)
   using_eh_for_cleanups ();
 
   /* Turn on -fexceptions, -fnon-call-exceptions and -fdelete-dead-exceptions.
- The first one triggers the generation of the necessary exception tables.
+ The first one activates the support for exceptions in the compiler.
  The second one is useful for two reasons: 1/ we map some asynchronous
  signals like SEGV to exceptions, so we need to ensure that the insns
  which can lead to such signals are correctly attached to the exception
@@ -402,10 +402,18 @@ gnat_init_gcc_eh (void)
  for such calls to actually raise in Ada.
  The third one is an optimization that makes it possible to delete dead
  instructions that may throw exceptions, most notably loads and stores,
- as permitted in Ada.  */
+ as permitted in Ada.
+ Turn off -faggressive-loop-optimizations because it may optimize away
+ out-of-bound array accesses that we want to be able to catch.
+ If checks are disabled, we use the same settings as the C++ compiler.  */
   flag_exceptions = 1;
-  flag_non_call_exceptions = 1;
   flag_delete_dead_exceptions = 1;
+  if (!Suppress_Checks)
+{
+  flag_non_call_exceptions = 1;
+  flag_aggressive_loop_optimizations = 0;
+  warn_aggressive_loop_optimizations = 0;
+}
 
   init_eh ();
 }
Index: ada/gcc-interface/trans.c
===
--- ada/gcc-interface/trans.c	(revision 265866)
+++ ada/gcc-interface/trans.c	(working copy)
@@ -198,8 +198,6 @@ struct GTY(()) loop_info_d {
   tree high_bound;
   vec *checks;
   bool artificial;
-  bool has_checks;
-  bool warned_aggressive_loop_optimizations;
 };
 
 typedef struct loop_info_d *loop_info;
@@ -679,10 +677,6 @@ gigi (Node_Id gnat_root,
   /* Now translate the compilation unit proper.  */
   Compilation_Unit_to_gnu (gnat_root);
 
-  /* Disable -Waggressive-loop-optimizations since we implement our own
- version of the warning.  */
-  warn_aggressive_loop_optimizations = 0;
-
   /* Then process the N_Validate_Unchecked_Conversion nodes.  We do this at
  the very end to avoid having to second-guess the front-end when we run
  into dummy nodes during the regular processing.  */
@@ -5720,7 +5714,6 @@ Raise_Error_to_gnu (Node_Id gnat_node, t
 	  rci->inserte

Re: cleanups and unification of value_range dumping code

2018-11-08 Thread Aldy Hernandez



On 11/8/18 9:39 AM, Richard Biener wrote:

On Thu, Nov 8, 2018 at 2:32 PM Aldy Hernandez  wrote:


[Richard, you're right.  An overloaded debug() is better than
this->dump().  Anyone who thinks different is wrong.  End of discussion.]

There's no reason to have multiple ways of dumping a value range.  And
I'm not even talking about ipa-prop.* which decided that they should
implement their own version of value_ranges basically to annoy me.
Attached is a patch to remedy this.

I have made three small user-visible changes to the dumping code--
cause, well... you know me... I can't leave things alone.

1. It *REALLY* irks me that bools are dumped with -INF/+INF.  Everyone
knows that bools should be 0/1.  It's in the Bible.  Look it up.

2. Analogous to #1, what's up with signed 1-bit fields?  Who understands
[0, +INF] as [0, 0]?  No one; that's right.  (It's still beyond me the
necessity of signed bit fields in the standard period, but I'll pick my
battles.)

3. I find it quite convenient to display the tree type along with the
range as:

 [1, 1] int EQUIVALENCES { me, myself, I }


the type inbetween the range and equivalences?  seriously?  If so then
_please_ dump the type before the range:

   int [1, 1] EQUIV { ... }


Done in attached patch.




Finally.. As mentioned, I've implement an overloaded debug() because
it's *so* nice to use gdb and an alias of:

 define dd
   call debug($arg0)
 end

...and have stuff just work.

I do realize that we still have dump() lying around.  At some point, we
should start dropping external access to the dump methods for objects
that are never dumped by the compiler (but by the user at debug time).


+DEBUG_FUNCTION void
+debug (const value_range *vr)
+{
+  dump_value_range (stderr, vr);
+}
+
+DEBUG_FUNCTION void
+debug (const value_range vr)

^^^ missing a & (const reference?)

+{
+  dump_value_range (stderr, &vr);
+}


Fixed.



also

@@ -2122,21 +2122,11 @@ dump_ssaname_info (pretty_printer *buffer,
tree node, int spc)
if (!POINTER_TYPE_P (TREE_TYPE (node))
&& SSA_NAME_RANGE_INFO (node))
  {
-  wide_int min, max, nonzero_bits;
-  value_range_kind range_type = get_range_info (node, &min, &max);
+  value_range vr;

-  if (range_type == VR_VARYING)
-   pp_printf (buffer, "# RANGE VR_VARYING");
-  else if (range_type == VR_RANGE || range_type == VR_ANTI_RANGE)
-   {
- pp_printf (buffer, "# RANGE ");
- pp_printf (buffer, "%s[", range_type == VR_RANGE ? "" : "~");
- pp_wide_int (buffer, min, TYPE_SIGN (TREE_TYPE (node)));
- pp_printf (buffer, ", ");
- pp_wide_int (buffer, max, TYPE_SIGN (TREE_TYPE (node)));
- pp_printf (buffer, "]");
-   }
-  nonzero_bits = get_nonzero_bits (node);
+  get_range_info (node, vr);

this is again allocating INTEGER_CSTs for no good reason.  dumping really
shuld avoid possibly messing with the GC state.


H, I'd really hate to have two distinct places that dump value 
ranges.  Is this really a problem?  Cause the times I've run into GC 
problems I'd had to resort to printf() anyhow...


And while we're on the subject of multiple implementations, I'm thinking 
of rewriting ipa-prop to actually use value_range's, not this:


struct GTY(()) ipa_vr
{
  /* The data fields below are valid only if known is true.  */
  bool known;
  enum value_range_kind type;
  wide_int min;
  wide_int max;
};

so perhaps:

struct { bool known; value_range vr; }

Remember that trees are cached, whereas wide_int's are not, and folks 
(not me :)) like to complain that wide_int's take a lot of space.


Would you be viscerally opposed to implementing ipa_vr with 
value_range's?  I really hate having to deal with yet another 
value_range variant.


Aldy
* gimple-pretty-print.c (dump_ssaname_info): Use value_range
dumper to dump range information.
* tree-vrp.c (value_range::dump_range): New.
(value_range::dump): Add variant for dumping to pretty_printer.
Adjust FILE * version accordingly.
(debug_value_range): Rename to debug.

Remove unused prototypes: dump_value_range, debug_value_range,
dump_all_value_ranges, dump_vr_equiv, debug_vr_equiv.
* tree-vrp.h (value_range::dump): Add new variant that takes
pretty_printer.
(value_range::dump_range): New.

diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index 7dfec9120ab..73af30ba92f 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -2122,21 +2122,11 @@ dump_ssaname_info (pretty_printer *buffer, tree node, int spc)
   if (!POINTER_TYPE_P (TREE_TYPE (node))
   && SSA_NAME_RANGE_INFO (node))
 {
-  wide_int min, max, nonzero_bits;
-  value_range_kind range_type = get_range_info (node, &min, &max);
+  value_range vr;
 
-  if (range_type == VR_VARYING)
-	pp_printf (buffer, "# RANGE VR_VARYING");
- 

Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register

2018-11-08 Thread Renlin Li

Hi Peter,

On 11/08/2018 03:21 PM, Peter Bergner wrote:

On 11/8/18 4:57 AM, Renlin Li wrote:

I think I found the problem!

As described in the PR, a hard register is used in
an pre/post modify expression. The hard register is live, but updated.
In this case, we should make it conflicting with all pseudos live at
that point.  Does it make sense?

[snip]

It fixes the ICE of mis-compiled arm-linux-gnueabihf toolchain described in the
PR.

I attached the patch for discussion.  I haven't give a complete test on arm or
any other targets, yet. (Probably need more adjusting)


Yes, this is the problem.  We see from the dump, that r2040 does not conflict 
with
hard reg r1:

;; a2040(r1597,l0) conflicts: 
;; total conflict hard regs:
;; conflict hard regs:

I think you should look for axxx(r2040, ..)?

Maybe I am wrong (not an expert of RA), from what I observed, it is the LRA
makes the code more complex. It decides to split the live range and spill r2040.
It creates multiple instructions to reload it.
r2944 in LRA dump is the register which starts to go wrong. It is assigned as 
r1.


  Creating newreg=2944 from oldreg=2040, assigning class GENERAL_REGS to 
inheritance r2944
Original reg change 2040->2944 (bb2):
 10905: r1:SI=r2944:SI
Add inheritance<-original before:
 12868: r2944:SI=r2040:SI

The dump is the final state of LRA. I debug it with gdb, and there are some 
temporary steps
which is not observable in the final dump.



...and we have the following RTL:

(insn 10905 179 199 2 (set (reg:SI 1 r1)
 (reg/f:SI 2040)) "../../gcc/gcc/vec.h":1654 647 {*thumb2_movsi_vfp}
  (nil))

...

(insn 208 202 182 2 (set (mem/f/c:SI (pre_modify:SI (reg:SI 1 r1)
 (plus:SI (reg:SI 1 r1)
 (const_int 12 [0xc]))) [129 loop_nest.m_vec+0 S4 A32])
 (reg:SI 1048)) "../../gcc/gcc/vec.h":1654 647 {*thumb2_movsi_vfp}
  (expr_list:REG_INC (reg:SI 1 r1)
 (nil)))

...

(insn 191 189 192 2 (set (mem/f/c:SI (plus:SI (reg/f:SI 2040)
 (const_int 8 [0x8])) [367 ddrs_table+0 S4 A32])
 (reg/f:SI 1047)) "../../gcc/gcc/tree-loop-distribution.c":2741 647 
{*thumb2_movsi_vfp}
  (nil))

So my patch caused us to (correctly) skip adding a conflict between r1 and
r2040 due to the register copy in insn 10905.  However, they really should
conflict as you found due to the definition of r1 in insn 208 and the fact
we don't add one is a latent bug in LRA.  I think your patch is on the right
track, but not totally there yet.  Imagine instead that the references to r1
and r2040 were swapped, so instead we have:

(insn 10905 179 199 2 (set (reg:SI 2040)
 (reg/f:SI 1 r1)) "../../gcc/gcc/vec.h":1654 647 {*thumb2_movsi_vfp}
  (nil))

...

(insn 208 202 182 2 (set (mem/f/c:SI (pre_modify:SI (reg:SI 2040)
 (plus:SI (reg:SI 2040)
 (const_int 12 [0xc]))) [129 loop_nest.m_vec+0 S4 A32])
 (reg:SI 1048)) "../../gcc/gcc/vec.h":1654 647 {*thumb2_movsi_vfp}
  (expr_list:REG_INC (reg:SI 2040)
 (nil)))

...

(insn 191 189 192 2 (set (mem/f/c:SI (plus:SI (reg/f:SI 1 r1)
 (const_int 8 [0x8])) [367 ddrs_table+0 S4 A32])
 (reg/f:SI 1047)) "../../gcc/gcc/tree-loop-distribution.c":2741 647 
{*thumb2_movsi_vfp}
  (nil))

Even with your patch, we'd miss adding the conflict between r1 and r2040.
Let me think about how we should solve this one.


Yes, I am not confident the patch will be the ultimate fix to the problem.



And a *BIG* thank you for tracking down the problem!!!


Nop.

Regards,
Renlin

Peter



Re: [PATCH] Verify that last argument of __builtin_expect_with_probability is a real cst (PR c/87811).

2018-11-08 Thread Martin Sebor

On 11/07/2018 02:36 AM, Martin Liška wrote:

On 11/5/18 7:00 PM, Martin Sebor wrote:

On 11/01/2018 07:45 AM, Martin Liška wrote:

On 11/1/18 1:15 PM, Jakub Jelinek wrote:

On Thu, Nov 01, 2018 at 01:09:16PM +0100, Martin Liška wrote:

-range 0.0 to 1.0, inclusive.
+range 0.0 to 1.0, inclusive.  The @var{probability} argument must be
+a compiler time constant.


When you say must, I think error_at should be used rather than warning_at.
If others disagree I'm open for leaving it as is.


Error is fine for me as well.




@@ -2474,6 +2481,11 @@ expr_expected_value_1 (tree type, tree op0, enum 
tree_code code,
   *predictor = PRED_BUILTIN_EXPECT_WITH_PROBABILITY;
   *probability = probi;
 }
+  else
+  warning_at (gimple_location (def), 0,
+  "probability argument %qE must be a in the "
+  "range 0.0 to 1.0", prob);


Wrong indentation.

And, no diagnostics for -O0 (which should also be covered by a testcase).


Test for that added.




+/* { dg-options "-O2 -fdump-tree-profile_estimate -frounding-math" } */


Why the -frounding-math options?


I remember I had some issue with:
  tree r = fold_build2_initializer_loc (UNKNOWN_LOCATION,
MULT_EXPR, t, prob, base);

on targets with a non-IEEE floating point arithmetics (s390?).

 I think test

coverage should handle both that and when that option is not used
if that option makes any difference.


It will eventually pop up if we install new tests w/o rounding math.



Jakub




Martin



I noticed a few minor issues in the hunks below:

--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -12046,7 +12046,8 @@
 when testing pointer or floating-point values.

 This function has the same semantics as @code{__builtin_expect},
 but the caller provides the expected probability that @var{exp} == @var{c}.
 The last argument, @var{probability}, is a floating-point value in the
-range 0.0 to 1.0, inclusive.
+range 0.0 to 1.0, inclusive.  The @var{probability} argument must be
+a compiler time constant.

The term is "compile-time constant" but please see below.

--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -2467,6 +2467,13 @@
 expr_expected_value_1 (tree type, tree op0, enum tree_code code,
   base = build_real_from_int_cst (t, base);
   tree r = fold_build2_initializer_loc (UNKNOWN_LOCATION,
 MULT_EXPR, t, prob, base);
+  if (TREE_CODE (r) != REAL_CST)
+{
+  error_at (gimple_location (def),
+"probability argument %qE must be a compile "
+"time constant", prob);
+  return NULL;
}

According to GCC coding conventions, when used as an adjective
the term "compile-time" should be hyphenated.  But the term used
in other diagnostics is either "constant integer" or "constant
integer expressions" so I would suggest to use it instead, here
and in the manual.

@@ -2474,6 +2481,11 @@
 expr_expected_value_1 (tree type, tree op0, enum tree_code code,
   *predictor = PRED_BUILTIN_EXPECT_WITH_PROBABILITY;
   *probability = probi;
 }
+  else
+error_at (gimple_location (def),
+  "probability argument %qE must be a in the "
+  "range 0.0 to 1.0", prob);
+

There's a stray 'a' in the text of the error.

But it's not really meaningful to say

  3.14 must be in the range 0.0 to 1.0

because that simply cannot happen.  We could say "argument 2 must
be in the range" but I would instead suggest to rephrase the error
along the same lines as other similar messages GCC already issues:

  "probability %qE is outside the range [0.0, 1.0]"

Martin


Hi Martin.

Thanks for help with the wording. Please take a look at attached patch
candidate.


Looks good!

Thanks
Martin


Re: [patches] Re: [PATCH] Update soft-fp from glibc.

2018-11-08 Thread Joseph Myers
On Thu, 8 Nov 2018, Kito Cheng wrote:

> Hi Joseph:
> 
> I don't have commit right, could you help me to commit that, thanks :)

Done.

-- 
Joseph S. Myers
jos...@codesourcery.com


[committed][MSP430][0/4] "Obvious" fixes to GCC tests for msp430-elf

2018-11-08 Thread Jozef Lawrynowicz

I've committed some "obvious" changes to GCC tests that fix failures when
running the GCC testsuite for msp430-elf.

Patch 1 adds a couple of new regexp strings to gcc-dg-prune to catch messages
emitted by GNU ld, when the size of an output section is too large for a memory
region, or the memory region overflows.

Patch 2 skips tests expecting NULL pointer checks to be deleted
with the -fdelete-null-pointer-checks flag, for targets which ignore this flag.

Patch 3 extends the regex in some tests which scan the assembler output for
"nop". On MSP430, the nop instruction is an uppercase "NOP".

Patch 4 fixes the calculation of USHRT_MAX in some tests, to prevent integer
overflow for targets where sizeof(short) == sizeof(int). i.e.
  -#define USHRT_MAX (SHRT_MAX * 2 + 1)
  +#define USHRT_MAX (SHRT_MAX * 2U + 1)

Succesfully regtested:
- gcc, g++, objc, gfortran for x86_64-pc-linux-gnu
- gcc, g++ for msp430-elf
- gcc for avr-elf

Applied to trunk.



[committed][MSP430][1/4] Add regexp for memory region overflow to gcc-dg-pune

2018-11-08 Thread Jozef Lawrynowicz

Patch 1 adds a couple of new regexp strings to gcc-dg-prune to catch messages
emitted by GNU ld, when the size of an output section is too large for a memory
region, or the memory region overflows.


>From af810d86c47092e56590f5c13d0633c490f53c9d Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Tue, 6 Nov 2018 12:49:36 +
Subject: [PATCH 1/4] [TESTSUITE] prune regex

2018-11-08  Jozef Lawrynowicz  

	gcc/testsuite/ChangeLog:

	* lib/gcc-dg.exp (gcc-dg-prune): Add new regexps for when
	size of an output section is too large for a memory region.
---
 gcc/testsuite/lib/gcc-dg.exp | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
index c33a50c..305dd3c 100644
--- a/gcc/testsuite/lib/gcc-dg.exp
+++ b/gcc/testsuite/lib/gcc-dg.exp
@@ -394,6 +394,14 @@ proc gcc-dg-prune { system text } {
 return "::unsupported::memory full"
 }
 
+if [regexp "(^|\n)\[^\n\]* section.*will not fit in region" $text] {
+	return "::unsupported::memory full"
+}
+
+if [regexp "(^|\n)\[^\n\]* region.*overflowed by" $text] {
+	return "::unsupported::memory full"
+}
+
 # Likewise, if we see ".text exceeds local store range" or
 # similar.
 if {[string match "spu-*" $system] && \
-- 
2.7.4



[committed][testsuite][MSP430][2/4] Skip tests requiring -fdelete-null-pointer-checks for targets which don't support this flag

2018-11-08 Thread Jozef Lawrynowicz

Patch 2 skips tests expecting NULL pointer checks to be deleted
with the -fdelete-null-pointer-checks flag, for targets which ignore this flag.

>From 55b405a4d2c694ffdbcbd6808e139d88cb2d2447 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Tue, 6 Nov 2018 12:59:33 +
Subject: [PATCH 2/4] [TESTSUITE] nullptr checks

2018-11-08  Jozef Lawrynowicz  

	gcc/testsuite/ChangeLog:

	* c-c++-common/pr27336.c: Skip test if the target keeps null pointer
	checks.
	* gcc.dg/addr_equal-1.c: Likewise.
	* gcc.dg/tree-ssa/pr78154.c: Likewise.
	* gcc.dg/tree-ssa/vrp111.c: Likewise.
---
 gcc/testsuite/c-c++-common/pr27336.c| 1 +
 gcc/testsuite/gcc.dg/addr_equal-1.c | 1 +
 gcc/testsuite/gcc.dg/tree-ssa/pr78154.c | 1 +
 gcc/testsuite/gcc.dg/tree-ssa/vrp111.c  | 1 +
 4 files changed, 4 insertions(+)

diff --git a/gcc/testsuite/c-c++-common/pr27336.c b/gcc/testsuite/c-c++-common/pr27336.c
index ce68559..4ecc232 100644
--- a/gcc/testsuite/c-c++-common/pr27336.c
+++ b/gcc/testsuite/c-c++-common/pr27336.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdelete-null-pointer-checks -fdump-tree-vrp1" } */
+/* { dg-skip-if "" { keeps_null_pointer_checks } } */
 
 struct B { int x; };
 extern void g3(struct B *that)  __attribute__((nonnull));
diff --git a/gcc/testsuite/gcc.dg/addr_equal-1.c b/gcc/testsuite/gcc.dg/addr_equal-1.c
index 70fa354..18b0dc9 100644
--- a/gcc/testsuite/gcc.dg/addr_equal-1.c
+++ b/gcc/testsuite/gcc.dg/addr_equal-1.c
@@ -4,6 +4,7 @@
 /* { dg-require-alias "" } */
 /* { dg-options "-O2 -fdelete-null-pointer-checks" } */
 /* { dg-skip-if "" { powerpc-ibm-aix* } } */
+/* { dg-skip-if "function pointers can be NULL" { keeps_null_pointer_checks } } */
 void abort (void);
 extern int undef_var0, undef_var1;
 extern __attribute__ ((weak)) int weak_undef_var0;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr78154.c b/gcc/testsuite/gcc.dg/tree-ssa/pr78154.c
index b561503..3ba8f64 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr78154.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr78154.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-evrp-slim -fdelete-null-pointer-checks" } */
+/* { dg-skip-if "" { keeps_null_pointer_checks } } */
 
 void f(void *d, const void *s, __SIZE_TYPE__ n)
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp111.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp111.c
index 6314423..cae2bc7 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp111.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp111.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-evrp -fdelete-null-pointer-checks" } */
+/* { dg-skip-if "" { keeps_null_pointer_checks } } */
 
 void foo (void *p) __attribute__((nonnull(1)));
 
-- 
2.7.4



  1   2   >