Re: [RFC] New pragma exec_charset

2017-10-20 Thread Richard Biener
On Thu, Oct 19, 2017 at 7:13 PM, Martin Sebor  wrote:
> On 10/19/2017 09:50 AM, Andreas Krebbel wrote:
>>
>> The TPF operating system uses the GCC S/390 backend.  They set an
>> EBCDIC exec charset for compilation using -fexec-charset.  However,
>> certain libraries require ASCII strings instead.  In order to be able
>> to put calls to that library into the normal code it is required to
>> switch the exec charset within a compilation unit.
>>
>> This is an attempt to implement it by adding a new pragma which could
>> be used like in the following example:
>>
>> int
>> foo ()
>> {
>>   call_with_utf8("hello world");
>>
>> #pragma GCC exec_charset("UTF16")
>>   call_with_utf16("hello world");
>>
>> #pragma GCC exec_charset(pop)
>>   call_with_utf8("hello world");
>> }
>>
>> Does this look reasonable?
>
>
> I'm not an expert on this but at a high level it looks reasonable
> to me.  But based on some small amount of work I did in this area
> I have a couple of questions.
>
> There are a few places in the compiler that already do or that
> should but don't yet handle different execution character sets.
> The former include built-ins like __bultin_isdigit() and
> __builtin_sprintf (in both builtins.c and gimple-ssa-sprintf.c)
> The latter is the -Wformat checking done by the C and C++ front
> ends.  The missing support for the latter is the subject of bug
> 38308.  According to bug 81686, LTO is apparently also missing
> support for exec-charset.
>
> I'm curious how the pragma might interact with these two areas,
> and whether the lack of support for it in the latter is a concern
> (and if not, why not).  For the former, I'm also wondering about
> the interaction of inlining and other interprocedural optimizations
> with the pragma.  Does it propagate through inlined calls as one
> would expect?

How does it work semantically to have different exec charsets?  That is,
if "strings" flow from a region with one -fexec-charset setting to a region
with another one is that undefined behavior?  Do we now require
external function declarations to be in the proper region (declared under
the appropriate exec charset flag)?  This would mean that passing
the exec charset in effect as additional argument isn't a possibility.

Or do we have to treat -fexec-charset similar to -frounding-math, that is,
we can't ever _interpret_ any string in the compiler?  [unless -fexec-charset
is the same everywhere]

I think the -frounding-math route is probably the easiest (and wisest
given the quite low test coverage we'll get) route.  Thus, add a -fmixed-charset
flag and reject any exec-charset attribute/pragma if that flag is not set?
With LTO we could always add this and/or merge -fexec-charset flags
appropriately,
injecting -fmixed-charset in case TUs use different settings.

Richard.


> Thanks
> Martin
>


Re: [RFC] New pragma exec_charset

2017-10-20 Thread Jakub Jelinek
On Fri, Oct 20, 2017 at 09:48:38AM +0200, Richard Biener wrote:
> How does it work semantically to have different exec charsets?  That is,
> if "strings" flow from a region with one -fexec-charset setting to a region
> with another one is that undefined behavior?  Do we now require
> external function declarations to be in the proper region (declared under
> the appropriate exec charset flag)?  This would mean that passing
> the exec charset in effect as additional argument isn't a possibility.
> 
> Or do we have to treat -fexec-charset similar to -frounding-math, that is,
> we can't ever _interpret_ any string in the compiler?  [unless -fexec-charset
> is the same everywhere]
> 
> I think the -frounding-math route is probably the easiest (and wisest
> given the quite low test coverage we'll get) route.  Thus, add a 
> -fmixed-charset
> flag and reject any exec-charset attribute/pragma if that flag is not set?
> With LTO we could always add this and/or merge -fexec-charset flags
> appropriately,
> injecting -fmixed-charset in case TUs use different settings.

It wouldn't have to be an option, simply mark in cfun all functions that
have more than one exec charset and give up on all optimizations/warnings
that require to read the characters and merge that unknown exec_charset
flag during inlining etc.  Though, that might still not be enough, e.g.
the whole function might have one exec charset, but a global const char []
variable might have another one and during optimization we might be looking
at that.  So perhaps it would need to be a per-TU flag merged during LTO.

Jakub


Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-10-20 Thread Richard Biener
On Fri, Oct 20, 2017 at 1:00 AM, Martin Sebor  wrote:
> On 10/19/2017 02:34 AM, Richard Biener wrote:
>>
>> On Thu, Oct 19, 2017 at 1:19 AM, Martin Sebor  wrote:
>>>
>>> On 10/18/2017 04:48 AM, Richard Biener wrote:


 On Wed, Oct 18, 2017 at 5:34 AM, Martin Sebor  wrote:
>
>
> While testing my latest -Wrestrict changes I noticed a number of
> opportunities to improve the -Warray-bounds warning.  Attached
> is a patch that implements a solution for the following subset
> of these:
>
> PR tree-optimization/82596 - missing -Warray-bounds on an out-of
>   bounds index into string literal
> PR tree-optimization/82588 - missing -Warray-bounds on an excessively
>   large index
> PR tree-optimization/82583 - missing -Warray-bounds on out-of-bounds
>   inner indices
>
>
>> I meant to use size_type_node (size_t), not sizetype.  But
>> I just checked that ptrdiff_type_node is initialized in
>> build_common_tree_nodes and thus always available.
>
>
> I see.  Using ptrdiff_type_node is preferable for the targets
> where ptrdiff_t has a greater precision than size_t (e.g., VMS).
> It makes sense now.  I should remember to change all the other
> places where I introduced ssizetype to use ptrdiff_type_node.
>
>>
>>> As an aside, at some point I would like to get away from a type
>>> based limit in all these warnings and instead use one that can
>>> be controlled by an option so that a user can impose a lower limit
>>> on the maximum size of an object and have all size-related warnings
>>> (and perhaps even optimizations) enforce it and benefit from it.
>>
>>
>> You could add a --param that is initialized from ptrdiff_type_node.
>
>
> Yes, that's an option to consider.  Thanks.
>
>
>>
 +  tree arg = TREE_OPERAND (ref, 0);
 +  tree_code code = TREE_CODE (arg);
 +  if (code == COMPONENT_REF)
 +   {
 + HOST_WIDE_INT off;
 + if (tree base = get_addr_base_and_unit_offset (ref, &off))
 +   up_bound_p1 = fold_build2 (MINUS_EXPR, ssizetype,
 up_bound_p1,
 +  TYPE_SIZE_UNIT (TREE_TYPE
 (base)));
 + else
 +   return;

 so this gives up on a.b[i].c.d[k] (ok, array_at_struct_end_p will be
 false).
 simply not subtracting anyhing instead of returning would be
 conservatively
 correct, no?  Likewise subtracting the offset of the array for all
 "previous"
 variably indexed components with assuming the lowest value for the
 index.
 But as above I think compensating for the offset of the array within the
 object
 is academic ... ;)
>>>
>>>
>>>
>>> I was going to say yes (it gives up) but on second thought I don't
>>> think it does.  Only the major index can be unbounded and the code
>>> does consider the size of the sub-array when checking the major
>>> index.  So, IIUC, I think this works correctly as is (*).  What
>>> doesn't work is VLAs but those are a separate problem.  Let me
>>> know if I misunderstood your question.
>>
>>
>> get_addr_base_and_unit_offset will return NULL if there's any variable
>> component in 'ref'.  So as written it seems to be dead code (you
>> want to pass 'arg'?)
>
>
> Sorry, I'm not sure I understand what you mean.  What do you think
> is dead code?  The call to get_addr_base_and_unit_offset() is also
> made for an array of unspecified bound (up_bound is null) and for
> an array at the end of a struct.  For those the function returns
> non-null, and for the others (arrays of runtime bound) it returns
> null.  (I passed arg instead of ref but I see no difference in
> my tests.)

If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg') it will
return the offset of 'c'.  If you pass a.b[j].c it will still return NULL.
You could use get_ref_base_and_extent which will return the offset
of a.b[0].c in this case and sets max_size != size - but you are only
interested in offset.  The disadvantage of get_ref_base_and_extent
is it returns offset in bits thus if the offset is too large for a HWI
you'll instead get offset == 0 and max_size == -1.

Thus I'm saying this is dead code for variable array accesses
(even for the array you are warning about).  Yes, for constant index
and at-struct-end you'll get sth, but the warning is in VRP because
of variable indexes.

So I suggest to pass 'arg' and use get_ref_base_and_extent
for some extra precision (and possible lossage for very very large
structures).

Thus instead of

+  tree maxbound = TYPE_MAX_VALUE (ptrdiff_type_node);
+
+  up_bound_p1 = int_const_binop (TRUNC_DIV_EXPR, maxbound, eltsize);
+
+  tree arg = TREE_OPERAND (ref, 0);
+  tree_code code = TREE_CODE (arg);
+  if (code == COMPONENT_REF)
+   {
+ HOST_WIDE_INT off;
+ if (tree base = get_addr_base_and_unit_offset (arg, &off))
+   {
+ tree size = TYPE_SIZE_UNIT (TREE_TYPE (base));

(not sure why y

Re: [patch 0/5] nios2 address mode improvements

2017-10-20 Thread Richard Biener
On Fri, Oct 20, 2017 at 4:03 AM, Sandra Loosemore
 wrote:
> This is the set of nios2 optimization patches that I've previously
> mentioned in these threads:
>
> https://gcc.gnu.org/ml/gcc/2017-10/msg00016.html
> https://gcc.gnu.org/ml/gcc-patches/2017-10/msg00957.html
>
> To give an overview of what this is for
>
> The nios2 backend currently generates quite bad code for memory
> accesses with addresses involving symbolic constants.  Like a typical
> RISC machine, nios2 requires splitting such 32-bit constants into
> HIGH/LO_SUM pairs.  Currently this happens in expand, and address
> expressions involving such constants are always converted to use a
> register indirect form.
>
> One part of the problem is that the backend currently doesn't
> recognize that LO_SUM is a legitimate address form (it's register
> indirect with a constant offset using the %lo relocation).  That's
> fixed in these patches.
>
> A harder problem is that doing the high/lo_sum splitting in expand
> inhibits subsequent optimizations.  One such problem arises when you
> have accesses to multiple fields in a static structure object.  Expand
> sees this as many (symbol + offset) expressions involving the same
> symbol with different constant offsets.  What we should be doing in
> that case is CSE'ing the symbol address computation rather than
> splitting every such expression individually.
>
> This patch series attacks that problem by deferring splitting to the
> split1 pass, which happens after cse and fwprop optimizations.
> Deferring the splitting also requires that TARGET_LEGITIMATE_ADDRESS_P
> accept these symbolic constant expressions until the splitting takes
> place, and that code that might generate 32-bit constants in other
> places (e.g., the movsi expander) must not do so after they are
> supposed to have been split.

How do other targets handle this situation?  Naiively I'd have handled
the splitting at reload/LRA time ... (which would make the flag
to test reload_completed)

There are quite a number of targets using lo_sum but I'm not sure they
share the issue with symbolic constants.

Otherwise defering the splitting of course looks like the correct thing to do.

Richard.

> This patch series also includes general improvements to the cost model
> to get better CSE results -- in particular, the nios2 backend has been
> completely missing an implementation for TARGET_ADDRESS_COST.  I also found
> that making TARGET_LEGITIMIZE_ADDRESS smarter resulted in better
> address cost modeling by the ivopts pass.
>
> All together, this resulted in about a 7% code size improvement on the
> customer-provided test case I was using for tuning purposes.
>
> Patches in this set are broken down as follows:
>
> 1: Switch to LRA.
> 2: Detect when splitting has been completed.
> 3: Add splitters and recognize the new address modes.
> 4: Cost model improvements.
> 5: Test cases.
>
> Part 2 is the piece that relates to the discussion linked above.  As
> implemented, it works fine, but it's maybe not the best design.  I'll
> hold off on committing the entire set for at least a few days in case
> somebody wants to suggest a better solution.
>
> -Sandra
>


Re: [patch 0/5] nios2 address mode improvements

2017-10-20 Thread Richard Biener
On Fri, Oct 20, 2017 at 10:12 AM, Richard Biener
 wrote:
> On Fri, Oct 20, 2017 at 4:03 AM, Sandra Loosemore
>  wrote:
>> This is the set of nios2 optimization patches that I've previously
>> mentioned in these threads:
>>
>> https://gcc.gnu.org/ml/gcc/2017-10/msg00016.html
>> https://gcc.gnu.org/ml/gcc-patches/2017-10/msg00957.html
>>
>> To give an overview of what this is for
>>
>> The nios2 backend currently generates quite bad code for memory
>> accesses with addresses involving symbolic constants.  Like a typical
>> RISC machine, nios2 requires splitting such 32-bit constants into
>> HIGH/LO_SUM pairs.  Currently this happens in expand, and address
>> expressions involving such constants are always converted to use a
>> register indirect form.
>>
>> One part of the problem is that the backend currently doesn't
>> recognize that LO_SUM is a legitimate address form (it's register
>> indirect with a constant offset using the %lo relocation).  That's
>> fixed in these patches.
>>
>> A harder problem is that doing the high/lo_sum splitting in expand
>> inhibits subsequent optimizations.  One such problem arises when you
>> have accesses to multiple fields in a static structure object.  Expand
>> sees this as many (symbol + offset) expressions involving the same
>> symbol with different constant offsets.  What we should be doing in
>> that case is CSE'ing the symbol address computation rather than
>> splitting every such expression individually.
>>
>> This patch series attacks that problem by deferring splitting to the
>> split1 pass, which happens after cse and fwprop optimizations.
>> Deferring the splitting also requires that TARGET_LEGITIMATE_ADDRESS_P
>> accept these symbolic constant expressions until the splitting takes
>> place, and that code that might generate 32-bit constants in other
>> places (e.g., the movsi expander) must not do so after they are
>> supposed to have been split.
>
> How do other targets handle this situation?  Naiively I'd have handled
> the splitting at reload/LRA time ... (which would make the flag
> to test reload_completed)
>
> There are quite a number of targets using lo_sum but I'm not sure they
> share the issue with symbolic constants.

sparc for example has in sparc_legitimate_address_p:

  /* During reload, accept the HIGH+LO_SUM construct generated by
 sparc_legitimize_reload_address.  */
  if (reload_in_progress
  && GET_CODE (rs1) == HIGH
  && XEXP (rs1, 0) == imm1)
return 1;

and it seems that sparc_legitimize_reload_address performs the splitting
(sparc uses LRA now so this part looks dead code to me -- maybe LRA
can do this magically somehow but I see nios2 still uses reload so the
code maybe a recipie to follow).

Richard.

> Otherwise defering the splitting of course looks like the correct thing to do.
>
> Richard.
>
>> This patch series also includes general improvements to the cost model
>> to get better CSE results -- in particular, the nios2 backend has been
>> completely missing an implementation for TARGET_ADDRESS_COST.  I also found
>> that making TARGET_LEGITIMIZE_ADDRESS smarter resulted in better
>> address cost modeling by the ivopts pass.
>>
>> All together, this resulted in about a 7% code size improvement on the
>> customer-provided test case I was using for tuning purposes.
>>
>> Patches in this set are broken down as follows:
>>
>> 1: Switch to LRA.
>> 2: Detect when splitting has been completed.
>> 3: Add splitters and recognize the new address modes.
>> 4: Cost model improvements.
>> 5: Test cases.
>>
>> Part 2 is the piece that relates to the discussion linked above.  As
>> implemented, it works fine, but it's maybe not the best design.  I'll
>> hold off on committing the entire set for at least a few days in case
>> somebody wants to suggest a better solution.
>>
>> -Sandra
>>


Re: [PATCH][PR target/19201] Peephole to improve clearing items in structure for m68k

2017-10-20 Thread Andreas Schwab
On Dez 13 2015, Jeff Law  wrote:

> diff --git a/gcc/config/m68k/m68k.md b/gcc/config/m68k/m68k.md
> index 1eaf58f..444515a 100644
> --- a/gcc/config/m68k/m68k.md
> +++ b/gcc/config/m68k/m68k.md
> @@ -7601,3 +7601,36 @@
>  
>  (include "cf.md")
>  (include "sync.md")
> +
> +;; Convert
> +;;
> +;;   move.l 4(%a0),%a0
> +;;   clr.b (%a0,%a1.l)
> +;;
> +;; into
> +;;
> +;;   add.l 4(%a0),%a1
> +;;   clr.b (%a1)
> +;;
> +;; The latter is smaller.  It is faster on all models except m68060.
> +
> +(define_peephole2
> +  [(set (match_operand:SI 0 "register_operand" "")
> + (mem:SI (plus:SI (match_operand:SI 1 "register_operand" "")
> +  (match_operand:SI 2 "const_int_operand" ""
> +   (set (mem:QI (plus:SI (match_operand:SI 3 "register_operand" "")
> +  (match_operand:SI 4 "register_operand" "")))
> + (const_int 0))]
> +  "(optimize_size || !TUNE_68060)
> +   && (operands[0] == operands[3] || operands[0] == operands[4])
> +   && ADDRESS_REG_P (operands[1])
> +   && ADDRESS_REG_P ((operands[0] == operands[3]) ? operands[4] : 
> operands[3])

Shouldn't that use rtx_equal_p?

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


RE: [PATCH, i386] Avoid 512-bit mode MOV for prefer-avx256 option in Intel AVX512 configuration

2017-10-20 Thread Shalnov, Sergey
I can't propose general solution since TARGET_PREFER256 is AVX512 specific.
Sorry for misunderstanding.
Sergey

-Original Message-
From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On 
Behalf Of Kirill Yukhin
Sent: Wednesday, October 18, 2017 8:10 PM
To: Shalnov, Sergey 
Cc: Jakub Jelinek ; 'gcc-patches@gcc.gnu.org' 
; 'ubiz...@gmail.com' ; Senkevich, 
Andrew ; Ivchenko, Alexander 
; Peryt, Sebastian 
Subject: Re: [PATCH, i386] Avoid 512-bit mode MOV for prefer-avx256 option in 
Intel AVX512 configuration

Hello Sergey,
On 06 Oct 14:20, Shalnov, Sergey wrote:
> Jakub,
> I completely agree with you. I fixed the patch.
> Currently, TARGET_PREFER256 will work on architectures with 512VL. It will 
> not work otherwise.
> 
> I will try to find better solution for this. I think I need to look 
> into register classes to configure available registers for 512F and 512VL in 
> case of TARGET_PREFER_AVX256.
Probably I am missing the point, but IMHO think register classes are loosely 
connected to preferred modes of operations.

> I would propose to merge this patch as temporal solution.
Why not to implement generic solution right now?
I don't think're in hurry here to push temporal solution unless we have some 
reasoning.

I see only few mentions of TARGET_PREFER_AVX256 in i386.[c|md] and nothing 
looks suspicious to me.

> 
> Sergey

--
Thanks, K


Re: [patch 2/5] add hook to track when splitting is complete

2017-10-20 Thread Richard Biener
On Fri, Oct 20, 2017 at 4:09 AM, Sandra Loosemore
 wrote:
> This patch adds a function to indicate whether the split1 pass has run
> yet.  This is used in part 3 of the patch set to decide whether 32-bit
> symbolic constant expressions are permitted, e.g. in
> TARGET_LEGITIMATE_ADDRESS_P and the movsi expander.
>
> Since there's currently no usable hook for querying the pass manager
> where it is relative to another pass, I implemented this using a
> target-specific pass that runs directly after split1 and does nothing
> but set a flag.

"Nice" hack ;)  The only currently existing way would be to add a property
to the IL state like

const pass_data pass_data_split_all_insns =
{
  RTL_PASS, /* type */
  "split1", /* name */
  OPTGROUP_NONE, /* optinfo_flags */
  TV_NONE, /* tv_id */
  0, /* properties_required */
  PROP_rtl_split_insns, /* properties_provided */
  0, /* properties_destroyed */

and test that via cfun->curr_properties & PROP_rtl_split_insns

Having run split might be a important enough change to warrant this.
Likewise reload_completed and reload_in_progress could be transitioned
to IL properties.

Richard.

> -Sandra
>


Re: [RFC] New pragma exec_charset

2017-10-20 Thread Richard Biener
On Fri, Oct 20, 2017 at 9:53 AM, Jakub Jelinek  wrote:
> On Fri, Oct 20, 2017 at 09:48:38AM +0200, Richard Biener wrote:
>> How does it work semantically to have different exec charsets?  That is,
>> if "strings" flow from a region with one -fexec-charset setting to a region
>> with another one is that undefined behavior?  Do we now require
>> external function declarations to be in the proper region (declared under
>> the appropriate exec charset flag)?  This would mean that passing
>> the exec charset in effect as additional argument isn't a possibility.
>>
>> Or do we have to treat -fexec-charset similar to -frounding-math, that is,
>> we can't ever _interpret_ any string in the compiler?  [unless -fexec-charset
>> is the same everywhere]
>>
>> I think the -frounding-math route is probably the easiest (and wisest
>> given the quite low test coverage we'll get) route.  Thus, add a 
>> -fmixed-charset
>> flag and reject any exec-charset attribute/pragma if that flag is not set?
>> With LTO we could always add this and/or merge -fexec-charset flags
>> appropriately,
>> injecting -fmixed-charset in case TUs use different settings.
>
> It wouldn't have to be an option, simply mark in cfun all functions that
> have more than one exec charset and give up on all optimizations/warnings
> that require to read the characters and merge that unknown exec_charset
> flag during inlining etc.  Though, that might still not be enough, e.g.
> the whole function might have one exec charset, but a global const char []
> variable might have another one and during optimization we might be looking
> at that.  So perhaps it would need to be a per-TU flag merged during LTO.

There's also IPA flow of strings between functions so unless mixing
exec charsets
invokes undefined behavior I can't see how a per-function flag would help.

But yes, if we can reliably detect whether multiple exec charsets are
used in a TU
we can make this a flag that doesn't have to be set by the user.  But that means
the pragma probably _always_ forces that flag given we have that
forced pre-included
file on some targest and the pragma token would occur after that...

Richard.

> Jakub


[patch][i386, AVX] Adding missing CMP* intrinsics

2017-10-20 Thread Peryt, Sebastian
Hi,

This patch written by Olga Makhotina adds listed below missing intrinsics:
_mm512_[mask_]cmpeq_[pd|ps]_mask
_mm512_[mask_]cmple_[pd|ps]_mask
_mm512_[mask_]cmplt_[pd|ps]_mask
_mm512_[mask_]cmpneq_[pd|ps]_mask
_mm512_[mask_]cmpnle_[pd|ps]_mask
_mm512_[mask_]cmpnlt_[pd|ps]_mask
_mm512_[mask_]cmpord_[pd|ps]_mask
_mm512_[mask_]cmpunord_[pd|ps]_mask

20.10.2017  Olga Makhotina  

gcc/
* config/i386/avx512fintrin.h (_mm512_cmpeq_pd_mask,
_mm512_cmple_pd_mask, _mm512_cmplt_pd_mask,
_mm512_cmpneq_pd_mask, _mm512_cmpnle_pd_mask,
_mm512_cmpnlt_pd_mask, _mm512_cmpord_pd_mask,
_mm512_cmpunord_pd_mask, _mm512_mask_cmpeq_pd_mask,
_mm512_mask_cmple_pd_mask, _mm512_mask_cmplt_pd_mask,
_mm512_mask_cmpneq_pd_mask, _mm512_mask_cmpnle_pd_mask,
_mm512_mask_cmpnlt_pd_mask, _mm512_mask_cmpord_pd_mask,
_mm512_mask_cmpunord_pd_mask, _mm512_cmpeq_ps_mask,
_mm512_cmple_ps_mask, _mm512_cmplt_ps_mask,
_mm512_cmpneq_ps_mask, _mm512_cmpnle_ps_mask,
_mm512_cmpnlt_ps_mask, _mm512_cmpord_ps_mask,
_mm512_cmpunord_ps_mask, _mm512_mask_cmpeq_ps_mask,
_mm512_mask_cmple_ps_mask, _mm512_mask_cmplt_ps_mask,
_mm512_mask_cmpneq_ps_mask, _mm512_mask_cmpnle_ps_mask,
_mm512_mask_cmpnlt_ps_mask, _mm512_mask_cmpord_ps_mask,
_mm512_mask_cmpunord_ps_mask): New intrinsics.

20.10.2017  Olga Makhotina  

gcc/testsuite/
* gcc.target/i386/avx512f-vcmpps-1.c (_mm512_cmpeq_ps_mask,
_mm512_cmple_ps_mask, _mm512_cmplt_ps_mask,
_mm512_cmpneq_ps_mask, _mm512_cmpnle_ps_mask,
_mm512_cmpnlt_ps_mask, _mm512_cmpord_ps_mask,
_mm512_cmpunord_ps_mask, _mm512_mask_cmpeq_ps_mask,
_mm512_mask_cmple_ps_mask, _mm512_mask_cmplt_ps_mask,
_mm512_mask_cmpneq_ps_mask, _mm512_mask_cmpnle_ps_mask,
_mm512_mask_cmpnlt_ps_mask, _mm512_mask_cmpord_ps_mask,
_mm512_mask_cmpunord_ps_mask): Test new intrinsics.
* gcc.target/i386/avx512f-vcmpps-2.c (_mm512_cmpeq_ps_mask,
_mm512_cmple_ps_mask, _mm512_cmplt_ps_mask, 
_mm512_cmpneq_ps_mask, _mm512_cmpnle_ps_mask,
_mm512_cmpnlt_ps_mask, _mm512_cmpord_ps_mask,
_mm512_cmpunord_ps_mask, _mm512_mask_cmpeq_ps_mask,
_mm512_mask_cmple_ps_mask, _mm512_mask_cmplt_ps_mask,
_mm512_mask_cmpneq_ps_mask, _mm512_mask_cmpnle_ps_mask,
_mm512_mask_cmpnlt_ps_mask, _mm512_mask_cmpord_ps_mask,
_mm512_mask_cmpunord_ps_mask): Test new intrinsics.
* gcc.target/i386/avx512f-vcmppd-1.c (_mm512_cmpeq_pd_mask,
_mm512_cmple_pd_mask, _mm512_cmplt_pd_mask,
_mm512_cmpneq_pd_mask, _mm512_cmpnle_pd_mask,
_mm512_cmpnlt_pd_mask, _mm512_cmpord_pd_mask,
_mm512_cmpunord_pd_mask, _mm512_mask_cmpeq_pd_mask,
_mm512_mask_cmple_pd_mask, _mm512_mask_cmplt_pd_mask,
_mm512_mask_cmpneq_pd_mask, _mm512_mask_cmpnle_pd_mask,
_mm512_mask_cmpnlt_pd_mask, _mm512_mask_cmpord_pd_mask,
_mm512_mask_cmpunord_pd_mask): Test new intrinsics.
* gcc.target/i386/avx512f-vcmppd-2.c (_mm512_cmpeq_pd_mask,
_mm512_cmple_pd_mask, _mm512_cmplt_pd_mask,
_mm512_cmpneq_pd_mask, _mm512_cmpnle_pd_mask,
_mm512_cmpnlt_pd_mask, _mm512_cmpord_pd_mask,
_mm512_cmpunord_pd_mask, _mm512_mask_cmpeq_pd_mask,
_mm512_mask_cmple_pd_mask, _mm512_mask_cmplt_pd_mask,
_mm512_mask_cmpneq_pd_mask, _mm512_mask_cmpnle_pd_mask,
_mm512_mask_cmpnlt_pd_mask, _mm512_mask_cmpord_pd_mask,
_mm512_mask_cmpunord_pd_mask): Test new intrinsics.

Is it ok for trunk?
 
Thanks,
Sebastian



0001-vcmpp-d-s.patch
Description: 0001-vcmpp-d-s.patch


Re: [patch 0/5] nios2 address mode improvements

2017-10-20 Thread Ramana Radhakrishnan
On Fri, Oct 20, 2017 at 9:18 AM, Richard Biener
 wrote:
> On Fri, Oct 20, 2017 at 10:12 AM, Richard Biener
>> How do other targets handle this situation?  Naiively I'd have handled
>> the splitting at reload/LRA time ... (which would make the flag
>> to test reload_completed)
>>
>> There are quite a number of targets using lo_sum but I'm not sure they
>> share the issue with symbolic constants.
>
> sparc for example has in sparc_legitimate_address_p:
>
>   /* During reload, accept the HIGH+LO_SUM construct generated by
>  sparc_legitimize_reload_address.  */
>   if (reload_in_progress
>   && GET_CODE (rs1) == HIGH
>   && XEXP (rs1, 0) == imm1)
> return 1;
>
> and it seems that sparc_legitimize_reload_address performs the splitting
> (sparc uses LRA now so this part looks dead code to me -- maybe LRA
> can do this magically somehow but I see nios2 still uses reload so the
> code maybe a recipie to follow).

I remember a patch to LRA to get this done with high / lo_sum
targeting MIPS go in during the gcc 7 time frame which seemed to help.

We've had similar issues with high and lo_sums in the Arm and AArch64
ports. Delaying splitting has helped but we've not found a case to go
as fine grained as what Sandra is attempting here but that maybe
something to further investigate on these ports.

Ramana

>
> Richard.
>
>> Otherwise defering the splitting of course looks like the correct thing to 
>> do.
>>
>> Richard.
>>
>>> This patch series also includes general improvements to the cost model
>>> to get better CSE results -- in particular, the nios2 backend has been
>>> completely missing an implementation for TARGET_ADDRESS_COST.  I also found
>>> that making TARGET_LEGITIMIZE_ADDRESS smarter resulted in better
>>> address cost modeling by the ivopts pass.
>>>
>>> All together, this resulted in about a 7% code size improvement on the
>>> customer-provided test case I was using for tuning purposes.
>>>
>>> Patches in this set are broken down as follows:
>>>
>>> 1: Switch to LRA.
>>> 2: Detect when splitting has been completed.
>>> 3: Add splitters and recognize the new address modes.
>>> 4: Cost model improvements.
>>> 5: Test cases.
>>>
>>> Part 2 is the piece that relates to the discussion linked above.  As
>>> implemented, it works fine, but it's maybe not the best design.  I'll
>>> hold off on committing the entire set for at least a few days in case
>>> somebody wants to suggest a better solution.
>>>
>>> -Sandra
>>>


Re: [patch 0/5] nios2 address mode improvements

2017-10-20 Thread Jakub Jelinek
On Thu, Oct 19, 2017 at 08:03:45PM -0600, Sandra Loosemore wrote:
> A harder problem is that doing the high/lo_sum splitting in expand
> inhibits subsequent optimizations.  One such problem arises when you
> have accesses to multiple fields in a static structure object.  Expand
> sees this as many (symbol + offset) expressions involving the same
> symbol with different constant offsets.  What we should be doing in
> that case is CSE'ing the symbol address computation rather than
> splitting every such expression individually.

Do you have the needed relocations for that though?
If not, then you need to do:
  tmp = high (symbol);
  tmp |= lo_sum (symbol); // or +
  a = [tmp + 0];
  b = [tmp + 4];
  c = [tmp + 8];
if you do (like e.g. sparc64 has the %olo relocation), then you can do
  tmp = high (symbol);
  a = [tmp + lo_sum (symbol) + 0];
  b = [tmp + lo_sum (symbol) + 4];
  c = [tmp + lo_sum (symbol) + 8];
If you tried to do:
  tmp = high (symbol);
  a = [tmp + lo_sum (symbol)];
  b = [tmp + lo_sum (symbol + 4)];
  c = [tmp + lo_sum (symbol + 8)];
then this would break if lo_sum (symbol + 4) or lo_sum (symbol + 8)
is < 4.

Jakub


Fix Ada bootstrap issue

2017-10-20 Thread Eric Botcazou
Because of the recent reorganization of the ada/ directory, the check for the 
presence of a working Ada compiler fails in stage #2 and later if you don't 
have a compiler already installed in the --prefix directory.

Bootstrapped on x86_64-suse-linux, applied on the mainline.


2017-10-20  Nicolas Roche  

* configure.ac (ACX_PROG_GNAT): Append "libgnat" to include search dir
* configure: Regenerate.

-- 
Eric BotcazouIndex: configure.ac
===
--- configure.ac	(revision 253921)
+++ configure.ac	(working copy)
@@ -362,7 +362,7 @@ rm -f a.out a.exe b.out
 # Find the native compiler
 AC_PROG_CC
 AC_PROG_CXX
-ACX_PROG_GNAT([-I"$srcdir"/ada])
+ACX_PROG_GNAT([-I"$srcdir"/ada/libgnat])
 
 # Do configure tests with the C++ compiler, since that's what we build with.
 AC_LANG(C++)


[RFC PATCH, i386]: Make FP inequality comparisons trapping on qNaN.

2017-10-20 Thread Uros Bizjak
Hello!

Attached patch makes FP inequality comparisons trap on qNaN. There is
an old comment mentioned reversible comparisons, but middle end
doesn't reverse them anymore (a couple of weeks ago, reversed
comparisons were removed from i386.md):

/* ??? In order to make all comparisons reversible, we do all comparisons
 non-trapping when compiling for IEEE.  Once gcc is able to distinguish
 all forms trapping and nontrapping comparisons, we can make inequality
 comparisons trapping again, since it results in better code when using
 FCOM based compares.  */

FCOM compares allow FP and integer memory operands.

This is also what ICC produces with -fp-model strict, so I see no
reason for GCC to produce different and inferior code.

2017-10-20  Uros Bizjak  

* config/i386/i386.c (ix86_fp_compare_mode): Return CCFPmode
for ordered inequality comparisons even with TARGET_IEEE_FP.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

If there are no comments, I plan to commit the patch to mainline early
next week.

Uros.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 45a219741dbb..7ff222be9aaf 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -21500,14 +21500,35 @@ ix86_expand_int_compare (enum rtx_code code, rtx op0, 
rtx op1)
Return the appropriate mode to use.  */
 
 machine_mode
-ix86_fp_compare_mode (enum rtx_code)
-{
-  /* ??? In order to make all comparisons reversible, we do all comparisons
- non-trapping when compiling for IEEE.  Once gcc is able to distinguish
- all forms trapping and nontrapping comparisons, we can make inequality
- comparisons trapping again, since it results in better code when using
- FCOM based compares.  */
-  return TARGET_IEEE_FP ? CCFPUmode : CCFPmode;
+ix86_fp_compare_mode (enum rtx_code code)
+{
+  if (!TARGET_IEEE_FP)
+return CCFPmode;
+
+  switch (code)
+{
+case GT:
+case GE:
+case LT:
+case LE:
+  return CCFPmode;
+
+case EQ:
+case NE:
+
+case LTGT:
+case UNORDERED:
+case ORDERED:
+case UNLT:
+case UNLE:
+case UNGT:
+case UNGE:
+case UNEQ:
+  return CCFPUmode;
+
+default:
+  gcc_unreachable ();
+}
 }
 
 machine_mode


Re: [PATCH GCC][4/7]Choose exit edge/path when removing inner loop's exit statement

2017-10-20 Thread Tom de Vries

On 10/19/2017 10:49 AM, Bin.Cheng wrote:

On Thu, Oct 19, 2017 at 9:31 AM, Tom de Vries  wrote:

On 10/09/2017 03:34 PM, Richard Biener wrote:


On Thu, Oct 5, 2017 at 3:16 PM, Bin Cheng  wrote:


Hi,
Function generate_loops_for_partition chooses arbitrary path when
removing exit
condition not in partition.  This is fine for now because it's impossible
to have
loop exit condition in case of innermost distribution.  After extending
to loop
nest distribution, we must choose exit edge/path for inner loop's exit
condition,
otherwise an infinite empty loop will be generated.  Test case added.

Bootstrap and test in patch set on x86_64 and AArch64, is it OK?



Ok.

Richard.


Thanks,
bin
2017-10-04  Bin Cheng  

  * tree-loop-distribution.c (generate_loops_for_partition):
Remove
  inner loop's exit stmt by making it always exit the loop,
otherwise
  we would generate an infinite empty loop.

gcc/testsuite/ChangeLog
2017-10-04  Bin Cheng  

  * gcc.dg/tree-ssa/ldist-27.c: New test.



Hi,

I've committed patch below to specify the stack size requirements of this
test-case (fixing the test failure for nvptx).

Hi,
Maybe we can simply make the structure a global variable?



Works for me.

Committed as attached.

Thanks,
- Tom
Reduce stack size in gcc.dg/tree-ssa/ldist-27.c

2017-10-20  Tom de Vries  

	* gcc.dg/tree-ssa/ldist-27.c: Remove dg-require-stack-size.
	(main): Move s ...
	(s): ... here.

---
 gcc/testsuite/gcc.dg/tree-ssa/ldist-27.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-27.c b/gcc/testsuite/gcc.dg/tree-ssa/ldist-27.c
index cd9696e..b1fd024 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ldist-27.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-27.c
@@ -1,6 +1,5 @@
 /* { dg-do run } */
 /* { dg-options "-O3 -ftree-loop-distribute-patterns -fdump-tree-ldist-details" } */
-/* { dg-require-stack-size "(300 + 200 + 300 * 200) * 8" } */
 
 #define M (300)
 #define N (200)
@@ -12,7 +11,8 @@ struct st
   double c[M][N];
 };
 
-int __attribute__ ((noinline)) foo (struct st *s)
+int __attribute__ ((noinline))
+foo (struct st *s)
 {
   int i, j;
   for (i = 0; i != M;)
@@ -30,9 +30,11 @@ L2:
   return 0;
 }
 
-int main (void)
+struct st s;
+
+int
+main (void)
 {
-  struct st s;
   return foo (&s);
 }
 


Re: [PATCH] Remove useless isa attributes from various sse.md patterns

2017-10-20 Thread Uros Bizjak
On Wed, Oct 4, 2017 at 9:39 PM, Jakub Jelinek  wrote:
> Hi!
>
> While working on the previous patch, I've noticed we have quite a few
> seemingly useless isa attributes (first I've noticed isa attribute
> which had one value for all alternatives, which IMHO should just
> been done in insn condition instead).
>
> fma_avx512f isa is only used on insns that have TARGET_AVX512F && ...
> in their conditions, and the isa has been enabled if:
> TARGET_FMA || TARGET_AVX512F
> so that is clearly satisfied always.
>
> The last hunk had isa avx and TARGET_AVX512BW && in the condition,
> that doesn't make any sense to me either.
>
> And the two hunks before that had avx512f isa, and TARGET_AVX512F && ...
> in the condition.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2017-10-04  Jakub Jelinek  
>
> * config/i386/i386.md (isa): Remove fma_avx512f.
> * config/i386/sse.md (_fmadd__mask,
> _fmadd__mask3,
> _fmsub__mask,
> _fmsub__mask3,
> _fnmadd__mask,
> _fnmadd__mask3,
> _fnmsub__mask,
> _fnmsub__mask3,
> _fmaddsub__mask,
> _fmaddsub__mask3,
> _fmsubadd__mask,
> _fmsubadd__mask3): Remove isa attribute.
> (*vec_widen_umult_even_v16si,
> *vec_widen_smult_even_v16si): Likewise.
> (avx512bw_dbpsadbw): Likewise.

OK for mainline as an obvious patch.

Thanks,
Uros.

> --- gcc/config/i386/i386.md.jj  2017-10-04 09:45:55.0 +0200
> +++ gcc/config/i386/i386.md 2017-10-04 16:28:36.954551561 +0200
> @@ -798,7 +798,7 @@ (define_attr "movu" "0,1" (const_string
>  (define_attr "isa" "base,x64,x64_sse4,x64_sse4_noavx,x64_avx,nox64,
> sse2,sse2_noavx,sse3,sse4,sse4_noavx,avx,noavx,
> avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,noavx512f,
> -   fma_avx512f,avx512bw,noavx512bw,avx512dq,noavx512dq,
> +   avx512bw,noavx512bw,avx512dq,noavx512dq,
> avx512vl,noavx512vl,x64_avx512dq,x64_avx512bw"
>(const_string "base"))
>
> @@ -832,8 +832,6 @@ (define_attr "enabled" ""
>  (eq_attr "isa" "fma") (symbol_ref "TARGET_FMA")
>  (eq_attr "isa" "avx512f") (symbol_ref "TARGET_AVX512F")
>  (eq_attr "isa" "noavx512f") (symbol_ref "!TARGET_AVX512F")
> -(eq_attr "isa" "fma_avx512f")
> -  (symbol_ref "TARGET_FMA || TARGET_AVX512F")
>  (eq_attr "isa" "avx512bw") (symbol_ref "TARGET_AVX512BW")
>  (eq_attr "isa" "noavx512bw") (symbol_ref "!TARGET_AVX512BW")
>  (eq_attr "isa" "avx512dq") (symbol_ref "TARGET_AVX512DQ")
> --- gcc/config/i386/sse.md.jj   2017-10-04 15:34:00.0 +0200
> +++ gcc/config/i386/sse.md  2017-10-04 16:21:41.724535506 +0200
> @@ -3720,8 +3720,7 @@ (define_insn "_fmadd__mask
>"@
> vfmadd132\t{%2, %3, %0%{%4%}|%0%{%4%}, %3, 
> %2}
> vfmadd213\t{%3, %2, %0%{%4%}|%0%{%4%}, %2, 
> %3}"
> -  [(set_attr "isa" "fma_avx512f,fma_avx512f")
> -   (set_attr "type" "ssemuladd")
> +  [(set_attr "type" "ssemuladd")
> (set_attr "mode" "")])
>
>  (define_insn "_fmadd__mask3"
> @@ -3735,8 +3734,7 @@ (define_insn "_fmadd__mask
>   (match_operand: 4 "register_operand" "Yk")))]
>"TARGET_AVX512F"
>"vfmadd231\t{%2, %1, %0%{%4%}|%0%{%4%}, %1, 
> %2}"
> -  [(set_attr "isa" "fma_avx512f")
> -   (set_attr "type" "ssemuladd")
> +  [(set_attr "type" "ssemuladd")
> (set_attr "mode" "")])
>
>  (define_insn "*fma_fmsub_"
> @@ -3786,8 +3784,7 @@ (define_insn "_fmsub__mask
>"@
> vfmsub132\t{%2, %3, %0%{%4%}|%0%{%4%}, %3, 
> %2}
> vfmsub213\t{%3, %2, %0%{%4%}|%0%{%4%}, %2, 
> %3}"
> -  [(set_attr "isa" "fma_avx512f,fma_avx512f")
> -   (set_attr "type" "ssemuladd")
> +  [(set_attr "type" "ssemuladd")
> (set_attr "mode" "")])
>
>  (define_insn "_fmsub__mask3"
> @@ -3802,8 +3799,7 @@ (define_insn "_fmsub__mask
>   (match_operand: 4 "register_operand" "Yk")))]
>"TARGET_AVX512F && "
>"vfmsub231\t{%2, %1, %0%{%4%}|%0%{%4%}, %1, 
> %2}"
> -  [(set_attr "isa" "fma_avx512f")
> -   (set_attr "type" "ssemuladd")
> +  [(set_attr "type" "ssemuladd")
> (set_attr "mode" "")])
>
>  (define_insn "*fma_fnmadd_"
> @@ -3853,8 +3849,7 @@ (define_insn "_fnmadd__mas
>"@
> vfnmadd132\t{%2, %3, %0%{%4%}|%0%{%4%}, %3, 
> %2}
> vfnmadd213\t{%3, %2, %0%{%4%}|%0%{%4%}, %2, 
> %3}"
> -  [(set_attr "isa" "fma_avx512f,fma_avx512f")
> -   (set_attr "type" "ssemuladd")
> +  [(set_attr "type" "ssemuladd")
> (set_attr "mode" "")])
>
>  (define_insn "_fnmadd__mask3"
> @@ -3869,8 +3864,7 @@ (define_insn "_fnmadd__mas
>   (match_operand: 4 "register_operand" "Yk")))]
>"TARGET_AVX512F && "
>"vfnmadd231\t{%2, %1, %0%{%4%}|%0%{%4%}, %1, 
> %2}"
> -  [(set_attr "isa" "fma_avx512f")
> -   (set_attr "type" "ssemuladd")
> +  [(set_attr "type" "ssemuladd")
> (set_attr "mode" "")])
>
>  (define_insn "*fma_fnmsub_"
> @@ -3923,8 +3917,7 @@ (define_insn "_fnmsub__mas
>"@
> vfnmsub

Re: [RFC PATCH] Merge libsanitizer from upstream

2017-10-20 Thread Christophe Lyon
Hi,

On 19 October 2017 at 13:17, Jakub Jelinek  wrote:
> On Thu, Oct 19, 2017 at 02:07:24PM +0300, Maxim Ostapenko wrote:
>> > Is the patch (the merge + this incremental) ok for trunk?
>>
>> I think the patch is OK, just wondering about two things:
>
> Richi just approved the patch on IRC, so I'll commit, then we can deal with
> follow-ups.
>

Does anyone else run these tests on arm?
Since you applied this patch, I'm seeing lots of new errors and timeouts.
I have been ignoring regression reports for *san because of yyrandomness
in the results, but the timeouts are a  major inconvenience in testing
because it increases latency a lot in getting results, or worse I get no
result at all because the validation job is killed before completion.

Looking at some intermediate logs, I have noticed:
==24797==AddressSanitizer CHECK failed:
/libsanitizer/asan/asan_poisoning.cc:34
"((AddrIsAlignedByGranularity(addr))) != (0)" (0x0, 0x0)
#0 0x408d7d65 in AsanCheckFailed /libsanitizer/asan/asan_rtl.cc:67
#1 0x408ecd5d in __sanitizer::CheckFailed(char const*, int, char
const*, unsigned long long, unsigned long long)
/libsanitizer/sanitizer_common/sanitizer_termination.cc:77
#2 0x408d22d5 in __asan::PoisonShadow(unsigned long, unsigned
long, unsigned char) /libsanitizer/asan/asan_poisoning.cc:34
#3 0x4085409b in __asan_register_globals
/libsanitizer/asan/asan_globals.cc:368
#4 0x109eb in _GLOBAL__sub_I_00099_1_ten
(/aci-gcc-fsf/builds/gcc-fsf-gccsrc-thumb/obj-arm-none-linux-gnueabi/gcc3/gcc/testsuite/gcc/alloca_big_alignment.exe+0x109eb)

in MANY (193 in gcc) tests.

and many others (152 in gcc) just time out individually (eg
c-c++-common/asan/alloca_instruments_all_paddings.c) with no error in
the logs besides Dejagnu's
WARNING: program timed out.


Since I'm using an apparently unusual setup, maybe I have to update it
to cope with the new version,
so I'd like to know if others are seeing the same problems on arm?

I'm using qemu -R 0 to execute the test programs, encapsulated by
proot (similar to chroot, but does not require root privileges).

Am I missing something obvious?

Thanks,

Christophe


>> 1) We have a bunch of GCC local patches, did you include them into a
>> cumulative patch (I guess yes)?
>
> I have done some verification today, diff from upstream r285547 to unpatched
> GCC (with the LLVM Compiler infrastructure two liners removed), attached P1,
> and diff from upstream r315899 to patched GCC (again, two liners removed),
> attached P2 and went through the changes in P1 and verified that except for
> the ubsan backwards compatibility we had that can't work anymore everything
> else was upstreamed, or remained in P2.  So P2 is the current diff from
> upstream, with the sanitizer_common/sanitizer_symbolizer_libbacktrace.cc
> changes now filed upstream.
>
>> 2) Upstream has enabled LSan for x86 and ARM, is it worth to enable them in
>> GCC too?
>
> Maybe, feel free to post patches.  For LSan we need to split off lsan_preinit
> out of liblsan and link it into executables, will handle it next (there is a
> PR about it, just wanted to wait until the merge is in).
>
> Jakub


[PATCH][GRAPHITE] Tame down dumping

2017-10-20 Thread Richard Biener

This tames dumping a bit and adjusts whitespacing and order of dumping.

Bootstrapped / tested on x86_64-unknown-linux-gnu, applied.

Richard.

2017-10-20  Richard Biener  

* graphite-isl-ast-to-gimple.c
(translate_isl_ast_to_gimple::graphite_copy_stmts_from_block):
Remove return value and simplify, dump copied stmt after lhs
adjustment.
(translate_isl_ast_to_gimple::translate_isl_ast_node_user):
Reduce dump verbosity.
(gsi_insert_earliest): Likewise.
(translate_isl_ast_to_gimple::copy_bb_and_scalar_dependences): Adjust.
* graphite.c (print_global_statistics): Adjust dumping.
(print_graphite_scop_statistics): Likewise.
(print_graphite_statistics): Do not dump loops here.
(graphite_transform_loops): But here.

Index: gcc/graphite-isl-ast-to-gimple.c
===
--- gcc/graphite-isl-ast-to-gimple.c(revision 253926)
+++ gcc/graphite-isl-ast-to-gimple.c(working copy)
@@ -191,7 +191,7 @@ class translate_isl_ast_to_gimple
 
   tree get_rename_from_scev (tree old_name, gimple_seq *stmts, loop_p loop,
 vec iv_map);
-  bool graphite_copy_stmts_from_block (basic_block bb, basic_block new_bb,
+  void graphite_copy_stmts_from_block (basic_block bb, basic_block new_bb,
   vec iv_map);
   edge copy_bb_and_scalar_dependences (basic_block bb, edge next_e,
   vec iv_map);
@@ -791,13 +810,12 @@ translate_isl_ast_node_user (__isl_keep
   isl_ast_expr_free (user_expr);
 
   basic_block old_bb = GBB_BB (gbb);
-  if (dump_file)
+  if (dump_file && (dump_flags & TDF_DETAILS))
 {
   fprintf (dump_file,
   "[codegen] copying from bb_%d on edge (bb_%d, bb_%d)\n",
   old_bb->index, next_e->src->index, next_e->dest->index);
   print_loops_bb (dump_file, GBB_BB (gbb), 0, 3);
-
 }
 
   next_e = copy_bb_and_scalar_dependences (old_bb, next_e, iv_map);
@@ -807,7 +825,7 @@ translate_isl_ast_node_user (__isl_keep
   if (codegen_error_p ())
 return NULL;
 
-  if (dump_file)
+  if (dump_file && (dump_flags & TDF_DETAILS))
 {
   fprintf (dump_file, "[codegen] (after copy) new basic block\n");
   print_loops_bb (dump_file, next_e->src, 0, 3);
@@ -1049,9 +1067,9 @@ gsi_insert_earliest (gimple_seq seq)
 
   if (dump_file)
{
- fprintf (dump_file, "[codegen] inserting statement: ");
+ fprintf (dump_file, "[codegen] inserting statement in BB %d: ",
+  gimple_bb (use_stmt)->index);
  print_gimple_stmt (dump_file, use_stmt, 0, TDF_VOPS | TDF_MEMSYMS);
- print_loops_bb (dump_file, gimple_bb (use_stmt), 0, 3);
}
 }
 }
@@ -1122,7 +1140,7 @@ should_copy_to_new_region (gimple *stmt,
 /* Duplicates the statements of basic block BB into basic block NEW_BB
and compute the new induction variables according to the IV_MAP.  */
 
-bool translate_isl_ast_to_gimple::
+void translate_isl_ast_to_gimple::
 graphite_copy_stmts_from_block (basic_block bb, basic_block new_bb,
vec iv_map)
 {
@@ -1139,7 +1157,6 @@ graphite_copy_stmts_from_block (basic_bl
   /* Create a new copy of STMT and duplicate STMT's virtual
 operands.  */
   gimple *copy = gimple_copy (stmt);
-  gsi_insert_after (&gsi_tgt, copy, GSI_NEW_STMT);
 
   /* Rather than not copying debug stmts we reset them.
  ???  Where we can rewrite uses without inserting new
@@ -1154,12 +1171,6 @@ graphite_copy_stmts_from_block (basic_bl
gcc_unreachable ();
}
 
-  if (dump_file)
-   {
- fprintf (dump_file, "[codegen] inserting statement: ");
- print_gimple_stmt (dump_file, copy, 0);
-   }
-
   maybe_duplicate_eh_stmt (copy, stmt);
   gimple_duplicate_stmt_histograms (cfun, copy, cfun, stmt);
 
@@ -1172,8 +1183,12 @@ graphite_copy_stmts_from_block (basic_bl
  create_new_def_for (old_name, copy, def_p);
}
 
-  if (codegen_error_p ())
-   return false;
+  gsi_insert_after (&gsi_tgt, copy, GSI_NEW_STMT);
+  if (dump_file)
+   {
+ fprintf (dump_file, "[codegen] inserting statement: ");
+ print_gimple_stmt (dump_file, copy, 0);
+   }
 
   /* For each SCEV analyzable SSA_NAME, rename their usage.  */
   ssa_op_iter iter;
@@ -1198,8 +1213,6 @@ graphite_copy_stmts_from_block (basic_bl
 
   update_stmt (copy);
 }
-
-  return true;
 }
 
 
@@ -1236,11 +1249,7 @@ copy_bb_and_scalar_dependences (basic_bl
   gsi_insert_after (&gsi_tgt, ass, GSI_NEW_STMT);
 }
 
-  if (!graphite_copy_stmts_from_block (bb, new_bb, iv_map))
-{
-  set_codegen_error ();
-  return NULL;
-}
+  graphite_copy_stmts_from_block (bb, new_bb, iv_map);
 
   /* Insert out-of SSA copies on the original BB outgoing edges.  */
   gsi_tgt = gsi_last_bb (new_bb);
Index: g

[PATCH] Fix PR82603

2017-10-20 Thread Richard Biener

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2017-10-20  Richard Biener  

PR tree-optimization/82473
* tree-if-conv.c (predicate_mem_writes): Make sure to only
remove false predicated stores.

* gcc.dg/torture/pr82603.c: New testcase.

Index: gcc/tree-if-conv.c
===
--- gcc/tree-if-conv.c  (revision 253926)
+++ gcc/tree-if-conv.c  (working copy)
@@ -2214,7 +2214,8 @@ predicate_mem_writes (loop_p loop)
{
  if (!gimple_assign_single_p (stmt = gsi_stmt (gsi)))
;
- else if (is_false_predicate (cond))
+ else if (is_false_predicate (cond)
+  && gimple_vdef (stmt))
{
  unlink_stmt_vdef (stmt);
  gsi_remove (&gsi, true);
Index: gcc/testsuite/gcc.dg/torture/pr82603.c
===
--- gcc/testsuite/gcc.dg/torture/pr82603.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr82603.c  (working copy)
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-ftree-loop-vectorize" } */
+
+int
+mr (unsigned int lf, int ms)
+{
+  unsigned int sw = 0;
+  char *cu = (char *)&ms;
+
+  while (ms < 1)
+{
+  if (lf == 0)
+   ms = 0;
+  else
+   ms = 0;
+  ms += ((lf > 0) && ((lf > sw) ? 1 : ++*cu));
+}
+
+  if (lf != 0)
+cu = (char *)&sw;
+  *cu = lf;
+
+  return ms;
+}


[PATCH] Fix PR82473

2017-10-20 Thread Richard Biener

The following fixes PR82473 - we were using a random (the first
non-reduction) operand of the reduction stmt to compute ncopies
but that's of course wrong.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2017-10-20  Richard Biener  

PR tree-optimization/82473
* tree-vect-loop.c (vectorizable_reduction): Properly get at
the largest input type.

* gcc.dg/torture/pr82473.c: New testcase.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c(revision 253926)
+++ gcc/tree-vect-loop.c(working copy)
@@ -5836,9 +5836,12 @@ vectorizable_reduction (gimple *stmt, gi
  reduc_index = i;
  continue;
}
-  else
+  else if (tem)
{
- if (!vectype_in)
+ /* To properly compute ncopies we are interested in the widest
+input type in case we're looking at a widening accumulation.  */
+ if (!vectype_in
+ || TYPE_VECTOR_SUBPARTS (vectype_in) > TYPE_VECTOR_SUBPARTS (tem))
vectype_in = tem;
}
 
Index: gcc/testsuite/gcc.dg/torture/pr82473.c
===
--- gcc/testsuite/gcc.dg/torture/pr82473.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr82473.c  (working copy)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-ftree-vectorize" } */
+
+void
+zz (int x9, short int gt)
+{
+  if (0)
+{
+  while (gt < 1)
+   {
+ int pz;
+
+k6:
+ for (pz = 0; pz < 3; ++pz)
+   x9 += gt;
+ ++gt;
+   }
+}
+
+  if (x9 != 0)
+goto k6;
+}


Re: [RFC] New pragma exec_charset

2017-10-20 Thread Andreas Krebbel
On 10/20/2017 10:28 AM, Richard Biener wrote:
> On Fri, Oct 20, 2017 at 9:53 AM, Jakub Jelinek  wrote:
>> On Fri, Oct 20, 2017 at 09:48:38AM +0200, Richard Biener wrote:
>>> How does it work semantically to have different exec charsets?  That is,
>>> if "strings" flow from a region with one -fexec-charset setting to a region
>>> with another one is that undefined behavior?  Do we now require
>>> external function declarations to be in the proper region (declared under
>>> the appropriate exec charset flag)?  This would mean that passing
>>> the exec charset in effect as additional argument isn't a possibility.
>>>
>>> Or do we have to treat -fexec-charset similar to -frounding-math, that is,
>>> we can't ever _interpret_ any string in the compiler?  [unless 
>>> -fexec-charset
>>> is the same everywhere]
>>>
>>> I think the -frounding-math route is probably the easiest (and wisest
>>> given the quite low test coverage we'll get) route.  Thus, add a 
>>> -fmixed-charset
>>> flag and reject any exec-charset attribute/pragma if that flag is not set?
>>> With LTO we could always add this and/or merge -fexec-charset flags
>>> appropriately,
>>> injecting -fmixed-charset in case TUs use different settings.
>>
>> It wouldn't have to be an option, simply mark in cfun all functions that
>> have more than one exec charset and give up on all optimizations/warnings
>> that require to read the characters and merge that unknown exec_charset
>> flag during inlining etc.  Though, that might still not be enough, e.g.
>> the whole function might have one exec charset, but a global const char []
>> variable might have another one and during optimization we might be looking
>> at that.  So perhaps it would need to be a per-TU flag merged during LTO.
> 
> There's also IPA flow of strings between functions so unless mixing
> exec charsets
> invokes undefined behavior I can't see how a per-function flag would help.
> 
> But yes, if we can reliably detect whether multiple exec charsets are
> used in a TU
> we can make this a flag that doesn't have to be set by the user.  But that 
> means
> the pragma probably _always_ forces that flag given we have that
> forced pre-included
> file on some targest and the pragma token would occur after that...

Would it make sense to mark the string literals itself as not using the default 
charset? Then we
could disable all interpretations only for these strings instead of disabling 
it for the entire TU?

-Andreas-



Re: [PATCH GCC][3/3]Refine CFG and bound information for split loops

2017-10-20 Thread Richard Biener
On Thu, Oct 19, 2017 at 3:26 PM, Bin Cheng  wrote:
> Hi,
> This is a rework of patch at 
> https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01037.html.
> The new patch doesn't try to handle all cases, instead, it only handles 
> obvious cases.
> It also tries to add tests illustrating different cases handled.
> Bootstrap and test for patch set on x86_64 and AArch64.  Comments?

ENOPATCH

> Thanks,
> bin
> 2017-10-16  Bin Cheng  
>
> * tree-ssa-loop-split.c (compute_new_first_bound): New parameter.
> Compute and return bound information for the second split loop.
> (adjust_loop_split): New function.
> (split_loop): Update use and call above function.
>
> gcc/testsuite/ChangeLog
> 2017-10-16  Bin Cheng  
>
> * gcc.dg/loop-split-1.c: New test.
> * gcc.dg/loop-split-2.c: New test.
> * gcc.dg/loop-split-3.c: New test.


Re: [RFC] New pragma exec_charset

2017-10-20 Thread Richard Biener
On Fri, Oct 20, 2017 at 1:19 PM, Andreas Krebbel
 wrote:
> On 10/20/2017 10:28 AM, Richard Biener wrote:
>> On Fri, Oct 20, 2017 at 9:53 AM, Jakub Jelinek  wrote:
>>> On Fri, Oct 20, 2017 at 09:48:38AM +0200, Richard Biener wrote:
 How does it work semantically to have different exec charsets?  That is,
 if "strings" flow from a region with one -fexec-charset setting to a region
 with another one is that undefined behavior?  Do we now require
 external function declarations to be in the proper region (declared under
 the appropriate exec charset flag)?  This would mean that passing
 the exec charset in effect as additional argument isn't a possibility.

 Or do we have to treat -fexec-charset similar to -frounding-math, that is,
 we can't ever _interpret_ any string in the compiler?  [unless 
 -fexec-charset
 is the same everywhere]

 I think the -frounding-math route is probably the easiest (and wisest
 given the quite low test coverage we'll get) route.  Thus, add a 
 -fmixed-charset
 flag and reject any exec-charset attribute/pragma if that flag is not set?
 With LTO we could always add this and/or merge -fexec-charset flags
 appropriately,
 injecting -fmixed-charset in case TUs use different settings.
>>>
>>> It wouldn't have to be an option, simply mark in cfun all functions that
>>> have more than one exec charset and give up on all optimizations/warnings
>>> that require to read the characters and merge that unknown exec_charset
>>> flag during inlining etc.  Though, that might still not be enough, e.g.
>>> the whole function might have one exec charset, but a global const char []
>>> variable might have another one and during optimization we might be looking
>>> at that.  So perhaps it would need to be a per-TU flag merged during LTO.
>>
>> There's also IPA flow of strings between functions so unless mixing
>> exec charsets
>> invokes undefined behavior I can't see how a per-function flag would help.
>>
>> But yes, if we can reliably detect whether multiple exec charsets are
>> used in a TU
>> we can make this a flag that doesn't have to be set by the user.  But that 
>> means
>> the pragma probably _always_ forces that flag given we have that
>> forced pre-included
>> file on some targest and the pragma token would occur after that...
>
> Would it make sense to mark the string literals itself as not using the 
> default charset? Then we
> could disable all interpretations only for these strings instead of disabling 
> it for the entire TU?

I think that would work, too.  Though I'd then rather explicitely
state the charset the string literal is in
(for efficiency we'd then need some mapping of charset id to actual
charset we store globally somewhere
and which we'd need to stream and merge for LTO - the "default" would
then always get zero and
the default charset being streamed to LTO).  Looks like
tree_base.u.bits is unused for STRING_CST
in the middle-end, you'd have to check FEs if they use a lang-specific
flag though.  Then we could
stick the exec charset number there (32bit index even - whoo).  Bah,
C++ of course uses a single
lang flag (PAREN_STRING_LITERAL_P).  Sticking it in the literals type
would work as well but
I find that a bit ugly.  We could reuse bits.address_space for a max
of 256 exec charsets,
a special value of 255 could indicate 'unknown, too many charsets'
also used in an initial implementation
without providing the actual mapping just distinguishing default from
non-default.

The interesting part is of course libcpp/cc1 interaction and getting
this all right.

Richard.

> -Andreas-
>


Re: [RFC] New pragma exec_charset

2017-10-20 Thread Richard Biener
On Fri, Oct 20, 2017 at 1:34 PM, Richard Biener
 wrote:
> On Fri, Oct 20, 2017 at 1:19 PM, Andreas Krebbel
>  wrote:
>> On 10/20/2017 10:28 AM, Richard Biener wrote:
>>> On Fri, Oct 20, 2017 at 9:53 AM, Jakub Jelinek  wrote:
 On Fri, Oct 20, 2017 at 09:48:38AM +0200, Richard Biener wrote:
> How does it work semantically to have different exec charsets?  That is,
> if "strings" flow from a region with one -fexec-charset setting to a 
> region
> with another one is that undefined behavior?  Do we now require
> external function declarations to be in the proper region (declared under
> the appropriate exec charset flag)?  This would mean that passing
> the exec charset in effect as additional argument isn't a possibility.
>
> Or do we have to treat -fexec-charset similar to -frounding-math, that is,
> we can't ever _interpret_ any string in the compiler?  [unless 
> -fexec-charset
> is the same everywhere]
>
> I think the -frounding-math route is probably the easiest (and wisest
> given the quite low test coverage we'll get) route.  Thus, add a 
> -fmixed-charset
> flag and reject any exec-charset attribute/pragma if that flag is not set?
> With LTO we could always add this and/or merge -fexec-charset flags
> appropriately,
> injecting -fmixed-charset in case TUs use different settings.

 It wouldn't have to be an option, simply mark in cfun all functions that
 have more than one exec charset and give up on all optimizations/warnings
 that require to read the characters and merge that unknown exec_charset
 flag during inlining etc.  Though, that might still not be enough, e.g.
 the whole function might have one exec charset, but a global const char []
 variable might have another one and during optimization we might be looking
 at that.  So perhaps it would need to be a per-TU flag merged during LTO.
>>>
>>> There's also IPA flow of strings between functions so unless mixing
>>> exec charsets
>>> invokes undefined behavior I can't see how a per-function flag would help.
>>>
>>> But yes, if we can reliably detect whether multiple exec charsets are
>>> used in a TU
>>> we can make this a flag that doesn't have to be set by the user.  But that 
>>> means
>>> the pragma probably _always_ forces that flag given we have that
>>> forced pre-included
>>> file on some targest and the pragma token would occur after that...
>>
>> Would it make sense to mark the string literals itself as not using the 
>> default charset? Then we
>> could disable all interpretations only for these strings instead of 
>> disabling it for the entire TU?
>
> I think that would work, too.  Though I'd then rather explicitely
> state the charset the string literal is in
> (for efficiency we'd then need some mapping of charset id to actual
> charset we store globally somewhere
> and which we'd need to stream and merge for LTO - the "default" would
> then always get zero and
> the default charset being streamed to LTO).  Looks like
> tree_base.u.bits is unused for STRING_CST
> in the middle-end, you'd have to check FEs if they use a lang-specific
> flag though.  Then we could
> stick the exec charset number there (32bit index even - whoo).  Bah,
> C++ of course uses a single
> lang flag (PAREN_STRING_LITERAL_P).  Sticking it in the literals type
> would work as well but
> I find that a bit ugly.  We could reuse bits.address_space for a max
> of 256 exec charsets,
> a special value of 255 could indicate 'unknown, too many charsets'
> also used in an initial implementation
> without providing the actual mapping just distinguishing default from
> non-default.
>
> The interesting part is of course libcpp/cc1 interaction and getting
> this all right.

Oh, and there are plenty of bits unused for STRING_CST so if the C++
FE could stop using lang specific tree bits we could shrink tree_string
by moving length to tree_base.u.  Re-using address_space would block
this improvement.  Finding a single bit for default vs. non-default wouldn't.

Richard.

> Richard.
>
>> -Andreas-
>>


Re: [RFC PATCH, i386]: Make FP inequality comparisons trapping on qNaN.

2017-10-20 Thread Joseph Myers
On Fri, 20 Oct 2017, Uros Bizjak wrote:

> 2017-10-20  Uros Bizjak  
> 
> * config/i386/i386.c (ix86_fp_compare_mode): Return CCFPmode
> for ordered inequality comparisons even with TARGET_IEEE_FP.

This is PR target/52451.

A testcase (conditional on the fenv_exceptions effective-target) that 
ordered comparisons with quiet NaNs set FE_INVALID would be a good idea, 
but it would need XFAILing for powerpc (bug 58684) and s390 (bug 77918).

-- 
Joseph S. Myers
jos...@codesourcery.com


[arm] Fix architecture selection when building libatomic with automatic FPU selection

2017-10-20 Thread Richard Earnshaw (lists)
Libatomic builds a few functions for Arm with an explicit -march option.
 This option does not specify an FPU, which can lead to problems when
targeting a hard-float or softfp environment since the architecture
appears to be incompatible with the selected ABI.  This is some fallout
from the move to making the FPU be automatically detected from the
CPU/architecture.

The fix is simple enough, just add +fp (the minimum floating point
option) to the architecture.  We don't use anything from the FP
architecture, so it shouldn't really change anything; and if we are
building for -mfloat-abi=soft the canonicalization process will remove
the unnecessary fp attributes anyway.  +fp is essentially the same as
the previous default behaviour of defaulting to the base FP architecture
in these circumstances.

* Makefile.am: (IFUNC_OPTIONS): Set the architecture to
-march=armv7-a+fp on Linux/Arm.
* Makefile.in: Regenerated.

Committed to trunk.
diff --git a/libatomic/Makefile.am b/libatomic/Makefile.am
index d731406..9c45700 100644
--- a/libatomic/Makefile.am
+++ b/libatomic/Makefile.am
@@ -123,7 +123,7 @@ libatomic_la_LIBADD = $(foreach s,$(SIZES),$(addsuffix _$(s)_.lo,$(SIZEOBJS)))
 ## On a target-specific basis, include alternates to be selected by IFUNC.
 if HAVE_IFUNC
 if ARCH_ARM_LINUX
-IFUNC_OPTIONS	 = -march=armv7-a -DHAVE_KERNEL64
+IFUNC_OPTIONS	 = -march=armv7-a+fp -DHAVE_KERNEL64
 libatomic_la_LIBADD += $(foreach s,$(SIZES),$(addsuffix _$(s)_1_.lo,$(SIZEOBJS)))
 libatomic_la_LIBADD += $(addsuffix _8_2_.lo,$(SIZEOBJS))
 endif
diff --git a/libatomic/Makefile.in b/libatomic/Makefile.in
index f6eeab3..0f0382e 100644
--- a/libatomic/Makefile.in
+++ b/libatomic/Makefile.in
@@ -346,7 +346,7 @@ M_SRC = $(firstword $(filter %/$(M_FILE), $(all_c_files)))
 libatomic_la_LIBADD = $(foreach s,$(SIZES),$(addsuffix \
 	_$(s)_.lo,$(SIZEOBJS))) $(am__append_1) $(am__append_2) \
 	$(am__append_3)
-@ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=armv7-a -DHAVE_KERNEL64
+@ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=armv7-a+fp -DHAVE_KERNEL64
 @ARCH_I386_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=i586
 @ARCH_X86_64_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -mcx16
 libatomic_convenience_la_SOURCES = $(libatomic_la_SOURCES)


Re: [PATCH] Fix nrv-1.c false failure on aarch64.

2017-10-20 Thread Alexandre Oliva
On Oct 19, 2017, "Richard Earnshaw (lists)"  wrote:

> On 19/10/17 09:14, Richard Biener wrote:
>> I guess Alex work on stmt frontiers will fix this instance?

> Don't stmt frontiers just enable you to identify exactly one stopping
> point with each statement, so that you don't keep repeatedly stepping to
> the same line?

There's that, but such stopping points are also ordered WRT debug bind
stmts, so that when you stop at such a recommended point, you observe
the expected side effects.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


[PATCH] rl78 subdi3 improvement

2017-10-20 Thread Sebastian Perta
Hello,

The following patch improves both the speed and code size for 64 bit 
subtraction for RL78:
it emits a library function call instead of emitting code for  the 64 bit add 
for every single subtraction.
The subtraction function which was added in libgcc is hand written, so more 
optimal than what GCC generates.

The change can easily be seen on the following test case.
long long my_subdi3(long long a, long long b) {
return a - b;
}
I did not add this to the regression as it very simple and there are many test 
cases in the regression which test this, for example 
gcc.c-torture/execute/20041011-1.c and  gcc.c-torture/execute/arith-rand-ll.c  
and so on.

Regression test is OK, tested with the following command:
make -k check-gcc RUNTESTFLAGS=--target_board=rl78-sim

Please let me know if this is OK, Thank you!
Sebastian

Index: gcc/ChangeLog
===
--- gcc/ChangeLog(revision 253893)
+++ gcc/ChangeLog(working copy)
@@ -1,3 +1,7 @@
+2017-10-13  Sebastian Perta  
+
+* config/rl78/rl78.md: New define_expand "subdi3".
+
 2017-10-19  Eric Botcazou  

 PR debug/82509
Index: gcc/config/rl78/rl78.md
===
--- gcc/config/rl78/rl78.md(revision 253893)
+++ gcc/config/rl78/rl78.md(working copy)
@@ -268,6 +268,16 @@
   DONE;"
 )

+(define_expand "subdi3"
+ [(set (match_operand:DI  0 "nonimmediate_operand" "")
+(minus:DI (match_operand:DI 1 "general_operand"  "")
+ (match_operand:DI2 "general_operand"  "")))
+   ]
+  ""
+  "rl78_emit_libcall (\"__subdi3\", MINUS, DImode, DImode, 3, operands);
+   DONE;"
+)
+
 (define_insn "subsi3_internal_virt"
   [(set (match_operand:SI   0 "nonimmediate_operand" "=v,&vm, vm")
 (minus:SI (match_operand:SI 1 "general_operand"  "0, vim, vim")
Index: libgcc/ChangeLog
===
--- libgcc/ChangeLog(revision 253893)
+++ libgcc/ChangeLog(working copy)
@@ -1,5 +1,10 @@
 2017-10-13  Sebastian Perta  

+* config/rl78/subdi3.S: New assembly file.
+* config/rl78/t-rl78: Added subdi3.S to LIB2ADD.
+
+2017-10-13  Sebastian Perta  
+
 * config/rl78/adddi3.S: New assembly file.
 * config/rl78/t-rl78: Added adddi3.S to LIB2ADD.

Index: libgcc/config/rl78/t-rl78
===
--- libgcc/config/rl78/t-rl78(revision 253893)
+++ libgcc/config/rl78/t-rl78(working copy)
@@ -31,7 +31,8 @@
 $(srcdir)/config/rl78/fpbit-sf.S \
 $(srcdir)/config/rl78/fpmath-sf.S \
 $(srcdir)/config/rl78/cmpsi2.S \
-$(srcdir)/config/rl78/adddi3.S
+$(srcdir)/config/rl78/adddi3.S \
+$(srcdir)/config/rl78/subdi3.S

 LIB2FUNCS_EXCLUDE = _clzhi2 _clzsi2 _ctzhi2 _ctzsi2 \
   _popcounthi2 _popcountsi2 \
Index: libgcc/config/rl78/subdi3.S
===
--- libgcc/config/rl78/subdi3.S(nonexistent)
+++ libgcc/config/rl78/subdi3.S(working copy)
@@ -0,0 +1,58 @@
+;   Copyright (C) 2017 Free Software Foundation, Inc.
+;   Contributed by Sebastian Perta.
+;
+; This file is free software; you can redistribute it and/or modify it
+; under the terms of the GNU General Public License as published by the
+; Free Software Foundation; either version 3, or (at your option) any
+; later version.
+;
+; This file is distributed in the hope that it will be useful, but
+; WITHOUT ANY WARRANTY; without even the implied warranty of
+; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+; General Public License for more details.
+;
+; Under Section 7 of GPL version 3, you are granted additional
+; permissions described in the GCC Runtime Library Exception, version
+; 3.1, as published by the Free Software Foundation.
+;
+; You should have received a copy of the GNU General Public License and
+; a copy of the GCC Runtime Library Exception along with this program;
+; see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+; .
+
+
+#include "vregs.h"
+
+.text
+
+START_FUNC ___subdi3
+
+movw  hl, sp   ; use HL-based addressing (allows for direct subw)
+
+movw  ax, [hl+4]
+subw  ax, [hl+12]
+movw  r8, ax
+
+mov   a, [hl+6]; middle bytes of the result are determined using 8-bit
+subc  a, [hl+14]   ; SUBC insns which both account for and update the 
carry bit
+mov   r10, a   ; (no SUBWC instruction is available)
+mov   a, [hl+7]
+subc  a, [hl+15]
+mov   r11, a
+
+mov   a, [hl+8]
+subc  a, [hl+16]
+mov   r12, a
+mov   a, [hl+9]
+subc  a, [hl+17]
+mov   r13, a
+
+movw  ax, [hl+10]
+sknc   ; account for the possible carry from the
+decw  ax   ; latest 8-bit operation
+subw  ax, [hl+18]
+movw  r14, ax
+
+ret
+
+END_FUNC ___subdi3
+



Renesas Electronics Europe Ltd, Dukes Meadow, Millboard Road, Bourne End, 
Buckinghamshire, SL8 5FH, UK. Registered in England

Re: [PATCH GCC][3/3]Refine CFG and bound information for split loops

2017-10-20 Thread Bin Cheng







From: Richard Biener 
Sent: 20 October 2017 12:24
To: Bin Cheng
Cc: gcc-patches@gcc.gnu.org; nd
Subject: Re: [PATCH GCC][3/3]Refine CFG and bound information for split loops
    
On Thu, Oct 19, 2017 at 3:26 PM, Bin Cheng  wrote:
> Hi,
> This is a rework of patch at  
> https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01037.html.
> The new patch doesn't try to handle all cases, instead, it only handles 
> obvious cases.
> It also tries to add tests illustrating different cases handled.
> Bootstrap and test for patch set on x86_64 and AArch64.  Comments?

ENOPATCH

Sorry for the mistake, here is the one.

Thanks,
bin

> Thanks,
> bin
> 2017-10-16  Bin Cheng  
>
> * tree-ssa-loop-split.c (compute_new_first_bound): New parameter.
> Compute and return bound information for the second split loop.
> (adjust_loop_split): New function.
> (split_loop): Update use and call above function.
>
> gcc/testsuite/ChangeLog
> 2017-10-16  Bin Cheng  
>
> * gcc.dg/loop-split-1.c: New test.
> * gcc.dg/loop-split-2.c: New test.
> * gcc.dg/loop-split-3.c: New test.
From 3bf8b382682b6a6c6aedf6f085d663e6379f003a Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Wed, 2 Aug 2017 14:57:27 +0100
Subject: [PATCH 3/3] lsplit-refine-cfg-niter-bound-20171017.txt

---
 gcc/testsuite/gcc.dg/loop-split-1.c |  40 
 gcc/testsuite/gcc.dg/loop-split-2.c |  34 +++
 gcc/testsuite/gcc.dg/loop-split-3.c |  40 
 gcc/tree-ssa-loop-split.c   | 179 +---
 4 files changed, 282 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/loop-split-1.c
 create mode 100644 gcc/testsuite/gcc.dg/loop-split-2.c
 create mode 100644 gcc/testsuite/gcc.dg/loop-split-3.c

diff --git a/gcc/testsuite/gcc.dg/loop-split-1.c b/gcc/testsuite/gcc.dg/loop-split-1.c
new file mode 100644
index 000..7cf6a37
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/loop-split-1.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fsplit-loops -fdump-tree-lsplit-details" } */
+
+#define NUM (100)
+int x[NUM] = {0, 1, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31};
+int y[NUM] = {0, 1, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31};
+int r[NUM] = {0, 2, 4, 6, 8, 12, 14, 18, 20, 24, 29, 31};
+
+extern void abort (void);
+int __attribute__((noinline)) foo (int *a, int *b, int len)
+{
+  int k;
+  for (k = 1; k <= len; k++)
+{
+  a[k]++;
+
+  if (k < len)
+	b[k]++;
+}
+}
+
+int main (void)
+{
+  int i;
+
+  foo (x, y, 9);
+
+  for (i = 0; i < NUM; ++i)
+{
+  if (i != 9
+	  && (x[i] != r[i] || y[i] != r[i]))
+	abort ();
+  if (i == 9
+	  && (x[i] != r[i] || y[i] != r[i] - 1))
+	abort ();
+}
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump "The second split loop iterates at 0 latch times." "lsplit" } } */
diff --git a/gcc/testsuite/gcc.dg/loop-split-2.c b/gcc/testsuite/gcc.dg/loop-split-2.c
new file mode 100644
index 000..3659a7a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/loop-split-2.c
@@ -0,0 +1,34 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fsplit-loops -fdump-tree-lsplit-details" } */
+
+#define NUM (100)
+int x[NUM] = {0, 1, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31};
+int y[NUM] = {0, 1, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31};
+int r[NUM] = {1, 1, 4, 5, 8, 11, 14, 17, 20, 23, 29, 31};
+
+extern void abort (void);
+int __attribute__((noinline)) foo (int *a, int *b, int len)
+{
+  int k, i;
+  for (k = 0, i = 1; k < len; k += 2, i += 2)
+{
+  a[k]++;
+
+  if (i < 1 + len)
+	b[k]++;
+}
+}
+
+int main (void)
+{
+  int i;
+
+  foo (x, y, 9);
+
+  for (i = 0; i < NUM; ++i)
+if (x[i] != r[i] || y[i] != r[i])
+  abort ();
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump "The second split loop is never executed." "lsplit" } } */
diff --git a/gcc/testsuite/gcc.dg/loop-split-3.c b/gcc/testsuite/gcc.dg/loop-split-3.c
new file mode 100644
index 000..10e7cfd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/loop-split-3.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fsplit-loops -fdump-tree-lsplit-details" } */
+
+#define NUM (100)
+int x[NUM] = {0, 1, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31};
+int y[NUM] = {0, 1, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31};
+int r[NUM] = {0, 2, 4, 6, 8, 12, 14, 18, 20, 24, 30, 31};
+
+extern void abort (void);
+int __attribute__((noinline)) foo (int *a, int *b, int start, int end)
+{
+  int k;
+  for (k = start; k >= end; k--)
+{
+  a[k]++;
+
+  if (k > end)
+	b[k]++;
+}
+}
+
+int main (void)
+{
+  int i;
+
+  foo (x, y, 10, 1);
+
+  for (i = 0; i < NUM; ++i)
+{
+  if (i != 1
+	  && (x[i] != r[i] || y[i] != r[i]))
+	abort ();
+  if (i == 1
+	  && (x[i] != r[i] || y[i] != r[i] - 1))
+	abort ();
+}
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump "The second split loop iterates at 0 latch times." "lsplit" } } */
diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
index e454cc5..ee398a2 100644
--- a/gcc/tree-ssa-loop-split.c
+++ b/gcc/tree-ssa-loop-sp

Re: [PATCH] Fix nrv-1.c false failure on aarch64.

2017-10-20 Thread Richard Earnshaw (lists)
On 20/10/17 13:45, Alexandre Oliva wrote:
> On Oct 19, 2017, "Richard Earnshaw (lists)" 
> wrote:
> 
>> On 19/10/17 09:14, Richard Biener wrote:
>>> I guess Alex work on stmt frontiers will fix this instance?
> 
>> Don't stmt frontiers just enable you to identify exactly one stopping
>> point with each statement, so that you don't keep repeatedly stepping to
>> the same line?
> 
> There's that, but such stopping points are also ordered WRT debug bind
> stmts, so that when you stop at such a recommended point, you observe
> the expected side effects.
> 

How do you ensure that if all the instructions from statement2 are
scheduled before any of the instructions from statement1?

R.

> -- 
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer



Re: [testsuite] Fix directives order

2017-10-20 Thread Richard Earnshaw (lists)
On 16/10/17 21:45, Christophe Lyon wrote:
> Hi,
> 
> I have noticed a few testcases where dg-do should be moved as the
> first directive, and others where dg-options should be moved before
> dg-add-options. The attached patch does that. I noticed no difference
> in testing, at least because the arm configs I test do not include
> v8m.
> So, no regression from my point of view, but this should avoid some headaches.
> 
> OK?

This all looks pretty sensible.

OK.

R.

> 
> Thanks,
> 
> Christophe
> 
> 
> dg-order.chlog.txt
> 
> 
> 2017-10-16  Christophe Lyon  
> 
>   * gcc.c-torture/execute/pr23135.c: Move dg-add-options after
>   dg-options.
>   * gcc.dg/torture/pr78305.c: Move dg-do as first directive.
>   * gcc.misc-tests/gcov-3.c: Likewise.
>   * gcc.target/arm/cmse/baseline/cmse-11.c: Move dg-options before 
> dg-add-options.
>   * gcc.target/arm/cmse/baseline/cmse-13.c: Likewise.
>   * gcc.target/arm/cmse/baseline/cmse-2.c: Likewise.
>   * gcc.target/arm/cmse/baseline/cmse-6.c: Likewise.
>   * gcc.target/arm/cmse/baseline/softfp.c: Likewise.
>   * gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Likewise.
>   * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: Likewise.
>   * gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: Likewise.
>   * gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: Likewise.
>   * gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise.
>   * gcc.target/arm/cmse/mainline/hard/cmse-5.c: Likewise.
>   * gcc.target/arm/cmse/mainline/hard/cmse-7.c: Likewise.
>   * gcc.target/arm/cmse/mainline/hard/cmse-8.c: Likewise.
>   * gcc.target/arm/cmse/mainline/soft/cmse-13.c: Likewise.
>   * gcc.target/arm/cmse/mainline/soft/cmse-5.c: Likewise.
>   * gcc.target/arm/cmse/mainline/soft/cmse-7.c: Likewise.
>   * gcc.target/arm/cmse/mainline/soft/cmse-8.c: Likewise.
>   * gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: Likewise.
>   * gcc.target/arm/cmse/mainline/softfp-sp/cmse-7.c: Likewise.
>   * gcc.target/arm/cmse/mainline/softfp-sp/cmse-8.c: Likewise.
>   * gcc.target/arm/cmse/mainline/softfp/cmse-13.c: Likewise.
>   * gcc.target/arm/cmse/mainline/softfp/cmse-5.c: Likewise.
>   * gcc.target/arm/cmse/mainline/softfp/cmse-7.c: Likewise.
>   * gcc.target/arm/cmse/mainline/softfp/cmse-8.c: Likewise.
>   * gcc.target/arm/lp1189445.c: Likewise.
> 
> 
> dg-order.patch.txt
> 
> 
> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr23135.c 
> b/gcc/testsuite/gcc.c-torture/execute/pr23135.c
> index 8dd6358..e740ff5 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/pr23135.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/pr23135.c
> @@ -1,9 +1,8 @@
> -/* { dg-add-options stack_size } */
> -
>  /* Based on execute/simd-1.c, modified by joern.renne...@st.com to
> trigger a reload bug.  Verified for gcc mainline from 20050722 13:00 UTC
> for sh-elf -m4 -O2.  */
>  /* { dg-options "-Wno-psabi" } */
> +/* { dg-add-options stack_size } */
>  
>  #ifndef STACK_SIZE
>  #define STACK_SIZE (256*1024)
> diff --git a/gcc/testsuite/gcc.dg/torture/pr78305.c 
> b/gcc/testsuite/gcc.dg/torture/pr78305.c
> index ccb8c6f..36d3620 100644
> --- a/gcc/testsuite/gcc.dg/torture/pr78305.c
> +++ b/gcc/testsuite/gcc.dg/torture/pr78305.c
> @@ -1,5 +1,5 @@
> -/* { dg-require-effective-target int32plus } */
>  /* { dg-do run } */
> +/* { dg-require-effective-target int32plus } */
>  
>  int main ()
>  {
> diff --git a/gcc/testsuite/gcc.misc-tests/gcov-3.c 
> b/gcc/testsuite/gcc.misc-tests/gcov-3.c
> index eb6e4cc..5b07dd7 100644
> --- a/gcc/testsuite/gcc.misc-tests/gcov-3.c
> +++ b/gcc/testsuite/gcc.misc-tests/gcov-3.c
> @@ -1,10 +1,10 @@
> +/* { dg-do run { target native } } */
>  /* { dg-require-effective-target label_values } */
>  
>  /* Test Gcov with computed gotos.
> This is the same as test gcc.c-torture/execute/980526-1.c */
>  
>  /* { dg-options "-fprofile-arcs -ftest-coverage" } */
> -/* { dg-do run { target native } } */
>  
>  extern void abort (void);
>  extern void exit (int);
> diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c 
> b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c
> index 3007409..795544f 100644
> --- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c
> +++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c
> @@ -1,7 +1,7 @@
>  /* { dg-do compile } */
> +/* { dg-options "-mcmse" }  */
>  /* { dg-require-effective-target arm_arch_v8m_base_ok } */
>  /* { dg-add-options arm_arch_v8m_base } */
> -/* { dg-options "-mcmse" }  */
>  
>  int __attribute__ ((cmse_nonsecure_call)) (*bar) (int);
>  
> diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c 
> b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c
> index f2b931b..8ced14b 100644
> --- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c
> +++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c
> @@ -1,7 +1,7 @@
>  /* { dg-do compile } */
> +/* { dg-options "-mcmse" } */
>  /* { dg-requir

[PATCH] Fix PR82129

2017-10-20 Thread Richard Biener

This fixes another antic iteration issue.  We were choosing a random
expression when intersecting ANTIC_OUT (that translated along the
first edge).  This leads to oscillations if this expression changes
from iteration to iteration.  The fix is to make sure we're picking
always the same expression out of a set of expressions (all expressions
from all edges).  (or keep them all, but for simplicity we're keeping
only a single expression per value in the sets)

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2017-10-20  Richard Biener  

PR tree-optimization/82129
* tree-ssa-pre.c (bitmap_set_and): Remove.
(compute_antic_aux): Compute ANTIC_OUT intersection in a way
canonicalizing expressions in the set to those with lowest
ID rather than taking that from the first edge.

* gcc.dg/torture/pr82129.c: New testcase.

Index: gcc/tree-ssa-pre.c
===
--- gcc/tree-ssa-pre.c  (revision 253930)
+++ gcc/tree-ssa-pre.c  (working copy)
@@ -537,7 +537,6 @@ static pre_expr bitmap_find_leader (bitm
 static void bitmap_value_insert_into_set (bitmap_set_t, pre_expr);
 static void bitmap_value_replace_in_set (bitmap_set_t, pre_expr);
 static void bitmap_set_copy (bitmap_set_t, bitmap_set_t);
-static void bitmap_set_and (bitmap_set_t, bitmap_set_t);
 static bool bitmap_set_contains_value (bitmap_set_t, unsigned int);
 static void bitmap_insert_into_set (bitmap_set_t, pre_expr);
 static bitmap_set_t bitmap_set_new (void);
@@ -800,36 +799,6 @@ sorted_array_from_bitmap_set (bitmap_set
   return result;
 }
 
-/* Perform bitmapped set operation DEST &= ORIG.  */
-
-static void
-bitmap_set_and (bitmap_set_t dest, bitmap_set_t orig)
-{
-  bitmap_iterator bi;
-  unsigned int i;
-
-  if (dest != orig)
-{
-  bitmap_and_into (&dest->values, &orig->values);
-
-  unsigned int to_clear = -1U;
-  FOR_EACH_EXPR_ID_IN_SET (dest, i, bi)
-   {
- if (to_clear != -1U)
-   {
- bitmap_clear_bit (&dest->expressions, to_clear);
- to_clear = -1U;
-   }
- pre_expr expr = expression_for_id (i);
- unsigned int value_id = get_expr_value_id (expr);
- if (!bitmap_bit_p (&dest->values, value_id))
-   to_clear = i;
-   }
-  if (to_clear != -1U)
-   bitmap_clear_bit (&dest->expressions, to_clear);
-}
-}
-
 /* Subtract all expressions contained in ORIG from DEST.  */
 
 static bitmap_set_t
@@ -2182,17 +2151,54 @@ compute_antic_aux (basic_block block, bo
 
   phi_translate_set (ANTIC_OUT, ANTIC_IN (first), block, first);
 
+  /* If we have multiple successors we need to intersect the ANTIC_OUT
+ sets.  For values that's a simple intersection but for
+expressions it is a union.  Given we want to have a single
+expression per value in our sets we have to canonicalize.
+Avoid randomness and running into cycles like for PR82129 and
+canonicalize the expression we choose to the one with the
+lowest id.  This requires we actually compute the union first.  */
   FOR_EACH_VEC_ELT (worklist, i, bprime)
{
  if (!gimple_seq_empty_p (phi_nodes (bprime)))
{
  bitmap_set_t tmp = bitmap_set_new ();
  phi_translate_set (tmp, ANTIC_IN (bprime), block, bprime);
- bitmap_set_and (ANTIC_OUT, tmp);
+ bitmap_and_into (&ANTIC_OUT->values, &tmp->values);
+ bitmap_ior_into (&ANTIC_OUT->expressions, &tmp->expressions);
  bitmap_set_free (tmp);
}
  else
-   bitmap_set_and (ANTIC_OUT, ANTIC_IN (bprime));
+   {
+ bitmap_and_into (&ANTIC_OUT->values, &ANTIC_IN (bprime)->values);
+ bitmap_ior_into (&ANTIC_OUT->expressions,
+  &ANTIC_IN (bprime)->expressions);
+   }
+   }
+  if (! worklist.is_empty ())
+   {
+ /* Prune expressions not in the value set, canonicalizing to
+expression with lowest ID.  */
+ bitmap_iterator bi;
+ unsigned int i;
+ unsigned int to_clear = -1U;
+ bitmap seen_value = BITMAP_ALLOC (NULL);
+ FOR_EACH_EXPR_ID_IN_SET (ANTIC_OUT, i, bi)
+   {
+ if (to_clear != -1U)
+   {
+ bitmap_clear_bit (&ANTIC_OUT->expressions, to_clear);
+ to_clear = -1U;
+   }
+ pre_expr expr = expression_for_id (i);
+ unsigned int value_id = get_expr_value_id (expr);
+ if (!bitmap_bit_p (&ANTIC_OUT->values, value_id)
+ || !bitmap_set_bit (seen_value, value_id))
+   to_clear = i;
+   }
+ if (to_clear != -1U)
+   bitmap_clear_bit (&ANTIC_OUT->expressions, to_clear);
+ BITMAP_FREE (seen_value);
}
 }
 
Index: gcc/testsuite/gcc.dg/torture/pr82129.c
===

Re: [PATCH][AArch64] Wrong type-attribute for stp and str

2017-10-20 Thread Richard Earnshaw (lists)
On 16/10/17 14:26, Dominik Inführ wrote:
> Hi,
> 
> it seems the type attributes for neon_stp and neon_store1_1reg should be 
> the other way around.
> 

Yes, I agree, but there's more

Firstly, we have two patterns that are named *aarch64_simd_mov,
with different iterators.  That's slightly confusing.  I think they need
to be renamed as:

*aarch64_simd_mov

and

*aarch64_simd_mov

to break the ambiguity.

Secondly it looks to me as though the attributes on the other one are
also incorrect.  Could you check that one out as well, please.

Thanks,

R.

> Thanks
> Dominik
> 
> ChangeLog:
> 2017-10-16  Dominik Infuehr  
> 
>   * config/aarch64/aarch64-simd.md
>   (*aarch64_simd_mov): Fix type-attribute.
> --
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 49f615cfdbf..409ad3502ff 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -160,8 +160,8 @@
> gcc_unreachable ();
>  }
>  }
> -  [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\
> -neon_stp, neon_logic, multiple, multiple,\
> +  [(set_attr "type" "neon_load1_1reg, neon_stp, neon_store1_1reg,\
> +neon_logic, multiple, multiple,\
>  multiple, neon_move")
> (set_attr "length" "4,4,4,4,8,8,8,4")]
>  )
> 



Re: [ARM] PR 67591 ARM v8 Thumb IT blocks are deprecated part 2

2017-10-20 Thread Richard Earnshaw (lists)
On 13/10/17 08:41, Christophe Lyon wrote:
> Hi,
> 
> The attached small patch solves PR 67591 and removes occurrences of
> "IT blocks containing 32-bit Thumb instructions are deprecated in
> ARMv8". It is similar to the patch I committed recently and updates
> the 3 remaining patterns that can generate such instructions. I
> checked gcc.log, g++.log, libstdc++.log and gfortran.log and found no
> occurrence of the warning with this patch applied.
> 
> Cross-tested on arm-none-linux-gnueabihf with -mthumb/-march=armv8-a
> and --with-cpu=cortex-a57 --with-mode=thumb, and also bootstrapped
> successfully on armv8 HW in thumb mode.
> 
> Benchmarking shows no noticeable difference.
> 
> OK for trunk?
> 

OK.

R.

> Thanks,
> 
> Christophe
> 
> 
> depr-it-2.chlog.txt
> 
> 
> 2017-10-13  Christophe Lyon  
> 
>   PR target/67591
>   * config/arm/arm.md (*sub_shiftsi): Add predicable_short_it
>   attribute.
>   (*cmp_ite0): Add enabled_for_depr_it attribute.
>   (*cmp_ite1): Likewise.
> 
> 
> depr-it-2.patch.txt
> 
> 
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index f241f9d..093db74 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -8960,6 +8960,7 @@
>"TARGET_32BIT"
>"sub%?\\t%0, %1, %3%S2"
>[(set_attr "predicable" "yes")
> +   (set_attr "predicable_short_it" "no")
> (set_attr "shift" "3")
> (set_attr "arch" "32,a")
> (set_attr "type" "alus_shift_imm,alus_shift_reg")])
> @@ -9398,6 +9399,7 @@
>}"
>[(set_attr "conds" "set")
> (set_attr "arch" "t2,t2,t2,t2,t2,any,any,any,any")
> +   (set_attr "enabled_for_depr_it" "yes,no,no,no,no,no,no,no,no")
> (set_attr "type" "multiple")
> (set_attr_alternative "length"
>[(const_int 6)
> @@ -9481,6 +9483,7 @@
>}"
>[(set_attr "conds" "set")
> (set_attr "arch" "t2,t2,t2,t2,t2,any,any,any,any")
> +   (set_attr "enabled_for_depr_it" "yes,no,no,no,no,no,no,no,no")
> (set_attr_alternative "length"
>[(const_int 6)
> (const_int 8)
> 



[Ada] Small optimization in Sem_Ch4.Find_Concatenation_Types

2017-10-20 Thread Pierre-Marie de Rodat
The handling of string concatenation is quite inefficient in the front-end
because the compiler generates a lot of implicit concatenation operators for
array types and also enumeration types, and then walks the full list every
time it needs to resolve a concatenation operator, in most cases for strings.

This small patch adds a guard to Find_Concatenation_Types in order to filter
out operators unrelated to string concatenation and thus to avoid doing a
costly full type resolution for each of them.  This saves more than 5% of
the instruction count on x86-64 for a typical testcase.

No functional changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-10-20  Eric Botcazou  

* sem_ch4.adb (Find_Concatenation_Types): Filter out operators if one
of the operands is a string literal.

Index: sem_ch4.adb
===
--- sem_ch4.adb (revision 253916)
+++ sem_ch4.adb (working copy)
@@ -6431,10 +6431,24 @@
   Op_Id : Entity_Id;
   N : Node_Id)
is
-  Op_Type : constant Entity_Id := Etype (Op_Id);
+  Is_String : constant Boolean := Nkind (L) = N_String_Literal
+or else
+  Nkind (R) = N_String_Literal;
+  Op_Type   : constant Entity_Id := Etype (Op_Id);
 
begin
   if Is_Array_Type (Op_Type)
+
+--  Small but very effective optimization: if at least one operand is a
+--  string literal, then the type of the operator must be either array
+--  of characters or array of strings.
+
+and then (not Is_String
+or else
+  Is_Character_Type (Component_Type (Op_Type))
+or else
+  Is_String_Type (Component_Type (Op_Type)))
+
 and then not Is_Limited_Type (Op_Type)
 
 and then (Has_Compatible_Type (L, Op_Type)


[Ada] Superfluous restriction on aspect Dimension applied to integer type

2017-10-20 Thread Pierre-Marie de Rodat
If the dimensioned root type is an integer type, it is not particularly useful,
and fractional dimensions do not make much sense for such types, so previously
we used to reject dimensions of integer types that were not integer literals.
However, the manipulation of dimensions does not depend on the kind of root
type, so we can accept this usage for rare cases where dimensions are specified
for integer-valued types.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

2017-10-20  Ed Schonberg  

* sem_dim.adb (Extract_Power): Accept dimension values that are not
non-negative integers when the dimensioned base type is an Integer
type.

gcc/testsuite/

2017-10-20  Ed Schonberg  

* gnat.dg/dimensions.adb, gnat.dg/dimensions.ads: New testcase.
Index: sem_dim.adb
===
--- sem_dim.adb (revision 253938)
+++ sem_dim.adb (working copy)
@@ -518,25 +518,17 @@
  Position : Dimension_Position)
   is
   begin
- --  Integer case
+ Dimensions (Position) := Create_Rational_From (Expr, True);
+ Processed (Position) := True;
 
- if Is_Integer_Type (Def_Id) then
+ --  If the dimensioned root type is an integer type, it is not
+ --  particularly useful, and fractional dimensions do not make
+ --  much sense for such types, so previously we used to reject
+ --  dimensions of integer types that were not integer literals.
+ --  However, the manipulation of dimensions does not depend on
+ --  the kind of root type, so we can accept this usage for rare
+ --  cases where dimensions are specified for integer values.
 
---  Dimension value must be an integer literal
-
-if Nkind (Expr) = N_Integer_Literal then
-   Dimensions (Position) := +Whole (UI_To_Int (Intval (Expr)));
-else
-   Error_Msg_N ("integer literal expected", Expr);
-end if;
-
- --  Float case
-
- else
-Dimensions (Position) := Create_Rational_From (Expr, True);
- end if;
-
- Processed (Position) := True;
   end Extract_Power;
 
   
Index: gnat.dg/dimensions.adb
===
--- gnat.dg/dimensions.adb  (revision 0)
+++ gnat.dg/dimensions.adb  (revision 253941)
@@ -0,0 +1,5 @@
+--  { dg-do compile }
+
+package body Dimensions is
+   procedure Dummy is null;
+end Dimensions;
Index: gnat.dg/dimensions.ads
===
--- gnat.dg/dimensions.ads  (revision 0)
+++ gnat.dg/dimensions.ads  (revision 253941)
@@ -0,0 +1,29 @@
+package Dimensions is
+
+   type Mks_Int_Type is new Integer
+ with
+  Dimension_System => (
+(Unit_Name => Meter,Unit_Symbol => 'm',   Dim_Symbol => 'L'),
+(Unit_Name => Kilogram, Unit_Symbol => "kg",  Dim_Symbol => 'M'),
+(Unit_Name => Second,   Unit_Symbol => 's',   Dim_Symbol => 'T'),
+(Unit_Name => Ampere,   Unit_Symbol => 'A',   Dim_Symbol => 'I'),
+(Unit_Name => Kelvin,   Unit_Symbol => 'K',   Dim_Symbol => '@'),
+(Unit_Name => Mole, Unit_Symbol => "mol", Dim_Symbol => 'N'),
+(Unit_Name => Candela,  Unit_Symbol => "cd",  Dim_Symbol => 'J'));
+
+   subtype Int_Length is Mks_Int_Type
+ with
+  Dimension => (Symbol => 'm',
+Meter  => 1,
+others => 0);
+
+   subtype Int_Speed is Mks_Int_Type
+ with
+  Dimension => (
+Meter  =>  1,
+Second => -1,
+others =>  0);
+
+   procedure Dummy;
+
+end Dimensions;


Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-10-20 Thread Martin Sebor

On 10/20/2017 02:08 AM, Richard Biener wrote:

On Fri, Oct 20, 2017 at 1:00 AM, Martin Sebor  wrote:

On 10/19/2017 02:34 AM, Richard Biener wrote:


On Thu, Oct 19, 2017 at 1:19 AM, Martin Sebor  wrote:


On 10/18/2017 04:48 AM, Richard Biener wrote:



On Wed, Oct 18, 2017 at 5:34 AM, Martin Sebor  wrote:



While testing my latest -Wrestrict changes I noticed a number of
opportunities to improve the -Warray-bounds warning.  Attached
is a patch that implements a solution for the following subset
of these:

PR tree-optimization/82596 - missing -Warray-bounds on an out-of
  bounds index into string literal
PR tree-optimization/82588 - missing -Warray-bounds on an excessively
  large index
PR tree-optimization/82583 - missing -Warray-bounds on out-of-bounds
  inner indices




I meant to use size_type_node (size_t), not sizetype.  But
I just checked that ptrdiff_type_node is initialized in
build_common_tree_nodes and thus always available.



I see.  Using ptrdiff_type_node is preferable for the targets
where ptrdiff_t has a greater precision than size_t (e.g., VMS).
It makes sense now.  I should remember to change all the other
places where I introduced ssizetype to use ptrdiff_type_node.




As an aside, at some point I would like to get away from a type
based limit in all these warnings and instead use one that can
be controlled by an option so that a user can impose a lower limit
on the maximum size of an object and have all size-related warnings
(and perhaps even optimizations) enforce it and benefit from it.



You could add a --param that is initialized from ptrdiff_type_node.



Yes, that's an option to consider.  Thanks.





+  tree arg = TREE_OPERAND (ref, 0);
+  tree_code code = TREE_CODE (arg);
+  if (code == COMPONENT_REF)
+   {
+ HOST_WIDE_INT off;
+ if (tree base = get_addr_base_and_unit_offset (ref, &off))
+   up_bound_p1 = fold_build2 (MINUS_EXPR, ssizetype,
up_bound_p1,
+  TYPE_SIZE_UNIT (TREE_TYPE
(base)));
+ else
+   return;

so this gives up on a.b[i].c.d[k] (ok, array_at_struct_end_p will be
false).
simply not subtracting anyhing instead of returning would be
conservatively
correct, no?  Likewise subtracting the offset of the array for all
"previous"
variably indexed components with assuming the lowest value for the
index.
But as above I think compensating for the offset of the array within the
object
is academic ... ;)




I was going to say yes (it gives up) but on second thought I don't
think it does.  Only the major index can be unbounded and the code
does consider the size of the sub-array when checking the major
index.  So, IIUC, I think this works correctly as is (*).  What
doesn't work is VLAs but those are a separate problem.  Let me
know if I misunderstood your question.



get_addr_base_and_unit_offset will return NULL if there's any variable
component in 'ref'.  So as written it seems to be dead code (you
want to pass 'arg'?)



Sorry, I'm not sure I understand what you mean.  What do you think
is dead code?  The call to get_addr_base_and_unit_offset() is also
made for an array of unspecified bound (up_bound is null) and for
an array at the end of a struct.  For those the function returns
non-null, and for the others (arrays of runtime bound) it returns
null.  (I passed arg instead of ref but I see no difference in
my tests.)


If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg') it will
return the offset of 'c'.  If you pass a.b[j].c it will still return NULL.
You could use get_ref_base_and_extent which will return the offset
of a.b[0].c in this case and sets max_size != size - but you are only
interested in offset.  The disadvantage of get_ref_base_and_extent
is it returns offset in bits thus if the offset is too large for a HWI
you'll instead get offset == 0 and max_size == -1.

Thus I'm saying this is dead code for variable array accesses
(even for the array you are warning about).  Yes, for constant index
and at-struct-end you'll get sth, but the warning is in VRP because
of variable indexes.

So I suggest to pass 'arg' and use get_ref_base_and_extent
for some extra precision (and possible lossage for very very large
structures).


Computing bit offsets defeats the out-of-bounds flexible array
index detection because the computation overflows (the function
sets the offset to zero).  I'll go ahead with the first version
unless you have a different suggestion.

Thanks
Martin



Thus instead of

+  tree maxbound = TYPE_MAX_VALUE (ptrdiff_type_node);
+
+  up_bound_p1 = int_const_binop (TRUNC_DIV_EXPR, maxbound, eltsize);
+
+  tree arg = TREE_OPERAND (ref, 0);
+  tree_code code = TREE_CODE (arg);
+  if (code == COMPONENT_REF)
+   {
+ HOST_WIDE_INT off;
+ if (tree base = get_addr_base_and_unit_offset (arg, &off))
+   {
+ tree size = TYPE_SIZE_UNIT (TREE_TYPE (base));

(not sure why you're subtracting the size

Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-10-20 Thread Richard Biener
On October 20, 2017 5:43:40 PM GMT+02:00, Martin Sebor  wrote:
>On 10/20/2017 02:08 AM, Richard Biener wrote:
>> On Fri, Oct 20, 2017 at 1:00 AM, Martin Sebor 
>wrote:
>>> On 10/19/2017 02:34 AM, Richard Biener wrote:

 On Thu, Oct 19, 2017 at 1:19 AM, Martin Sebor 
>wrote:
>
> On 10/18/2017 04:48 AM, Richard Biener wrote:
>>
>>
>> On Wed, Oct 18, 2017 at 5:34 AM, Martin Sebor 
>wrote:
>>>
>>>
>>> While testing my latest -Wrestrict changes I noticed a number of
>>> opportunities to improve the -Warray-bounds warning.  Attached
>>> is a patch that implements a solution for the following subset
>>> of these:
>>>
>>> PR tree-optimization/82596 - missing -Warray-bounds on an out-of
>>>   bounds index into string literal
>>> PR tree-optimization/82588 - missing -Warray-bounds on an
>excessively
>>>   large index
>>> PR tree-optimization/82583 - missing -Warray-bounds on
>out-of-bounds
>>>   inner indices
>>>
>>>
 I meant to use size_type_node (size_t), not sizetype.  But
 I just checked that ptrdiff_type_node is initialized in
 build_common_tree_nodes and thus always available.
>>>
>>>
>>> I see.  Using ptrdiff_type_node is preferable for the targets
>>> where ptrdiff_t has a greater precision than size_t (e.g., VMS).
>>> It makes sense now.  I should remember to change all the other
>>> places where I introduced ssizetype to use ptrdiff_type_node.
>>>

> As an aside, at some point I would like to get away from a type
> based limit in all these warnings and instead use one that can
> be controlled by an option so that a user can impose a lower limit
> on the maximum size of an object and have all size-related
>warnings
> (and perhaps even optimizations) enforce it and benefit from it.


 You could add a --param that is initialized from ptrdiff_type_node.
>>>
>>>
>>> Yes, that's an option to consider.  Thanks.
>>>
>>>

>> +  tree arg = TREE_OPERAND (ref, 0);
>> +  tree_code code = TREE_CODE (arg);
>> +  if (code == COMPONENT_REF)
>> +   {
>> + HOST_WIDE_INT off;
>> + if (tree base = get_addr_base_and_unit_offset (ref,
>&off))
>> +   up_bound_p1 = fold_build2 (MINUS_EXPR, ssizetype,
>> up_bound_p1,
>> +  TYPE_SIZE_UNIT (TREE_TYPE
>> (base)));
>> + else
>> +   return;
>>
>> so this gives up on a.b[i].c.d[k] (ok, array_at_struct_end_p will
>be
>> false).
>> simply not subtracting anyhing instead of returning would be
>> conservatively
>> correct, no?  Likewise subtracting the offset of the array for
>all
>> "previous"
>> variably indexed components with assuming the lowest value for
>the
>> index.
>> But as above I think compensating for the offset of the array
>within the
>> object
>> is academic ... ;)
>
>
>
> I was going to say yes (it gives up) but on second thought I don't
> think it does.  Only the major index can be unbounded and the code
> does consider the size of the sub-array when checking the major
> index.  So, IIUC, I think this works correctly as is (*).  What
> doesn't work is VLAs but those are a separate problem.  Let me
> know if I misunderstood your question.


 get_addr_base_and_unit_offset will return NULL if there's any
>variable
 component in 'ref'.  So as written it seems to be dead code (you
 want to pass 'arg'?)
>>>
>>>
>>> Sorry, I'm not sure I understand what you mean.  What do you think
>>> is dead code?  The call to get_addr_base_and_unit_offset() is also
>>> made for an array of unspecified bound (up_bound is null) and for
>>> an array at the end of a struct.  For those the function returns
>>> non-null, and for the others (arrays of runtime bound) it returns
>>> null.  (I passed arg instead of ref but I see no difference in
>>> my tests.)
>>
>> If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg')
>it will
>> return the offset of 'c'.  If you pass a.b[j].c it will still return
>NULL.
>> You could use get_ref_base_and_extent which will return the offset
>> of a.b[0].c in this case and sets max_size != size - but you are only
>> interested in offset.  The disadvantage of get_ref_base_and_extent
>> is it returns offset in bits thus if the offset is too large for a
>HWI
>> you'll instead get offset == 0 and max_size == -1.
>>
>> Thus I'm saying this is dead code for variable array accesses
>> (even for the array you are warning about).  Yes, for constant index
>> and at-struct-end you'll get sth, but the warning is in VRP because
>> of variable indexes.
>>
>> So I suggest to pass 'arg' and use get_ref_base_and_extent
>> for some extra precision (and possible lossage for very very large
>> structures).
>
>Computing bit offsets defeats the out-of-bounds flexible array
>index detection because the computati

Re: [patch, c++] Add a warning flag for the enum bit-field declaration warning in bug #61414.

2017-10-20 Thread Jason Merrill
On Wed, Oct 18, 2017 at 3:15 PM, Sam van Kampen via gcc-patches
 wrote:
> On Wed, Oct 18, 2017 at 09:46:08AM -0600, Martin Sebor wrote:
>> > Fair enough, I didn't know whether to change the way it currently was
>> > triggered. Do you think it should fall under -Wextra (I don't think it
>> > falls under -Wall, since it isn't "easy to avoid or modify to prevent
>> > the warning" because it may be valid and wanted behavior), or should it
>> > be enabled by no other flag?
>>
>> I think it depends on the implementation of the warning.  With
>> the current (fairly restrictive) behavior I'd say it should be
>> disabled by default.  But if it were to be changed to more closely
>> match the Clang behavior and only warn for bit-field declarations
>> that cannot represent all enumerators of the enumerated type, then
>> including it in -Wall would seem helpful to me.
>>
>> I.e., Clang doesn't warn on this and IIUC that's what the reporter
>> of the bug also expects:
>>
>>   enum E: unsigned { E3 = 15 };
>>
>>   struct S { E i: 4; };
>>
>> (There is value in warning on this as well, but I think most users
>> will not be interested in it, so making the warning a two-level one
>> where level 1 warns same as Clang and level 2 same as GCC does now
>> might give us the best of both worlds).
>
> I see what you mean - that is the behavior I wanted to implement in the
> first place, but Jonathan Wakely rightly pointed out that when an
> enumeration is scoped, all values of its underlying type are valid
> enumeration values, and so the bit-field you declare in 'S' _is_ too
> small to hold all values of 'enum E'.
>
> Here's the corresponding text from draft N4659 of the C++17 standard,
> §10.2/8 [dcl.enum]
>
> For an enumeration whose underlying type is fixed, the values of the
> enumeration are the values of the underlying type. [...] It is possible
> to define an enumeration that has values not defined by any of its
> enumerators.
>
> Still, warning when a bit-field can't hold all enumerators instead of
> all values may be a good idea. I've looked into it, and it does require
> recalculating the maximum and minimum enumerator value, since the bounds
> of the underlying type are saved in TYPE_MIN_VALUE and TYPE_MAX_VALUE
> when the enumeration is scoped, instead of the min/max enumerator value.
>
> Would adding that separate warning level be part of a separate patch, or
> should I add it to this one?

I think making that behavior the default would be appropriate.  The
current behavior for scoped enums seems like a bug; sure, all values
of the underlying type are valid, but that's also true for "unsigned i
: 4", and we don't warn about that.

Jason


[Ada] Be more permissive for comparisons with literals in dimension system

2017-10-20 Thread Pierre-Marie de Rodat
The dimension system in GNAT now allows to compare a dimensioned
expression with a literal, but it issues a warning in this case if the
literal is not zero.

The following code compiles with warnings:

 $ gcc -c use_dims.adb

 1. with System.Dim.Mks; use System.Dim.Mks;
 2.
 3. procedure Use_Dims with SPARK_Mode is
 4.X : Speed := 0.0;
 5.Y : Speed := 1.0;
|
>>> warning: assumed to be "1.0 m.s**(-1)"

 6. begin
 7.if X = 0.0 then
 8.   null;
 9.elsif 0.0 = X then
10.   null;
11.elsif X = 1.0 then
 |
>>> warning: assumed to be "1.0 m.s**(-1)"

12.   null;
13.end if;
14. end Use_Dims;

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-10-20  Yannick Moy  

* sem_dim.adb (Analyze_Dimension_Binary_Op): Accept with a warning to
compare a dimensioned expression with a literal.
(Dim_Warning_For_Numeric_Literal): Do not issue a warning for the
special value zero.
* doc/gnat_ugn/gnat_and_program_execution.rst: Update description of
dimensionality system in GNAT.
* gnat_ugn.texi: Regenerate.

Index: doc/gnat_ugn/gnat_and_program_execution.rst
===
--- doc/gnat_ugn/gnat_and_program_execution.rst (revision 253938)
+++ doc/gnat_ugn/gnat_and_program_execution.rst (working copy)
@@ -3611,21 +3611,27 @@
 ``Acceleration``.
 
 The dimensionality checks for relationals use the same rules as
-for "+" and "-"; thus
+for "+" and "-", except when comparing to a literal; thus
 
   .. code-block:: ada
 
-acc > 10.0
+acc > len
 
 is equivalent to
 
   .. code-block:: ada
 
-   acc-10.0 > 0.0
+   acc-len > 0.0
 
-and is thus illegal. Analogously a conditional expression
-requires the same dimension vector for each branch.
+and is thus illegal, but
 
+  .. code-block:: ada
+
+acc > 10.0
+
+is accepted with a warning. Analogously a conditional expression requires the
+same dimension vector for each branch (with no exception for literals).
+
 The dimension vector of a type conversion :samp:`T({expr})` is defined
 as follows, based on the nature of ``T``:
 
Index: sem_dim.adb
===
--- sem_dim.adb (revision 253941)
+++ sem_dim.adb (working copy)
@@ -1577,6 +1577,20 @@
   then
  null;
 
+  --  Numeric literal case. Issue a warning to indicate the
+  --  literal is treated as if its dimension matches the type
+  --  dimension.
+
+  elsif Nkind_In (Original_Node (L), N_Real_Literal,
+ N_Integer_Literal)
+  then
+ Dim_Warning_For_Numeric_Literal (L, Etype (R));
+
+  elsif Nkind_In (Original_Node (R), N_Real_Literal,
+ N_Integer_Literal)
+  then
+ Dim_Warning_For_Numeric_Literal (R, Etype (L));
+
   else
  Error_Dim_Msg_For_Binary_Op (N, L, R);
   end if;
@@ -2724,6 +2738,24 @@
 
procedure Dim_Warning_For_Numeric_Literal (N : Node_Id; Typ : Entity_Id) is
begin
+  --  Consider the literal zero (integer 0 or real 0.0) to be of any
+  --  dimension.
+
+  case Nkind (Original_Node (N)) is
+ when N_Real_Literal =>
+if Expr_Value_R (N) = Ureal_0 then
+   return;
+end if;
+
+ when N_Integer_Literal =>
+if Expr_Value (N) = Uint_0 then
+   return;
+end if;
+
+ when others =>
+null;
+  end case;
+
   --  Initialize name buffer
 
   Name_Len := 0;
Index: gnat_ugn.texi
===
--- gnat_ugn.texi   (revision 253938)
+++ gnat_ugn.texi   (working copy)
@@ -21,7 +21,7 @@
 
 @copying
 @quotation
-GNAT User's Guide for Native Platforms , Oct 14, 2017
+GNAT User's Guide for Native Platforms , Oct 20, 2017
 
 AdaCore
 
@@ -12474,8 +12474,8 @@
 This switch activates warnings for exception usage when pragma Restrictions
 (No_Exception_Propagation) is in effect. Warnings are given for implicit or
 explicit exception raises which are not covered by a local handler, and for
-exception handlers which do not cover a local raise. The default is that these
-warnings are not given.
+exception handlers which do not cover a local raise. The default is that
+these warnings are given for units that contain exception handlers.
 
 @item @code{-gnatw.X}
 
@@ -22901,12 +22901,12 @@
 @code{Acceleration}.
 
 The dimensionality checks for relationals use the same rules as
-for "+" and "-"; thus
+for "+" and "-", except when comparing to a literal; thus
 
 @quotation
 
 @exa

[Ada] Fix inadequate silencing of errors in expression functions

2017-10-20 Thread Pierre-Marie de Rodat
Errors where previously silenced on expression functions, which caused
some BUG BOX to be issued inside GNATprove, as the AST could be
ill-formed. Now fixed. There is no example code as this only has an effect
on GNATprove.

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-10-20  Yannick Moy  

* sem_ch6.adb (Analyze_Expression_Function.Freeze_Expr_Types): Remove
inadequate silencing of errors.
* sem_util.adb (Check_Part_Of_Reference): Do not issue an error when
checking the subprogram body generated from an expression function,
when this is done as part of the preanalysis done on expression
functions, as the subprogram body may not yet be attached in the AST.
The error if any will be issued later during the analysis of the body.
(Is_Aliased_View): Trivial rewrite with Is_Formal_Object.

Index: sem_ch6.adb
===
--- sem_ch6.adb (revision 253938)
+++ sem_ch6.adb (working copy)
@@ -442,18 +442,12 @@
   begin
  --  Preanalyze a duplicate of the expression to have available the
  --  minimum decoration needed to locate referenced unfrozen types
- --  without adding any decoration to the function expression. This
- --  preanalysis is performed with errors disabled to avoid reporting
- --  spurious errors on Ghost entities (since the expression is not
- --  fully analyzed).
+ --  without adding any decoration to the function expression.
 
  Push_Scope (Def_Id);
  Install_Formals (Def_Id);
- Ignore_Errors_Enable := Ignore_Errors_Enable + 1;
 
  Preanalyze_Spec_Expression (Dup_Expr, Etype (Def_Id));
-
- Ignore_Errors_Enable := Ignore_Errors_Enable - 1;
  End_Scope;
 
  --  Restore certain attributes of Def_Id since the preanalysis may
Index: sem_util.adb
===
--- sem_util.adb(revision 253938)
+++ sem_util.adb(working copy)
@@ -3354,10 +3354,13 @@
and then not Comes_From_Source (Par)
  then
 --  Continue to examine the context if the reference appears in a
---  subprogram body which was previously an expression function.
+--  subprogram body which was previously an expression function,
+--  unless this is during preanalysis (when In_Spec_Expression is
+--  True), as the body may not yet be inserted in the tree.
 
 if Nkind (Par) = N_Subprogram_Body
   and then Was_Expression_Function (Par)
+  and then not In_Spec_Expression
 then
null;
 
@@ -12545,9 +12548,7 @@
  or else (Present (Renamed_Object (E))
and then Is_Aliased_View (Renamed_Object (E)
 
-   or else ((Is_Formal (E)
-  or else Ekind_In (E, E_Generic_In_Out_Parameter,
-   E_Generic_In_Parameter))
+   or else ((Is_Formal (E) or else Is_Formal_Object (E))
 and then Is_Tagged_Type (Etype (E)))
 
or else (Is_Concurrent_Type (E) and then In_Open_Scopes (E))


[Ada] Mark temporary entity created while removing side effects as internal

2017-10-20 Thread Pierre-Marie de Rodat
Temporary entities created by the frontend are now marked as internal to
simplify their detection in the GNATprove backend. Also, by marking them
as internal it is less likely that an extra code for their initialization
will be created if pragma Initialize_Scalars is active (though I didn't
check that). No impact on the testsuite.

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-10-20  Piotr Trojanek  

* exp_util.adb (Build_Temporary): Mark created temporary entity as
internal.

Index: exp_util.adb
===
--- exp_util.adb(revision 253941)
+++ exp_util.adb(working copy)
@@ -10978,7 +10978,8 @@
  Related_Nod : Node_Id := Empty) return Entity_Id;
   --  Create an external symbol of the form xxx_FIRST/_LAST if Related_Nod
   --  is present (xxx is taken from the Chars field of Related_Nod),
-  --  otherwise it generates an internal temporary.
+  --  otherwise it generates an internal temporary. The created temporary
+  --  entity is marked as internal.
 
   -
   -- Build_Temporary --
@@ -10990,6 +10991,7 @@
  Related_Nod : Node_Id := Empty) return Entity_Id
   is
  Temp_Nam : Name_Id;
+ Temp_Id  : Entity_Id;
 
   begin
  --  The context requires an external symbol
@@ -11001,13 +11003,17 @@
Temp_Nam := New_External_Name (Chars (Related_Id), "_LAST");
 end if;
 
-return Make_Defining_Identifier (Loc, Temp_Nam);
+Temp_Id := Make_Defining_Identifier (Loc, Temp_Nam);
 
  --  Otherwise generate an internal temporary
 
  else
-return Make_Temporary (Loc, Id, Related_Nod);
+Temp_Id := Make_Temporary (Loc, Id, Related_Nod);
  end if;
+
+ Set_Is_Internal (Temp_Id);
+
+ return Temp_Id;
   end Build_Temporary;
 
   --  Local variables


[Ada] Adjust new implementation of ABE detection to ZFP context

2017-10-20 Thread Pierre-Marie de Rodat
The new implementation of Access-Before-Elaboration detection can create new
raise Program_Error statements at the very end of the front-end processing,
which is too late in order for the first-line mechanism implementing the
No_Exception_Propagation restriction present in the front-end to catch them.

There is a second-line mechanism present in gigi that can catch them, but the
expanded tree must nevertheless be prepared beforehand for their possible
creation; this is achieved by calling Possible_Local_Raise in the few cases
where an ABE scenario could give rise to raising Program_Error.

Since this is a very conservative processing, additional adjustements are made
in order for the warnings tied to the No_Exception_Propagation restriction to
still be issued in an useful way.

ACATS c39006b must now pass again in ZFP mode.

2017-10-20  Eric Botcazou  

* exp_ch11.ads (Warn_If_No_Local_Raise): Declare.
* exp_ch11.adb (Expand_Exception_Handlers): Use Warn_If_No_Local_Raise
to issue the warning on the absence of local raise.
(Possible_Local_Raise): Do not issue the warning for Call_Markers.
(Warn_If_No_Local_Raise): New procedure to issue the warning on the
absence of local raise.
* sem_elab.adb: Add with and use clauses for Exp_Ch11.
(Record_Elaboration_Scenario): Call Possible_Local_Raise in the cases
where a scenario could give rise to raising Program_Error.
* sem_elab.adb: Typo fixes.
* fe.h (Warn_If_No_Local_Raise): Declare.
* gcc-interface/gigi.h (get_exception_label): Change return type.
* gcc-interface/trans.c (gnu_constraint_error_label_stack): Change to
simple vector of Entity_Id.
(gnu_storage_error_label_stack): Likewise.
(gnu_program_error_label_stack): Likewise.
(gigi): Adjust to above changes.
(Raise_Error_to_gnu): Likewise.
(gnat_to_gnu) : Set TREE_USED on the label.
(N_Push_Constraint_Error_Label): Push the label onto the stack.
(N_Push_Storage_Error_Label): Likewise.
(N_Push_Program_Error_Label): Likewise.
(N_Pop_Constraint_Error_Label): Pop the label from the stack and issue
a warning on the absence of local raise.
(N_Pop_Storage_Error_Label): Likewise.
(N_Pop_Program_Error_Label): Likewise.
(push_exception_label_stack): Delete.
(get_exception_label): Change return type to Entity_Id and adjust.
* gcc-interface/utils2.c (build_goto_raise): Change type of first
parameter to Entity_Id and adjust.  Set TREE_USED on the label.
(build_call_raise): Adjust calls to get_exception_label and also
build_goto_raise.
(build_call_raise_column): Likewise.
(build_call_raise_range): Likewise.
* doc/gnat_ugn/building_executable_programs_with_gnat.rst (-gnatw.x):
Document actual default behavior.
Index: doc/gnat_ugn/building_executable_programs_with_gnat.rst
===
--- doc/gnat_ugn/building_executable_programs_with_gnat.rst (revision 
253938)
+++ doc/gnat_ugn/building_executable_programs_with_gnat.rst (working copy)
@@ -3898,8 +3898,8 @@
   This switch activates warnings for exception usage when pragma Restrictions
   (No_Exception_Propagation) is in effect. Warnings are given for implicit or
   explicit exception raises which are not covered by a local handler, and for
-  exception handlers which do not cover a local raise. The default is that 
these
-  warnings are not given.
+  exception handlers which do not cover a local raise. The default is that
+  these warnings are given for units that contain exception handlers.
 
 
 :switch:`-gnatw.X`
Index: exp_ch11.adb
===
--- exp_ch11.adb(revision 253938)
+++ exp_ch11.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 1992-2016, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2017, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -64,7 +64,7 @@
 
procedure Warn_If_No_Propagation (N : Node_Id);
--  Called for an exception raise that is not a local raise (and thus can
-   --  not be optimized to a goto. Issues warning if No_Exception_Propagation
+   --  not be optimized to a goto). Issues warning if No_Exception_Propagation
--  restriction is set. N is the node for the raise or equivalent call.
 
--

[Ada] Spurious error on partial parameterization

2017-10-20 Thread Pierre-Marie de Rodat
This patch corrects an issue whereby a defaulted formal package actual
generated a spurious type mismatch error upon instantiation instead of beging
accepted as per ARM 12.7 4.4/3.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

2017-10-20  Justin Squirek  

* sem_ch12.adb (Check_Formal_Package_Instance): Add sanity check to
verify a renaming exists for a generic formal before comparing it to
the actual as defaulted formals will not have a renamed_object.

gcc/testsuite/

2017-10-20  Justin Squirek  

* gnat.dg/default_pkg_actual.adb, gnat.dg/default_pkg_actual2.adb: New
testcases.
Index: sem_ch12.adb
===
--- sem_ch12.adb(revision 253941)
+++ sem_ch12.adb(working copy)
@@ -6459,10 +6459,11 @@
  elsif Ekind (E1) = E_Package then
 Check_Mismatch
   (Ekind (E1) /= Ekind (E2)
-or else Renamed_Object (E1) /= Renamed_Object (E2));
+or else (Present (Renamed_Object (E2))
+  and then Renamed_Object (E1) /=
+ Renamed_Object (E2)));
 
  elsif Is_Overloadable (E1) then
-
 --  Verify that the actual subprograms match. Note that actuals
 --  that are attributes are rewritten as subprograms. If the
 --  subprogram in the formal package is defaulted, no check is
Index: ../testsuite/gnat.dg/default_pkg_actual.adb
===
--- ../testsuite/gnat.dg/default_pkg_actual.adb (revision 0)
+++ ../testsuite/gnat.dg/default_pkg_actual.adb (revision 0)
@@ -0,0 +1,32 @@
+--  { dg-do compile }
+
+procedure Default_Pkg_Actual is
+
+   generic
+   package As is
+   end As;
+
+   generic
+  type T is private;
+  with package A0 is new As;
+   package Bs is
+   end Bs;
+
+   generic
+  with package Xa is new As;
+   package Xs is
+  package Xb is new Bs(T => Integer, A0 => Xa);
+   end Xs;
+
+   generic
+  with package Yb is new Bs(T => Integer, others => <>);
+   package Ys is
+   end Ys;
+
+   package A is new As;
+   package X is new Xs(Xa => A);
+   package Y is new Ys(Yb => X.Xb);
+
+begin
+   null;
+end;
Index: ../testsuite/gnat.dg/default_pkg_actual2.adb
===
--- ../testsuite/gnat.dg/default_pkg_actual2.adb(revision 0)
+++ ../testsuite/gnat.dg/default_pkg_actual2.adb(revision 0)
@@ -0,0 +1,27 @@
+--  { dg-do compile }
+
+procedure Default_Pkg_Actual2 is
+
+   generic
+   package P1 is
+   end;
+
+   generic
+  with package FP1a is new P1;
+  with package FP1b is new P1;
+   package P2 is
+   end;
+
+   generic
+  with package FP2 is new P2 (FP1a => <>,  FP1b => <>);
+   package P3 is
+   end;
+
+   package NP1a is new P1;
+   package NP1b is new P1;
+   package NP2  is new P2 (NP1a, NP1b);
+   package NP4  is new P3 (NP2);
+
+begin
+   null;
+end;


[C++ Patch] PR 80955 (Macros expanded in definition of user-defined literals)

2017-10-20 Thread Mukesh Kapoor

Hi,

This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80955.
Handle user-defined literals correctly in lex_string(). An empty string 
followed by an identifier is

a valid user-defined literal. Don't issue a warning for this case.

Bootstrapped and tested with 'make check' on x86_64-linux. New test case 
added. Ok for trunk?


Mukesh

Index: gcc/testsuite/g++.dg/cpp0x/udlit-macros.C
===
--- gcc/testsuite/g++.dg/cpp0x/udlit-macros.C   (revision 0)
+++ gcc/testsuite/g++.dg/cpp0x/udlit-macros.C   (working copy)
@@ -0,0 +1,7 @@
+// PR c++/80955
+// { dg-do compile { target c++11 } }
+
+using size_t = decltype(sizeof(0));
+#define _zero
+int operator""_zero(const char*, size_t) { return 0; }
+int main() { return ""_zero; }
Index: libcpp/lex.c
===
--- libcpp/lex.c(revision 253775)
+++ libcpp/lex.c(working copy)
@@ -2001,8 +2001,11 @@
   /* If a string format macro, say from inttypes.h, is placed touching
 a string literal it could be parsed as a C++11 user-defined string
 literal thus breaking the program.
-Try to identify macros with is_macro. A warning is issued. */
-  if (is_macro (pfile, cur))
+Try to identify macros with is_macro. A warning is issued.
+Don't do this for a user-defined literal, i.e. an
+empty string followed by an identifier.
+For an empty string "", (cur-base)==2. Bug 80955 */
+  if (is_macro (pfile, cur) && ((cur-base) != 2))
{
  /* Raise a warning, but do not consume subsequent tokens.  */
  if (CPP_OPTION (pfile, warn_literal_suffix) && !pfile->state.skipping)
/libcpp
2017-10-20  Mukesh Kapoor   

PR c++/80955
* lex.c (lex_string): An empty string followed by an identifier is
a valid user-defined literal. Don't issue a warning for this case.

/testsuite
2017-10-20  Mukesh Kapoor   

PR c++/80955
* g++.dg/cpp0x/udlit-macros.C: New.



[Ada] Spurious warnings and errors on calls on synchronized interfaces

2017-10-20 Thread Pierre-Marie de Rodat
This patch fixes some spurious warnings and errors on dispatching calls to
synchronized operations when the controlling formal of the operation is an
access to interface type.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

2017-10-20  Ed Schonberg  

* sem_util.adb (Is_Controlling_Limited_Procedure): Handle properly the
case where the controlling formal is an anonymous access to interface
type.
* exp_ch9.adb (Extract_Dispatching_Call): If controlling actual is an
access type, handle properly the the constructed dereference that
designates the object used in the rewritten synchronized call.
(Parameter_Block_Pack): If the type of the actual is by-copy, its
generated declaration in the parameter block does not need an
initialization even if the type is a null-excluding access type,
because it will be initialized with the value of the actual later on.
(Parameter_Block_Pack): Do not add controlling actual to parameter
block when its type is by-copy.

gcc/testsuite/

2017-10-20  Ed Schonberg  

* gnat.dg/sync_iface_call.adb, gnat.dg/sync_iface_call_pkg.ads,
gnat.dg/sync_iface_call_pkg2.adb, gnat.dg/sync_iface_call_pkg2.ads:
New testcase.
Index: sem_util.adb
===
--- sem_util.adb(revision 253947)
+++ sem_util.adb(working copy)
@@ -13186,18 +13186,30 @@
function Is_Controlling_Limited_Procedure
  (Proc_Nam : Entity_Id) return Boolean
is
+  Param : Node_Id;
   Param_Typ : Entity_Id := Empty;
 
begin
   if Ekind (Proc_Nam) = E_Procedure
 and then Present (Parameter_Specifications (Parent (Proc_Nam)))
   then
- Param_Typ := Etype (Parameter_Type (First (
-Parameter_Specifications (Parent (Proc_Nam);
+ Param := Parameter_Type (First (
+Parameter_Specifications (Parent (Proc_Nam;
 
-  --  In this case where an Itype was created, the procedure call has been
-  --  rewritten.
+ --  The formal may be an anonymous access type.
 
+ if Nkind (Param) = N_Access_Definition then
+Param_Typ := Entity (Subtype_Mark (Param));
+
+ else
+Param_Typ := Etype (Param);
+ end if;
+
+  --  In the case where an Itype was created for a dispatchin call, the
+  --  procedure call has been rewritten. The actual may be an access to
+  --  interface type in which case it is the designated type that is the
+  --  controlling type.
+
   elsif Present (Associated_Node_For_Itype (Proc_Nam))
 and then Present (Original_Node (Associated_Node_For_Itype (Proc_Nam)))
 and then
@@ -13207,6 +13219,10 @@
  Param_Typ :=
Etype (First (Parameter_Associations
   (Associated_Node_For_Itype (Proc_Nam;
+
+ if Ekind (Param_Typ) = E_Anonymous_Access_Type then
+Param_Typ := Directly_Designated_Type (Param_Typ);
+ end if;
   end if;
 
   if Present (Param_Typ) then
Index: exp_ch9.adb
===
--- exp_ch9.adb (revision 253941)
+++ exp_ch9.adb (working copy)
@@ -12909,11 +12909,14 @@
   end if;
 
   --  If the type of the dispatching object is an access type then return
-  --  an explicit dereference.
+  --  an explicit dereference  of a copy of the object, and note that
+  --  this is the controlling actual of the call.
 
   if Is_Access_Type (Etype (Object)) then
- Object := Make_Explicit_Dereference (Sloc (N), Object);
+ Object :=
+   Make_Explicit_Dereference (Sloc (N), New_Copy_Tree (Object));
  Analyze (Object);
+ Set_Is_Controlling_Actual (Object);
   end if;
end Extract_Dispatching_Call;
 
@@ -14561,6 +14564,12 @@
 Object_Definition   =>
   New_Occurrence_Of (Etype (Formal), Loc)));
 
+--  The object is initialized with an explicit assignment
+--  later. Indicate that it does not need an initialization
+--  to prevent spurious warnings if the type excludes null.
+
+Set_No_Initialization (Last (Decls));
+
 if Ekind (Formal) /= E_Out_Parameter then
 
--  Generate:
@@ -14577,16 +14586,23 @@
Expression => New_Copy_Tree (Actual)));
 end if;
 
---  Generate:
+--  If the actual is not controlling, generate:
+
 --Jnn'unchecked_access
 
-Append_To (Params,
-  Make_Attribute_Reference (Loc,
-Attribute_Name => Name_Unchecked_Access,
-Prefix => New_Occurrence_Of (Temp_Nam, Loc)));
+--  and add it to aggegate for access to formals. Note that
+--  the actual may be by-copy but still b

[Ada] Spurious ineffective use_clause warning

2017-10-20 Thread Pierre-Marie de Rodat
This patch corrects an issue whereby a child package included into the body of
a parent forced checks on ineffective use clauses within the parent's spec to
be checked early - leading to spurious warnings.


-- Source --


--  pp.ads

package PP is
  type Object is null record;
  Undefined : Object;
end;

--  p.ads

with PP;
package P is
   use type PP.Object;
   procedure Force;
end;

--  p.adb

with P.S;
package body P is
   Junk : Boolean := PP.Undefined /= PP.Undefined and P.S.Junk;
   procedure Force is null;
end;

--  p-s.ads

package P.S is
   Junk : Boolean := True;
end;


-- Compilation and output --


& gnatmake -q -gnatwu p.adb

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-10-20  Justin Squirek  

* sem_ch8.adb (Update_Use_Clause_Chain): Add sanity check to verify
scope stack traversal into the context clause.

Index: sem_ch8.adb
===
--- sem_ch8.adb (revision 253945)
+++ sem_ch8.adb (working copy)
@@ -9108,10 +9108,10 @@
   --  Deal with use clauses within the context area if the current
   --  scope is a compilation unit.
 
-  if Is_Compilation_Unit (Current_Scope) then
-
- pragma Assert (Scope_Stack.Last /= Scope_Stack.First);
-
+  if Is_Compilation_Unit (Current_Scope)
+and then Sloc (Scope_Stack.Table
+(Scope_Stack.Last - 1).Entity) = Standard_Location
+  then
  Update_Chain_In_Scope (Scope_Stack.Last - 1);
   end if;
end Update_Use_Clause_Chain;


[Ada] Use the Monotonic Clock on Linux

2017-10-20 Thread Pierre-Marie de Rodat
The Posix method of calculating absolute deadlines is adopted in
favor of latching the monotonic clock to a known epoch, as the Posix
method is simpler and meets all the requirements of the Ada LRM.

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-10-20  Doug Rupp  

* libgnarl/s-osinte__linux.ads (Relative_Timed_Wait): Add variable
needed for using monotonic clock.
* libgnarl/s-taprop__linux.adb: Revert previous monotonic clock
changes.
* libgnarl/s-taprop__linux.adb, s-taprop__posix.adb: Unify and factor
out monotonic clock related functions body.
(Timed_Sleep, Timed_Delay, Montonic_Clock, RT_Resolution,
Compute_Deadline): Move to...
* libgnarl/s-tpopmo.adb: ... here. New separate package body.

Index: libgnarl/s-osinte__linux.ads
===
--- libgnarl/s-osinte__linux.ads(revision 253938)
+++ libgnarl/s-osinte__linux.ads(working copy)
@@ -448,6 +448,9 @@
   abstime : access timespec) return int;
pragma Import (C, pthread_cond_timedwait, "pthread_cond_timedwait");
 
+   Relative_Timed_Wait : constant Boolean := False;
+   --  pthread_cond_timedwait requires an absolute delay time
+
--
-- POSIX.1c  Section 13 --
--
Index: libgnarl/s-taprop__posix.adb
===
--- libgnarl/s-taprop__posix.adb(revision 253938)
+++ libgnarl/s-taprop__posix.adb(working copy)
@@ -145,6 +145,38 @@
package body Specific is separate;
--  The body of this package is target specific
 
+   package Monotonic is
+
+  function Monotonic_Clock return Duration;
+  pragma Inline (Monotonic_Clock);
+  --  Returns "absolute" time, represented as an offset relative to "the
+  --  Epoch", which is Jan 1, 1970. This clock implementation is immune to
+  --  the system's clock changes.
+
+  function RT_Resolution return Duration;
+  pragma Inline (RT_Resolution);
+  --  Returns resolution of the underlying clock used to implement RT_Clock
+
+  procedure Timed_Sleep
+(Self_ID  : ST.Task_Id;
+ Time : Duration;
+ Mode : ST.Delay_Modes;
+ Reason   : System.Tasking.Task_States;
+ Timedout : out Boolean;
+ Yielded  : out Boolean);
+  --  Combination of Sleep (above) and Timed_Delay
+
+  procedure Timed_Delay
+(Self_ID : ST.Task_Id;
+ Time: Duration;
+ Mode: ST.Delay_Modes);
+  --  Implement the semantics of the delay statement.
+  --  The caller should be abort-deferred and should not hold any locks.
+
+   end Monotonic;
+
+   package body Monotonic is separate;
+
--
-- ATCB allocation/deallocation --
--
@@ -183,18 +215,6 @@
pragma Import (C,
  GNAT_pthread_condattr_setup, "__gnat_pthread_condattr_setup");
 
-   procedure Compute_Deadline
- (Time   : Duration;
-  Mode   : ST.Delay_Modes;
-  Check_Time : out Duration;
-  Abs_Time   : out Duration;
-  Rel_Time   : out Duration);
-   --  Helper for Timed_Sleep and Timed_Delay: given a deadline specified by
-   --  Time and Mode, compute the current clock reading (Check_Time), and the
-   --  target absolute and relative clock readings (Abs_Time, Rel_Time). The
-   --  epoch for Time depends on Mode; the epoch for Check_Time and Abs_Time
-   --  is always that of CLOCK_RT_Ada.
-
---
-- Abort_Handler --
---
@@ -253,67 +273,6 @@
   end if;
end Abort_Handler;
 
-   --
-   -- Compute_Deadline --
-   --
-
-   procedure Compute_Deadline
- (Time   : Duration;
-  Mode   : ST.Delay_Modes;
-  Check_Time : out Duration;
-  Abs_Time   : out Duration;
-  Rel_Time   : out Duration)
-   is
-   begin
-  Check_Time := Monotonic_Clock;
-
-  --  Relative deadline
-
-  if Mode = Relative then
- Abs_Time := Duration'Min (Time, Max_Sensible_Delay) + Check_Time;
-
- if Relative_Timed_Wait then
-Rel_Time := Duration'Min (Max_Sensible_Delay, Time);
- end if;
-
- pragma Warnings (Off);
- --  Comparison "OSC.CLOCK_RT_Ada = OSC.CLOCK_REALTIME" is compile
- --  time known.
-
-  --  Absolute deadline specified using the tasking clock (CLOCK_RT_Ada)
-
-  elsif Mode = Absolute_RT
-or else OSC.CLOCK_RT_Ada = OSC.CLOCK_REALTIME
-  then
- pragma Warnings (On);
- Abs_Time := Duration'Min (Check_Time + Max_Sensible_Delay, Time);
-
- if Relative_Timed_Wait then
-Rel_Time := Duration'Min (Max_Sensible_Delay, Time - Check_Time);
- end if;
-
-  --  Absolute deadline specified using the calendar clock, in the
-  --  case where it is not the 

[Patch, fortran] PR82586 - [PDT] ICE: write_symbol(): bad module symbol

2017-10-20 Thread Paul Richard Thomas
Dear All,

The attached patch is pretty clear with the ChangeLogs and is very
nearly obvious.

Bootstrapped and regtested on FC23/x86_64 - OK for trunk?

Paul

2017-10-20  Paul Thomas  

PR fortran/82586
* decl.c (gfc_get_pdt_instance): Remove the error message that
the parameter does not have a corresponding component since
this is now taken care of when the derived type is resolved. Go
straight to error return instead.
(gfc_match_formal_arglist): Make the PDT relevant errors
immediate so that parsing of the derived type can continue.
(gfc_match_derived_decl): Do not check the match status on
return from gfc_match_formal_arglist for the same reason.
* resolve.c (resolve_fl_derived0): Check that each type
parameter has a corresponding component.

2017-10-20  Paul Thomas  

PR fortran/82586
* gfortran.dg/pdt_16.f03 : New test.
* gfortran.dg/pdt_4.f03 : Catch the changed messages.
* gfortran.dg/pdt_8.f03 : Ditto.


-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein
Index: gcc/fortran/decl.c
===
*** gcc/fortran/decl.c  (revision 253847)
--- gcc/fortran/decl.c  (working copy)
*** gfc_get_pdt_instance (gfc_actual_arglist
*** 3242,3254 
param = type_param_name_list->sym;
  
c1 = gfc_find_component (pdt, param->name, false, true, NULL);
if (!pdt->attr.use_assoc && !c1)
!   {
! gfc_error ("The type parameter name list at %L contains a parameter "
!"'%qs' , which is not declared as a component of the type",
!&pdt->declared_at, param->name);
! goto error_return;
!   }
  
kind_expr = NULL;
if (!name_seen)
--- 3242,3251 
param = type_param_name_list->sym;
  
c1 = gfc_find_component (pdt, param->name, false, true, NULL);
+   /* An error should already have been thrown in resolve.c
+(resolve_fl_derived0).  */
if (!pdt->attr.use_assoc && !c1)
!   goto error_return;
  
kind_expr = NULL;
if (!name_seen)
*** gfc_match_formal_arglist (gfc_symbol *pr
*** 5984,5990 
/* The name of a program unit can be in a different namespace,
 so check for it explicitly.  After the statement is accepted,
 the name is checked for especially in gfc_get_symbol().  */
!   if (gfc_new_block != NULL && sym != NULL
  && strcmp (sym->name, gfc_new_block->name) == 0)
{
  gfc_error ("Name %qs at %C is the name of the procedure",
--- 5981,5987 
/* The name of a program unit can be in a different namespace,
 so check for it explicitly.  After the statement is accepted,
 the name is checked for especially in gfc_get_symbol().  */
!   if (gfc_new_block != NULL && sym != NULL && !typeparam
  && strcmp (sym->name, gfc_new_block->name) == 0)
{
  gfc_error ("Name %qs at %C is the name of the procedure",
*** gfc_match_formal_arglist (gfc_symbol *pr
*** 5999,6005 
m = gfc_match_char (',');
if (m != MATCH_YES)
{
! gfc_error ("Unexpected junk in formal argument list at %C");
  goto cleanup;
}
  }
--- 5996,6006 
m = gfc_match_char (',');
if (m != MATCH_YES)
{
! if (typeparam)
!   gfc_error_now ("Expected parameter list in type declaration "
!  "at %C");
! else
!   gfc_error ("Unexpected junk in formal argument list at %C");
  goto cleanup;
}
  }
*** ok:
*** 6016,6023 
  for (q = p->next; q; q = q->next)
if (p->sym == q->sym)
  {
!   gfc_error ("Duplicate symbol %qs in formal argument list "
!  "at %C", p->sym->name);
  
m = MATCH_ERROR;
goto cleanup;
--- 6017,6028 
  for (q = p->next; q; q = q->next)
if (p->sym == q->sym)
  {
!   if (typeparam)
! gfc_error_now ("Duplicate name %qs in parameter "
!"list at %C", p->sym->name);
!   else
! gfc_error ("Duplicate symbol %qs in formal argument "
!"list at %C", p->sym->name);
  
m = MATCH_ERROR;
goto cleanup;
*** gfc_match_derived_decl (void)
*** 9814,9822 
  
if (parameterized_type)
  {
!   m = gfc_match_formal_arglist (sym, 0, 0, true);
!   if (m != MATCH_YES)
!   return m;
m = gfc_match_eos ();
if (m != MATCH_YES)
return m;
--- 9819,9827 
  
if (parameterized_type)
  {
!   /* Ignore error or mismatches to avoid the component declarations
!causing problems later.  */
!   gfc_match_formal_arglist (sym, 0, 0, tru

Re: [C++ Patch] PR 80955 (Macros expanded in definition of user-defined literals)

2017-10-20 Thread Nathan Sidwell

On 10/20/2017 12:37 PM, Mukesh Kapoor wrote:

Hi,

This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80955.
Handle user-defined literals correctly in lex_string(). An empty string 
followed by an identifier is

a valid user-defined literal. Don't issue a warning for this case.


a) why do we trigger on the definition of the operator function, and not 
on the use site?


b) Why is the empty string special cased?  Doesn't the same logic apply to:

int operator "bob"_zero (const char *, size_t) { return 0;}

(that'd be a syntactic error in the C++ parser of course)

nathan

--
Nathan Sidwell


Re: [C++ Patch] PR 80955 (Macros expanded in definition of user-defined literals)

2017-10-20 Thread Mukesh Kapoor

Hi,

On 10/20/2017 10:45 AM, Nathan Sidwell wrote:

On 10/20/2017 12:37 PM, Mukesh Kapoor wrote:

Hi,

This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80955.
Handle user-defined literals correctly in lex_string(). An empty 
string followed by an identifier is

a valid user-defined literal. Don't issue a warning for this case.


a) why do we trigger on the definition of the operator function, and 
not on the use site?


Actually, the current compiler issues an error (incorrectly) at both 
places: at the definition as well as at its use.




b) Why is the empty string special cased?  Doesn't the same logic 
apply to:


int operator "bob"_zero (const char *, size_t) { return 0;}


This is not a valid user-defined literal and is already reported as an 
error by the compiler. After my changes it's still reported as an error.
The empty string immediately followed by an identifier is a special case 
because it's a valid user-defined literal in C++. ""_zero is a valid 
user-defined literal.


Mukesh



(that'd be a syntactic error in the C++ parser of course)

nathan





[PATCH] Define __cpp_lib_byte feature-test macro

2017-10-20 Thread Jonathan Wakely

The recent SD-6 drafts define a macro for std::byte, so this patch
adds it.

* include/c_global/cstddef: Define __cpp_lib_byte feature-test macro.
* testsuite/18_support/byte/requirements.cc: Check macro.

Tested powerpc64le-linux, committed to trunk and gcc-7-branch.


commit 22d1d67abd32dc4db49519fbf24c7726999247fc
Author: Jonathan Wakely 
Date:   Fri Oct 20 18:51:10 2017 +0100

Define __cpp_lib_byte feature-test macro

* include/c_global/cstddef: Define __cpp_lib_byte feature-test 
macro.
* testsuite/18_support/byte/requirements.cc: Check macro.

diff --git a/libstdc++-v3/include/c_global/cstddef 
b/libstdc++-v3/include/c_global/cstddef
index 09754ee45da..11d268b7f81 100644
--- a/libstdc++-v3/include/c_global/cstddef
+++ b/libstdc++-v3/include/c_global/cstddef
@@ -57,9 +57,11 @@ namespace std
 }
 #endif
 
-#if __cplusplus > 201402L
+#if __cplusplus >= 201703L
 namespace std
 {
+#define __cpp_lib_byte 201603
+
   /// std::byte
   enum class byte : unsigned char {};
 
diff --git a/libstdc++-v3/testsuite/18_support/byte/requirements.cc 
b/libstdc++-v3/testsuite/18_support/byte/requirements.cc
index 4cb05df0405..74c8b64d6ce 100644
--- a/libstdc++-v3/testsuite/18_support/byte/requirements.cc
+++ b/libstdc++-v3/testsuite/18_support/byte/requirements.cc
@@ -20,6 +20,12 @@
 
 #include 
 
+#ifndef __cpp_lib_byte
+# error "Feature-test macro for byte missing"
+#elif __cpp_lib_byte != 201603
+# error "Feature-test macro for byte has wrong value"
+#endif
+
 static_assert( sizeof(std::byte) == sizeof(unsigned char) );
 static_assert( alignof(std::byte) == alignof(unsigned char) );
 


[C++ PATCH] AS_BASETYPE

2017-10-20 Thread Nathan Sidwell
We have a special 'as-base' instance of each class, for use in derived 
layouts.  for some classes this is just the regular class itself, but 
for others it's a new instance.  This instance is not in the 
class-members, and that's now biting me on the modules branch (i'd been 
punting on it before).


This patch does a few cleanup, which make sense on trunk.

We do seem to be creating this instance in more cases than necessary -- 
looking at (CLASSTYPE_NON_LAYOUT_POD_P (t) || CLASSTYPE_EMPTY_P (t)). 
The former claims to 'not be the language non-pod'. but it actually is 
if you're in c++98-land.  I /think/ we only need a special instance for

a) empty classes
b) classes with virtual bases
c) non-pod classes with tail padding (which could be overlaid in 
derivation).


But that's a cleanup for another day.

nathan
--
Nathan Sidwell
2017-10-20  Nathan Sidwell  

	* class.c (layout_class_type): Cleanup as-base creation, determine
	mode here.
	(finish_struct_1): ... not here.

Index: class.c
===
--- class.c	(revision 253948)
+++ class.c	(working copy)
@@ -5992,8 +5992,6 @@ layout_class_type (tree t, tree *virtual
   bool last_field_was_bitfield = false;
   /* The location at which the next field should be inserted.  */
   tree *next_field;
-  /* T, as a base class.  */
-  tree base_t;
 
   /* Keep track of the first non-static data member.  */
   non_static_data_members = TYPE_FIELDS (t);
@@ -6218,15 +6216,11 @@ layout_class_type (tree t, tree *virtual
  that the type is laid out they are no longer important.  */
   remove_zero_width_bit_fields (t);
 
-  /* Create the version of T used for virtual bases.  We do not use
- make_class_type for this version; this is an artificial type.  For
- a POD type, we just reuse T.  */
   if (CLASSTYPE_NON_LAYOUT_POD_P (t) || CLASSTYPE_EMPTY_P (t))
 {
-  base_t = make_node (TREE_CODE (t));
-
-  /* Set the size and alignment for the new type.  */
-  tree eoc;
+  /* T needs a different layout as a base (eliding virtual bases
+	 or whatever).  Create that version.  */
+  tree base_t = make_node (TREE_CODE (t));
 
   /* If the ABI version is not at least two, and the last
 	 field was a bit-field, RLI may not be on a byte
@@ -6235,7 +6229,7 @@ layout_class_type (tree t, tree *virtual
 	 indicates the total number of bits used.  Therefore,
 	 rli_size_so_far, rather than rli_size_unit_so_far, is
 	 used to compute TYPE_SIZE_UNIT.  */
-  eoc = end_of_class (t, /*include_virtuals_p=*/0);
+  tree eoc = end_of_class (t, /*include_virtuals_p=*/0);
   TYPE_SIZE_UNIT (base_t)
 	= size_binop (MAX_EXPR,
 		  fold_convert (sizetype,
@@ -6252,7 +6246,8 @@ layout_class_type (tree t, tree *virtual
   SET_TYPE_ALIGN (base_t, rli->record_align);
   TYPE_USER_ALIGN (base_t) = TYPE_USER_ALIGN (t);
 
-  /* Copy the fields from T.  */
+  /* Copy the non-static data members of T. This will include its
+	 direct non-virtual bases & vtable.  */
   next_field = &TYPE_FIELDS (base_t);
   for (field = TYPE_FIELDS (t); field; field = DECL_CHAIN (field))
 	if (TREE_CODE (field) == FIELD_DECL)
@@ -6263,9 +6258,14 @@ layout_class_type (tree t, tree *virtual
 	  }
   *next_field = NULL_TREE;
 
+  /* We use the base type for trivial assignments, and hence it
+	 needs a mode.  */
+  compute_record_mode (base_t);
+
+  TYPE_CONTEXT (base_t) = t;
+
   /* Record the base version of the type.  */
   CLASSTYPE_AS_BASE (t) = base_t;
-  TYPE_CONTEXT (base_t) = t;
 }
   else
 CLASSTYPE_AS_BASE (t) = t;
@@ -6822,11 +6822,6 @@ finish_struct_1 (tree t)
 
   set_class_bindings (t);
 
-  if (CLASSTYPE_AS_BASE (t) != t)
-/* We use the base type for trivial assignments, and hence it
-   needs a mode.  */
-compute_record_mode (CLASSTYPE_AS_BASE (t));
-
   /* With the layout complete, check for flexible array members and
  zero-length arrays that might overlap other members in the final
  layout.  */


Re: [patch 0/5] nios2 address mode improvements

2017-10-20 Thread Sandra Loosemore

On 10/20/2017 02:12 AM, Richard Biener wrote:

On Fri, Oct 20, 2017 at 4:03 AM, Sandra Loosemore
 wrote:

This is the set of nios2 optimization patches that I've previously
mentioned in these threads:

https://gcc.gnu.org/ml/gcc/2017-10/msg00016.html
https://gcc.gnu.org/ml/gcc-patches/2017-10/msg00957.html

To give an overview of what this is for

The nios2 backend currently generates quite bad code for memory
accesses with addresses involving symbolic constants.  Like a typical
RISC machine, nios2 requires splitting such 32-bit constants into
HIGH/LO_SUM pairs.  Currently this happens in expand, and address
expressions involving such constants are always converted to use a
register indirect form.

One part of the problem is that the backend currently doesn't
recognize that LO_SUM is a legitimate address form (it's register
indirect with a constant offset using the %lo relocation).  That's
fixed in these patches.

A harder problem is that doing the high/lo_sum splitting in expand
inhibits subsequent optimizations.  One such problem arises when you
have accesses to multiple fields in a static structure object.  Expand
sees this as many (symbol + offset) expressions involving the same
symbol with different constant offsets.  What we should be doing in
that case is CSE'ing the symbol address computation rather than
splitting every such expression individually.

This patch series attacks that problem by deferring splitting to the
split1 pass, which happens after cse and fwprop optimizations.
Deferring the splitting also requires that TARGET_LEGITIMATE_ADDRESS_P
accept these symbolic constant expressions until the splitting takes
place, and that code that might generate 32-bit constants in other
places (e.g., the movsi expander) must not do so after they are
supposed to have been split.


How do other targets handle this situation?  Naiively I'd have handled
the splitting at reload/LRA time ... (which would make the flag
to test reload_completed)


The problem with this is that the HIGH/LO_SUM split requires a scratch 
register, so it has to be done before register allocation; 
reload_completed is too late.  Early on when I started working on this 
problem, I did various experiments with this, including fiddling with 
the insn patterns and constraints, trying to use TARGET_SECONDARY_RELOAD 
to manage the scratch register, etc, but I could never get it to work 
without completely stomping on all the unsplit symbols before reload.


I also considered a separate target-specific pass to do the splitting, 
but once I got it to work with the regular split mechanism I was quite 
happy with the design except for the "have we split yet?" hook.  :-P



There are quite a number of targets using lo_sum but I'm not sure they
share the issue with symbolic constants.


Nios II resembles a simplified MIPS-ish architecture so that was where I 
looked first.  It does some complicated stuff with trying to match up 
HIGH and LO_SUM pairs after the fact, and I didn't really want to go 
there.  I looked at some other backends as well but didn't see anything 
I could directly copy.


-Sandra



libgo patch committed: Support 64-bit DWARF in version check, elsewhere

2017-10-20 Thread Ian Lance Taylor
This patch to libgo supports 64-bit DWARF in byte order check.  It
also fixes 64-bit DWARF to read a 64-bit abbrev offset in the
compilation unit.

This is a backport of https://golang.org/cl/71171, which will be inthe
Go 1.10 release, to the gofrontend copy. Doing it now because AIX is
pretty much the only system that uses 64-bit DWARF.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 253694)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-44132970e4b6c1186036bf8eda8982fb6e905d6f
+a409ac2c78899e638a014c97891925bec93cb3ad
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/debug/dwarf/entry.go
===
--- libgo/go/debug/dwarf/entry.go   (revision 253311)
+++ libgo/go/debug/dwarf/entry.go   (working copy)
@@ -33,13 +33,13 @@ type abbrevTable map[uint32]abbrev
 
 // ParseAbbrev returns the abbreviation table that starts at byte off
 // in the .debug_abbrev section.
-func (d *Data) parseAbbrev(off uint32, vers int) (abbrevTable, error) {
+func (d *Data) parseAbbrev(off uint64, vers int) (abbrevTable, error) {
if m, ok := d.abbrevCache[off]; ok {
return m, nil
}
 
data := d.abbrev
-   if off > uint32(len(data)) {
+   if off > uint64(len(data)) {
data = nil
} else {
data = data[off:]
Index: libgo/go/debug/dwarf/entry_test.go
===
--- libgo/go/debug/dwarf/entry_test.go  (revision 253311)
+++ libgo/go/debug/dwarf/entry_test.go  (working copy)
@@ -135,3 +135,63 @@ func TestReaderRanges(t *testing.T) {
t.Errorf("saw only %d subprograms, expected %d", i, 
len(subprograms))
}
 }
+
+func Test64Bit(t *testing.T) {
+   // I don't know how to generate a 64-bit DWARF debug
+   // compilation unit except by using XCOFF, so this is
+   // hand-written.
+   tests := []struct {
+   name string
+   info []byte
+   }{
+   {
+   "32-bit little",
+   []byte{0x30, 0, 0, 0, // comp unit length
+   4, 0, // DWARF version 4
+   0, 0, 0, 0, // abbrev offset
+   8, // address size
+   0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   },
+   },
+   {
+   "64-bit little",
+   []byte{0xff, 0xff, 0xff, 0xff, // 64-bit DWARF
+   0x30, 0, 0, 0, 0, 0, 0, 0, // comp unit length
+   4, 0, // DWARF version 4
+   0, 0, 0, 0, 0, 0, 0, 0, // abbrev offset
+   8, // address size
+   0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   },
+   },
+   {
+   "64-bit big",
+   []byte{0xff, 0xff, 0xff, 0xff, // 64-bit DWARF
+   0, 0, 0, 0, 0, 0, 0, 0x30, // comp unit length
+   0, 4, // DWARF version 4
+   0, 0, 0, 0, 0, 0, 0, 0, // abbrev offset
+   8, // address size
+   0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   },
+   },
+   }
+
+   for _, test := range tests {
+   _, err := New(nil, nil, nil, test.info, nil, nil, nil, nil)
+   if err != nil {
+   t.Errorf("%s: %v", test.name, err)
+   }
+   }
+}
Index: libgo/go/debug/dwarf/open.go
===
--- libgo/go/debug/dwarf/open.go(revision 253311)
+++ libgo/go/debug/dwarf/open.go(working copy)
@@ -23,7 +23,7 @@ type Data struct {
str  []byte
 
// parsed data
-   abbrevCache map[uint32]abbrevTable
+   abbrevCache map[uint64]abbrevTable
   

Re: [patch 0/5] nios2 address mode improvements

2017-10-20 Thread Sandra Loosemore

On 10/20/2017 02:56 AM, Jakub Jelinek wrote:

On Thu, Oct 19, 2017 at 08:03:45PM -0600, Sandra Loosemore wrote:

A harder problem is that doing the high/lo_sum splitting in expand
inhibits subsequent optimizations.  One such problem arises when you
have accesses to multiple fields in a static structure object.  Expand
sees this as many (symbol + offset) expressions involving the same
symbol with different constant offsets.  What we should be doing in
that case is CSE'ing the symbol address computation rather than
splitting every such expression individually.


Do you have the needed relocations for that though?
If not, then you need to do:
   tmp = high (symbol);
   tmp |= lo_sum (symbol); // or +
   a = [tmp + 0];
   b = [tmp + 4];
   c = [tmp + 8];
if you do (like e.g. sparc64 has the %olo relocation), then you can do
   tmp = high (symbol);
   a = [tmp + lo_sum (symbol) + 0];
   b = [tmp + lo_sum (symbol) + 4];
   c = [tmp + lo_sum (symbol) + 8];
If you tried to do:
   tmp = high (symbol);
   a = [tmp + lo_sum (symbol)];
   b = [tmp + lo_sum (symbol + 4)];
   c = [tmp + lo_sum (symbol + 8)];
then this would break if lo_sum (symbol + 4) or lo_sum (symbol + 8)
is < 4.


No, nothing that sophisticated -- I'm aiming to produce the first of 
your three choices.  I just want to avoid producing 3 different 
high/lo_sum pairs here, which is what the unpatched nios2 backend does.  :-(


-Sandra



Re: [patch 0/5] nios2 address mode improvements

2017-10-20 Thread Andrew Pinski
On Thu, Oct 19, 2017 at 7:03 PM, Sandra Loosemore
 wrote:
> This is the set of nios2 optimization patches that I've previously
> mentioned in these threads:
>
> https://gcc.gnu.org/ml/gcc/2017-10/msg00016.html
> https://gcc.gnu.org/ml/gcc-patches/2017-10/msg00957.html
>
> To give an overview of what this is for
>
> The nios2 backend currently generates quite bad code for memory
> accesses with addresses involving symbolic constants.  Like a typical
> RISC machine, nios2 requires splitting such 32-bit constants into
> HIGH/LO_SUM pairs.  Currently this happens in expand, and address
> expressions involving such constants are always converted to use a
> register indirect form.
>
> One part of the problem is that the backend currently doesn't
> recognize that LO_SUM is a legitimate address form (it's register
> indirect with a constant offset using the %lo relocation).  That's
> fixed in these patches.
>
> A harder problem is that doing the high/lo_sum splitting in expand
> inhibits subsequent optimizations.  One such problem arises when you
> have accesses to multiple fields in a static structure object.  Expand
> sees this as many (symbol + offset) expressions involving the same
> symbol with different constant offsets.  What we should be doing in
> that case is CSE'ing the symbol address computation rather than
> splitting every such expression individually.
>
> This patch series attacks that problem by deferring splitting to the
> split1 pass, which happens after cse and fwprop optimizations.
> Deferring the splitting also requires that TARGET_LEGITIMATE_ADDRESS_P
> accept these symbolic constant expressions until the splitting takes
> place, and that code that might generate 32-bit constants in other
> places (e.g., the movsi expander) must not do so after they are
> supposed to have been split.
>
> This patch series also includes general improvements to the cost model
> to get better CSE results -- in particular, the nios2 backend has been
> completely missing an implementation for TARGET_ADDRESS_COST.  I also found
> that making TARGET_LEGITIMIZE_ADDRESS smarter resulted in better
> address cost modeling by the ivopts pass.
>
> All together, this resulted in about a 7% code size improvement on the
> customer-provided test case I was using for tuning purposes.

I remember the Sony version of the SPU Back-end doing something
similar and getting similar improvements.
But I don't remember the exact details either.  It might have been
because the SPU only had 128bit loads so expanding the loads too soon
was missing optimizations.  I do remember not upstreaming that code
and I was always disappointed it was not.

Thanks,
Andrew

>
> Patches in this set are broken down as follows:
>
> 1: Switch to LRA.
> 2: Detect when splitting has been completed.
> 3: Add splitters and recognize the new address modes.
> 4: Cost model improvements.
> 5: Test cases.
>
> Part 2 is the piece that relates to the discussion linked above.  As
> implemented, it works fine, but it's maybe not the best design.  I'll
> hold off on committing the entire set for at least a few days in case
> somebody wants to suggest a better solution.
>
> -Sandra
>


Re: [Patch, fortran] PR82586 - [PDT] ICE: write_symbol(): bad module symbol

2017-10-20 Thread Paul Richard Thomas
Dear All,

In the last hour, I have added fixes for PRs 82587 and 82589. Please
review them together with 82586.

I will stop work on Gerhard's PDT bugs until this patch is committed.
Fortunately, Steve Kargl has proposed fixes for most of them :-)

Cheers

Paul

2017-10-20  Paul Thomas  

PR fortran/82586
* decl.c (gfc_get_pdt_instance): Remove the error message that
the parameter does not have a corresponding component since
this is now taken care of when the derived type is resolved. Go
straight to error return instead.
(gfc_match_formal_arglist): Make the PDT relevant errors
immediate so that parsing of the derived type can continue.
(gfc_match_derived_decl): Do not check the match status on
return from gfc_match_formal_arglist for the same reason.
* resolve.c (resolve_fl_derived0): Check that each type
parameter has a corresponding component.

PR fortran/82587
* resolve.c (resolve_generic_f): Check that the derived type
can be used before resolving the struture constructor.

PR fortran/82589
* symbol.c (check_conflict): Add the conflicts involving PDT
KIND and LEN attributes.

2017-10-20  Paul Thomas  

PR fortran/82586
* gfortran.dg/pdt_16.f03 : New test.
* gfortran.dg/pdt_4.f03 : Catch the changed messages.
* gfortran.dg/pdt_8.f03 : Ditto.

PR fortran/82587
* gfortran.dg/pdt_17.f03 : New test.

PR fortran/82589
* gfortran.dg/pdt_18.f03 : New test.

On 20 October 2017 at 18:17, Paul Richard Thomas
 wrote:
> Dear All,
>
> The attached patch is pretty clear with the ChangeLogs and is very
> nearly obvious.
>
> Bootstrapped and regtested on FC23/x86_64 - OK for trunk?
>
> Paul
>
> 2017-10-20  Paul Thomas  
>
> PR fortran/82586
> * decl.c (gfc_get_pdt_instance): Remove the error message that
> the parameter does not have a corresponding component since
> this is now taken care of when the derived type is resolved. Go
> straight to error return instead.
> (gfc_match_formal_arglist): Make the PDT relevant errors
> immediate so that parsing of the derived type can continue.
> (gfc_match_derived_decl): Do not check the match status on
> return from gfc_match_formal_arglist for the same reason.
> * resolve.c (resolve_fl_derived0): Check that each type
> parameter has a corresponding component.
>
> 2017-10-20  Paul Thomas  
>
> PR fortran/82586
> * gfortran.dg/pdt_16.f03 : New test.
> * gfortran.dg/pdt_4.f03 : Catch the changed messages.
> * gfortran.dg/pdt_8.f03 : Ditto.
>
>
> --
> "If you can't explain it simply, you don't understand it well enough"
> - Albert Einstein



-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein
Index: gcc/fortran/decl.c
===
*** gcc/fortran/decl.c  (revision 253847)
--- gcc/fortran/decl.c  (working copy)
*** gfc_get_pdt_instance (gfc_actual_arglist
*** 3242,3254 
param = type_param_name_list->sym;
  
c1 = gfc_find_component (pdt, param->name, false, true, NULL);
if (!pdt->attr.use_assoc && !c1)
!   {
! gfc_error ("The type parameter name list at %L contains a parameter "
!"'%qs' , which is not declared as a component of the type",
!&pdt->declared_at, param->name);
! goto error_return;
!   }
  
kind_expr = NULL;
if (!name_seen)
--- 3242,3251 
param = type_param_name_list->sym;
  
c1 = gfc_find_component (pdt, param->name, false, true, NULL);
+   /* An error should already have been thrown in resolve.c
+(resolve_fl_derived0).  */
if (!pdt->attr.use_assoc && !c1)
!   goto error_return;
  
kind_expr = NULL;
if (!name_seen)
*** gfc_match_formal_arglist (gfc_symbol *pr
*** 5984,5990 
/* The name of a program unit can be in a different namespace,
 so check for it explicitly.  After the statement is accepted,
 the name is checked for especially in gfc_get_symbol().  */
!   if (gfc_new_block != NULL && sym != NULL
  && strcmp (sym->name, gfc_new_block->name) == 0)
{
  gfc_error ("Name %qs at %C is the name of the procedure",
--- 5981,5987 
/* The name of a program unit can be in a different namespace,
 so check for it explicitly.  After the statement is accepted,
 the name is checked for especially in gfc_get_symbol().  */
!   if (gfc_new_block != NULL && sym != NULL && !typeparam
  && strcmp (sym->name, gfc_new_block->name) == 0)
{
  gfc_error ("Name %qs at %C is the name of the procedure",
*** gfc_match_formal_arglist (gfc_symbol *pr
*** 5999,6005 
m = gfc_match_char (',');
if (m != MATCH_YES)
{
! gfc_error ("Unexpected junk in formal argument list at %C");
  goto cleanup;
  

Re: [Patch, fortran] PR82586 - [PDT] ICE: write_symbol(): bad module symbol

2017-10-20 Thread Steve Kargl
On Fri, Oct 20, 2017 at 07:55:17PM +0100, Paul Richard Thomas wrote:
> 
> In the last hour, I have added fixes for PRs 82587 and 82589. Please
> review them together with 82586.
> 
> I will stop work on Gerhard's PDT bugs until this patch is committed.
> Fortunately, Steve Kargl has proposed fixes for most of them :-)
> 

Looks good to me.  Ok to commit.

-- 
Steve


Re: [PATCH] Improve V?TImode shifts (PR target/82370)

2017-10-20 Thread Kirill Yukhin
Hello Jakub, Uroš, Jakub
On 04 Oct 21:35, Jakub Jelinek wrote:
> Hi!
> 
> The following patch tweaks the TImode vector shifts similarly
> to the earlier vector shift patch, so that for shifts by immediate
> we can accept a memory input.  Additionally, it removes the vec_shl_*
> expander, because the middle-end has dropped that a few years ago,
> and merges the left and right shift patterns using code iterators.
> Appart from the code/names that can be handled by mode attributes,
> the only difference was that one of the insns had
> (set_attr "atom_unit" "sishuf")
> and the other didn't.  I hope that is just an error, I'd really expect
> both vpslldq and vpsrldq to use the same atom unit, isn't that the case?
This was introduced back in 2010. So, I have no info.
What I can see from config/atom.md:
;; if palignr or psrldq
(define_insn_reservation  "atom_sseishft_2" 1
  (and (eq_attr "cpu" "atom")
   (and (eq_attr "type" "sseishft")
(and (eq_attr "atom_unit" "sishuf")
 (match_operand 2 "immediate_operand"
  "atom-simple-0")

This leads back to initial commit of atom.md.
So, discrimination of psrldq and pslldq looks intentional.

On the over hand, I see in Software Optimization Guide, Table 14-2 that
PSRLDQ and PSLLDQ occupy same line which directs both insns to port-0 (p 14-18).
So, looking from that point, definition for PSLLDQ which allow either of port-0
and port-1 looks wrong (atom-simple-either reservation).

In absence of other information, I'd play on safe side and leave things as they
occur right now.

Maybe Uroš or HJ could shed a light?

--
Thanks, K


Re: [RFC PATCH, i386]: Make FP inequality comparisons trapping on qNaN.

2017-10-20 Thread Uros Bizjak
On Fri, Oct 20, 2017 at 2:15 PM, Joseph Myers  wrote:

> This is PR target/52451.
>
> A testcase (conditional on the fenv_exceptions effective-target) that
> ordered comparisons with quiet NaNs set FE_INVALID would be a good idea,
> but it would need XFAILing for powerpc (bug 58684) and s390 (bug 77918).

Joseph,

thanks for pointing out a PR reference and a suggestion for a testcase.

Please find attached a new version of the patch, including the
comprehensive tests.

2017-10-20  Uros Bizjak  

PR target/52451
* config/i386/i386.c (ix86_fp_compare_mode): Return CCFPmode
for ordered inequality comparisons even with TARGET_IEEE_FP.

2017-10-20  Uros Bizjak  

PR target/52451
* gcc.dg/torture/pr52451.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 253949)
+++ config/i386/i386.c  (working copy)
@@ -21500,14 +21500,35 @@
Return the appropriate mode to use.  */
 
 machine_mode
-ix86_fp_compare_mode (enum rtx_code)
+ix86_fp_compare_mode (enum rtx_code code)
 {
-  /* ??? In order to make all comparisons reversible, we do all comparisons
- non-trapping when compiling for IEEE.  Once gcc is able to distinguish
- all forms trapping and nontrapping comparisons, we can make inequality
- comparisons trapping again, since it results in better code when using
- FCOM based compares.  */
-  return TARGET_IEEE_FP ? CCFPUmode : CCFPmode;
+  if (!TARGET_IEEE_FP)
+return CCFPmode;
+
+  switch (code)
+{
+case GT:
+case GE:
+case LT:
+case LE:
+  return CCFPmode;
+
+case EQ:
+case NE:
+
+case LTGT:
+case UNORDERED:
+case ORDERED:
+case UNLT:
+case UNLE:
+case UNGT:
+case UNGE:
+case UNEQ:
+  return CCFPUmode;
+
+default:
+  gcc_unreachable ();
+}
 }
 
 machine_mode
Index: testsuite/gcc.dg/torture/pr52451.c
===
--- testsuite/gcc.dg/torture/pr52451.c  (nonexistent)
+++ testsuite/gcc.dg/torture/pr52451.c  (working copy)
@@ -0,0 +1,55 @@
+/* { dg-do run } */
+/* { dg-add-options ieee } */
+/* { dg-require-effective-target fenv_exceptions } */
+
+#include 
+
+#define TEST_C_NOEX(CMP, S)\
+  r = nan##S CMP arg##S;   \
+  if (fetestexcept (FE_INVALID))   \
+__builtin_abort ()
+
+#define TEST_B_NOEX(FN, S) \
+  r = __builtin_##FN (nan##S, arg##S); \
+  if (fetestexcept (FE_INVALID))   \
+__builtin_abort ()
+
+#define TEST_C_EX(CMP, S)  \
+  r = nan##S CMP arg##S;   \
+  if (!fetestexcept (FE_INVALID))  \
+__builtin_abort ();\
+  feclearexcept (FE_INVALID)
+
+#define TEST(TYPE, S)  \
+  volatile TYPE nan##S = __builtin_nan##S ("");\
+  volatile TYPE arg##S = 1.0##S;   \
+   \
+  TEST_C_NOEX (==, S); \
+  TEST_C_NOEX (!=, S); \
+   \
+  TEST_B_NOEX (isgreater, S);  \
+  TEST_B_NOEX (isless, S); \
+  TEST_B_NOEX (isgreaterequal, S); \
+  TEST_B_NOEX (islessequal, S);\
+   \
+  TEST_B_NOEX (islessgreater, S);  \
+  TEST_B_NOEX (isunordered, S);\
+   \
+  TEST_C_EX (>, S);\
+  TEST_C_EX (<, S);\
+  TEST_C_EX (>=, S);   \
+  TEST_C_EX (<=, S)
+
+int
+main (void)
+{
+  volatile int r;
+
+  feclearexcept (FE_INVALID);
+
+  TEST (float, f);
+  TEST (double, );
+  TEST (long double, l);
+  
+  return 0;
+}


Re: [C++ Patch] PR 80955 (Macros expanded in definition of user-defined literals)

2017-10-20 Thread Mukesh Kapoor



On 10/20/2017 11:00 AM, Mukesh Kapoor wrote:

Hi,

On 10/20/2017 10:45 AM, Nathan Sidwell wrote:

On 10/20/2017 12:37 PM, Mukesh Kapoor wrote:

Hi,

This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80955.
Handle user-defined literals correctly in lex_string(). An empty 
string followed by an identifier is

a valid user-defined literal. Don't issue a warning for this case.


a) why do we trigger on the definition of the operator function, and 
not on the use site?


Actually, the current compiler issues an error (incorrectly) at both 
places: at the definition as well as at its use.




b) Why is the empty string special cased?  Doesn't the same logic 
apply to:


int operator "bob"_zero (const char *, size_t) { return 0;}


This is not a valid user-defined literal and is already reported as an 
error by the compiler. After my changes it's still reported as an error.
The empty string immediately followed by an identifier is a special 
case because it's a valid user-defined literal in C++. ""_zero is a 
valid user-defined literal.


Sorry, I used incorrect terminology here. An empty string immediately 
followed by an identifier is a valid name for a literal operator; 
""_zero is a valid name for a literal operator.


Mukesh



Mukesh



(that'd be a syntactic error in the C++ parser of course)

nathan







[PATCH], Update __FLOAT128_HARDWARE__ on power9

2017-10-20 Thread Michael Meissner
This is a simple patch to add a way that the GLIBC team call tell that certain
__float128 built-in functions are available.  While previous patches of mine
set __FAST_FP_FMAF128, which could be used for this purpose, this macro just
bumps __FLOAT128_HARDWARE__ to say that the built-in functions are available in
addition to supporting the basic IEEE 128-bit floating point instructions.

I did a full bootstrap and c/c++/fortran check and there were no regressions on
a little endian Power8 system.  I verified that the updated test
(float128-hw.c) did run.  Can I check this into the trunk?

[gcc]
2017-10-20  Michael Meissner  

* config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Define
__FLOAT128_HARDWARE__ to be 2 if float128 built-in functions are
available.
* extend.texi (PowerPC Built-in Functions): Document setting
__FLOAT128_HARDWARE__ to 2.

[gcc/testsuite]
2017-10-20  Michael Meissner  

* gcc.target/powerpc/float128-hw.c: Update test to include all 4
FMA variants.  Add check that __float128 to float conversions use
round to odd to convert it to DFmode before converting to SFmode.
Add check for __FLOAT128_HARDWARE__ being at least 2.  Reformat
code.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



[PATCH] Update value of __cpp_lib_chrono feature-test macro

2017-10-20 Thread Jonathan Wakely

Dinka Ranns contributed this feature back in February, but the SD-6
feature-test macro now has a new value to indicate support for this,
so let's update it.

* include/std/chrono (__cpp_lib_chrono): Update macro value to
indicate support for P0505R0.
* testsuite/20_util/duration/arithmetic/constexpr_c++17.cc: Check
for updated macro.

Tested powerpc64le-linux, committed to trunk and gcc-7-branch.


commit 8a7bf24c72d1373571d8f61d8078c874425c8256
Author: Jonathan Wakely 
Date:   Fri Oct 20 19:53:24 2017 +0100

Update value of __cpp_lib_chrono feature-test macro

* include/std/chrono (__cpp_lib_chrono): Update macro value to
indicate support for P0505R0.
* testsuite/20_util/duration/arithmetic/constexpr_c++17.cc: Check
for updated macro.

diff --git a/libstdc++-v3/include/std/chrono b/libstdc++-v3/include/std/chrono
index fc058fcd8d8..9491508e637 100644
--- a/libstdc++-v3/include/std/chrono
+++ b/libstdc++-v3/include/std/chrono
@@ -214,8 +214,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 treat_as_floating_point<_Rep>::value;
 #endif // C++17
 
-#if __cplusplus > 201402L
-# define __cpp_lib_chrono 201510
+#if __cplusplus >= 201703L
+# define __cpp_lib_chrono 201611
 
 template
   constexpr __enable_if_is_duration<_ToDur>
diff --git 
a/libstdc++-v3/testsuite/20_util/duration/arithmetic/constexpr_c++17.cc 
b/libstdc++-v3/testsuite/20_util/duration/arithmetic/constexpr_c++17.cc
index 438d50afddf..0ba1b8cc706 100644
--- a/libstdc++-v3/testsuite/20_util/duration/arithmetic/constexpr_c++17.cc
+++ b/libstdc++-v3/testsuite/20_util/duration/arithmetic/constexpr_c++17.cc
@@ -20,6 +20,13 @@
 
 #include 
 #include 
+
+#ifndef __cpp_lib_chrono
+# error "Feature-test macro for constexpr  missing"
+#elif __cpp_lib_chrono != 201611
+# error "Feature-test macro for constexpr  has wrong value"
+#endif
+
 constexpr auto test_operators()
 {
   std::chrono::nanoseconds d1 { 1 };


Re: [PATCH, rs6000] Fix incorrect mode usage for vec_select

2017-10-20 Thread Bill Schmidt

> On Mar 9, 2017, at 2:31 PM, Segher Boessenkool  
> wrote:
> 
> On Wed, Mar 08, 2017 at 09:47:32AM -0600, Bill Schmidt wrote:
>> As noted by Jakub in 
>> https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00183.html,
>> the PowerPC back end incorrectly uses vec_select with 2 elements for a mode
>> that has only one.  This is due to faulty mode iterator use:  V1TImode was
>> wrongly included in the VSX_LE mode iterator, but should instead have been
>> in the VSX_LE_128 mode iterator.
>> 
>> This patch fixes that, and with VSX_LE no longer including V1TImode, it is
>> now redundant with VSX_D, so that patch removes VSX_LE altogether.
>> 
>> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.
>> Is this ok for trunk?
> 
> Yes, thanks.
> 
>> I am uncertain whether we should backport the fix to gcc 5 and 6, since,
>> although the code is technically incorrect, it works just fine.  The fix
>> is needed in trunk to permit the sanity checking that Jakub has proposed
>> for genrecog.
> 
> Then let's not, not until we backport something that touches this code.
> OTOH a backport of this is approved as well, if you prefer that.

As this was rediscovered with PR81294, I've gone ahead with the backport for 6.
It does not apply to 5, as the faulty code doesn't exist there.

Thanks!
Bill

2017-10-20  Bill Schmidt  

Backport from mainline
2017-03-09  Bill Schmidt  

* config/rs6000/rs6000.c (rs6000_gen_le_vsx_permute): Use rotate
instead of vec_select for V1TImode.
* conifg/rs6000/vsx.md (VSX_LE): Remove mode iterator that is no
longer needed.
(VSX_LE_128): Add V1TI to this mode iterator.
(*vsx_le_perm_load_): Change to use VSX_D mode iterator.
(*vsx_le_perm_store_): Likewise.
(pre-reload splitter for VSX stores): Likewise.
(post-reload splitter for VSX stores): Likewise.
(*vsx_xxpermdi2_le_): Likewise.
(*vsx_lxvd2x2_le_): Likewise.
(*vsx_stxvd2x2_le_): Likewise.


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 253955)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -9591,7 +9591,7 @@ rs6000_gen_le_vsx_permute (rtx source, machine_mod
 {
   /* Use ROTATE instead of VEC_SELECT on IEEE 128-bit floating point, and
  128-bit integers if they are allowed in VSX registers.  */
-  if (FLOAT128_VECTOR_P (mode) || mode == TImode)
+  if (FLOAT128_VECTOR_P (mode) || mode == TImode || mode == V1TImode)
 return gen_rtx_ROTATE (mode, source, GEN_INT (64));
   else
 {
Index: gcc/config/rs6000/vsx.md
===
--- gcc/config/rs6000/vsx.md(revision 253955)
+++ gcc/config/rs6000/vsx.md(working copy)
@@ -24,15 +24,12 @@
 ;; Iterator for the 2 64-bit vector types
 (define_mode_iterator VSX_D [V2DF V2DI])
 
-;; Iterator for the 2 64-bit vector types + 128-bit types that are loaded with
-;; lxvd2x to properly handle swapping words on little endian
-(define_mode_iterator VSX_LE [V2DF V2DI V1TI])
-
 ;; Mode iterator to handle swapping words on little endian for the 128-bit
 ;; types that goes in a single vector register.
 (define_mode_iterator VSX_LE_128 [(KF   "FLOAT128_VECTOR_P (KFmode)")
  (TF   "FLOAT128_VECTOR_P (TFmode)")
- (TI   "TARGET_VSX_TIMODE")])
+ (TI   "TARGET_VSX_TIMODE")
+ V1TI])
 
 ;; Iterator for the 2 32-bit vector types
 (define_mode_iterator VSX_W [V4SF V4SI])
@@ -300,8 +297,8 @@
 ;; The patterns for LE permuted loads and stores come before the general
 ;; VSX moves so they match first.
 (define_insn_and_split "*vsx_le_perm_load_"
-  [(set (match_operand:VSX_LE 0 "vsx_register_operand" "=")
-(match_operand:VSX_LE 1 "memory_operand" "Z"))]
+  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=")
+(match_operand:VSX_D 1 "memory_operand" "Z"))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
   "#"
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
@@ -414,8 +411,8 @@
(set_attr "length" "8")])
 
 (define_insn "*vsx_le_perm_store_"
-  [(set (match_operand:VSX_LE 0 "memory_operand" "=Z")
-(match_operand:VSX_LE 1 "vsx_register_operand" "+"))]
+  [(set (match_operand:VSX_D 0 "memory_operand" "=Z")
+(match_operand:VSX_D 1 "vsx_register_operand" "+"))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
   "#"
   [(set_attr "type" "vecstore")
@@ -422,8 +419,8 @@
(set_attr "length" "12")])
 
 (define_split
-  [(set (match_operand:VSX_LE 0 "memory_operand" "")
-(match_operand:VSX_LE 1 "vsx_register_operand" ""))]
+  [(set (match_operand:VSX_D 0 "memory_operand" "")
+(match_operand:VSX_D 1 "vsx_register_operand" ""))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR && !reload_completed"
   [(set (match_dup 2)
  

Re: [PATCH, rs6000] Fix incorrect mode usage for vec_select

2017-10-20 Thread Jakub Jelinek
On Fri, Oct 20, 2017 at 04:41:04PM -0500, Bill Schmidt wrote:
> As this was rediscovered with PR81294, I've gone ahead with the backport for 
> 6.
> It does not apply to 5, as the faulty code doesn't exist there.

Well, 5 is closed anyway.

Jakub


Re: [PATCH, rs6000] Fix incorrect mode usage for vec_select

2017-10-20 Thread Bill Schmidt
On Oct 20, 2017, at 4:47 PM, Jakub Jelinek  wrote:
> 
> On Fri, Oct 20, 2017 at 04:41:04PM -0500, Bill Schmidt wrote:
>> As this was rediscovered with PR81294, I've gone ahead with the backport for 
>> 6.
>> It does not apply to 5, as the faulty code doesn't exist there.
> 
> Well, 5 is closed anyway.
> 
Right, of course -- I was still looking to see if there would be an easy patch 
for Doko
to carry if he so chose, but that's not looking good.

Bill



[PATCH] RFC: Preserving locations for variable-uses and constants (PR 43486)

2017-10-20 Thread David Malcolm
[following up on a discussion at Cauldron]

This is a work-in-progress attempt at retaining source-location
information for uses of variables and for constants: the tree nodes
that don't have an EXPR_LOCATION in our internal representation.

I'm posting the patch now to check that my approach is correct and
get feedback.  It adds new "wrapper" tree nodes around the nodes
that don't have a location_t, effectively decorating them with a
location_t.

The patch doesn't yet bootstrap, and fails many tests, but it does fix
the missing location information, so that e.g.:

  test.cc:5:38: error: invalid conversion from 'int' to 'const char*' 
[-fpermissive]
 return callee (first, second, third);
^

becomes:

  test.cc:5:25: error: invalid conversion from 'int' to 'const char*' 
[-fpermissive]
 return callee (first, second, third);
   ^~

for a mismatching type in a function call involving a variable or constant.

The case of a compound expression already works for this case e.g.:

 return callee (first, second * 2, third);
   ~~~^~~

These cases are already handled within the C frontend by the
vec that's passed around for callsites.

FWIW I posted a patch to add a vec to the C++ frontend:
  "[PATCH] C++: use an optional vec for callsites"
https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01392.html
which fixes the cases above, but Jason requested at Cauldron that I pursue
the wrapper node approach (as the vec is kind of a workaround)
so here's what I have so far.

Limitations:

* The patch as-is preserves the locations during the frontend, and hence
  solves various issues with diagnostics in the frontend, but the
  locations are discarded during gimplification.  PR 43486 requests
  preserving them into gimple, so although this approach would help with
  that PR, it doesn't fully address it.  I'm happy to defer the
  gimplification issue until after GCC 8.

* To simplify things, the patch only touches the C++ frontend.  Similar
  things would need to happen in the C frontend (and presumably others,
  but I care most about C and C++).

* As noted above, it doesn't yet bootstrap, and introduces various test
  regressions; obviously I'd fix all that assuming the direction of the
  patch is acceptable (folding appears to be the main issue: various
  places in the code expect the result of folding to be a decl, and
  go wrong if they see a wrapper node instead, but we still want the
  location information after folding).

Design questions:

* The patch introduces a new kind of tree node, currently called
  DECL_WRAPPER_EXPR (although it's used for wrapping constants as well
  as decls).  Should wrappers be a new kind of tree node, or should they
  reuse an existing TREE_CODE? (e.g. NOP_EXPR, CONVERT_EXPR, etc).
* NOP_EXPR: seems to be for use as an rvalue
* CONVERT_EXPR: for type conversions
* NON_LVALUE_EXPR: "Value is same as argument, but guaranteed not an
  lvalue"
  * but we *do* want to support lvalues here
* VIEW_CONVERT_EXPR: viewing one thing as of a different type
  * can it support lvalues?
* C_MAYBE_CONST_EXPR perhaps (generalized somehow)
  Any suggestions or guidance here?

Memory usage stats:

I tried running this on a non-trivial C++ file ("kdecore.cc" [1]),
but it doesn't yet work well enough to compile it.  So I hacked it
up like this:

   diff --git a/gcc/tree.c b/gcc/tree.c
   index 270e680..5711b2a 100644
   --- a/gcc/tree.c
   +++ b/gcc/tree.c
   @@ -13764,7 +13764,15 @@ maybe_wrap_with_location (tree expr, location_t loc)
  gcc_assert (CONSTANT_CLASS_P (expr)
 || DECL_P (expr)
 || EXCEPTIONAL_CLASS_P (expr));
   +
   +#if 0
  return build1_loc (loc, DECL_USAGE_EXPR, TREE_TYPE (expr), expr);
   +#else
   +  /* Simulate the GGC-effect of building the node... */
   +  (void)build1_loc (loc, DECL_USAGE_EXPR, TREE_TYPE (expr), expr);
   +  /* But don't actually do it.  */
   +  return expr;
   +#endif
}

/* Return the name of combined function FN, for debugging purposes.  */

to simulate the effect of allocating the wrapper nodes, without actually
using those nodes.

With that, -ftime-report's memory stats for "TOTAL" went
from 615999 kB to 617773 kB
i.e. about a 0.3% increase in overall GC-managed allocations.

I don't have reliable timing information yet.

Thoughts?

Thanks
Dave

[1] https://github.com/davidmalcolm/gcc-build/blob/master/kdecore.cc

gcc/ChangeLog:
PR c++/43486
* builtins.c (fold_builtin_next_arg): Strip off any
DECL_USAGE_EXPR.
* gimplify.c (gimplify_expr): Handle DECL_USAGE_EXPR.
* tree.c (maybe_wrap_with_location): New function.
* tree.def (DECL_USAGE_EXPR): New tree code.
* tree.h (STRIP_DECL_USAGE_EXPR): New macro.
(maybe_wrap_with_location): New decl.

gcc/c-family/ChangeLog:
PR c++/43486
* c-format.c (check_format_

Re: [Patch, fortran] PR82586 - [PDT] ICE: write_symbol(): bad module symbol

2017-10-20 Thread Jerry DeLisle
On 10/20/2017 11:55 AM, Paul Richard Thomas wrote:
> Dear All,
> 
> In the last hour, I have added fixes for PRs 82587 and 82589. Please
> review them together with 82586.
> 
> I will stop work on Gerhard's PDT bugs until this patch is committed.
> Fortunately, Steve Kargl has proposed fixes for most of them :-)
> 
> Cheers
> 
> Paul
> 
> 2017-10-20  Paul Thomas  
> 
> PR fortran/82586
> * decl.c (gfc_get_pdt_instance): Remove the error message that
> the parameter does not have a corresponding component since
> this is now taken care of when the derived type is resolved. Go
> straight to error return instead.
> (gfc_match_formal_arglist): Make the PDT relevant errors
> immediate so that parsing of the derived type can continue.
> (gfc_match_derived_decl): Do not check the match status on
> return from gfc_match_formal_arglist for the same reason.
> * resolve.c (resolve_fl_derived0): Check that each type
> parameter has a corresponding component.
> 
> PR fortran/82587
> * resolve.c (resolve_generic_f): Check that the derived type
> can be used before resolving the struture constructor.
> 
> PR fortran/82589
> * symbol.c (check_conflict): Add the conflicts involving PDT
> KIND and LEN attributes.
> 
> 2017-10-20  Paul Thomas  
> 
> PR fortran/82586
> * gfortran.dg/pdt_16.f03 : New test.
> * gfortran.dg/pdt_4.f03 : Catch the changed messages.
> * gfortran.dg/pdt_8.f03 : Ditto.
> 
> PR fortran/82587
> * gfortran.dg/pdt_17.f03 : New test.
> 
> PR fortran/82589
> * gfortran.dg/pdt_18.f03 : New test.
> 
> On 20 October 2017 at 18:17, Paul Richard Thomas
>  wrote:
>> Dear All,
>>
>> The attached patch is pretty clear with the ChangeLogs and is very
>> nearly obvious.
>>
>> Bootstrapped and regtested on FC23/x86_64 - OK for trunk?
>>
>> Paul

Looks Good to me. OK for trunk.

Jerry

PS The previous patch as well.

>>
>> 2017-10-20  Paul Thomas  
>>
>> PR fortran/82586
>> * decl.c (gfc_get_pdt_instance): Remove the error message that
>> the parameter does not have a corresponding component since
>> this is now taken care of when the derived type is resolved. Go
>> straight to error return instead.
>> (gfc_match_formal_arglist): Make the PDT relevant errors
>> immediate so that parsing of the derived type can continue.
>> (gfc_match_derived_decl): Do not check the match status on
>> return from gfc_match_formal_arglist for the same reason.
>> * resolve.c (resolve_fl_derived0): Check that each type
>> parameter has a corresponding component.
>>
>> 2017-10-20  Paul Thomas  
>>
>> PR fortran/82586
>> * gfortran.dg/pdt_16.f03 : New test.
>> * gfortran.dg/pdt_4.f03 : Catch the changed messages.
>> * gfortran.dg/pdt_8.f03 : Ditto.
>>
>>
>> --
>> "If you can't explain it simply, you don't understand it well enough"
>> - Albert Einstein
> 
> 
> 



Re: [Patch, fortran] PR82586 - [PDT] ICE: write_symbol(): bad module symbol

2017-10-20 Thread Jerry DeLisle
On 10/20/2017 12:17 PM, Steve Kargl wrote:
> On Fri, Oct 20, 2017 at 07:55:17PM +0100, Paul Richard Thomas wrote:
>>
>> In the last hour, I have added fixes for PRs 82587 and 82589. Please
>> review them together with 82586.
>>
>> I will stop work on Gerhard's PDT bugs until this patch is committed.
>> Fortunately, Steve Kargl has proposed fixes for most of them :-)
>>
> 
> Looks good to me.  Ok to commit.
> 

Well if I had scrolled my email down one more line I would have seen Steve
already reviewed it.

Cheers,

Jerry


Re: [PATCH] Derive interface buffers from max name length

2017-10-20 Thread Bernhard Reutner-Fischer
On 19 October 2017 10:03:06 CEST, Bernhard Reutner-Fischer 
 wrote:
>On Sat, Jun 18, 2016 at 09:46:17PM +0200, Bernhard Reutner-Fischer
>wrote:
>> On December 3, 2015 10:46:09 AM GMT+01:00, Janne Blomqvist
> wrote:
>> >On Tue, Dec 1, 2015 at 6:51 PM, Bernhard Reutner-Fischer
>> > wrote:
>> >> On 1 December 2015 at 15:52, Janne Blomqvist
>> > wrote:
>> >>> On Tue, Dec 1, 2015 at 2:54 PM, Bernhard Reutner-Fischer
>> >>>  wrote:
>>  These three function used a hardcoded buffer of 100 but would be
>> >better
>>  off to base off GFC_MAX_SYMBOL_LEN which denotes the maximum
>length
>> >of a
>>  name in any of our supported standards (63 as of f2003 ff.).
>> >>>
>> >>> Please use xasprintf() instead (and free the result, or course).
>One
>> >>> of my backburner projects is to get rid of these static symbol
>> >>> buffers, and use dynamic buffers (or the symbol table) instead.
>We
>> >>> IIRC already have some ugly hacks by using hashing to get around
>> >>> GFC_MAX_SYMBOL_LEN when handling mangled symbols. Your patch
>doesn't
>> >>> make the situation worse per se, but if you're going to fix it,
>lets
>> >>> do it properly.
>> >>
>> >> I see.
>> >>
>> >> /scratch/src/gcc-6.0.mine/gcc/fortran$ git grep
>> >> "^[[:space:]]*char[[:space:]][[:space:]]*[^[;[:space:]]*\[" | wc
>-l
>> >> 142
>> >> /scratch/src/gcc-6.0.mine/gcc/fortran$ git grep "xasprintf" | wc
>-l
>> >> 32
>> >
>> >Yes, that's why it's on the TODO-list rather than on the DONE-list.
>:)
>> >
>> >> What about memory fragmentation when switching to heap-based
>> >allocation?
>> >> Or is there consensus that these are in the noise compared to
>other
>> >> parts of the compiler?
>> >
>> >Heap fragmentation is an issue, yes. I'm not sure it's that
>> >performance-critical, but I don't think there is any consensus. I
>just
>> >want to avoid ugly hacks like symbol hashing to fit within some
>fixed
>> >buffer. Perhaps an good compromise would be something like
>std::string
>> >with small string optimization, but as you have seen there is some
>> >resistance to C++. But this is more relevant for mangled symbols, so
>> >GFC_MAX_MANGLED_SYMBOL_LEN is more relevant here, and there's only a
>> >few of them left. So, well, if you're sure that mangled symbols are
>> >never copied into the buffers your patch modifies, please consider
>> >your original patch Ok as well. Whichever you prefer.
>> >
>> >Performance-wise I think a bigger benefit would be to use the symbol
>> >table more and then e.g. be able to do pointer comparisons rather
>than
>> >strcmp(). But that is certainly much more work.
>> 
>> Hm, worth a look indeed since that would certainly be a step in the
>right direction.
>
>Installed the initial patch as intermediate step as r253881 for now.

JFYI I'm contemplating to move the stack-based allocations to heap-based ones 
now, starting with gfc_match_name and gradually moving to pointer comparisons 
with the stringpool based identifiers. I'll strive to suggest something for 
discussion in smallish steps when it's ready.

Cheers,
>
>thanks,
>> 
>> >
>> >> BTW:
>> >> $ git grep APO
>> >> io.c:  static const char *delim[] = { "APOSTROPHE", "QUOTE",
>"NONE",
>> >NULL };
>> >> io.c:  static const char *delim[] = { "APOSTROPHE", "QUOTE",
>"NONE",
>> >NULL };
>> >
>> >? What are you saying?
>> 
>> delim is duplicated, we should remove one instance.
>> thanks,
>> 



Re: [PATCH 3/4] enhance overflow and truncation detection in strncpy and strncat (PR 81117)

2017-10-20 Thread Martin Sebor

On 10/02/2017 04:15 PM, Jeff Law wrote:

On 08/10/2017 01:29 PM, Martin Sebor wrote:

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 016f68d..1aa9e22 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c

[ ... ]

+
+  if (TREE_CODE (type) == ARRAY_TYPE)
+{
+  /* Return the constant size unless it's zero (that's a
zero-length
+ array likely at the end of a struct).  */
+  tree size = TYPE_SIZE_UNIT (type);
+  if (size && TREE_CODE (size) == INTEGER_CST
+  && !integer_zerop (size))
+return size;
+}

Q. Do we have a canonical test for the trailing array idiom?   In some
contexts isn't it size 1?  ISTM This test needs slight improvement.
Ideally we'd use some canonical test for detect the trailing array idiom
rather than open-coding it here.  You might look at the array index
warnings in tree-vrp.c to see if it's got a canonical test you can call
or factor and use.


You're right, there is an API for this (array_at_struct_end_p,
as Richard pointed out).  I didn't want to use it because it
treats any array at the end of a struct as a flexible array
member, but simple tests show that that's what -Wstringop-
overflow does now, and it wasn't my intention to tighten up
the checking under this change.  It surprises me that no tests
exposed this. Let me relax the check and think about proposing
to tighten it up separately.




What might be even better would be to use the immediate uses of the
memory tag.  For your case there should be only one immediate use and it
should point to the statement which NUL terminates the destination.  Or
maybe that would be worse in that you only want to allow this exception
when the statements are consecutive.


You said "maybe that would be worse" so I hadn't implemented it.
I went ahead and coded it up but with more testing I don't think
it has the desired result.  See below.



I'll have to try this to better understand how it might work.

It's actually quite simple.

Rather than looking at the next statement in the chain via
gsi_next_nondebug you follow the def->use chain for the memory tag
associated with the string copy statement.

/* Get the memory tag that is defined by this statement.  */
defvar = gimple_vdef (gsi_stmt (gsi));

imm_use_iterator iter;
gimple *use_stmt;

if (num_imm_uses (defvar) == 1)
  {
imm_use_terator iter;
gimple *use_stmt;

/* Iterate over the immediate uses of the memory tag.  */
FOR_EACH_IMM_USE_STMT (use_stmt, ui, defvar)
  {
Check if STMT is dst[i] = '\0'
  }
  }



The check that there is a single immediate use is designed to make sure
you get a warning for this scenario:


Thanks for the outline of the solution.  I managed to get it to
work with only a few minor changes(*) but...


strxncpy
read the destination
terminate the destination

Which I think you'd want to consider non-terminated because of the read
of the destination prior to termination.

But avoids warnings for

strxncpy
stuff that doesn't read the destination
terminate the destintion


...while it works fine for the basic cases it has the downside
of missing more subtle problems like this one:

  char a[8];

  void f (void) { puts (a); }

  void g (const char *s)
  {
strncpy (a, s);

f ();   // assumes a is a string

a[sizeof a - 1] = '\0';
  }

or this one:

  struct S { char a[8]; };

  void f (const struct S *p) { puts (p->a); }

  void g (struct S *p, const char *s)
  {
strncpy (p->a, s);

f (p);   // assumes p->a is a string

a[sizeof p->a - 1] = '\0';
  }

I would rather have the test be a little more strict than possibly
miss these kinds of insidious bugs for what seems like a unlike use
case (other code between the strncpy and the *dst = '\0').  I think
sticking with the original also encourages cleaner code: keeping
the nul-termination as close to the strncpy call.


You still need to rename strlen_optimize_stmt since after your changes
it does both optimizations and warnings.


I'm not sure I understand why.  It's a pre-existing function that
just dispatches to the built-in handlers.  We don't rename function
callers each time we improve error/warning detection in some
function they call (case in point: all the expanders in builtins.c)
Why do it here?  And what would be a suitable name?  All that comes
to my mind is awkward variations on strlen_optimize_stmt_and_warn.

Actually we often end up renaming functions as their capabilities
change.  If I was to read that name, I'd think its only purpose was to
optimize.  Given its static with a single use we should just fix it.


Something in compute_objsize I just noticed.

When DEST is an SSA_NAME, you follow the use->def chain back to its
defining statement, then get a new dest from the RHS of that statement:


+  if (TREE_CODE (dest) == SSA_NAME)
+{
+  gimple *stmt = SSA_NAME_DEF_STMT (dest);
+  if (!is_gimple_assign (stmt))
+   return NULL_TREE;
+
+  dest = gimple_assign_rhs1 (stmt);
+}


This seems wrong as-written -

Re: [Patch] Edit contrib/ files to download gfortran prerequisites

2017-10-20 Thread Damian Rouson
 
Hi Richard,

Attached is a revised patch that makes the downloading of Fortran prerequisites 
optional via a new --no-fortran flag that can be passed to 
contrib/download_prerequisites as requested in your reply below. 

As Jerry mentioned in his response, he has been working on edits to the 
top-level build machinery, but we need additional guidance to complete his 
work.  Given that there were no responses to his request for guidance and it’s 
not clear when that work will complete, I’m hoping this minor change can be 
approved independently so that this patch doesn’t suffer bit rot in the interim.

Ok for trunk?

Damian




On September 21, 2017 at 12:40:49 AM, Richard Biener 
(richard.guent...@gmail.com(mailto:richard.guent...@gmail.com)) wrote:

> On Wed, Sep 20, 2017 at 10:35 PM, Damian Rouson
> wrote:
> > Attached is a patch that adds the downloading of gfortran prerequisites 
> > OpenCoarrays and MPICH in the contrib/download_prerequisites script. The 
> > patch also provides a useful error message when neither wget or curl are 
> > available on the target platform. I tested this patch with several choices 
> > for the command-line options on macOS (including --md5 and --sha512) and 
> > Ubuntu Linux (including --sha512). A suggested ChangeLog entry is
> >
> > * contrib/download_prerequisites: Download OpenCoarrays and MPICH.
> > * contrib/prerequisites.sha5: Add sha512 message digests for OpenCoarrays 
> > and MPICH.
> > * contrib/prerequisites.md5: Add md5 message digests for OpenCoarrays and 
> > MPICH.
> >
> >
> > OK for trunk? If so, I’ll ask Jerry to commit this. I don’t have commit 
> > rights.
>  
> Can you make this optional similar to graphite/isl? Also I see no support in
> the toplevel build machinery to build/install the libs as part of GCC
> so how does
> that work in the end?
>  
> Thanks,
> Richard.
>  
> > Damian


downlaod-prereqs.diff
Description: Binary data


Re: [PATCH] Fix path::iterator post-increment and post-decrement

2017-10-20 Thread Jonathan Wakely

On 19/10/17 15:00 +0100, Jonathan Wakely wrote:

I made a dumb mistake in the post-inc and post-dec operators for
the path::iterator type, forgetting that _M_cur is sometimes null (for
a single-element path).

* include/experimental/bits/fs_path.h (path::iterator++(int))
(path::iterator--(int)): Fix for paths with only one component.
* testsuite/experimental/filesystem/path/itr/traversal.cc: Test
post-increment and post-decrement.


And I made another dumb mistake in the new test, incrementing an end
iterator. It was caught by testing with _GLIBCXX_ASSERTIONS though.

Tested powerpc64le-linux, committed to trunk and gcc-7-branch.


commit b978785e751acf12d2429f19130900f419136e34
Author: Jonathan Wakely 
Date:   Sat Oct 21 02:11:37 2017 +0100

Fix invalid path::iterator test

* testsuite/experimental/filesystem/path/itr/traversal.cc: Do not
increment past-the-end iterators.

diff --git a/libstdc++-v3/testsuite/experimental/filesystem/path/itr/traversal.cc b/libstdc++-v3/testsuite/experimental/filesystem/path/itr/traversal.cc
index dbb4d46796d..41a292af4db 100644
--- a/libstdc++-v3/testsuite/experimental/filesystem/path/itr/traversal.cc
+++ b/libstdc++-v3/testsuite/experimental/filesystem/path/itr/traversal.cc
@@ -90,10 +90,9 @@ test03()
   ++iter;
   iter2++;
   VERIFY( iter2 == iter );
-  auto iter3 = iter;
-  --iter3;
+  --iter;
   iter2--;
-  VERIFY( iter2 == iter3 );
+  VERIFY( iter2 == iter );
 }
 }