[PATCH][PR65802] Mark ifn_va_arg with ECF_NOTHROW

2015-04-21 Thread Tom de Vries

Hi,

this patch fixes PR65802.

The problem described in PR65802 is that when compiling the test-case (included 
in the patch below) at -O0, the compiler runs into a gcc_assert ICE in 
redirect_eh_edge_1 during pass_cleanup_eh:

...
  gcc_assert (lookup_stmt_eh_lp (throw_stmt) == old_lp_nr);
...


In more detail, during compilation the ifn_va_arg is marked at as a throwing 
function. That causes exception handling code to be generated, with exception 
handling edges:

...
;; basic block 2, loop depth 0, count 0, freq 0, maybe hot
;;  prev block 0, next block 3, flags: (NEW, REACHABLE)
;;  pred:   ENTRY (FALLTHRU)
[LP 1] # .MEM_5 = VDEF <.MEM_4(D)>
# USE = anything
# CLB = anything
_6 = VA_ARG (&cD.2333, 0B);
;;  succ:   7 (EH)
;;  3 (FALLTHRU)
...

After pass_lower_vaarg, the expansion of ifn_va_arg is spread over several basic 
blocks:

...
;;   basic block 2, loop depth 0, count 0, freq 0, maybe hot
;;prev block 0, next block 11, flags: (NEW, REACHABLE)
;;pred:   ENTRY (FALLTHRU)
;;succ:   11 [100.0%]  (FALLTHRU)

;;   basic block 11, loop depth 0, count 0, freq 0, maybe hot
;;prev block 2, next block 12, flags: (NEW)
;;pred:   2 [100.0%]  (FALLTHRU)
  # VUSE <.MEM_4(D)>
  _22 = cD.2333.gp_offsetD.5;
  if (_22 >= 48)
goto  ();
  else
goto  ();
;;succ:   13 (TRUE_VALUE)
;;12 (FALSE_VALUE)

;;   basic block 12, loop depth 0, count 0, freq 0, maybe hot
;;prev block 11, next block 13, flags: (NEW)
;;pred:   11 (FALSE_VALUE)
:
  # VUSE <.MEM_4(D)>
  _23 = cD.2333.reg_save_areaD.8;
  # VUSE <.MEM_4(D)>
  _24 = cD.2333.gp_offsetD.5;
  _25 = (sizetype) _24;
  addr.1_26 = _23 + _25;
  # VUSE <.MEM_4(D)>
  _27 = cD.2333.gp_offsetD.5;
  _28 = _27 + 8;
  # .MEM_29 = VDEF <.MEM_4(D)>
  cD.2333.gp_offsetD.5 = _28;
  goto  ();
;;succ:   14 (FALLTHRU)

;;   basic block 13, loop depth 0, count 0, freq 0, maybe hot
;;prev block 12, next block 14, flags: (NEW)
;;pred:   11 (TRUE_VALUE)
:
  # VUSE <.MEM_4(D)>
  _30 = cD.2333.overflow_arg_areaD.7;
  addr.1_31 = _30;
  _32 = _30 + 8;
  # .MEM_33 = VDEF <.MEM_4(D)>
  cD.2333.overflow_arg_areaD.7 = _32;
;;succ:   14 (FALLTHRU)

;;   basic block 14, loop depth 0, count 0, freq 0, maybe hot
;;prev block 13, next block 15, flags: (NEW)
;;pred:   12 (FALLTHRU)
;;13 (FALLTHRU)
  # .MEM_20 = PHI <.MEM_29(12), .MEM_33(13)>
  # addr.1_21 = PHI 
:
  # VUSE <.MEM_20>
  _6 = MEM[(intD.9 * * {ref-all})addr.1_21];
;;succ:   15 (FALLTHRU)

;;   basic block 15, loop depth 0, count 0, freq 0, maybe hot
;;prev block 14, next block 3, flags: (NEW)
;;pred:   14 (FALLTHRU)
;;succ:   7 (EH)
;;3 (FALLTHRU)
...

And an ICE is triggered in redirect_eh_edge_1, because the code expects the last 
statement in a BB with an outgoing EH edge to be a throwing statement.


That's obviously not the case, since bb15 is empty. But also all the other 
statements in the expansion are non-throwing.



Looking at the representation before the ifn_va_arg, VA_ARG_EXPR is non-throwing 
(even with -fnon-call-exceptions).


And looking at the situation before the introduction of ifn_va_arg, the 
expansion of VA_ARG_EXPR also didn't contain any throwing statements.



This patch fixes the ICE by marking ifn_va_arg with ECF_NOTHROW.

Bootstrapped and reg-tested on x86_64.

OK for trunk?

Thanks,
- Tom
Mark ifn_va_arg with ECF_NOTHROW

2015-04-20  Tom de Vries  

	PR tree-optimization/65802
	* internal-fn.def (VA_ARG): Add ECF_NOTROW to flags.

	* g++.dg/pr65802.C: New test.
---
 gcc/internal-fn.def|  2 +-
 gcc/testsuite/g++.dg/pr65802.C | 29 +
 2 files changed, 30 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/pr65802.C

diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index f557c64..7e19313 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -62,4 +62,4 @@ DEF_INTERNAL_FN (ADD_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (SUB_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (MUL_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (TSAN_FUNC_EXIT, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
-DEF_INTERNAL_FN (VA_ARG, 0, NULL)
+DEF_INTERNAL_FN (VA_ARG, ECF_NOTHROW, NULL)
diff --git a/gcc/testsuite/g++.dg/pr65802.C b/gcc/testsuite/g++.dg/pr65802.C
new file mode 100644
index 000..26e5317
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr65802.C
@@ -0,0 +1,29 @@
+// { dg-do compile }
+// { dg-options "-O0" }
+
+typedef int tf ();
+
+struct S
+{
+  tf m_fn1;
+} a;
+
+void
+fn1 ()
+{
+  try
+{
+  __builtin_va_list c;
+  {
+	int *d = __builtin_va_arg (c, int *);
+	int **e = &d;
+	__asm__("" : "=d"(e));
+	a.m_fn1 ();
+  }
+  a.m_fn1 ();
+}
+  catch (...)
+{
+
+}
+}
-- 
1.9.1



Re: Ping^3 : [PATCH] [gcc, combine] PR46164: Don't combine the insns if a volatile register is contained.

2015-04-21 Thread Terry Guo
On Tue, Apr 21, 2015 at 11:03 AM, Segher Boessenkool
 wrote:
> On Tue, Apr 21, 2015 at 09:39:16AM +0800, Terry Guo wrote:
>> Is this one ok to trunk?
>
> Probably, if you send the patch + changelog entry :-)
>
> Did you fix the comment?  REG_USERVAR_P and HARD_REGISTER_P can be
> set for more than just register asm.
>
>
> Segher

Sorry for missing the patch. I believe that I addressed your patch.
Please review it again to make sure my understanding is correct. The
patch is attached and here is the URL to it
https://gcc.gnu.org/ml/gcc-patches/2015-02/msg01593.html. The
ChangeLog:

gcc/ChangeLog:
2015-04-21  Terry Guo  

   PR rtl-optimization/64818
   * combine.c (can_combine_p): Don't combine if DEST is a user-specified
   register.

gcc/testsuite/ChangeLog:

2015-04-21  Terry Guo  

   PR rtl-optimization/64818
   * gcc.target/arm/pr64818.c: New.


pr64818-combine-user-specified-register.patch-5
Description: Binary data


Re: [PATCH][PR65802] Mark ifn_va_arg with ECF_NOTHROW

2015-04-21 Thread Richard Biener
On Tue, 21 Apr 2015, Tom de Vries wrote:

> Hi,
> 
> this patch fixes PR65802.
> 
> The problem described in PR65802 is that when compiling the test-case
> (included in the patch below) at -O0, the compiler runs into a gcc_assert ICE
> in redirect_eh_edge_1 during pass_cleanup_eh:
> ...
> gcc_assert (lookup_stmt_eh_lp (throw_stmt) == old_lp_nr);
> ...
> 
> 
> In more detail, during compilation the ifn_va_arg is marked at as a throwing
> function. That causes exception handling code to be generated, with exception
> handling edges:
> ...
> ;; basic block 2, loop depth 0, count 0, freq 0, maybe hot
> ;;  prev block 0, next block 3, flags: (NEW, REACHABLE)
> ;;  pred:   ENTRY (FALLTHRU)
> [LP 1] # .MEM_5 = VDEF <.MEM_4(D)>
> # USE = anything
> # CLB = anything
> _6 = VA_ARG (&cD.2333, 0B);
> ;;  succ:   7 (EH)
> ;;  3 (FALLTHRU)
> ...
> 
> After pass_lower_vaarg, the expansion of ifn_va_arg is spread over several
> basic blocks:
> ...
> ;;   basic block 2, loop depth 0, count 0, freq 0, maybe hot
> ;;prev block 0, next block 11, flags: (NEW, REACHABLE)
> ;;pred:   ENTRY (FALLTHRU)
> ;;succ:   11 [100.0%]  (FALLTHRU)
> 
> ;;   basic block 11, loop depth 0, count 0, freq 0, maybe hot
> ;;prev block 2, next block 12, flags: (NEW)
> ;;pred:   2 [100.0%]  (FALLTHRU)
>   # VUSE <.MEM_4(D)>
>   _22 = cD.2333.gp_offsetD.5;
>   if (_22 >= 48)
> goto  ();
>   else
> goto  ();
> ;;succ:   13 (TRUE_VALUE)
> ;;12 (FALSE_VALUE)
> 
> ;;   basic block 12, loop depth 0, count 0, freq 0, maybe hot
> ;;prev block 11, next block 13, flags: (NEW)
> ;;pred:   11 (FALSE_VALUE)
> :
>   # VUSE <.MEM_4(D)>
>   _23 = cD.2333.reg_save_areaD.8;
>   # VUSE <.MEM_4(D)>
>   _24 = cD.2333.gp_offsetD.5;
>   _25 = (sizetype) _24;
>   addr.1_26 = _23 + _25;
>   # VUSE <.MEM_4(D)>
>   _27 = cD.2333.gp_offsetD.5;
>   _28 = _27 + 8;
>   # .MEM_29 = VDEF <.MEM_4(D)>
>   cD.2333.gp_offsetD.5 = _28;
>   goto  ();
> ;;succ:   14 (FALLTHRU)
> 
> ;;   basic block 13, loop depth 0, count 0, freq 0, maybe hot
> ;;prev block 12, next block 14, flags: (NEW)
> ;;pred:   11 (TRUE_VALUE)
> :
>   # VUSE <.MEM_4(D)>
>   _30 = cD.2333.overflow_arg_areaD.7;
>   addr.1_31 = _30;
>   _32 = _30 + 8;
>   # .MEM_33 = VDEF <.MEM_4(D)>
>   cD.2333.overflow_arg_areaD.7 = _32;
> ;;succ:   14 (FALLTHRU)
> 
> ;;   basic block 14, loop depth 0, count 0, freq 0, maybe hot
> ;;prev block 13, next block 15, flags: (NEW)
> ;;pred:   12 (FALLTHRU)
> ;;13 (FALLTHRU)
>   # .MEM_20 = PHI <.MEM_29(12), .MEM_33(13)>
>   # addr.1_21 = PHI 
> :
>   # VUSE <.MEM_20>
>   _6 = MEM[(intD.9 * * {ref-all})addr.1_21];
> ;;succ:   15 (FALLTHRU)
> 
> ;;   basic block 15, loop depth 0, count 0, freq 0, maybe hot
> ;;prev block 14, next block 3, flags: (NEW)
> ;;pred:   14 (FALLTHRU)
> ;;succ:   7 (EH)
> ;;3 (FALLTHRU)
> ...
> 
> And an ICE is triggered in redirect_eh_edge_1, because the code expects the
> last statement in a BB with an outgoing EH edge to be a throwing statement.
> 
> That's obviously not the case, since bb15 is empty. But also all the other
> statements in the expansion are non-throwing.
> 
> 
> Looking at the representation before the ifn_va_arg, VA_ARG_EXPR is
> non-throwing (even with -fnon-call-exceptions).
> 
> And looking at the situation before the introduction of ifn_va_arg, the
> expansion of VA_ARG_EXPR also didn't contain any throwing statements.
> 
> 
> This patch fixes the ICE by marking ifn_va_arg with ECF_NOTHROW.
> 
> Bootstrapped and reg-tested on x86_64.
> 
> OK for trunk?

Ok.

Thanks,
Richard.

> Thanks,
> - Tom
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


Re: [PATCH][PR65802] Mark ifn_va_arg with ECF_NOTHROW

2015-04-21 Thread Jan Hubicka
> Mark ifn_va_arg with ECF_NOTHROW

You can defnitly make it ECF_LEAF too. I wonder if we can make it ECF_CONST or 
at leat PURE
this would help to keep variadic functions const/pure that may be moderately 
interesting
in practice.

Honza
> 
> 2015-04-20  Tom de Vries  
> 
>   PR tree-optimization/65802
>   * internal-fn.def (VA_ARG): Add ECF_NOTROW to flags.
> 
>   * g++.dg/pr65802.C: New test.
> ---
>  gcc/internal-fn.def|  2 +-
>  gcc/testsuite/g++.dg/pr65802.C | 29 +
>  2 files changed, 30 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/pr65802.C
> 
> diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> index f557c64..7e19313 100644
> --- a/gcc/internal-fn.def
> +++ b/gcc/internal-fn.def
> @@ -62,4 +62,4 @@ DEF_INTERNAL_FN (ADD_OVERFLOW, ECF_CONST | ECF_LEAF | 
> ECF_NOTHROW, NULL)
>  DEF_INTERNAL_FN (SUB_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
>  DEF_INTERNAL_FN (MUL_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
>  DEF_INTERNAL_FN (TSAN_FUNC_EXIT, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
> -DEF_INTERNAL_FN (VA_ARG, 0, NULL)
> +DEF_INTERNAL_FN (VA_ARG, ECF_NOTHROW, NULL)
> diff --git a/gcc/testsuite/g++.dg/pr65802.C b/gcc/testsuite/g++.dg/pr65802.C
> new file mode 100644
> index 000..26e5317
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pr65802.C
> @@ -0,0 +1,29 @@
> +// { dg-do compile }
> +// { dg-options "-O0" }
> +
> +typedef int tf ();
> +
> +struct S
> +{
> +  tf m_fn1;
> +} a;
> +
> +void
> +fn1 ()
> +{
> +  try
> +{
> +  __builtin_va_list c;
> +  {
> + int *d = __builtin_va_arg (c, int *);
> + int **e = &d;
> + __asm__("" : "=d"(e));
> + a.m_fn1 ();
> +  }
> +  a.m_fn1 ();
> +}
> +  catch (...)
> +{
> +
> +}
> +}
> -- 
> 1.9.1
> 



Re: [PATCH][PR65802] Mark ifn_va_arg with ECF_NOTHROW

2015-04-21 Thread Richard Biener
On Tue, 21 Apr 2015, Jan Hubicka wrote:

> > Mark ifn_va_arg with ECF_NOTHROW
> 
> You can defnitly make it ECF_LEAF too. I wonder if we can make it ECF_CONST 
> or at leat PURE
> this would help to keep variadic functions const/pure that may be moderately 
> interesting
> in practice.

Yes to ECF_LEAF but it isn't const or pure as it modifies the valist 
argument so you can't for example DCE va_arg (...) if the result isn't
needed.

Richard.

> Honza
> > 
> > 2015-04-20  Tom de Vries  
> > 
> > PR tree-optimization/65802
> > * internal-fn.def (VA_ARG): Add ECF_NOTROW to flags.
> > 
> > * g++.dg/pr65802.C: New test.
> > ---
> >  gcc/internal-fn.def|  2 +-
> >  gcc/testsuite/g++.dg/pr65802.C | 29 +
> >  2 files changed, 30 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/g++.dg/pr65802.C
> > 
> > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> > index f557c64..7e19313 100644
> > --- a/gcc/internal-fn.def
> > +++ b/gcc/internal-fn.def
> > @@ -62,4 +62,4 @@ DEF_INTERNAL_FN (ADD_OVERFLOW, ECF_CONST | ECF_LEAF | 
> > ECF_NOTHROW, NULL)
> >  DEF_INTERNAL_FN (SUB_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
> >  DEF_INTERNAL_FN (MUL_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
> >  DEF_INTERNAL_FN (TSAN_FUNC_EXIT, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
> > -DEF_INTERNAL_FN (VA_ARG, 0, NULL)
> > +DEF_INTERNAL_FN (VA_ARG, ECF_NOTHROW, NULL)
> > diff --git a/gcc/testsuite/g++.dg/pr65802.C b/gcc/testsuite/g++.dg/pr65802.C
> > new file mode 100644
> > index 000..26e5317
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/pr65802.C
> > @@ -0,0 +1,29 @@
> > +// { dg-do compile }
> > +// { dg-options "-O0" }
> > +
> > +typedef int tf ();
> > +
> > +struct S
> > +{
> > +  tf m_fn1;
> > +} a;
> > +
> > +void
> > +fn1 ()
> > +{
> > +  try
> > +{
> > +  __builtin_va_list c;
> > +  {
> > +   int *d = __builtin_va_arg (c, int *);
> > +   int **e = &d;
> > +   __asm__("" : "=d"(e));
> > +   a.m_fn1 ();
> > +  }
> > +  a.m_fn1 ();
> > +}
> > +  catch (...)
> > +{
> > +
> > +}
> > +}
> > -- 
> > 1.9.1
> > 
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


Re: [PATCH][AArch64] Implement -m{cpu,tune,arch}=native using only /proc/cpuinfo

2015-04-21 Thread Kyrill Tkachov


On 21/04/15 05:41, Kumar, Venkataramanan wrote:

Hi Kyrill,

In AMD Seattle board,  I see that CPU implementer is 0x41 and CPU part is 
0xd07.CPU variant is 1 but you don’t do anything with that.
It matches with cortex-a57 and its features.


Thanks, that's a Cortex-A57.



I will try a bootstrap test as well.


Awesome.
I'd like to have a --with-{arch,tune,cpu}=native configure option at some point 
in the future
but I'm not sure at the moment how that would be done without some refactoring.

Kyrill



Regards,
Venkat.
  


-Original Message-
From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On 
Behalf Of Kyrill Tkachov
Sent: Monday, April 20, 2015 9:18 PM
To: GCC Patches
Cc: Marcus Shawcroft; Richard Earnshaw; James Greenhalgh; Evandro Menezes; 
Andrew Pinski; James Greenhalgh
Subject: [PATCH][AArch64] Implement -m{cpu,tune,arch}=native using only 
/proc/cpuinfo

Hi all,

This is an attempt to add native CPU detection to AArch64 GNU/Linux targets.
Similar to other ports we use SPEC rewriting to rewrite 
-m{cpu,tune,arch}=native options into the appropriate CPU/architecture and the 
architecture extension options when appropriate (i.e. +crypto/+crc etc).

For CPU/architecture detection it gets a bit involved, especially when running 
on a big.LITTLE system. My proposed approach is to look at /proc/cpuinfo/ and 
search for the implementer id and part number fields that uniquely identify 
each core (appropriate identifying information is added to aarch64-cores.def). 
If we find two types of core we have a big.LITTLE system, so search through the 
core definitions extracted from aarch64-cores.def to find if we support such a 
combination (currently only cortex-a57.cortex-a53 and cortex-a72.cortex-a53) 
and make sure that the implementer id field matches up.

I tested this on a 4xCortex-A53 + 2xCortex-A57 big.LITTLE Ubuntu GNU/Linux 
system.
There are two formats for /proc/cpuinfo/ that I'm aware of. The first (old) one 
has the format:
--
processor: 0
processor: 1
processor: 2
processor: 3
processor: 4
processor: 5
Features: fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer: 0x41
CPU architecture: AArch64
CPU variant: 0x0
CPU part: 0xd03
--

In this format it lists the 6 cores but the CPU part it reports is only the one 
for the core from which /proc/cpuinfo was read from (!), in this case one of 
the Cortex-A53 cores.
This means we detect a different CPU depending on which core GCC was invoked 
on. Not ideal really, but there's no more information that we can extract.
Given the /proc/cpuinfo above, this patch will rewrite -mcpu=native into 
-mcpu=cortex-a53+fp+simd+crypto+crc

The newer /proc/cpuinfo format proposed at 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=44b82b7700d05a52cd983799d3ecde1a976b3bed
looks like this:

--
processor   : 0
Features: fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part: 0xd03
CPU revision: 0

processor   : 1
Features: fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part: 0xd03
CPU revision: 0

processor   : 2
Features: fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part: 0xd03
CPU revision: 0

processor   : 3
Features: fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part: 0xd03
CPU revision: 0

processor   : 4
Features: fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part: 0xd07
CPU revision: 0

processor   : 5
Features: fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part: 0xd07
CPU revision: 0
--

The Features field is used to detect the architectural features that we map to 
GCC option extensions i.e. +fp,+crypto,+simd,+crc etc.

Similarly, -march=native would be rewritten into 
-march=armv8-a+fp+simd+crypto+crc while -mtune=native into 
-march=cortex-a57.cortex-a53 (the arch extension options are not valid for 
-mtune).

If it detects more than one implementer ID or the implementer IDs not matching 
up somewhere or some other weirdness /proc/cpuinfo or fails to recognise the 
CPU it will bail out and ignore the option entirely (similarly to other ports).

The patch works fine with both /proc/cpuinfo formats although, as mentioned 
above, it will not be able to detect the big.LITTLE combination from the first 
format.

I've filled in the implemente

Re: [PATCH][AArch64] Increase static buffer size in aarch64_rewrite_selected_cpu

2015-04-21 Thread Kyrill Tkachov


On 20/04/15 21:30, James Greenhalgh wrote:

On Mon, Apr 20, 2015 at 05:24:39PM +0100, Kyrill Tkachov wrote:

Hi all,

When trying to compile a testcase with -mcpu=cortex-a57+crypto+nocrc I got
the weird assembler error:
Assembler messages:
Error: missing architectural extension
Error: unrecognized option -mcpu=cortex-a57+crypto+no

The problem is the aarch64_rewrite_selected_cpu that is used to rewrite -mcpu
for big.LITTLE options has a limit of 20 characters in what it handles, which
we can exhaust quickly if we specify architectural extensions in a
fine-grained manner.

This patch increases that character limit to 128 and adds an assert to
confirm that no bad things happen.

You've implemented this as a hard ICE, was that intended?


Yeah, the idea is that before this we would silently truncate i.e. do the wrong 
thing.
Now, if we exceed the limit we ICE. I don't think it should be a user error 
because
it's not really the user's fault that the compiler doesn't handle crazy long 
strings
but handling arbitrary large strings would make this function more complex than 
I think
is needed for the majority of cases. If you plan to rewrite this in the future, 
we can
revisit that.





It also fixes another problem: If we pass a big.LITTLE combination with
feature modifiers like: -mcpu=cortex-a57.cortex-a53+nosimd

the code will truncate everything after '.', thus destroying the extensions
that we want to pass.  The patch adds code to stitch the extensions back on
after the LITTLE cpu is removed.

UGH, I should not be allowed near strings! This code is on my list of
things I'd love to rewrite to this year! For now, this is OK and please
also queue it for 5.2 when that opens for patches.


Committed to trunk with r58.

Thanks for looking at it,
Kyrill




Ok for trunk?

Yes, thanks. And sorry again for introducing this in the first place.

James





Re: [PATCH] tetstsuite gcc.target/i386/ avx512*

2015-04-21 Thread Kirill Yukhin
Hello Andreas,
On 19 Apr 21:56, Andreas Tobler wrote:
> Done so and tested on FreeBSD amd64-unknown-freebsd11.0 and CentOS7.1.
> 
> Ok for trunk?
The patch is OK for trunk and for gcc-5 branch (when it is open).

Thanks for fixing this!

--
K


Re: [PATCH][expr.c] PR 65358 Avoid clobbering partial argument during sibcall

2015-04-21 Thread Kyrill Tkachov


On 20/04/15 19:02, Jeff Law wrote:

On 04/20/2015 02:25 AM, Kyrill Tkachov wrote:

Hi Jeff,

Hmmm, so what happens if the difference is < 0?   I'd be a bit worried
about that case for the PA (for example).

So how about asserting that the INTVAL is >= 0 prior to returning so
that we catch that case if it ever occurs?

INTVAL being >= 0 is the case that I want to catch with this function.
INTVAL <0 is the usual case on leaf call optimisation. On arm, at least,
it means that x and y use the same base register (i.e. same stack frame)
but the offsets are such that reading SIZE bytes from X will not overlap
with Y, thus not requiring the workaround in this patch.
Thus, asserting that the result is positive is not right here.

What characteristic on pa makes this problematic? Is it the
STACK_GROWS_UPWARD?

Yea or more correctly that {STACK,FRAME}_GROWS_UPWARD and
ARGS_GROW_DOWNWARD.  I think the stormy16 may have downward growing args
too.



Should I then extend this function to do something like:

HOST_WIDE_INT res = INTVAL (sub);
#ifndef STACK_GROWS_DOWNWARD
res = -res;
#endif

return res?

It certainly feels like something is needed for targets where growth is
in the opposite direction -- but my guess is that without a concrete
case that triggers on those targets (just the PA in 64 bit mode and
stormy?) we'll probably get it wrong in one way or another.  Hence my
suggestion that we assert rather than try to handle it and silently
generate incorrect code in the process.


However, this function is expected to return negative numbers
when there is no overlap i.e. in the vast majority of cases when this
bug doesn't manifest. So asserting that it's positive is just
going to ICE at -O2 in almost any code.

From reading config/stormy16/stormy-abi it seems to me that we don't
pass arguments partially in stormy16, so this code would never be called
there. That leaves pa as the potential problematic target.
I don't suppose there's an easy way to test on pa? My checkout of binutils
doesn't seem to include a sim target for it.

Kyrill




Jeff





[PATCH][AArch64] Add zero_extend variants of logical+not ops

2015-04-21 Thread Kyrill Tkachov

Hi all,

We were missing the patterns for the zero-extend versions of the negated-logic 
ops, bic,orn,eon
leading to redundant zero-extends being generated for code like:

unsigned long
bar (unsigned int a, unsigned int b)
{
  return a ^ ~b;
}

unsigned long
bar2 (unsigned int a, unsigned int b)
{
  return a & ~b;
}


With this patch for the above we can generate:
bar:
eonw0, w1, w0
ret

bar2:
bicw0, w0, w1
ret


instead of:
bar:
eonw0, w1, w0
uxtwx0, w0
ret

bar2:
bicw0, w0, w1
uxtwx0, w0
ret


Bootstrapped and tested on aarch64-linux.
Ok for trunk?

Thanks,
Kyrill

2015-04-21  Kyrylo Tkachov  

* config/aarch64/aarch64.md (*_one_cmplsidi3_ze):
New pattern.
(*xor_one_cmplsidi3_ze): Likewise.
commit 8ff76787ce2674b918e1e6ed8b09cafb6b7a
Author: Kyrylo Tkachov 
Date:   Mon Mar 2 16:20:10 2015 +

[AArch64] Add zero_extend variants of logical+not ops

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 4aa8f5c..1a7f888 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3058,6 +3058,26 @@ (define_insn "*_one_cmpl3"
(set_attr "simd" "*,yes")]
 )
 
+(define_insn "*_one_cmplsidi3_ze"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(zero_extend:DI
+	  (NLOGICAL:SI (not:SI (match_operand:SI 1 "register_operand" "r"))
+	   (match_operand:SI 2 "register_operand" "r"]
+  ""
+  "\\t%w0, %w2, %w1"
+  [(set_attr "type" "logic_reg")]
+)
+
+(define_insn "*xor_one_cmplsidi3_ze"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+(zero_extend:DI
+  (not:SI (xor:SI (match_operand:SI 1 "register_operand" "r")
+  (match_operand:SI 2 "register_operand" "r")]
+  ""
+  "eon\\t%w0, %w1, %w2"
+  [(set_attr "type" "logic_reg")]
+)
+
 ;; (xor (not a) b) is simplify_rtx-ed down to (not (xor a b)).
 ;; eon does not operate on SIMD registers so the vector variant must be split.
 (define_insn_and_split "*xor_one_cmpl3"


[C++ Patch, committed] PR 65801

2015-04-21 Thread Paolo Carlini

Hi,

I committed the below to trunk as approved by Jason on the audit trail. 
Will go in gcc-5 branch too for 5.2. Tested x86_64-linux.


Thanks,
Paolo.

//
/cp
2015-04-20  Paolo Carlini  

PR c++/65801
* typeck2.c (check_narrowing): In C++11 mode too, -Wno-narrowing
suppresses the diagnostic.

2015-04-20  Paolo Carlini  

PR c++/65801
* doc/invoke.texi ([-Wnarrowing]): Update.

/testsuite
2015-04-20  Paolo Carlini  

PR c++/65801
* g++.dg/cpp0x/Wnarrowing2.C: New.
Index: cp/typeck2.c
===
--- cp/typeck2.c(revision 40)
+++ cp/typeck2.c(working copy)
@@ -957,9 +957,13 @@ check_narrowing (tree type, tree init, tsubst_flag
}
}
   else if (complain & tf_error)
-   error_at (EXPR_LOC_OR_LOC (init, input_location),
- "narrowing conversion of %qE from %qT to %qT inside { }",
- init, ftype, type);
+   {
+ global_dc->pedantic_errors = 1;
+ pedwarn (EXPR_LOC_OR_LOC (init, input_location), OPT_Wnarrowing,
+  "narrowing conversion of %qE from %qT to %qT inside { }",
+  init, ftype, type);
+ global_dc->pedantic_errors = flag_pedantic_errors;
+   }
 }
 
   return cxx_dialect == cxx98 || ok; 
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 40)
+++ doc/invoke.texi (working copy)
@@ -2706,10 +2706,10 @@ int i = @{ 2.2 @}; // error: narrowing from double
 
 This flag is included in @option{-Wall} and @option{-Wc++11-compat}.
 
-With @option{-std=c++11}, @option{-Wno-narrowing} suppresses for
-non-constants the diagnostic required by the standard.  Note that this
-does not affect the meaning of well-formed code; narrowing conversions
-are still considered ill-formed in SFINAE context.
+With @option{-std=c++11}, @option{-Wno-narrowing} suppresses the diagnostic
+required by the standard.  Note that this does not affect the meaning
+of well-formed code; narrowing conversions are still considered
+ill-formed in SFINAE context.
 
 @item -Wnoexcept @r{(C++ and Objective-C++ only)}
 @opindex Wnoexcept
Index: testsuite/g++.dg/cpp0x/Wnarrowing2.C
===
--- testsuite/g++.dg/cpp0x/Wnarrowing2.C(revision 0)
+++ testsuite/g++.dg/cpp0x/Wnarrowing2.C(working copy)
@@ -0,0 +1,5 @@
+// PR c++/65801
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wno-narrowing" }
+
+static struct zai { unsigned int x; } x = {-1};


[PATCH][ARM][stage-1] Initialise cost to COSTS_N_INSNS (1) and increment in arm rtx costs

2015-04-21 Thread Kyrill Tkachov

Hi all,

This is the first of a series to clean up and simplify the arm rtx costs 
function.
This patch initialises the cost to COSTS_N_INSNS (1) at the top and increments 
it when appropriate
in the rest of the function. This makes it more similar to the aarch64 rtx 
costs function and saves
us the trouble of having to remember to initialise the cost to COSTS_N_INSNS 
(1) in each case of the
switch statement.

Bootstrapped and tested arm-none-linux-gnueabihf.
Compiled some large programs with no codegen difference, except some DIV 
synthesis algorithms were changed,
presumably due to the cost of SDIV/UDIV, which is now being correctly 
calculated (before it was missing the
baseline COSTS_N_INSNS (1)).

Ok for trunk?

Thanks,
Kyrill

2015-04-21  Kyrylo Tkachov  

* config/arm/arm.c (arm_new_rtx_costs): Initialise cost to
COSTS_N_INSNS (1) and increment it appropriately throughout the
function.
commit 8c4d923b6a2fc902a1a195e2e8c5f934e571d8dd
Author: Kyrylo Tkachov 
Date:   Thu Apr 2 11:44:54 2015 +0100

[ARM] Initialise rtx cost to COSTS_N_INSNS (1) once at the beginning

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 4dfe4a7..00da2b7 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -9704,6 +9704,8 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 {
   machine_mode mode = GET_MODE (x);
 
+  *cost = COSTS_N_INSNS (1);
+
   if (TARGET_THUMB1)
 {
   if (speed_p)
@@ -9804,8 +9806,6 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
   bool is_ldm = load_multiple_operation (x, SImode);
   bool is_stm = store_multiple_operation (x, SImode);
 
-  *cost = COSTS_N_INSNS (1);
-
   if (is_ldm || is_stm)
 {
 	  if (speed_p)
@@ -9832,10 +9832,10 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 case UDIV:
   if (TARGET_HARD_FLOAT && GET_MODE_CLASS (mode) == MODE_FLOAT
 	  && (mode == SFmode || !TARGET_VFP_SINGLE))
-	*cost = COSTS_N_INSNS (speed_p
-			   ? extra_cost->fp[mode != SFmode].div : 1);
+	*cost += COSTS_N_INSNS (speed_p
+			   ? extra_cost->fp[mode != SFmode].div : 0);
   else if (mode == SImode && TARGET_IDIV)
-	*cost = COSTS_N_INSNS (speed_p ? extra_cost->mult[0].idiv : 1);
+	*cost += COSTS_N_INSNS (speed_p ? extra_cost->mult[0].idiv : 0);
   else
 	*cost = LIBCALL_COST (2);
   return false;	/* All arguments must be in registers.  */
@@ -9848,7 +9848,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 case ROTATE:
   if (mode == SImode && REG_P (XEXP (x, 1)))
 	{
-	  *cost = (COSTS_N_INSNS (2)
+	  *cost += (COSTS_N_INSNS (1)
 		   + rtx_cost (XEXP (x, 0), code, 0, speed_p));
 	  if (speed_p)
 	*cost += extra_cost->alu.shift_reg;
@@ -9861,7 +9861,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 case ASHIFTRT:
   if (mode == DImode && CONST_INT_P (XEXP (x, 1)))
 	{
-	  *cost = (COSTS_N_INSNS (3)
+	  *cost += (COSTS_N_INSNS (2)
 		   + rtx_cost (XEXP (x, 0), code, 0, speed_p));
 	  if (speed_p)
 	*cost += 2 * extra_cost->alu.shift;
@@ -9869,8 +9869,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 	}
   else if (mode == SImode)
 	{
-	  *cost = (COSTS_N_INSNS (1)
-		   + rtx_cost (XEXP (x, 0), code, 0, speed_p));
+	  *cost += rtx_cost (XEXP (x, 0), code, 0, speed_p);
 	  /* Slightly disparage register shifts at -Os, but not by much.  */
 	  if (!CONST_INT_P (XEXP (x, 1)))
 	*cost += (speed_p ? extra_cost->alu.shift_reg : 1
@@ -9882,8 +9881,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 	{
 	  if (code == ASHIFT)
 	{
-	  *cost = (COSTS_N_INSNS (1)
-		   + rtx_cost (XEXP (x, 0), code, 0, speed_p));
+	  *cost += rtx_cost (XEXP (x, 0), code, 0, speed_p);
 	  /* Slightly disparage register shifts at -Os, but not by
 	 much.  */
 	  if (!CONST_INT_P (XEXP (x, 1)))
@@ -9895,14 +9893,13 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 	  if (arm_arch_thumb2 && CONST_INT_P (XEXP (x, 1)))
 		{
 		  /* Can use SBFX/UBFX.  */
-		  *cost = COSTS_N_INSNS (1);
 		  if (speed_p)
 		*cost += extra_cost->alu.bfx;
 		  *cost += rtx_cost (XEXP (x, 0), code, 0, speed_p);
 		}
 	  else
 		{
-		  *cost = COSTS_N_INSNS (2);
+		  *cost += COSTS_N_INSNS (1);
 		  *cost += rtx_cost (XEXP (x, 0), code, 0, speed_p);
 		  if (speed_p)
 		{
@@ -9919,7 +9916,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 	}
 	  else /* Rotates.  */
 	{
-	  *cost = COSTS_N_INSNS (3 + !CONST_INT_P (XEXP (x, 1)));
+	  *cost += COSTS_N_INSNS (2 + !CONST_INT_P (XEXP (x, 1)));
 	  *cost += rtx_cost (XEXP (x, 0), code, 0, speed_p);
 	  if (speed_p)
 		{
@@ -9943,7 +9940,6 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 {
   if (mode == SImode)
 {
-  

Re: [PATCH][combine] Do not call rtx costs on potentially unrecognisable rtxes in combine

2015-04-21 Thread Kyrill Tkachov


On 20/04/15 19:51, Jeff Law wrote:

On 04/20/2015 08:04 AM, Kyrill Tkachov wrote:

Hi all,

I'm trying to reduce the cases where the midend calls the backend rtx
costs on bogus rtl for which the backend
doesn't have patterns or ways of handling. Having to handle these kinds
of rtxes sanely bloats those
functions and makes them harder to maintain.

One of the cases where this occurs is in combine and
distribute_and_simplify_rtx in particular.
Citing the comment at that function:
" See if X is of the form (* (+ A B) C), and if so convert to
 (+ (* A C) (* B C)) and try to simplify.
   Most of the time, this results in no change.  However, if some of
 the operands are the same or inverses of each other, simplifications
 will result."

The problem is that after it applies the distributive law it calls rtx
costs
to figure out whether the rtx became simpler. This rtx can get pretty
complex.
For example, on arm I've seen it try to cost:
(plus:SI (mult:SI (plus:SI (reg:SI 232 [ m1 ])
  (const_int 1 [0x1]))
  (reg:SI 232 [ m1 ]))
  (plus:SI (reg:SI 232 [ m1 ])
  (const_int 1 [0x1])))

which is never going to match anything on arm anyway, so why should the
costs function handle it?
In any case, I believe combine's design is such that it should first be
attempting to call
recog and split on the rtxes, and only if that succeeds should it be
making a target-specific
decision on which rtx to prefer. distribute_and_simplify_rtx goes
against that by calling
rtx costs on an unverified rtx in attempt to gauge its complexity.

This patch remedies that by removing the call to rtx costs and instead
manually performing
a relatively simple check on whether the resultant rtx was simplified.
That is, using the example
from the comment, whether (+ (* A C) (* B C)) still has + at the top and
* in the two operands.
This should give a good indication on whether any meaningful
simplification was made (The '+' and '*'
operators in the example can be any operators that can be distributed
over).

Initially, I wanted to just return the distributed version and let recog
reject the invalid rtxes
but that caused some code quality regressions on arm where the original
rtx would not recog but
would match a beneficial splitter, whereas the distributed rtx would not.

With this patch I saw almost no codegen differences on arm for the whole
of SPEC2006.
The one exception was 416.gamess where it managed to merge a mul and an
add into an mla
which resulted in a slightly better code sequence. That was in a pretty
large file and I
don't speak Fortran'ese, so I couldn't really reduce a testcase for it,
but my guess is that
before the patch the costs would return some essentially random value
for an arbitrarily complex rtx
that it was passed to, which changed the decision in
distribute_and_simplify_rtx on whether
to return the distributed rtx, which could have impacted further
optimisations in combine.

I tried it on x86_64 as well. Again, there were almost no codegen
differences. The exception
was tonto and wrf where a few instructions were eliminated, but no
significant difference.
The resultant binaries for these two were a tiny bit smaller, with no
impact on runtime.

Therefore I claim that this a safe thing to do, as it leaves the
target-specific rtx cost
judgements in combine to be made only on valid recog-ed rtxes, and not
having them cancel
optimisations early due to rtx costs not handling arbitrary rtxes well.

Bootstrapped on arm, x86_64, aarch64 (all linux). Tested on arm,aarch64.

Ok for trunk?

Thanks,
Kyrill


2015-04-20  Kyrylo Tkachov  

  * combine.c (distribute_and_simplify_rtx): Do not check rtx costs.
  Look at the rtx codes to see if a simplification occured.

OK.


Thanks



Though I do wonder if, in practice, we can identify those cases that do
simplify more directly apriori and just punt everything else rather than
this rather convoluted approach.


You mean like calling simplify_binary_operation that returns NULL
if no simplification is possible?

Kyrill



jeff





Re: [PATCH][AArch64] Add zero_extend variants of logical+not ops

2015-04-21 Thread Richard Earnshaw
On 21/04/15 09:44, Kyrill Tkachov wrote:
> Hi all,
> 
> We were missing the patterns for the zero-extend versions of the
> negated-logic ops, bic,orn,eon
> leading to redundant zero-extends being generated for code like:
> 
> unsigned long
> bar (unsigned int a, unsigned int b)
> {
>   return a ^ ~b;
> }
> 
> unsigned long
> bar2 (unsigned int a, unsigned int b)
> {
>   return a & ~b;
> }
> 
> 
> With this patch for the above we can generate:
> bar:
> eonw0, w1, w0
> ret
> 
> bar2:
> bicw0, w0, w1
> ret
> 
> 
> instead of:
> bar:
> eonw0, w1, w0
> uxtwx0, w0
> ret
> 
> bar2:
> bicw0, w0, w1
> uxtwx0, w0
> ret
> 
> 
> Bootstrapped and tested on aarch64-linux.
> Ok for trunk?
> 
> Thanks,
> Kyrill
> 
> 2015-04-21  Kyrylo Tkachov  
> 
> * config/aarch64/aarch64.md (*_one_cmplsidi3_ze):
> New pattern.
> (*xor_one_cmplsidi3_ze): Likewise.
> 
> aarch64-ze-logic.patch
> 
> 
> commit 8ff76787ce2674b918e1e6ed8b09cafb6b7a
> Author: Kyrylo Tkachov 
> Date:   Mon Mar 2 16:20:10 2015 +
> 
> [AArch64] Add zero_extend variants of logical+not ops
> 
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 4aa8f5c..1a7f888 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -3058,6 +3058,26 @@ (define_insn "*_one_cmpl3"
> (set_attr "simd" "*,yes")]
>  )
>  
> +(define_insn "*_one_cmplsidi3_ze"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> + (zero_extend:DI
> +   (NLOGICAL:SI (not:SI (match_operand:SI 1 "register_operand" "r"))
> +(match_operand:SI 2 "register_operand" "r"]
> +  ""
> +  "\\t%w0, %w2, %w1"
> +  [(set_attr "type" "logic_reg")]
> +)
> +
> +(define_insn "*xor_one_cmplsidi3_ze"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +(zero_extend:DI
> +  (not:SI (xor:SI (match_operand:SI 1 "register_operand" "r")
> +  (match_operand:SI 2 "register_operand" "r")]
> +  ""
> +  "eon\\t%w0, %w1, %w2"
> +  [(set_attr "type" "logic_reg")]
> +)
> +

I would have thought combine ought to know how to canonicalize this last
case into the form supported above.  That helps if one of the operands
is a constant, since then you can eliminate the NOT entirely.

Anyway, that's probably best held for a follow-up.

OK.


R.



[wwwdocs] PATCH for Re: GCC Plugin Announcement; CTraps - Lightweight dynamic analysis for concurrent code

2015-04-21 Thread Gerald Pfeifer
Hi Brandon,

On Wed, 23 Jan 2013, Brandon Lucia wrote:
> I have implemented a GCC plugin that I have found useful for doing
> dynamic program analysis, debugging, and performance tuning in
> concurrent code.
> 
> The plugin is called CTraps, short for Communication Traps.  The main
> idea behind CTraps is that a compiler pass implemented as a GCC plugin
> instruments instructions that access memory locations that might be
> shared between threads.  The instrumentation inserts a function call
> before such accesses.

I added this to our extensions page at https://gcc.gnu.org/extensions.html
per the patch below.

If you have further updates or changes, just advise.

Gerald

PS: The README file on github felt a bit confusing/not as clear as
your e-mail here.


Index: extensions.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/extensions.html,v
retrieving revision 1.54
diff -u -r1.54 extensions.html
--- extensions.html 20 Apr 2015 22:52:58 -  1.54
+++ extensions.html 21 Apr 2015 10:10:38 -
@@ -12,6 +12,14 @@
 tree. Please direct feedback and bug reports to their respective
 maintainers, not our mailing lists.
 
+https://github.com/blucia0a/CTraps-gcc";>CTraps plugin for 
GCC
+
+CTraps, short for Communication Traps, adds a compiler pass as
+a plugin that instruments instructions that access memory locations
+that might be shared between threads.  It supports dynamic program
+analysis, debugging, and performance tuning in concurrent code.
+
+
 http://gcc-melt.org";>GCC MELT
 
 MELT is a high-level domain specific language to ease the


RE: [PATCH, ping1] Fix removing of df problem in df_finish_pass

2015-04-21 Thread Thomas Preud'homme
Committed. I'll wait a week and then ask for approval for a backport to 5.1.1 
once 5.1 is released.

Best regards,

Thomas

> -Original Message-
> From: Kenneth Zadeck [mailto:zad...@naturalbridge.com]
> Sent: Monday, April 20, 2015 9:26 PM
> To: Thomas Preud'homme; 'Bernhard Reutner-Fischer'; gcc-
> patc...@gcc.gnu.org; 'Paolo Bonzini'; 'Seongbae Park'
> Subject: Re: [PATCH, ping1] Fix removing of df problem in df_finish_pass
> 
> As a dataflow maintainer, I approve this patch for the next release.
> However, you will have to get approval of a release manager to get it
> into 5.0.
> 
> 
> 
> On 04/20/2015 04:22 AM, Thomas Preud'homme wrote:
> > Ping?
> >
> >> -Original Message-
> >> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> >> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> >> Sent: Tuesday, March 03, 2015 12:02 PM
> >> To: 'Bernhard Reutner-Fischer'; gcc-patches@gcc.gnu.org; 'Paolo
> Bonzini';
> >> 'Seongbae Park'; 'Kenneth Zadeck'
> >> Subject: RE: [PATCH] Fix removing of df problem in df_finish_pass
> >>
> >>> From: Bernhard Reutner-Fischer [mailto:rep.dot@gmail.com]
> >>> Sent: Saturday, February 28, 2015 4:00 AM
> use df_remove_problem rather than manually removing
> problems,
> >>> living
> >>>
> >>> leaving
> >> Indeed. Please find updated changelog below:
> >>
> >> 2015-03-03  Thomas Preud'homme
> 
> >>
> >>* df-core.c (df_finish_pass): Iterate over df-
> >>> problems_by_index[] and
> >>use df_remove_problem rather than manually removing
> >> problems, leaving
> >>holes in df->problems_in_order[].
> >>
> >> Best regards,
> >>
> >> Thomas
> >>
> >>
> >>
> >>
> >
> >





[patch] Document libstdc++ dual ABI

2015-04-21 Thread Jonathan Wakely

This adds some proper documentation for the ABI changes.

Committed to trunk.

commit 738e20c17326a4d966b24d081549991f0a318774
Author: Jonathan Wakely 
Date:   Mon Apr 20 22:49:43 2015 +0100

	* doc/xml/manual/configure.xml: Update descriptions of options
	affecting dual ABI and add cross-references.
	* doc/xml/manual/strings.xml: Clarify that string isn't COW now.
	* doc/xml/manual/using.xml: Document ABI transition.
	* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/configure.xml b/libstdc++-v3/doc/xml/manual/configure.xml
index a6e0c21..56d071e 100644
--- a/libstdc++-v3/doc/xml/manual/configure.xml
+++ b/libstdc++-v3/doc/xml/manual/configure.xml
@@ -385,18 +385,22 @@
 --disable-libstdcxx-dual-abi
  

- Disable support for the new, C++11-conforming std::string
- implementation.  This option changes the library ABI.
+ Disable support for the new, C++11-conforming implementations of
+ std::string, std::list etc. so that the
+ library only provides definitions of types using the old ABI
+ (see ).
+ This option changes the library ABI.

  
 
---with-default-libstdcxx-abi
+--with-default-libstdcxx-abi=OPTION
  

- By default, the new std::string implementation will be
- declared and a macro must be defined to declare the old implementation
- instead. That default can be reversed by configuring the library with
- --with-default-libstdcxx-abi=c++98.
+ Set the default value for the _GLIBCXX_USE_CXX11_ABI
+ macro (see ).
+ The default is OPTION=c++11 which sets the macro to
+ 1,
+ use OPTION=c++98 to set it to 0.
  This option does not change the library ABI.

  
diff --git a/libstdc++-v3/doc/xml/manual/strings.xml b/libstdc++-v3/doc/xml/manual/strings.xml
index 6a94fa2..101f8cd 100644
--- a/libstdc++-v3/doc/xml/manual/strings.xml
+++ b/libstdc++-v3/doc/xml/manual/strings.xml
@@ -353,7 +353,7 @@ stringtok(Container &container, string const &in,
   a vector's memory usage
   (see this FAQ
   entry) but the regular copy constructor cannot be used
-  because libstdc++'s string is Copy-On-Write.
+  because libstdc++'s string is Copy-On-Write in GCC 3.

In C++11 mode you can call
   s.shrink_to_fit() to achieve the same effect as
diff --git a/libstdc++-v3/doc/xml/manual/using.xml b/libstdc++-v3/doc/xml/manual/using.xml
index 0ce4407..8b4af1a 100644
--- a/libstdc++-v3/doc/xml/manual/using.xml
+++ b/libstdc++-v3/doc/xml/manual/using.xml
@@ -875,6 +875,22 @@ g++ -Winvalid-pch -I. -include stdc++.h -H -g -O2 hello.cc -o test.exe
 
 
 
+_GLIBCXX_USE_CXX11_ABI
+
+  
+Defined to the value 1 by default.
+Configurable via  --disable-libstdcxx-dual-abi
+and/or --with-default-libstdcxx-abi.
+ABI-changing.
+When defined to a non-zero value the library headers will use the
+new C++11-conforming ABI introduced in GCC 5, rather than the older
+ABI introduced in GCC 3.4. This changes the definition of several
+class templates, including std:string,
+std::list and some locale facets.
+For more details see .
+
+
+
 _GLIBCXX_CONCEPT_CHECKS
 
   
@@ -922,6 +938,94 @@ g++ -Winvalid-pch -I. -include stdc++.h -H -g -O2 hello.cc -o test.exe
 
   
 
+
+  Dual ABI
+  
+
+ In the GCC 5.1 release libstdc++ introduced a new library ABI that
+  includes new implementations of std::string and
+  std::list. These changes were necessary to conform
+  to the 2011 C++ standard which forbids Copy-On-Write strings and requires
+  lists to keep track of their size.
+
+
+ In order to maintain backwards compatibility for existing code linked
+  to libstdc++ the library's soname has not changed and the old
+  implementations are still supported in parallel with the new ones.
+  This is achieved by defining the new implementations in an inline namespace
+  so they have different names for linkage purposes, e.g. the new version of
+  std::list is actually defined as
+  std::__cxx11::list. Because the symbols
+  for the new implementations have different names the definitions for both
+  versions can be present in the same library.
+
+
+ The _GLIBCXX_USE_CXX11_ABI macro (see
+) controls whether
+  the declarations in the library headers use the old or new ABI.
+  So the decision of which ABI to use can be made separately for each
+  source file being compiled.
+  Using the default configuration options for GCC the default value
+  of the macro is 1 which causes the new ABI to be active,
+  so to use the old ABI you must explicitly define the macro to
+  0 before including any library headers.
+  (Be aware that some GNU/Linux distributions configure GCC 5 differently so
+  that the default value of the macro is 0 and users must
+  define it to 1 to enable the new ABI.)
+
+
+ Although the changes were made for C++11 conformance, the choice of ABI
+  to use is independent of the -std 

[patch] Document effects of -std=c++14 and -std=c++03 in libstdc++ manual

2015-04-21 Thread Jonathan Wakely

A small doc patch that could also go to the 4.9 and 5 branches.

Committed only to trunk for now.


commit c5a5a32af8b7cb69c14decbfca9c1a3175e7c535
Author: Jonathan Wakely 
Date:   Mon Apr 20 13:20:16 2015 +0100

	* doc/xml/manual/abi.xml: Use uppercase for C++ Standard Library.
	* doc/xml/manual/using.xml: Document newer -std options. Use better
	examples of nested namespaces.

diff --git a/libstdc++-v3/doc/xml/manual/abi.xml b/libstdc++-v3/doc/xml/manual/abi.xml
index ee3a27e..86c591d 100644
--- a/libstdc++-v3/doc/xml/manual/abi.xml
+++ b/libstdc++-v3/doc/xml/manual/abi.xml
@@ -66,7 +66,7 @@
 
 
  Putting all of these ideas together results in the C++ Standard
-library ABI, which is the compilation of a given library API by a
+Library ABI, which is the compilation of a given library API by a
 given compiler ABI. In a nutshell:
 
 
diff --git a/libstdc++-v3/doc/xml/manual/using.xml b/libstdc++-v3/doc/xml/manual/using.xml
index f6f615e..0ce4407 100644
--- a/libstdc++-v3/doc/xml/manual/using.xml
+++ b/libstdc++-v3/doc/xml/manual/using.xml
@@ -13,7 +13,10 @@
 
 
 
-  By default, g++ is equivalent to  g++ -std=gnu++98. The standard library also defaults to this dialect.
+  The standard library conforms to the dialect of C++ specified by the
+  -std option passed to the compiler.
+  By default, g++ is equivalent to
+  g++ -std=gnu++98.
 
 
  
@@ -32,12 +35,14 @@
 
   
 
-  -std=c++98
+  -std=c++98 or -std=c++03
+  
   Use the 1998 ISO C++ standard plus amendments.
 
 
 
-  -std=gnu++98
+  -std=gnu++98 or -std=gnu++03
+  
   As directly above, with GNU extensions.
 
 
@@ -52,6 +57,16 @@
 
 
 
+  -std=c++14
+  Use the 2014 ISO C++ standard.
+
+
+
+  -std=gnu++14
+  As directly above, with GNU extensions.
+
+
+
   -fexceptions
   See exception-free dialect
 
@@ -923,8 +938,8 @@ g++ -Winvalid-pch -I. -include stdc++.h -H -g -O2 hello.cc -o test.exe
   std
 The ISO C++ standards specify that "all library entities are defined
 within namespace std." This includes namespaces nested
-within namespace std, such as namespace
-std::tr1.
+within namespace std, such as namespace
+std::chrono.
 
 
 abi


[C/C++ PATCH] Improve -Wlogical-op (PR c/63357)

2015-04-21 Thread Marek Polacek
This patch improves -Wlogical-op so that it also warns about cases such as
P && P or P || P.  I made use of what merge_ranges computes: if we have equal
operands with the same ranges, warn -- that seems to work well.
(-Wlogical-op still isn't enabled neither by -Wall nor by -Wextra.)

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-04-21  Marek Polacek  

PR c/63357
* c-common.c (warn_logical_operator): Warn if the operands have the
same expressions.

* doc/invoke.texi: Update description of -Wlogical-op.

* c-c++-common/Wlogical-op-1.c: New test.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index 7fe7fa6..6eecc73 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -1772,22 +1772,35 @@ warn_logical_operator (location_t location, enum 
tree_code code, tree type,
 return;
 
   /* If both expressions have the same operand, if we can merge the
- ranges, and if the range test is always false, then warn.  */
+ ranges, ...  */
   if (operand_equal_p (lhs, rhs, 0)
   && merge_ranges (&in_p, &low, &high, in0_p, low0, high0,
-  in1_p, low1, high1)
-  && 0 != (tem = build_range_check (UNKNOWN_LOCATION,
-   type, lhs, in_p, low, high))
-  && integer_zerop (tem))
+  in1_p, low1, high1))
 {
-  if (or_op)
-warning_at (location, OPT_Wlogical_op,
-"logical % "
-"of collectively exhaustive tests is always true");
-  else
-warning_at (location, OPT_Wlogical_op,
-"logical % "
-"of mutually exclusive tests is always false");
+  tem = build_range_check (UNKNOWN_LOCATION, type, lhs, in_p, low, high);
+  /* ... and if the range test is always false, then warn.  */
+  if (tem && integer_zerop (tem))
+   {
+ if (or_op)
+   warning_at (location, OPT_Wlogical_op,
+   "logical % of collectively exhaustive tests is "
+   "always true");
+ else
+   warning_at (location, OPT_Wlogical_op,
+   "logical % of mutually exclusive tests is "
+   "always false");
+   }
+  /* Or warn if the operands have exactly the same range, e.g.
+A > 0 && A > 0.  */
+  else if (low0 == low1 && high0 == high1)
+   {
+ if (or_op)
+   warning_at (location, OPT_Wlogical_op,
+   "logical % of equal expressions");
+ else
+   warning_at (location, OPT_Wlogical_op,
+   "logical % of equal expressions");
+   }
 }
 }
 
diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi
index c20dd4d..8ce233b 100644
--- gcc/doc/invoke.texi
+++ gcc/doc/invoke.texi
@@ -4936,7 +4936,12 @@ programmer intended to use @code{strcmp}.  This warning 
is enabled by
 @opindex Wno-logical-op
 Warn about suspicious uses of logical operators in expressions.
 This includes using logical operators in contexts where a
-bit-wise operator is likely to be expected.
+bit-wise operator is likely to be expected.  Also warns when
+the operands of a logical operator are the same:
+@smallexample
+extern int a;
+if (a < 0 && a < 0) @{ @dots{} @}
+@end smallexample
 
 @item -Wlogical-not-parentheses
 @opindex Wlogical-not-parentheses
diff --git gcc/testsuite/c-c++-common/Wlogical-op-1.c 
gcc/testsuite/c-c++-common/Wlogical-op-1.c
index e69de29..33d4f38 100644
--- gcc/testsuite/c-c++-common/Wlogical-op-1.c
+++ gcc/testsuite/c-c++-common/Wlogical-op-1.c
@@ -0,0 +1,109 @@
+/* PR c/63357 */
+/* { dg-do compile } */
+/* { dg-options "-Wlogical-op" } */
+
+#ifndef __cplusplus
+# define bool _Bool
+# define true 1
+# define false 0
+#endif
+
+extern int bar (void);
+extern int *p;
+struct R { int a, b; } S;
+
+void
+andfn (int a, int b)
+{
+  if (a && a) {}   /* { dg-warning "logical .and. of equal 
expressions" } */
+  if (!a && !a) {} /* { dg-warning "logical .and. of equal 
expressions" } */
+  if (!!a && !!a) {}   /* { dg-warning "logical .and. of equal 
expressions" } */
+  if (a > 0 && a > 0) {}   /* { dg-warning "logical .and. of equal 
expressions" } */
+  if (a < 0 && a < 0) {}   /* { dg-warning "logical .and. of equal 
expressions" } */
+  if (a == 0 && a == 0) {} /* { dg-warning "logical .and. of equal 
expressions" } */
+  if (a <= 0 && a <= 0) {} /* { dg-warning "logical .and. of equal 
expressions" } */
+  if (a >= 0 && a >= 0) {} /* { dg-warning "logical .and. of equal 
expressions" } */
+  if (a == 0 && !(a != 0)) {}  /* { dg-warning "logical .and. of equal 
expressions" } */
+
+  if (a && a && a) {}  /* { dg-warning "logical .and. of equal 
expressions" } */
+  if ((a + 1) && (a + 1)) {}   /* { dg-warning "logical .and. of equal 
expressions" } */
+  if ((10 * a) && (a * 10)) {} /* { dg-warning "logical .and. of equal 
expressions" } 

[PATCH] Skip preprocessor directives in mklog

2015-04-21 Thread Yury Gribov

Hi all,

Contrib/mklog is currently faked by preprocessor directives inside 
functions to produce invalid ChangeLog.  The attached patch fixes this.


Tested with my local mklog testsuite and http://paste.debian.net/167999/ 
.  Ok to commit?


-Y
commit 23a738d05393676e72db82cb527d5fb1b3060e2f
Author: Yury Gribov 
Date:   Tue Apr 21 14:17:23 2015 +0300

2015-04-21  Yury Gribov  

	* mklog: Ignore preprocessor directives.

diff --git a/contrib/mklog b/contrib/mklog
index f7974a7..455614b 100755
--- a/contrib/mklog
+++ b/contrib/mklog
@@ -131,7 +131,6 @@ sub is_unified_hunk_start {
 }
 
 # Check if line is a top-level declaration.
-# TODO: ignore preprocessor directives except maybe #define ?
 sub is_top_level {
 	my ($function, $is_context_diff) = (@_);
 	if (is_unified_hunk_start ($function)
@@ -143,7 +142,7 @@ sub is_top_level {
 	} else {
 		$function =~ s/^.//;
 	}
-	return $function && $function !~ /^[\s{]/;
+	return $function && $function !~ /^[\s{#]/;
 }
 
 # Read contents of .diff file


[PATCH][i386] Properly scale vec_construct cost

2015-04-21 Thread Richard Biener

Hi,

currently vec_construct cost is simply TYPE_VECTOR_SUBPARTS / 2 + 1,
a reasonable estimate only of other target stmt costs are close to 1.
The idea was you need that many vector stmts thus the following patch
which should fix skewed costs for bdver2 for example with a
vec_stmt_cost of 6.

Fixing this gets important for a fix for PR62283 which will consider
building vectors up from parts during basic-block vectorization
and relies on the cost model to reject too expensive ones.
For example gcc.dg/vect/bb-slp-14.c will now be vectorized (with
the generic cost model and just SSE2) as

Cost model analysis:
  Vector inside of basic block cost: 2
  Vector prologue cost: 7
  Vector epilogue cost: 0
  Scalar cost of basic block: 10

.LFB7:
.cfi_startproc
subq$24, %rsp
.cfi_def_cfa_offset 32
movlin+12(%rip), %eax
testl   %edi, %edi
movdin+4(%rip), %xmm0
movdin(%rip), %xmm1
movl%eax, 12(%rsp)
movdin+4(%rip), %xmm4
movd12(%rsp), %xmm3
movl%edi, 12(%rsp)
punpckldq   %xmm4, %xmm1
punpckldq   %xmm3, %xmm0
punpcklqdq  %xmm0, %xmm1
movd12(%rsp), %xmm0
movl%esi, 12(%rsp)
movd12(%rsp), %xmm5
paddd   .LC2(%rip), %xmm1
movdqa  %xmm1, %xmm2
psrlq   $32, %xmm1
punpckldq   %xmm5, %xmm0
punpcklqdq  %xmm0, %xmm0
pmuludq %xmm0, %xmm2
psrlq   $32, %xmm0
pmuludq %xmm1, %xmm0
pshufd  $8, %xmm2, %xmm1
pshufd  $8, %xmm0, %xmm0
punpckldq   %xmm0, %xmm1
movaps  %xmm1, out(%rip)
je  .L12

vs. the scalar variant

.LFB7:
.cfi_startproc
subq$8, %rsp
.cfi_def_cfa_offset 16
movlin(%rip), %edx
movlin+4(%rip), %eax
movlin+12(%rip), %ecx
addl$23, %edx
imull   %edi, %edx
leal31(%rcx), %r8d
movl%edx, out(%rip)
leal142(%rax), %edx
addl$2, %eax
imull   %edi, %eax
imull   %esi, %edx
movl%eax, out+8(%rip)
movl%r8d, %eax
imull   %esi, %eax
testl   %edi, %edi
movl%edx, out+4(%rip)
movl%eax, out+12(%rip)
je  .L12

Some excessive PRE across the conditional asm() keeps part
of the scalar computes live (yes, the cost model accounts
for that).  Previously we didn't vectorize the basic-block
because the loads from in[] could not be vectorized.  Now
we will build up a vector from the scalar loads.

The vectorized code is generated from

  :
  vect_cst_.19_43 = {x_10(D), y_13(D), x_10(D), y_13(D)};
  _3 = in[0];
  _5 = in[1];
  _8 = in[3];
  vect_cst_.16_47 = {_3, _5, _5, _8};
  vect_a0_4.15_42 = vect_cst_.16_47 + { 23, 142, 2, 31 };
  vect__11.18_44 = vect_a0_4.15_42 * vect_cst_.19_43;
  MEM[(unsigned int *)&out] = vect__11.18_44;

thus the code we generate for

  _3 = in[0];
  _5 = in[1];
  _8 = in[3];
  vect_cst_.16_47 = {_3, _5, _5, _8};

is quite bad.  It get's better for -mavx but I wonder where we
should try to optimize code generation for constructors...
(we can vectorize the loads by enhancing load permutation support,
of course - another vectorizer improvement I have some partial
patches for).

Well, anyway - below for the "obvoious" cost model patch.

Boostrapped on x86_64-unknown-linux-gnu, testing in progress.

Ok for trunk?

Thanks,
Richard.

2015-04-21  Richard Biener  

* config/i386/i386.c (ix86_builtin_vectorization_cost): Scale
vec_construct cost by vec_stmt_cost.

Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c  (revision 30)
+++ gcc/config/i386/i386.c  (working copy)
@@ -46731,7 +46731,7 @@ ix86_builtin_vectorization_cost (enum ve
 
   case vec_construct:
elements = TYPE_VECTOR_SUBPARTS (vectype);
-   return elements / 2 + 1;
+   return ix86_cost->vec_stmt_cost * (elements / 2 + 1);
 
   default:
 gcc_unreachable ();


Re: [Patch] pr65779 - [5/6 Regression] undefined local symbol on powerpc

2015-04-21 Thread Alan Modra
On Mon, Apr 20, 2015 at 03:17:21PM +0200, Jakub Jelinek wrote:
> On Mon, Apr 20, 2015 at 10:30:32PM +0930, Alan Modra wrote:
> Zapping is conservatively correct, if you don't know where the var lives in
> or how to compute it, you tell the debugger you don't know it.
> Of course, it is a QoI issue, if there is an easy way how to reconstruct the
> value otherwise, it is always better to do so.

That's what this revised patch does, fix the easy cases.

> > Of course, all this moving for shrink-wrap is senseless in a block
> > that contains a call.
> 
> Yeah, such blocks clearly aren't going to be shrink-wrapped, so there is no
> point to move it that far, right?

It's not where we're moving to, but from.  The first block in the
function has a call, but prepare_shrink_wrap goes ahead regardless,
moving reg copies and initialization out of the block.  Ideally none
of the moves would be committed until we decide that we can shrink
wrap.  The tricky part is that we need to perform the moves in order
to update dataflow info used to decide whether other moves can
happen.  So I think the only way to get back to the original insn
stream is keep info around for an undo.

Anyway, here's the current patch.  The debug_loc info looks much
better, so we should see fewer of those  messages from
gdb.  Cures a dozen quality fails on powerpc64 too (all in one
testcase).  Bootstrapped and regression tested powerpc64-linux and
x86_64-linux.

gcc/
PR debug/65779
* shrink-wrap.c (insn_uses_reg): New function.
(move_insn_for_shrink_wrap): Try to fix up debug insns related
to the moved insn.
gcc/testsuite/
* gcc.dg/pr65779.c: New.

Index: shrink-wrap.c
===
--- shrink-wrap.c   (revision 27)
+++ shrink-wrap.c   (working copy)
@@ -182,6 +182,24 @@ live_edge_for_reg (basic_block bb, int regno, int
   return live_edge;
 }
 
+/* Return true if INSN df shows a use of a reg in the range
+   [REGNO,END_REGNO).  */
+
+static bool
+insn_uses_reg (rtx_insn *insn, unsigned int regno, unsigned int end_regno)
+{
+  df_ref use;
+
+  FOR_EACH_INSN_USE (use, insn)
+{
+  rtx reg = DF_REF_REG (use);
+
+  if (REG_P (reg) && REGNO (reg) >= regno && REGNO (reg) < end_regno)
+   return true;
+}
+  return false;
+}
+
 /* Try to move INSN from BB to a successor.  Return true on success.
USES and DEFS are the set of registers that are used and defined
after INSN in BB.  SPLIT_P indicates whether a live edge from BB
@@ -342,8 +360,11 @@ move_insn_for_shrink_wrap (basic_block bb, rtx_ins
 
   /* At this point we are committed to moving INSN, but let's try to
  move it as far as we can.  */
+  auto_vec live_bbs;
   do
 {
+  if (MAY_HAVE_DEBUG_INSNS)
+   live_bbs.safe_push (bb);
   live_out = df_get_live_out (bb);
   live_in = df_get_live_in (next_block);
   bb = next_block;
@@ -426,6 +447,54 @@ move_insn_for_shrink_wrap (basic_block bb, rtx_ins
SET_REGNO_REG_SET (bb_uses, i);
 }
 
+  /* Try to fix up debug insns in the tail of the entry block and any
+ intervening blocks that use regs set by the insn we are moving.  */
+  if (MAY_HAVE_DEBUG_INSNS)
+{
+  while (!live_bbs.is_empty ())
+   {
+ rtx_insn *dinsn;
+ basic_block tmp_bb = live_bbs.pop ();
+
+ FOR_BB_INSNS_REVERSE (tmp_bb, dinsn)
+   {
+ if (dinsn == insn)
+   break;
+ if (DEBUG_INSN_P (dinsn)
+ && insn_uses_reg (dinsn, dregno, end_dregno))
+   {
+ if (live_bbs.is_empty ())
+   /* Put debug info for the insn we'll be moving
+  into the destination block.  */
+   {
+ rtx_insn *newdinsn
+   = emit_debug_insn_after (copy_rtx (PATTERN (dinsn)),
+bb_note (bb));
+ df_insn_rescan (newdinsn);
+   }
+
+ /* If the insn is a simple reg-reg copy, then reset
+the debug insn to point to src.  */
+ if (REG_P (src) && GET_MODE (src) == GET_MODE (dest))
+   {
+ INSN_VAR_LOCATION_LOC (dinsn)
+   = simplify_replace_rtx (INSN_VAR_LOCATION_LOC (dinsn),
+   dest, src);
+ df_insn_rescan (dinsn);
+   }
+ else
+   {
+ /* Otherwise remove anything about this variable.  */
+ INSN_VAR_LOCATION_LOC (dinsn)
+   = gen_rtx_UNKNOWN_VAR_LOC ();
+ df_insn_rescan_debug_internal (dinsn);
+   }
+ break;
+   }
+   }
+   }
+}
+
   emit_insn_after (PATTERN (insn), bb_note (bb));
   delete_insn (insn);
   return t

Re: [PATCH][i386] Properly scale vec_construct cost

2015-04-21 Thread Uros Bizjak
On Tue, Apr 21, 2015 at 1:30 PM, Richard Biener  wrote:

> Well, anyway - below for the "obvoious" cost model patch.
>
> Boostrapped on x86_64-unknown-linux-gnu, testing in progress.
>
> Ok for trunk?
>
> Thanks,
> Richard.
>
> 2015-04-21  Richard Biener  
>
> * config/i386/i386.c (ix86_builtin_vectorization_cost): Scale
> vec_construct cost by vec_stmt_cost.

OK.

Thanks,
Uros.


[wwwdocs] Add libstdc++ ABI changes to /gcc-5/changes.html

2015-04-21 Thread Jonathan Wakely

I plan to commit this to wwwdocs later today, it adds a caveat to the
top of the file, with a link to a larger description in the libstdc++
section, which links to the new page I've just added to the manual.

It also clarifies that the deprecations apply to C++, so people who
don't care about C++ can ignore that item.

Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.109
diff -u -r1.109 changes.html
--- htdocs/gcc-5/changes.html	20 Apr 2015 08:22:35 -	1.109
+++ htdocs/gcc-5/changes.html	21 Apr 2015 11:45:51 -
@@ -16,15 +16,17 @@
   
 The default mode for C is now -std=gnu11 instead of
 -std=gnu89.
+The C++ runtime library (libstdc++) uses a new ABI by default
+(see below).
 The Graphite framework for loop optimizations no longer requires the
 	CLooG library, only ISL version 0.14 (recommended) or 0.12.2.  The
 installation manual contains more information about requirements to
 build GCC.
-The non-standard type traits
+The non-standard C++0x type traits
 has_trivial_default_constructor,
 has_trivial_copy_constructor and
 has_trivial_copy_assign have been deprecated and will
-be removed in a future version. The standard traits
+be removed in a future version. The standard C++11 traits
 is_trivially_default_constructible,
 is_trivially_copy_constructible and
 is_trivially_copy_assignable should be used instead.
@@ -415,6 +417,11 @@
 
   Runtime Library (libstdc++)
   
+A Dual
+ABI is provided by the library. A new ABI is enabled by default.
+The old ABI is still supported and can be used by defining the macro
+_GLIBCXX_USE_CXX11_ABI to 0 before
+including any C++ standard library headers. 
 A new implementation of std::string is enabled by default,
 using the small string optimization instead of
 copy-on-write reference counting.


[patch, libgomp] Re-factor GOMP_MAP_POINTER handling

2015-04-21 Thread Chung-Lin Tang
Hi,
while investigating some issues in the variable mapping code, I observed
that the GOMP_MAP_POINTER handling is essentially duplicated under the PSET 
case.
This patch abstracts and unifies the handling code, basically just a cleanup
patch. Ran libgomp tests to ensure no regressions, ok for trunk?

Thanks,
Chung-Lin

2015-04-21  Chung-Lin Tang  

libgomp/
* target.c (gomp_map_pointer): New function abstracting out
GOMP_MAP_POINTER handling.
(gomp_map_vars): Remove GOMP_MAP_POINTER handling code and use
gomp_map_pointer().
Index: target.c
===
--- target.c	(revision 448412)
+++ target.c	(working copy)
@@ -163,6 +163,60 @@ get_kind (bool is_openacc, void *kinds, int idx)
 		: ((unsigned char *) kinds)[idx];
 }
 
+static void
+gomp_map_pointer (struct target_mem_desc *tgt, uintptr_t host_ptr,
+		  uintptr_t target_offset, uintptr_t bias)
+{
+  struct gomp_device_descr *devicep = tgt->device_descr;
+  struct splay_tree_s *mem_map = &devicep->mem_map;
+  struct splay_tree_key_s cur_node;
+
+  cur_node.host_start = host_ptr;
+  if (cur_node.host_start == (uintptr_t) NULL)
+{
+  cur_node.tgt_offset = (uintptr_t) NULL;
+  /* FIXME: see comment about coalescing host/dev transfers below.  */
+  devicep->host2dev_func (devicep->target_id,
+			  (void *) (tgt->tgt_start + target_offset),
+			  (void *) &cur_node.tgt_offset,
+			  sizeof (void *));
+  return;
+}
+  /* Add bias to the pointer value.  */
+  cur_node.host_start += bias;
+  cur_node.host_end = cur_node.host_start + 1;
+  splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
+  if (n == NULL)
+{
+  /* Could be possibly zero size array section.  */
+  cur_node.host_end--;
+  n = splay_tree_lookup (mem_map, &cur_node);
+  if (n == NULL)
+	{
+	  cur_node.host_start--;
+	  n = splay_tree_lookup (mem_map, &cur_node);
+	  cur_node.host_start++;
+	}
+}
+  if (n == NULL)
+{
+  gomp_mutex_unlock (&devicep->lock);
+  gomp_fatal ("Pointer target of array section wasn't mapped");
+}
+  cur_node.host_start -= n->host_start;
+  cur_node.tgt_offset
+= n->tgt->tgt_start + n->tgt_offset + cur_node.host_start;
+  /* At this point tgt_offset is target address of the
+ array section.  Now subtract bias to get what we want
+ to initialize the pointer with.  */
+  cur_node.tgt_offset -= bias;
+  /* FIXME: see comment about coalescing host/dev transfers below.  */
+  devicep->host2dev_func (devicep->target_id,
+			  (void *) (tgt->tgt_start + target_offset),
+			  (void *) &cur_node.tgt_offset,
+			  sizeof (void *));
+}
+
 attribute_hidden struct target_mem_desc *
 gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 	   void **hostaddrs, void **devaddrs, size_t *sizes, void *kinds,
@@ -336,54 +390,8 @@ gomp_map_vars (struct gomp_device_descr *devicep,
 	k->host_end - k->host_start);
 		break;
 		  case GOMP_MAP_POINTER:
-		cur_node.host_start
-		  = (uintptr_t) *(void **) k->host_start;
-		if (cur_node.host_start == (uintptr_t) NULL)
-		  {
-			cur_node.tgt_offset = (uintptr_t) NULL;
-			/* FIXME: see above FIXME comment.  */
-			devicep->host2dev_func (devicep->target_id,
-		(void *) (tgt->tgt_start
-			  + k->tgt_offset),
-		(void *) &cur_node.tgt_offset,
-		sizeof (void *));
-			break;
-		  }
-		/* Add bias to the pointer value.  */
-		cur_node.host_start += sizes[i];
-		cur_node.host_end = cur_node.host_start + 1;
-		n = splay_tree_lookup (mem_map, &cur_node);
-		if (n == NULL)
-		  {
-			/* Could be possibly zero size array section.  */
-			cur_node.host_end--;
-			n = splay_tree_lookup (mem_map, &cur_node);
-			if (n == NULL)
-			  {
-			cur_node.host_start--;
-			n = splay_tree_lookup (mem_map, &cur_node);
-			cur_node.host_start++;
-			  }
-		  }
-		if (n == NULL)
-		  {
-			gomp_mutex_unlock (&devicep->lock);
-			gomp_fatal ("Pointer target of array section "
-"wasn't mapped");
-		  }
-		cur_node.host_start -= n->host_start;
-		cur_node.tgt_offset = n->tgt->tgt_start + n->tgt_offset
-	  + cur_node.host_start;
-		/* At this point tgt_offset is target address of the
-		   array section.  Now subtract bias to get what we want
-		   to initialize the pointer with.  */
-		cur_node.tgt_offset -= sizes[i];
-		/* FIXME: see above FIXME comment.  */
-		devicep->host2dev_func (devicep->target_id,
-	(void *) (tgt->tgt_start
-		  + k->tgt_offset),
-	(void *) &cur_node.tgt_offset,
-	sizeof (void *));
+		gomp_map_pointer (tgt, (uintptr_t) *(void **) k->host_start,
+  k->tgt_offset, sizes[i]);
 		break;
 		  case GOMP_MAP_TO_PSET:
 		/* FIXME: see above FIXME comment.  */
@@ -405,58 +413,12 @@ gomp_map_vars (struct gomp_device_descr *devicep,
 			{
 			  tgt->list[j] = k;
 		

Re: [PATCH] Skip preprocessor directives in mklog

2015-04-21 Thread Tom de Vries

On 21-04-15 13:26, Yury Gribov wrote:

Hi all,

Contrib/mklog is currently faked by preprocessor directives inside functions to
produce invalid ChangeLog.


Hi Yury,

The effect of the patch on the mklog output using the pastebin input is:
...
@@ -2,11 +2,13 @@

 2015-04-21  x  

-   * builtins.c:
+   * builtins.c (expand_builtin):
* defaults.h:
-   * df-scan.c:
+   * df-scan.c (df_bb_refs_collect):
+   (df_get_exit_block_use_set):
* except.c:
-   * haifa-sched.c:
-   * ira-lives.c:
-   * lra-lives.c:
+   * haifa-sched.c (initiate_bb_reg_pressure_info):
+   * ira-lives.c (process_bb_node_lives):
+   * lra-lives.c (process_bb_lives):
...


So, for f.i. this patch hunk:
...
diff --git a/gcc/builtins.c b/gcc/builtins.c
index 9263777..028d793 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -6510,10 +6510,8 @@ expand_builtin (tree exp, rtx target, rtx subtarget, 
machine_mode mode,

   expand_builtin_eh_return (CALL_EXPR_ARG (exp, 0),
CALL_EXPR_ARG (exp, 1));
   return const0_rtx;
-#ifdef EH_RETURN_DATA_REGNO
 case BUILT_IN_EH_RETURN_DATA_REGNO:
   return expand_builtin_eh_return_data_regno (exp);
-#endif
 case BUILT_IN_EXTEND_POINTER:
   return expand_builtin_extend_pointer (CALL_EXPR_ARG (exp, 0));
 case BUILT_IN_EH_POINTER:
...

with the patch we output:
...
   * builtins.c (expand_builtin):
...

instead of:
...
   * builtins.c:
...

That looks like an improvement to me.

Thanks,
- Tom


 The attached patch fixes this.

Tested with my local mklog testsuite and http://paste.debian.net/167999/ .  Ok
to commit?

-Y

mklog-1.diff


commit 23a738d05393676e72db82cb527d5fb1b3060e2f
Author: Yury Gribov
Date:   Tue Apr 21 14:17:23 2015 +0300

 2015-04-21  Yury Gribov

* mklog: Ignore preprocessor directives.

diff --git a/contrib/mklog b/contrib/mklog
index f7974a7..455614b 100755
--- a/contrib/mklog
+++ b/contrib/mklog
@@ -131,7 +131,6 @@ sub is_unified_hunk_start {
  }

  # Check if line is a top-level declaration.
-# TODO: ignore preprocessor directives except maybe #define ?
  sub is_top_level {
my ($function, $is_context_diff) = (@_);
if (is_unified_hunk_start ($function)
@@ -143,7 +142,7 @@ sub is_top_level {
} else {
$function =~ s/^.//;
}
-   return $function && $function !~ /^[\s{]/;
+   return $function && $function !~ /^[\s{#]/;
  }

  # Read contents of .diff file





[PATCH] Fix PR65788

2015-04-21 Thread Richard Biener

The following fixes PR65788.  We need to use UNDEFINED whenever possible
to not get spurious invalid lattice transitions later.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2015-04-21  Richard Biener  

PR tree-optimization/65788
* tree-ssa-ccp.c (evaluate_stmt): Evaluate to UNDEFINED early.

Index: gcc/tree-ssa-ccp.c
===
*** gcc/tree-ssa-ccp.c  (revision 27)
--- gcc/tree-ssa-ccp.c  (working copy)
*** evaluate_stmt (gimple stmt)
*** 1756,1761 
--- 1756,1769 
  val.mask = 0;
}
  }
+   /* If the statement result is likely UNDEFINED, make it so.  */
+   else if (likelyvalue == UNDEFINED)
+ {
+   val.lattice_val = UNDEFINED;
+   val.value = NULL_TREE;
+   val.mask = 0;
+   return val;
+ }
  
/* Resort to simplification for bitwise tracking.  */
if (flag_tree_bit_ccp
*** evaluate_stmt (gimple stmt)
*** 1890,1896 
  
if (flag_tree_bit_ccp
&& ((is_constant && TREE_CODE (val.value) == INTEGER_CST)
! || (!is_constant && likelyvalue != UNDEFINED))
&& gimple_get_lhs (stmt)
&& TREE_CODE (gimple_get_lhs (stmt)) == SSA_NAME)
  {
--- 1898,1904 
  
if (flag_tree_bit_ccp
&& ((is_constant && TREE_CODE (val.value) == INTEGER_CST)
! || !is_constant)
&& gimple_get_lhs (stmt)
&& TREE_CODE (gimple_get_lhs (stmt)) == SSA_NAME)
  {
*** evaluate_stmt (gimple stmt)
*** 1918,1939 
}
  }
  
if (!is_constant)
  {
!   /* The statement produced a nonconstant value.  If the statement
!had UNDEFINED operands, then the result of the statement
!should be UNDEFINED.  Otherwise, the statement is VARYING.  */
!   if (likelyvalue == UNDEFINED)
!   {
! val.lattice_val = likelyvalue;
! val.mask = 0;
!   }
!   else
!   {
! val.lattice_val = VARYING;
! val.mask = -1;
!   }
! 
val.value = NULL_TREE;
  }
  
--- 1926,1936 
}
  }
  
+   /* The statement produced a nonconstant value.  */
if (!is_constant)
  {
!   val.lattice_val = VARYING;
!   val.mask = -1;
val.value = NULL_TREE;
  }
  


Re: [PATCH] 65479 - sanitizer stack trace missing frames past #0 on powerpc64

2015-04-21 Thread Peter Bergner
On Tue, 2015-04-21 at 08:22 +0200, Jakub Jelinek wrote:
> > -#if defined(__powerpc__) || defined(__powerpc64__)
> > -  // PCs are always 4 byte aligned.
> > -  return pc - 4;
> > -#elif defined(__sparc__) || defined(__mips__)
> > -  return pc - 8;
> 
> The SPARC/MIPS case is of course needed, because on these architectures
> the call is followed by a delay slot.  But I wonder why you need anything
> special on any other architecture, why pc - 1 isn't good enough for those.
> The point isn't to find a PC of the call instruction, on some targets that
> is very hard and you need to disassemble, but to just find some byte in the
> call instruction.

I wrote the "pc - 4" code for powerpc* and I guess I was just
being pedantic on returning the first address of the instruction.
If using "pc - 1" works, then I'm fine with that.

Peter



Re: [PATCH][expmed] Properly account for the cost and latency of shift+add ops when synthesizing mults

2015-04-21 Thread Marcus Shawcroft
On 20 April 2015 at 16:12, Kyrill Tkachov  wrote:

> Thanks,
> I could've sworn I had sent this version out a couple hours ago.
> My mail client has been playing up.
>
> Here it is with 6 tests. For the tests corresponding to f1/f3 in my
> example above I scan that we don't use the 'w1' reg.
>
> I'll give the AArch64 maintainers to comment on the tests for a day or two
> before committing.

Using scan-assembler-times is more robust than scan-assembler.
Otherwise, OK by me.
/Marcus

> Thanks,
> Kyrill
>
> 2015-04-20  Kyrylo Tkachov  
>
> * expmed.c: (synth_mult): Only assume overlapping
> shift with previous steps in alg_sub_t_m2 case.
>
> 2015-04-20  Kyrylo Tkachov  
>
> * gcc.target/aarch64/mult-synth_1.c: New test.
> * gcc.target/aarch64/mult-synth_2.c: Likewise.
> * gcc.target/aarch64/mult-synth_3.c: Likewise.
> * gcc.target/aarch64/mult-synth_4.c: Likewise.
> * gcc.target/aarch64/mult-synth_5.c: Likewise.
> * gcc.target/aarch64/mult-synth_6.c: Likewise.
>>
>>
>> jeff
>>
>


Re: [Patch] pr65779 - [5/6 Regression] undefined local symbol on powerpc

2015-04-21 Thread Jakub Jelinek
On Tue, Apr 21, 2015 at 09:08:04PM +0930, Alan Modra wrote:
> +   if (DEBUG_INSN_P (dinsn)
> +   && insn_uses_reg (dinsn, dregno, end_dregno))
> + {
> +   if (live_bbs.is_empty ())
> + /* Put debug info for the insn we'll be moving
> +into the destination block.  */
> + {
> +   rtx_insn *newdinsn
> + = emit_debug_insn_after (copy_rtx (PATTERN (dinsn)),
> +  bb_note (bb));
> +   df_insn_rescan (newdinsn);
> + }

This isn't safe.  There could be a debug_insn for the same decl anywhere in
between the dinsn and bb_note (bb) on the chosen live path, if there is,
this change will break stuff.

> +   /* If the insn is a simple reg-reg copy, then reset
> +  the debug insn to point to src.  */
> +   if (REG_P (src) && GET_MODE (src) == GET_MODE (dest))
> + {
> +   INSN_VAR_LOCATION_LOC (dinsn)
> + = simplify_replace_rtx (INSN_VAR_LOCATION_LOC (dinsn),
> + dest, src);
> +   df_insn_rescan (dinsn);
> + }
> +   else
> + {
> +   /* Otherwise remove anything about this variable.  */
> +   INSN_VAR_LOCATION_LOC (dinsn)
> + = gen_rtx_UNKNOWN_VAR_LOC ();
> +   df_insn_rescan_debug_internal (dinsn);
> + }

This works (though the simplify_replace_rtx alone is dangerous, you'd better
use propagate_for_debug), but is unnecessarily limitting.  You could just 
insert a debug
insn with a debug temp before the original insn and replace all the uses of
the reg with the debug temporary.
And, as you are walking all the bbs on the path insn by insn anyway,
supposedly you could instead use the valtrack APIs for that.
Thus, call
  dead_debug_local_init (&debug, NULL, NULL);
before walking the first bb, then call
  dead_debug_add on each FOR_EACH_INSN_INFO_USE of the debug insns that
overlaps the dest REG, and finally
  dead_debug_insert_temp with DEBUG_TEMP_BEFORE_WITH_VALUE and
finally dead_debug_local_finish.  Of course all this guarded with
MAY_HAVE_DEBUG_INSNS.

Jakub


Re: [PATCH] Fix PR65650 (1/n in merging CCP and copyprop)

2015-04-21 Thread Richard Biener
On Thu, 2 Apr 2015, Richard Biener wrote:

> 
> The following makes CCP track copies which avoids pass ordering
> issues between CCP and copyprop as seen from the testcase.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for stage1.
> 
> For stage1 I'd like to get rid of copyprop completely, a 2nd patch
> in the series will remove the copyprop instances immediately
> preceeding/following CCP.
> 
> CCP needs some TLC and I'm going to apply that during stage1.

With the propagator engine improvement this need some extra
testcase adjustments.

Re-bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Note that the "obvious" improvement of ccp_lattice_meet to

  if (val1->lattice_val == UNDEFINED
  /* For UNDEFINED M SSA we can't always SSA because its definition
 may not dominate the PHI node.  */
  && (val2->lattice_val != CONSTANT
  || TREE_CODE (val2->value) != SSA_NAME
  || SSA_NAME_IS_DEFAULT_DEF (val2->value)
  || (gimple_bb (SSA_NAME_DEF_STMT (val2->value)) != where
  && dominated_by_p (CDI_DOMINATORS, where,
 gimple_bb (SSA_NAME_DEF_STMT 
(val2->value))

to enable optimistic copy propagation (yes, copyprop doesn't do that)
regresses quite some gcc.dg/uninit-pred*.c testcases, so I'm at least
not enabling this with this patch.  For example in
gcc.dg/uninit-pred-2_a.c:

int foo (int n, int m, int r)
{
  int flag = 0;
  int v;

  if (n)
{
  v = r;
  flag = 1;
}

  if (m)
g++;
  else
bar();

  /* Wrong guard */
  if (!flag)
blah(v); /* { dg-warning "uninitialized" "real uninitialized var 
warning" } */

we see that we can optimistically propagate r into v for the call
due to the PHI

 v_3 = PHI 

and v being uninitialized on one path.  We have similar missed
uninit warnings for optimistic constant propagations already
so I think optimistically propagating copies isn't wrong.
We might want to provide a flag to turn both off, of course.

I'll send a followup enabling optimistic copyprop once I get
my mind around on how to fix the testcases.

Richard.

2015-04-16  Richard Biener  

PR tree-optimization/65650
* tree-ssa-ccp.c (valid_lattice_transition): Allow lattice
transitions involving copies.
(set_lattice_value): Adjust for copy lattice state.
(ccp_lattice_meet): Do not merge UNDEFINED and a copy to the copy
if that doesn't dominate the merge point.
(bit_value_unop): Adjust what we treat as varying mask.
(bit_value_binop): Likewise.
(bit_value_assume_aligned): Likewise.
(evaluate_stmt): When we simplified to a SSA name record a copy
instead of dropping to varying.
(visit_assignment): Simplify.

* gimple-match.h (gimple_simplify): Add another callback.
* gimple-fold.c (fold_stmt_1): Adjust caller.
(gimple_fold_stmt_to_constant_1): Likewise - pass valueize
for the 2nd callback.
* gimple-match-head.c (gimple_simplify): Add a callback that is
used to valueize the stmt operands and use it that way.

* gcc.dg/tree-ssa/ssa-ccp-37.c: New testcase.
* gcc.dg/tree-ssa/forwprop-11.c: Adjust.
* gcc.dg/tree-ssa/ssa-fre-3.c: Likewise.
* gcc.dg/tree-ssa/ssa-fre-4.c: Likewise.
* gcc.dg/tree-ssa/ssa-fre-5.c: Likewise.
* gcc.dg/tree-ssa/ssa-fre-32.c: Likewise.


Index: gcc/gimple-fold.c
===
*** gcc/gimple-fold.c   (revision 66)
--- gcc/gimple-fold.c   (working copy)
*** fold_stmt_1 (gimple_stmt_iterator *gsi,
*** 3621,3627 
gimple_seq seq = NULL;
code_helper rcode;
tree ops[3] = {};
!   if (gimple_simplify (stmt, &rcode, ops, inplace ? NULL : &seq, 
valueize))
{
  if (replace_stmt_with_simplification (gsi, rcode, ops, &seq, inplace))
changed = true;
--- 3621,3628 
gimple_seq seq = NULL;
code_helper rcode;
tree ops[3] = {};
!   if (gimple_simplify (stmt, &rcode, ops, inplace ? NULL : &seq,
!  valueize, valueize))
{
  if (replace_stmt_with_simplification (gsi, rcode, ops, &seq, inplace))
changed = true;
*** gimple_fold_stmt_to_constant_1 (gimple s
*** 4928,4934 
   edges if there are intermediate VARYING defs.  For this reason
   do not follow SSA edges here even though SCCVN can technically
   just deal fine with that.  */
!   if (gimple_simplify (stmt, &rcode, ops, NULL, gvalueize)
&& rcode.is_tree_code ()
&& (TREE_CODE_LENGTH ((tree_code) rcode) == 0
  || ((tree_code) rcode) == ADDR_EXPR)
--- 4929,4935 
   edges if there are intermediate VARYING defs.  For this reason
   do not follow SSA edges here even though SCCVN can technically
   just deal fine with that.  */
!   if (gimple_simplify (stmt, &rcod

Re: [PATCH][expmed] Properly account for the cost and latency of shift+add ops when synthesizing mults

2015-04-21 Thread Kyrill Tkachov


On 21/04/15 13:46, Marcus Shawcroft wrote:

On 20 April 2015 at 16:12, Kyrill Tkachov  wrote:


Thanks,
I could've sworn I had sent this version out a couple hours ago.
My mail client has been playing up.

Here it is with 6 tests. For the tests corresponding to f1/f3 in my
example above I scan that we don't use the 'w1' reg.

I'll give the AArch64 maintainers to comment on the tests for a day or two
before committing.

Using scan-assembler-times is more robust than scan-assembler.
Otherwise, OK by me.
/Marcus


Thanks, I used scan-assembler-times for those tests.
Attached is what I committed with r68.

Kyrill




Thanks,
Kyrill

2015-04-20  Kyrylo Tkachov  

 * expmed.c: (synth_mult): Only assume overlapping
 shift with previous steps in alg_sub_t_m2 case.

2015-04-20  Kyrylo Tkachov  

 * gcc.target/aarch64/mult-synth_1.c: New test.
 * gcc.target/aarch64/mult-synth_2.c: Likewise.
 * gcc.target/aarch64/mult-synth_3.c: Likewise.
 * gcc.target/aarch64/mult-synth_4.c: Likewise.
 * gcc.target/aarch64/mult-synth_5.c: Likewise.
 * gcc.target/aarch64/mult-synth_6.c: Likewise.


jeff



Index: gcc/ChangeLog
===
--- gcc/ChangeLog	(revision 66)
+++ gcc/ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2015-04-21  Kyrylo Tkachov  
+
+	* expmed.c: (synth_mult): Only assume overlapping
+	shift with previous steps in alg_sub_t_m2 case.
+
 2015-04-21  Richard Biener  
 
 	PR tree-optimization/65788
Index: gcc/testsuite/gcc.target/aarch64/mult-synth_2.c
===
--- gcc/testsuite/gcc.target/aarch64/mult-synth_2.c	(revision 0)
+++ gcc/testsuite/gcc.target/aarch64/mult-synth_2.c	(revision 0)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */
+
+int
+foo (int x)
+{
+  return x * 25;
+}
+
+/* { dg-final { scan-assembler-times "mul\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" 1 } } */
+/* { dg-final { cleanup-saved-temps } } */
Index: gcc/testsuite/gcc.target/aarch64/mult-synth_3.c
===
--- gcc/testsuite/gcc.target/aarch64/mult-synth_3.c	(revision 0)
+++ gcc/testsuite/gcc.target/aarch64/mult-synth_3.c	(revision 0)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */
+
+int
+foo (int x)
+{
+  return x * 11;
+}
+
+/* { dg-final { scan-assembler-times "mul\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" 1 } } */
+/* { dg-final { cleanup-saved-temps } } */
Index: gcc/testsuite/gcc.target/aarch64/mult-synth_4.c
===
--- gcc/testsuite/gcc.target/aarch64/mult-synth_4.c	(revision 0)
+++ gcc/testsuite/gcc.target/aarch64/mult-synth_4.c	(revision 0)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */
+
+long
+foo (int x, int y)
+{
+   return (long)x * 6L;
+}
+
+/* { dg-final { scan-assembler-times "smull\tx\[0-9\]+, w\[0-9\]+, w\[0-9\]+" 1 } } */
+/* { dg-final { cleanup-saved-temps } } */
Index: gcc/testsuite/gcc.target/aarch64/mult-synth_5.c
===
--- gcc/testsuite/gcc.target/aarch64/mult-synth_5.c	(revision 0)
+++ gcc/testsuite/gcc.target/aarch64/mult-synth_5.c	(revision 0)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */
+
+int
+foo (int x)
+{
+  return x * 10;
+}
+
+/* { dg-final { scan-assembler-not "\tw1" } } */
+/* { dg-final { cleanup-saved-temps } } */
Index: gcc/testsuite/gcc.target/aarch64/mult-synth_6.c
===
--- gcc/testsuite/gcc.target/aarch64/mult-synth_6.c	(revision 0)
+++ gcc/testsuite/gcc.target/aarch64/mult-synth_6.c	(revision 0)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */
+
+int
+foo (int x)
+{
+  return x * 20;
+}
+
+/* { dg-final { scan-assembler-not "\tw1" } } */
+/* { dg-final { cleanup-saved-temps } } */
Index: gcc/testsuite/gcc.target/aarch64/mult-synth_1.c
===
--- gcc/testsuite/gcc.target/aarch64/mult-synth_1.c	(revision 0)
+++ gcc/testsuite/gcc.target/aarch64/mult-synth_1.c	(revision 0)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */
+
+int
+foo (int x)
+{
+  return x * 100;
+}
+
+/* { dg-final { scan-assembler-times "mul\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" 1 } } */
+/* { dg-final { cleanup-saved-temps } } */
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(revision 66)
+++ gcc/testsuite/ChangeLog	(working copy)
@@ -1,3 +1,12 @@
+2015-04-21  Kyrylo Tkachov  
+
+	* gcc.target/aarch64/mult-synth_1.c: New test.
+	* gcc.target/aarch64/mult-synth_2.c: Likewise.
+	* gcc.target/aarch64/mult-synth_3.c: Likewi

Re: [patch,wwwdocs] Add gcc-5 caveats for avr.

2015-04-21 Thread Georg-Johann Lay

Am 04/20/2015 um 09:02 PM schrieb Gerald Pfeifer:

Hi Johann,

On Mon, 20 Apr 2015, Georg-Johann Lay wrote:

Okay to install?


+The AVR port uses a new scheme to describe supported devices:
+For each supported device the compiler provides a device-specific
+http://gcc.gnu.org/onlinedocs/gcc/Spec-Files.html";>spec
file.
+If the compiler is used together with AVR-LibC, this requires at
+least GCC 5.2 and a version of AVR-LibC which implements
+http://savannah.nongnu.org/bugs/?44574";#44574.

Can you please make the two links https-links?  (Especially the
one to gcc.gnu.org actually redirects.)

Just using "#44574" for a reference, may that be a little confusing,
or is it sufficiently clear to AVR users?

+  A new command option -nodevicelib has been added.

"command-line option"

+If this option is turned on the compiler won't link against AVR-LibC's
+device-specific library libdevice.a by omitting
+-ldevice from the linker's command line.

How about making this "...-nodevicelib prevents the compiler
from linking against"?

+If the compiler had not been
+http://gcc.gnu.org/install/configure.html";>configured
+to be used with AVR-LibC, the compiler will not link against that
+library and the option has no effect.

"was not" (or "is") instead of "had not", and can you please use
https here as well?

Though, really, could this be just simplified to "If the compiler is
not configured for use with AVR-LibC to begin with, this option has no effect"?


Your patch is fine with the above changes or considering them and
deciding not go for one or the other.

Gerald



Thanks for your support.  The new entry also contains more topics.





Index: gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.109
diff -u -p -r1.109 changes.html
--- gcc-5/changes.html	20 Apr 2015 08:22:35 -	1.109
+++ gcc-5/changes.html	21 Apr 2015 13:00:11 -
@@ -28,6 +28,14 @@
 is_trivially_default_constructible,
 is_trivially_copy_constructible and
 is_trivially_copy_assignable should be used instead.
+On AVR, support has been added for the devices ATtiny4/5/9/10/20/40.
+This requires Binutils 2.25 or newer.
+The AVR port uses a new scheme to describe supported devices:
+For each supported device the compiler provides a device-specific
+https://gcc.gnu.org/onlinedocs/gcc/Spec-Files.html";>spec file.
+If the compiler is used together with AVR-LibC, this requires at
+least GCC 5.2 and a version of AVR-LibC which implements
+https://savannah.nongnu.org/bugs/?44574";feature #44574.
   
 
 General Optimizer Improvements
@@ -690,6 +698,57 @@ here.
  
  
 
+AVR
+
+  The compiler no more supports individual devices like ATmega8.
+Specifying, say, -mmcu=atmega8 triggers the usage of the
+device-specific
+https://gcc.gnu.org/onlinedocs/gcc/Spec-Files.html";>spec file
+specs-atmega8 which is part of the installation and describes
+options for the sub-processes like compiler proper, assembler and linker.
+You can add support for a new device -mmcu=mydevice as follows:
+
+  In an empty directory /someplace, create a new
+  directory device-specs.
+  Copy a device spec file from the installed device-specs
+folder, follow the comments in that file and then save it as
+/someplace/device-specs/specs-mydevice.
+  Add -B /someplace -mmcu=mydevice to the
+compiler's command-line options.  Notice that /someplace
+must specify an absolute path and that mydevice must
+not start with "avr".
+  Provided you have a device-specific library
+libmydevice.a available, you can put it at
+/someplace, dito for a device-specific startup
+file crtmydevice.o.
+
+The contents of the device spec files depend on the compiler's
+configuration, in particular on --with-avrlibc=no and
+whether or not it is configured for RTEMS.
+  
+  A new command-line option -nodevicelib has been added.
+It prevents the compiler from linking against AVR-LibC's
+device-specific library libdevice.a.
+  The following three command-line options have been added:
+
+  -mrmw
+  Set if the device supports the read-modify-write instructions
+LAC, LAS, LAT
+and XCH.
+  -mn-flash=size
+  Specify the flash size of the device in units of 64 KiB,
+rounded up to the next integer as needed.  This option affects the
+availability of the
+https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html";>AVR
+  address-spaces.
+  -mskip-bug
+  Set if the device is affected by the respective silicon bug.
+
+In general, you don't need to set these options by ha

[PATCH 01/12] add default definition of EH_RETURN_DATA_REGNO

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h: New definition of EH_RETURN_DATA_REGNO.
* except.c: Remove definition of EH_RETURN_DATA_REGNO.
* builtins.c (expand_builtin): Remove check if
EH_RETURN_DATA_REGNO is defined.
* df-scan.c (df_bb_refs_collect): Likewise.
(df_get_exit_block_use_set): Likewise.
* haifa-sched.c (initiate_bb_reg_pressure_info): Likewise.
* ira-lives.c (process_bb_node_lives): Likewise.
* lra-lives.c (process_bb_lives): Likewise.
---
 gcc/builtins.c| 2 --
 gcc/defaults.h| 6 ++
 gcc/df-scan.c | 4 
 gcc/except.c  | 6 --
 gcc/haifa-sched.c | 2 --
 gcc/ira-lives.c   | 2 --
 gcc/lra-lives.c   | 2 --
 7 files changed, 6 insertions(+), 18 deletions(-)

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 9263777..028d793 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -6510,10 +6510,8 @@ expand_builtin (tree exp, rtx target, rtx subtarget, 
machine_mode mode,
   expand_builtin_eh_return (CALL_EXPR_ARG (exp, 0),
CALL_EXPR_ARG (exp, 1));
   return const0_rtx;
-#ifdef EH_RETURN_DATA_REGNO
 case BUILT_IN_EH_RETURN_DATA_REGNO:
   return expand_builtin_eh_return_data_regno (exp);
-#endif
 case BUILT_IN_EXTEND_POINTER:
   return expand_builtin_extend_pointer (CALL_EXPR_ARG (exp, 0));
 case BUILT_IN_EH_POINTER:
diff --git a/gcc/defaults.h b/gcc/defaults.h
index 1d54798..911c2f8 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -377,6 +377,12 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 #endif
 #endif
 
+/* Provide defaults for stuff that may not be defined when using
+   sjlj exceptions.  */
+#ifndef EH_RETURN_DATA_REGNO
+#define EH_RETURN_DATA_REGNO(N) INVALID_REGNUM
+#endif
+
 /* If we have named section and we support weak symbols, then use the
.jcr section for recording java classes which need to be registered
at program start-up time.  */
diff --git a/gcc/df-scan.c b/gcc/df-scan.c
index 1700be9..b2e2e5d 100644
--- a/gcc/df-scan.c
+++ b/gcc/df-scan.c
@@ -3332,7 +3332,6 @@ df_bb_refs_collect (struct df_collection_rec 
*collection_rec, basic_block bb)
   return;
 }
 
-#ifdef EH_RETURN_DATA_REGNO
   if (bb_has_eh_pred (bb))
 {
   unsigned int i;
@@ -3346,7 +3345,6 @@ df_bb_refs_collect (struct df_collection_rec 
*collection_rec, basic_block bb)
 bb, NULL, DF_REF_REG_DEF, DF_REF_AT_TOP);
}
 }
-#endif
 
   /* Add the hard_frame_pointer if this block is the target of a
  non-local goto.  */
@@ -3751,7 +3749,6 @@ df_get_exit_block_use_set (bitmap exit_block_uses)
  bitmap_set_bit (exit_block_uses, i);
 }
 
-#ifdef EH_RETURN_DATA_REGNO
   /* Mark the registers that will contain data for the handler.  */
   if (reload_completed && crtl->calls_eh_return)
 for (i = 0; ; ++i)
@@ -3761,7 +3758,6 @@ df_get_exit_block_use_set (bitmap exit_block_uses)
  break;
bitmap_set_bit (exit_block_uses, regno);
   }
-#endif
 
 #ifdef EH_RETURN_STACKADJ_RTX
   if ((!HAVE_epilogue || ! epilogue_completed)
diff --git a/gcc/except.c b/gcc/except.c
index 833ec21..7573c88 100644
--- a/gcc/except.c
+++ b/gcc/except.c
@@ -174,12 +174,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "builtins.h"
 
-/* Provide defaults for stuff that may not be defined when using
-   sjlj exceptions.  */
-#ifndef EH_RETURN_DATA_REGNO
-#define EH_RETURN_DATA_REGNO(N) INVALID_REGNUM
-#endif
-
 static GTY(()) int call_site_base;
 
 struct tree_hash_traits : default_hashmap_traits
diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
index ad2450b..d47cb8c 100644
--- a/gcc/haifa-sched.c
+++ b/gcc/haifa-sched.c
@@ -1070,7 +1070,6 @@ initiate_bb_reg_pressure_info (basic_block bb)
   if (NONDEBUG_INSN_P (insn))
setup_ref_regs (PATTERN (insn));
   initiate_reg_pressure_info (df_get_live_in (bb));
-#ifdef EH_RETURN_DATA_REGNO
   if (bb_has_eh_pred (bb))
 for (i = 0; ; ++i)
   {
@@ -1082,7 +1081,6 @@ initiate_bb_reg_pressure_info (basic_block bb)
  mark_regno_birth_or_death (curr_reg_live, curr_reg_pressure,
 regno, true);
   }
-#endif
 }
 
 /* Save current register pressure related info.  */
diff --git a/gcc/ira-lives.c b/gcc/ira-lives.c
index b29f572..2837349 100644
--- a/gcc/ira-lives.c
+++ b/gcc/ira-lives.c
@@ -1319,7 +1319,6 @@ process_bb_node_lives (ira_loop_tree_node_t 
loop_tree_node)
  curr_point++;
}
 
-#ifdef EH_RETURN_DATA_REGNO
   if (bb_has_eh_pred (bb))
for (j = 0; ; ++j)
  {
@@ -1328,7 +1327,6 @@ process_bb_node_lives (ira_loop_tree_node_t 
loop_tree_node)
  break;
make_hard_regno_born (regno);
  }
-#endif
 
   /* Allocnos can't go in stack regs at the start of a basic block
 that is reached by an abnormal edge. Likewise for call
diff 

[PATCH 00/12] Reduce conditional compilation

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

Hi,

This is a first round of patches to reduce the amount of code with in #if /
#ifdef.  This makes it incrementally easier to not break configs other than the
one being built, and moves things slightly closer to using target hooks for
everything.

each commit bootstrapped and regtested on x86_64-linux-gnu without regression,
and whole patch set run through config-list.mk without issue, ok?

Trevor Saunders (12):
  add default definition of EH_RETURN_DATA_REGNO
  remove some ifdef HAVE_cc0
  more HAVE_cc0
  always define HAVE_cc0
  make some HAVE_cc0 code always compiled
  provide default for RETURN_ADDR_OFFSET
  provide default for MASK_RETURN_ADDR
  reduce conditional compilation for HARD_FRAME_POINTER_IS_FRAME_POINTER
  remove #if for PIC_OFFSET_TABLE_REGNUM
  remove more ifdefs for HAVE_cc0
  provide default for INSN_SETS_ARE_DELAYED
  add default for INSN_REFERENCES_ARE_DELAYED

 gcc/alias.c   |  7 ++---
 gcc/builtins.c|  2 --
 gcc/caller-save.c |  4 +--
 gcc/cfgcleanup.c  | 26 +---
 gcc/cfgrtl.c  | 12 ++--
 gcc/combine.c | 84 ++-
 gcc/conditions.h  |  6 
 gcc/cprop.c   |  4 +--
 gcc/cse.c | 22 +-
 gcc/defaults.h| 23 ++
 gcc/df-problems.c |  9 ++
 gcc/df-scan.c | 46 +++-
 gcc/emit-rtl.c|  8 ++---
 gcc/except.c  | 26 ++--
 gcc/final.c   | 43 --
 gcc/function.c|  5 ++-
 gcc/gcse.c| 24 ---
 gcc/genconfig.c   |  1 +
 gcc/haifa-sched.c |  5 +--
 gcc/ira-lives.c   |  2 --
 gcc/ira.c | 33 +---
 gcc/jump.c|  3 --
 gcc/loop-invariant.c  |  4 +--
 gcc/lra-constraints.c |  6 ++--
 gcc/lra-lives.c   |  2 --
 gcc/optabs.c  |  2 +-
 gcc/postreload.c  |  4 +--
 gcc/recog.c   |  2 --
 gcc/recog.h   |  2 --
 gcc/reginfo.c |  5 ++-
 gcc/regrename.c   |  5 ++-
 gcc/reload.c  | 12 +++-
 gcc/reload1.c | 10 +++---
 gcc/reorg.c   | 68 ++---
 gcc/resource.c| 15 +++--
 gcc/rtlanal.c |  2 --
 gcc/sched-deps.c  |  5 +--
 gcc/sched-rgn.c   |  4 +--
 gcc/simplify-rtx.c|  5 ++-
 39 files changed, 199 insertions(+), 349 deletions(-)

-- 
2.3.0.80.g18d0fec.dirty



[PATCH 02/12] remove some ifdef HAVE_cc0

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* conditions.h: Define macros even if HAVE_cc0 is undefined.
* emit-rtl.c: Define functions even if HAVE_cc0 is undefined.
* final.c: Likewise.
* jump.c: Likewise.
* recog.c: Likewise.
* recog.h: Declare functions even when HAVE_cc0 is undefined.
* sched-deps.c (sched_analyze_2): Always compile case for cc0.
---
 gcc/conditions.h | 6 --
 gcc/emit-rtl.c   | 2 --
 gcc/final.c  | 2 --
 gcc/jump.c   | 3 ---
 gcc/recog.c  | 2 --
 gcc/recog.h  | 2 --
 gcc/sched-deps.c | 5 +++--
 7 files changed, 3 insertions(+), 19 deletions(-)

diff --git a/gcc/conditions.h b/gcc/conditions.h
index 2308bfc..7cd1e1c 100644
--- a/gcc/conditions.h
+++ b/gcc/conditions.h
@@ -20,10 +20,6 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_CONDITIONS_H
 #define GCC_CONDITIONS_H
 
-/* None of the things in the files exist if we don't use CC0.  */
-
-#ifdef HAVE_cc0
-
 /* The variable cc_status says how to interpret the condition code.
It is set by output routines for an instruction that sets the cc's
and examined by output routines for jump instructions.
@@ -117,6 +113,4 @@ extern CC_STATUS cc_status;
  (cc_status.flags = 0, cc_status.value1 = 0, cc_status.value2 = 0,  \
   CC_STATUS_MDEP_INIT)
 
-#endif
-
 #endif /* GCC_CONDITIONS_H */
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 483eacb..c1974bb 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -3541,7 +3541,6 @@ prev_active_insn (rtx uncast_insn)
   return insn;
 }
 
-#ifdef HAVE_cc0
 /* Return the next insn that uses CC0 after INSN, which is assumed to
set it.  This is the inverse of prev_cc0_setter (i.e., prev_cc0_setter
applied to the result of this function should yield INSN).
@@ -3589,7 +3588,6 @@ prev_cc0_setter (rtx uncast_insn)
 
   return insn;
 }
-#endif
 
 #ifdef AUTO_INC_DEC
 /* Find a RTX_AUTOINC class rtx which matches DATA.  */
diff --git a/gcc/final.c b/gcc/final.c
index 1fa93d9..41f6bd9 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -191,7 +191,6 @@ static rtx last_ignored_compare = 0;
 
 static int insn_counter = 0;
 
-#ifdef HAVE_cc0
 /* This variable contains machine-dependent flags (defined in tm.h)
set and examined by output routines
that describe how to interpret the condition codes properly.  */
@@ -202,7 +201,6 @@ CC_STATUS cc_status;
from before the insn.  */
 
 CC_STATUS cc_prev_status;
-#endif
 
 /* Number of unmatched NOTE_INSN_BLOCK_BEG notes we have seen.  */
 
diff --git a/gcc/jump.c b/gcc/jump.c
index 34b3b7b..bc91550 100644
--- a/gcc/jump.c
+++ b/gcc/jump.c
@@ -1044,8 +1044,6 @@ jump_to_label_p (const rtx_insn *insn)
  && JUMP_LABEL (insn) != NULL && !ANY_RETURN_P (JUMP_LABEL (insn)));
 }
 
-#ifdef HAVE_cc0
-
 /* Return nonzero if X is an RTX that only sets the condition codes
and has no side effects.  */
 
@@ -1094,7 +1092,6 @@ sets_cc0_p (const_rtx x)
 }
   return 0;
 }
-#endif
 
 /* Find all CODE_LABELs referred to in X, and increment their use
counts.  If INSN is a JUMP_INSN and there is at least one
diff --git a/gcc/recog.c b/gcc/recog.c
index a9d3b1f..c3ad86f 100644
--- a/gcc/recog.c
+++ b/gcc/recog.c
@@ -971,7 +971,6 @@ validate_simplify_insn (rtx insn)
   return ((num_changes_pending () > 0) && (apply_change_group () > 0));
 }
 
-#ifdef HAVE_cc0
 /* Return 1 if the insn using CC0 set by INSN does not contain
any ordered tests applied to the condition codes.
EQ and NE tests do not count.  */
@@ -988,7 +987,6 @@ next_insn_tests_no_inequality (rtx insn)
   return (INSN_P (next)
  && ! inequality_comparisons_p (PATTERN (next)));
 }
-#endif
 
 /* Return 1 if OP is a valid general operand for machine mode MODE.
This is either a register reference, a memory reference,
diff --git a/gcc/recog.h b/gcc/recog.h
index 45ea671..8a38b26 100644
--- a/gcc/recog.h
+++ b/gcc/recog.h
@@ -112,9 +112,7 @@ extern void validate_replace_rtx_group (rtx, rtx, rtx);
 extern void validate_replace_src_group (rtx, rtx, rtx);
 extern bool validate_simplify_insn (rtx insn);
 extern int num_changes_pending (void);
-#ifdef HAVE_cc0
 extern int next_insn_tests_no_inequality (rtx);
-#endif
 extern bool reg_fits_class_p (const_rtx, reg_class_t, int, machine_mode);
 
 extern int offsettable_memref_p (rtx);
diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c
index 5434831..31de6be 100644
--- a/gcc/sched-deps.c
+++ b/gcc/sched-deps.c
@@ -2608,8 +2608,10 @@ sched_analyze_2 (struct deps_desc *deps, rtx x, rtx_insn 
*insn)
 
   return;
 
-#ifdef HAVE_cc0
 case CC0:
+#ifdef HAVE_cc0
+  gcc_unreachable ();
+#endif
   /* User of CC0 depends on immediately preceding insn.  */
   SCHED_GROUP_P (insn) = 1;
/* Don't move CC0 setter to another block (it can set up the
@@ -2620,7 +2622,6 @@ sched_analyze_2 (struct deps_desc *deps, rtx x, rtx_insn 
*insn)
sched_deps_info->finish_rhs ();
 
   return;
-#endif
 
 case R

[PATCH 05/12] make some HAVE_cc0 code always compiled

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* cfgrtl.c (rtl_merge_blocks): Change #if HAVE_cc0 to if (HAVE_cc0)
(try_redirect_by_replacing_jump): Likewise.
(rtl_tidy_fallthru_edge): Likewise.
* combine.c (insn_a_feeds_b): Likewise.
(find_split_point): Likewise.
(simplify_set): Likewise.
* cprop.c (cprop_jump): Likewise.
* cse.c (cse_extended_basic_block): Likewise.
* df-problems.c (can_move_insns_across): Likewise.
* function.c (emit_use_return_register_into_block): Likewise.
* haifa-sched.c (sched_init): Likewise.
* ira.c (find_moveable_pseudos): Likewise.
* loop-invariant.c (find_invariant_insn): Likewise.
* lra-constraints.c (curr_insn_transform): Likewise.
* postreload.c (reload_combine_recognize_const_pattern):
* Likewise.
* reload.c (find_reloads): Likewise.
* reorg.c (delete_scheduled_jump): Likewise.
(steal_delay_list_from_target): Likewise.
(steal_delay_list_from_fallthrough): Likewise.
(redundant_insn): Likewise.
(fill_simple_delay_slots): Likewise.
(fill_slots_from_thread): Likewise.
(delete_computation): Likewise.
* sched-rgn.c (add_branch_dependences): Likewise.
---
 gcc/cfgrtl.c  | 12 +++-
 gcc/combine.c | 10 ++
 gcc/cprop.c   |  4 +---
 gcc/cse.c |  4 +---
 gcc/df-problems.c |  4 +---
 gcc/function.c|  5 ++---
 gcc/haifa-sched.c |  3 +--
 gcc/ira.c |  5 ++---
 gcc/loop-invariant.c  |  4 +---
 gcc/lra-constraints.c |  6 ++
 gcc/postreload.c  |  4 +---
 gcc/reload.c  | 10 +++---
 gcc/reorg.c   | 32 
 gcc/sched-rgn.c   |  4 +---
 14 files changed, 29 insertions(+), 78 deletions(-)

diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c
index 4c1708f..d93a49e 100644
--- a/gcc/cfgrtl.c
+++ b/gcc/cfgrtl.c
@@ -893,10 +893,9 @@ rtl_merge_blocks (basic_block a, basic_block b)
 
   del_first = a_end;
 
-#if HAVE_cc0
   /* If this was a conditional jump, we need to also delete
 the insn that set cc0.  */
-  if (only_sets_cc0_p (prev))
+  if (HAVE_cc0 && only_sets_cc0_p (prev))
{
  rtx_insn *tmp = prev;
 
@@ -905,7 +904,6 @@ rtl_merge_blocks (basic_block a, basic_block b)
prev = BB_HEAD (a);
  del_first = tmp;
}
-#endif
 
   a_end = PREV_INSN (del_first);
 }
@@ -1064,11 +1062,9 @@ try_redirect_by_replacing_jump (edge e, basic_block 
target, bool in_cfglayout)
   /* In case we zap a conditional jump, we'll need to kill
  the cc0 setter too.  */
   kill_from = insn;
-#if HAVE_cc0
-  if (reg_mentioned_p (cc0_rtx, PATTERN (insn))
+  if (HAVE_cc0 && reg_mentioned_p (cc0_rtx, PATTERN (insn))
   && only_sets_cc0_p (PREV_INSN (insn)))
 kill_from = PREV_INSN (insn);
-#endif
 
   /* See if we can create the fallthru edge.  */
   if (in_cfglayout || can_fallthru (src, target))
@@ -1825,12 +1821,10 @@ rtl_tidy_fallthru_edge (edge e)
  delete_insn (table);
}
 
-#if HAVE_cc0
   /* If this was a conditional jump, we need to also delete
 the insn that set cc0.  */
-  if (any_condjump_p (q) && only_sets_cc0_p (PREV_INSN (q)))
+  if (HAVE_cc0 && any_condjump_p (q) && only_sets_cc0_p (PREV_INSN (q)))
q = PREV_INSN (q);
-#endif
 
   q = PREV_INSN (q);
 }
diff --git a/gcc/combine.c b/gcc/combine.c
index 430084e..d71f863 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -1141,10 +1141,8 @@ insn_a_feeds_b (rtx_insn *a, rtx_insn *b)
   FOR_EACH_LOG_LINK (links, b)
 if (links->insn == a)
   return true;
-#if HAVE_cc0
-  if (sets_cc0_p (a))
+  if (HAVE_cc0 && sets_cc0_p (a))
 return true;
-#endif
   return false;
 }
 
@@ -4816,7 +4814,6 @@ find_split_point (rtx *loc, rtx_insn *insn, bool set_src)
   break;
 
 case SET:
-#if HAVE_cc0
   /* If SET_DEST is CC0 and SET_SRC is not an operand, a COMPARE, or a
 ZERO_EXTRACT, the most likely reason why this doesn't match is that
 we need to put the operand into a register.  So split at that
@@ -4829,7 +4826,6 @@ find_split_point (rtx *loc, rtx_insn *insn, bool set_src)
  && ! (GET_CODE (SET_SRC (x)) == SUBREG
&& OBJECT_P (SUBREG_REG (SET_SRC (x)
return &SET_SRC (x);
-#endif
 
   /* See if we can split SET_SRC as it stands.  */
   split = find_split_point (&SET_SRC (x), insn, true);
@@ -6582,13 +6578,12 @@ simplify_set (rtx x)
   else
compare_mode = SELECT_CC_MODE (new_code, op0, op1);
 
-#if !HAVE_cc0
   /* If the mode changed, we have to change SET_DEST, the mode in the
 compare, and the mode in the place SET_DEST is used.  If SET_DEST is
 a hard register, just build new versions with the proper mode.  If it
 is a pseudo, we lose unless it is only time we set the pseudo, in
 

[PATCH 04/12] always define HAVE_cc0

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* genconfig.c (main): Always define HAVE_cc0.
* caller-save.c (insert_one_insn): Change ifdef HAVE_cc0 to #if
HAVE_cc0.
* cfgcleanup.c (flow_find_cross_jump): Likewise.
(flow_find_head_matching_sequence): Likewise.
(try_head_merge_bb): Likewise.
* cfgrtl.c (rtl_merge_blocks): Likewise.
(try_redirect_by_replacing_jump): Likewise.
(rtl_tidy_fallthru_edge): Likewise.
* combine.c (do_SUBST_MODE): Likewise.
(insn_a_feeds_b): Likewise.
(combine_instructions): Likewise.
(can_combine_p): Likewise.
(try_combine): Likewise.
(find_split_point): Likewise.
(subst): Likewise.
(simplify_set): Likewise.
(distribute_notes): Likewise.
* cprop.c (cprop_jump): Likewise.
* cse.c (cse_extended_basic_block): Likewise.
* df-problems.c (can_move_insns_across): Likewise.
* final.c (final): Likewise.
(final_scan_insn): Likewise.
* function.c (emit_use_return_register_into_block): Likewise.
* gcse.c (insert_insn_end_basic_block): Likewise.
* haifa-sched.c (sched_init): Likewise.
* ira.c (find_moveable_pseudos): Likewise.
* loop-invariant.c (find_invariant_insn): Likewise.
* lra-constraints.c (curr_insn_transform): Likewise.
* optabs.c (prepare_cmp_insn): Likewise.
* postreload.c (reload_combine_recognize_const_pattern):
* Likewise.
* reload.c (find_reloads): Likewise.
(find_reloads_address_1): Likewise.
* reorg.c (delete_scheduled_jump): Likewise.
(steal_delay_list_from_target): Likewise.
(steal_delay_list_from_fallthrough): Likewise.
(try_merge_delay_insns): Likewise.
(redundant_insn): Likewise.
(fill_simple_delay_slots): Likewise.
(fill_slots_from_thread): Likewise.
(delete_computation): Likewise.
(relax_delay_slots): Likewise.
* sched-deps.c (sched_analyze_2): Likewise.
* sched-rgn.c (add_branch_dependences): Likewise.
---
 gcc/caller-save.c |  2 +-
 gcc/cfgcleanup.c  | 12 ++--
 gcc/cfgrtl.c  |  6 +++---
 gcc/combine.c | 36 ++--
 gcc/cprop.c   |  2 +-
 gcc/cse.c |  2 +-
 gcc/df-problems.c |  4 ++--
 gcc/final.c   | 14 +++---
 gcc/function.c|  2 +-
 gcc/gcse.c|  2 +-
 gcc/genconfig.c   |  1 +
 gcc/haifa-sched.c |  2 +-
 gcc/ira.c |  4 ++--
 gcc/loop-invariant.c  |  2 +-
 gcc/lra-constraints.c |  2 +-
 gcc/optabs.c  |  2 +-
 gcc/postreload.c  |  2 +-
 gcc/reload.c  |  6 +++---
 gcc/reorg.c   | 30 +++---
 gcc/sched-deps.c  |  2 +-
 gcc/sched-rgn.c   |  2 +-
 21 files changed, 69 insertions(+), 68 deletions(-)

diff --git a/gcc/caller-save.c b/gcc/caller-save.c
index 3b01941..fc575eb 100644
--- a/gcc/caller-save.c
+++ b/gcc/caller-save.c
@@ -1400,7 +1400,7 @@ insert_one_insn (struct insn_chain *chain, int before_p, 
int code, rtx pat)
   rtx_insn *insn = chain->insn;
   struct insn_chain *new_chain;
 
-#ifdef HAVE_cc0
+#if HAVE_cc0
   /* If INSN references CC0, put our insns in front of the insn that sets
  CC0.  This is always safe, since the only way we could be passed an
  insn that references CC0 is for a restore, and doing a restore earlier
diff --git a/gcc/cfgcleanup.c b/gcc/cfgcleanup.c
index cee152e..58d235e 100644
--- a/gcc/cfgcleanup.c
+++ b/gcc/cfgcleanup.c
@@ -1416,7 +1416,7 @@ flow_find_cross_jump (basic_block bb1, basic_block bb2, 
rtx_insn **f1,
   i2 = PREV_INSN (i2);
 }
 
-#ifdef HAVE_cc0
+#if HAVE_cc0
   /* Don't allow the insn after a compare to be shared by
  cross-jumping unless the compare is also shared.  */
   if (ninsns && reg_mentioned_p (cc0_rtx, last1) && ! sets_cc0_p (last1))
@@ -1539,7 +1539,7 @@ flow_find_head_matching_sequence (basic_block bb1, 
basic_block bb2, rtx_insn **f
   i2 = NEXT_INSN (i2);
 }
 
-#ifdef HAVE_cc0
+#if HAVE_cc0
   /* Don't allow a compare to be shared by cross-jumping unless the insn
  after the compare is also shared.  */
   if (ninsns && reg_mentioned_p (cc0_rtx, last1) && sets_cc0_p (last1))
@@ -2330,7 +2330,7 @@ try_head_merge_bb (basic_block bb)
   cond = get_condition (jump, &move_before, true, false);
   if (cond == NULL_RTX)
 {
-#ifdef HAVE_cc0
+#if HAVE_cc0
   if (reg_mentioned_p (cc0_rtx, jump))
move_before = prev_nonnote_nondebug_insn (jump);
   else
@@ -2499,7 +2499,7 @@ try_head_merge_bb (basic_block bb)
   cond = get_condition (jump, &move_before, true, false);
   if (cond == NULL_RTX)
{
-#ifdef HAVE_cc0
+#if HAVE_cc0
  if (reg_mentioned_p (cc0_rtx, jump))
move_before = prev_nonnote_nondebug_insn (jump);
  else
@@ -2522,7 +2522,7 @@ try_head_merge_bb

[PATCH 03/12] more removal of ifdef HAVE_cc0

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* combine.c (find_single_use): Remove HAVE_cc0 ifdef for code
that is trivially ded on non cc0 targets.
(simplify_set): Likewise.
(mark_used_regs_combine): Likewise.
* cse.c (new_basic_block): Likewise.
(fold_rtx): Likewise.
(cse_insn): Likewise.
(cse_extended_basic_block): Likewise.
(set_live_p): Likewise.
* rtlanal.c (canonicalize_condition): Likewise.
* simplify-rtx.c (simplify_binary_operation_1): Likewise.
---
 gcc/combine.c  |  6 --
 gcc/cse.c  | 18 --
 gcc/rtlanal.c  |  2 --
 gcc/simplify-rtx.c |  5 ++---
 4 files changed, 2 insertions(+), 29 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index 46cd6db..0a35b8f 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -686,7 +686,6 @@ find_single_use (rtx dest, rtx_insn *insn, rtx_insn **ploc)
   rtx *result;
   struct insn_link *link;
 
-#ifdef HAVE_cc0
   if (dest == cc0_rtx)
 {
   next = NEXT_INSN (insn);
@@ -699,7 +698,6 @@ find_single_use (rtx dest, rtx_insn *insn, rtx_insn **ploc)
*ploc = next;
   return result;
 }
-#endif
 
   if (!REG_P (dest))
 return 0;
@@ -6724,7 +6722,6 @@ simplify_set (rtx x)
   src = SET_SRC (x), dest = SET_DEST (x);
 }
 
-#ifdef HAVE_cc0
   /* If we have (set (cc0) (subreg ...)), we try to remove the subreg
  in SRC.  */
   if (dest == cc0_rtx
@@ -6744,7 +6741,6 @@ simplify_set (rtx x)
  src = SET_SRC (x);
}
 }
-#endif
 
 #ifdef LOAD_EXTEND_OP
   /* If we have (set FOO (subreg:M (mem:N BAR) 0)) with M wider than N, this
@@ -13193,11 +13189,9 @@ mark_used_regs_combine (rtx x)
 case ADDR_VEC:
 case ADDR_DIFF_VEC:
 case ASM_INPUT:
-#ifdef HAVE_cc0
 /* CC0 must die in the insn after it is set, so we don't need to take
special note of it here.  */
 case CC0:
-#endif
   return;
 
 case CLOBBER:
diff --git a/gcc/cse.c b/gcc/cse.c
index 2a33827..d184d27 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -281,7 +281,6 @@ struct qty_table_elem
 /* The table of all qtys, indexed by qty number.  */
 static struct qty_table_elem *qty_table;
 
-#ifdef HAVE_cc0
 /* For machines that have a CC0, we do not record its value in the hash
table since its use is guaranteed to be the insn immediately following
its definition and any other insn is presumed to invalidate it.
@@ -293,7 +292,6 @@ static struct qty_table_elem *qty_table;
 
 static rtx this_insn_cc0, prev_insn_cc0;
 static machine_mode this_insn_cc0_mode, prev_insn_cc0_mode;
-#endif
 
 /* Insn being scanned.  */
 
@@ -884,9 +882,7 @@ new_basic_block (void)
}
 }
 
-#ifdef HAVE_cc0
   prev_insn_cc0 = 0;
-#endif
 }
 
 /* Say that register REG contains a quantity in mode MODE not in any
@@ -3166,10 +3162,8 @@ fold_rtx (rtx x, rtx_insn *insn)
 case EXPR_LIST:
   return x;
 
-#ifdef HAVE_cc0
 case CC0:
   return prev_insn_cc0;
-#endif
 
 case ASM_OPERANDS:
   if (insn)
@@ -3223,7 +3217,6 @@ fold_rtx (rtx x, rtx_insn *insn)
const_arg = folded_arg;
break;
 
-#ifdef HAVE_cc0
  case CC0:
/* The cc0-user and cc0-setter may be in different blocks if
   the cc0-setter potentially traps.  In that case PREV_INSN_CC0
@@ -3247,7 +3240,6 @@ fold_rtx (rtx x, rtx_insn *insn)
const_arg = equiv_constant (folded_arg);
  }
break;
-#endif
 
  default:
folded_arg = fold_rtx (folded_arg, insn);
@@ -4522,11 +4514,9 @@ cse_insn (rtx_insn *insn)
 sets = XALLOCAVEC (struct set, XVECLEN (x, 0));
 
   this_insn = insn;
-#ifdef HAVE_cc0
   /* Records what this insn does to set CC0.  */
   this_insn_cc0 = 0;
   this_insn_cc0_mode = VOIDmode;
-#endif
 
   /* Find all regs explicitly clobbered in this insn,
  to ensure they are not replaced with any other regs
@@ -5541,7 +5531,6 @@ cse_insn (rtx_insn *insn)
}
}
 
-#ifdef HAVE_cc0
   /* If setting CC0, record what it was set to, or a constant, if it
 is equivalent to a constant.  If it is being set to a floating-point
 value, make a COMPARE with the appropriate constant of 0.  If we
@@ -5556,7 +5545,6 @@ cse_insn (rtx_insn *insn)
this_insn_cc0 = gen_rtx_COMPARE (VOIDmode, this_insn_cc0,
 CONST0_RTX (mode));
}
-#endif
 }
 
   /* Now enter all non-volatile source expressions in the hash table
@@ -6604,11 +6592,9 @@ cse_extended_basic_block (struct cse_basic_block_data 
*ebb_data)
  record_jump_equiv (insn, taken);
}
 
-#ifdef HAVE_cc0
   /* Clear the CC0-tracking related insns, they can't provide
 useful information across basic block boundaries.  */
   prev_insn_cc0 = 0;
-#endif
 }
 
   gcc_assert (next_qty <= max_qty);
@@ -6859,21 +6845,17 @@ static bool
 set_live_p (rtx set, rtx_in

[PATCH 06/12] provide default for RETURN_ADDR_OFFSET

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h (RETURN_ADDR_OFFSET): New definition.
* except.c (expand_builtin_extract_return_addr): Remove ifdef
RETURN_ADDR_OFFSET.
(expand_builtin_frob_return_addr): Likewise.
---
 gcc/defaults.h |  5 +
 gcc/except.c   | 14 +++---
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/gcc/defaults.h b/gcc/defaults.h
index 911c2f8..767901a 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -383,6 +383,11 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 #define EH_RETURN_DATA_REGNO(N) INVALID_REGNUM
 #endif
 
+/* Offset between the eh handler address and entry in eh tables.  */
+#ifndef RETURN_ADDR_OFFSET
+#define RETURN_ADDR_OFFSET 0
+#endif
+
 /* If we have named section and we support weak symbols, then use the
.jcr section for recording java classes which need to be registered
at program start-up time.  */
diff --git a/gcc/except.c b/gcc/except.c
index 7573c88..c98163d 100644
--- a/gcc/except.c
+++ b/gcc/except.c
@@ -2189,9 +2189,8 @@ expand_builtin_extract_return_addr (tree addr_tree)
 #endif
 
   /* Then adjust to find the real return address.  */
-#if defined (RETURN_ADDR_OFFSET)
-  addr = plus_constant (Pmode, addr, RETURN_ADDR_OFFSET);
-#endif
+  if (RETURN_ADDR_OFFSET)
+addr = plus_constant (Pmode, addr, RETURN_ADDR_OFFSET);
 
   return addr;
 }
@@ -2207,10 +2206,11 @@ expand_builtin_frob_return_addr (tree addr_tree)
 
   addr = convert_memory_address (Pmode, addr);
 
-#ifdef RETURN_ADDR_OFFSET
-  addr = force_reg (Pmode, addr);
-  addr = plus_constant (Pmode, addr, -RETURN_ADDR_OFFSET);
-#endif
+  if (RETURN_ADDR_OFFSET)
+{
+  addr = force_reg (Pmode, addr);
+  addr = plus_constant (Pmode, addr, -RETURN_ADDR_OFFSET);
+}
 
   return addr;
 }
-- 
2.3.0.80.g18d0fec.dirty



[PATCH 07/12] provide default for MASK_RETURN_ADDR

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h (MASK_RETURN_ADDR): New definition.
* except.c (expand_builtin_extract_return_addr): Remove ifdef
MASK_RETURN_ADDR.
---
 gcc/defaults.h | 4 
 gcc/except.c   | 6 +++---
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/gcc/defaults.h b/gcc/defaults.h
index 767901a..843d7e2 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -388,6 +388,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 #define RETURN_ADDR_OFFSET 0
 #endif
 
+#ifndef MASK_RETURN_ADDR
+#define MASK_RETURN_ADDR NULL_RTX
+#endif
+
 /* If we have named section and we support weak symbols, then use the
.jcr section for recording java classes which need to be registered
at program start-up time.  */
diff --git a/gcc/except.c b/gcc/except.c
index c98163d..5b24006 100644
--- a/gcc/except.c
+++ b/gcc/except.c
@@ -2184,9 +2184,9 @@ expand_builtin_extract_return_addr (tree addr_tree)
 }
 
   /* First mask out any unwanted bits.  */
-#ifdef MASK_RETURN_ADDR
-  expand_and (Pmode, addr, MASK_RETURN_ADDR, addr);
-#endif
+  rtx mask = MASK_RETURN_ADDR;
+  if (mask)
+expand_and (Pmode, addr, mask, addr);
 
   /* Then adjust to find the real return address.  */
   if (RETURN_ADDR_OFFSET)
-- 
2.3.0.80.g18d0fec.dirty



[PATCH 08/12] reduce conditional compilation for HARD_FRAME_POINTER_IS_FRAME_POINTER

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* alias.c (init_alias_target): Remove ifdef
* HARD_FRAME_POINTER_IS_FRAME_POINTER.
* df-scan.c (df_insn_refs_collect): Likewise.
(df_get_regular_block_artificial_uses): Likewise.
(df_get_eh_block_artificial_uses): Likewise.
(df_get_entry_block_def_set): Likewise.
(df_get_exit_block_use_set): Likewise.
* emit-rtl.c (gen_rtx_REG): Likewise.
* ira.c (ira_setup_eliminable_regset): Likewise.
* reginfo.c (init_reg_sets_1): Likewise.
* regrename.c (rename_chains): Likewise.
* reload1.c (reload): Likewise.
(eliminate_regs_in_insn): Likewise.
* resource.c (mark_referenced_resources): Likewise.
(init_resource_info): Likewise.
---
 gcc/alias.c |  7 +++
 gcc/df-scan.c   | 35 +--
 gcc/emit-rtl.c  |  6 +++---
 gcc/ira.c   | 23 ---
 gcc/reginfo.c   |  5 ++---
 gcc/regrename.c |  5 ++---
 gcc/reload1.c   | 10 --
 gcc/resource.c  | 11 +--
 8 files changed, 48 insertions(+), 54 deletions(-)

diff --git a/gcc/alias.c b/gcc/alias.c
index a7160f3..8f48660 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -2765,10 +2765,9 @@ init_alias_target (void)
 = unique_base_value (UNIQUE_BASE_VALUE_ARGP);
   static_reg_base_value[FRAME_POINTER_REGNUM]
 = unique_base_value (UNIQUE_BASE_VALUE_FP);
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
-  static_reg_base_value[HARD_FRAME_POINTER_REGNUM]
-= unique_base_value (UNIQUE_BASE_VALUE_HFP);
-#endif
+  if (!HARD_FRAME_POINTER_IS_FRAME_POINTER)
+static_reg_base_value[HARD_FRAME_POINTER_REGNUM]
+  = unique_base_value (UNIQUE_BASE_VALUE_HFP);
 }
 
 /* Set MEMORY_MODIFIED when X modifies DATA (that is assumed
diff --git a/gcc/df-scan.c b/gcc/df-scan.c
index b2e2e5d..69332a8 100644
--- a/gcc/df-scan.c
+++ b/gcc/df-scan.c
@@ -3247,12 +3247,11 @@ df_insn_refs_collect (struct df_collection_rec 
*collection_rec,
  regno_reg_rtx[FRAME_POINTER_REGNUM],
  NULL, bb, insn_info,
  DF_REF_REG_USE, 0);
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
-  df_ref_record (DF_REF_BASE, collection_rec,
- regno_reg_rtx[HARD_FRAME_POINTER_REGNUM],
- NULL, bb, insn_info,
- DF_REF_REG_USE, 0);
-#endif
+ if (!HARD_FRAME_POINTER_IS_FRAME_POINTER)
+   df_ref_record (DF_REF_BASE, collection_rec,
+  regno_reg_rtx[HARD_FRAME_POINTER_REGNUM],
+  NULL, bb, insn_info,
+  DF_REF_REG_USE, 0);
   break;
 default:
   break;
@@ -3442,9 +3441,9 @@ df_get_regular_block_artificial_uses (bitmap 
regular_block_artificial_uses)
 reference of the frame pointer.  */
   bitmap_set_bit (regular_block_artificial_uses, FRAME_POINTER_REGNUM);
 
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
-  bitmap_set_bit (regular_block_artificial_uses, 
HARD_FRAME_POINTER_REGNUM);
-#endif
+  if (!HARD_FRAME_POINTER_IS_FRAME_POINTER)
+   bitmap_set_bit (regular_block_artificial_uses,
+   HARD_FRAME_POINTER_REGNUM);
 
 #if FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
   /* Pseudos with argument area equivalences may require
@@ -3494,9 +3493,9 @@ df_get_eh_block_artificial_uses (bitmap 
eh_block_artificial_uses)
   if (frame_pointer_needed)
{
  bitmap_set_bit (eh_block_artificial_uses, FRAME_POINTER_REGNUM);
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
- bitmap_set_bit (eh_block_artificial_uses, HARD_FRAME_POINTER_REGNUM);
-#endif
+ if (!HARD_FRAME_POINTER_IS_FRAME_POINTER)
+   bitmap_set_bit (eh_block_artificial_uses,
+   HARD_FRAME_POINTER_REGNUM);
}
 #if FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
   if (fixed_regs[ARG_POINTER_REGNUM])
@@ -3580,11 +3579,11 @@ df_get_entry_block_def_set (bitmap entry_block_defs)
   /* Any reference to any pseudo before reload is a potential
 reference of the frame pointer.  */
   bitmap_set_bit (entry_block_defs, FRAME_POINTER_REGNUM);
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
+
   /* If they are different, also mark the hard frame pointer as live.  */
-  if (!LOCAL_REGNO (HARD_FRAME_POINTER_REGNUM))
+  if (!HARD_FRAME_POINTER_IS_FRAME_POINTER
+ && !LOCAL_REGNO (HARD_FRAME_POINTER_REGNUM))
bitmap_set_bit (entry_block_defs, HARD_FRAME_POINTER_REGNUM);
-#endif
 }
 
   /* These registers are live everywhere.  */
@@ -3718,11 +3717,11 @@ df_get_exit_block_use_set (bitmap exit_block_uses)
   if ((!reload_completed) || frame_pointer_needed)
 {
   bitmap_set_bit (exit_block_uses, FRAME_POINTER_REGNUM);
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
+
   /* If they are different, also mark the hard frame pointer as live.  */
-  if (

[PATCH 10/12] remove more ifdefs for HAVE_cc0

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* caller-save.c (insert_one_insn): Remove ifdef HAVE_cc0.
* cfgcleanup.c (flow_find_cross_jump): Likewise.
(flow_find_head_matching_sequence): Likewise.
(try_head_merge_bb): Likewise.
* combine.c (can_combine_p): Likewise.
(try_combine): Likewise.
(distribute_notes): Likewise.
* df-problems.c (can_move_insns_across): Likewise.
* final.c (final): Likewise.
* gcse.c (insert_insn_end_basic_block): Likewise.
* ira.c (find_moveable_pseudos): Likewise.
* reorg.c (try_merge_delay_insns): Likewise.
(fill_simple_delay_slots): Likewise.
(fill_slots_from_thread): Likewise.
* sched-deps.c (sched_analyze_2): Likewise.
---
 gcc/caller-save.c |  4 +---
 gcc/cfgcleanup.c  | 26 --
 gcc/combine.c | 54 +-
 gcc/df-problems.c |  5 +
 gcc/final.c   | 29 ++---
 gcc/gcse.c| 24 +---
 gcc/ira.c |  5 +
 gcc/reorg.c   | 26 +++---
 gcc/sched-deps.c  |  6 +++---
 9 files changed, 69 insertions(+), 110 deletions(-)

diff --git a/gcc/caller-save.c b/gcc/caller-save.c
index fc575eb..76c3a7e 100644
--- a/gcc/caller-save.c
+++ b/gcc/caller-save.c
@@ -1400,18 +1400,16 @@ insert_one_insn (struct insn_chain *chain, int 
before_p, int code, rtx pat)
   rtx_insn *insn = chain->insn;
   struct insn_chain *new_chain;
 
-#if HAVE_cc0
   /* If INSN references CC0, put our insns in front of the insn that sets
  CC0.  This is always safe, since the only way we could be passed an
  insn that references CC0 is for a restore, and doing a restore earlier
  isn't a problem.  We do, however, assume here that CALL_INSNs don't
  reference CC0.  Guard against non-INSN's like CODE_LABEL.  */
 
-  if ((NONJUMP_INSN_P (insn) || JUMP_P (insn))
+  if (HAVE_cc0 && (NONJUMP_INSN_P (insn) || JUMP_P (insn))
   && before_p
   && reg_referenced_p (cc0_rtx, PATTERN (insn)))
 chain = chain->prev, insn = chain->insn;
-#endif
 
   new_chain = new_insn_chain ();
   if (before_p)
diff --git a/gcc/cfgcleanup.c b/gcc/cfgcleanup.c
index 58d235e..e5c4747 100644
--- a/gcc/cfgcleanup.c
+++ b/gcc/cfgcleanup.c
@@ -1416,12 +1416,11 @@ flow_find_cross_jump (basic_block bb1, basic_block bb2, 
rtx_insn **f1,
   i2 = PREV_INSN (i2);
 }
 
-#if HAVE_cc0
   /* Don't allow the insn after a compare to be shared by
  cross-jumping unless the compare is also shared.  */
-  if (ninsns && reg_mentioned_p (cc0_rtx, last1) && ! sets_cc0_p (last1))
+  if (HAVE_cc0 && ninsns && reg_mentioned_p (cc0_rtx, last1)
+  && ! sets_cc0_p (last1))
 last1 = afterlast1, last2 = afterlast2, last_dir = afterlast_dir, ninsns--;
-#endif
 
   /* Include preceding notes and labels in the cross-jump.  One,
  this may bring us to the head of the blocks as requested above.
@@ -1539,12 +1538,11 @@ flow_find_head_matching_sequence (basic_block bb1, 
basic_block bb2, rtx_insn **f
   i2 = NEXT_INSN (i2);
 }
 
-#if HAVE_cc0
   /* Don't allow a compare to be shared by cross-jumping unless the insn
  after the compare is also shared.  */
-  if (ninsns && reg_mentioned_p (cc0_rtx, last1) && sets_cc0_p (last1))
+  if (HAVE_cc0 && ninsns && reg_mentioned_p (cc0_rtx, last1)
+  && sets_cc0_p (last1))
 last1 = beforelast1, last2 = beforelast2, ninsns--;
-#endif
 
   if (ninsns)
 {
@@ -2330,11 +2328,9 @@ try_head_merge_bb (basic_block bb)
   cond = get_condition (jump, &move_before, true, false);
   if (cond == NULL_RTX)
 {
-#if HAVE_cc0
-  if (reg_mentioned_p (cc0_rtx, jump))
+  if (HAVE_cc0 && reg_mentioned_p (cc0_rtx, jump))
move_before = prev_nonnote_nondebug_insn (jump);
   else
-#endif
move_before = jump;
 }
 
@@ -2499,11 +2495,9 @@ try_head_merge_bb (basic_block bb)
   cond = get_condition (jump, &move_before, true, false);
   if (cond == NULL_RTX)
{
-#if HAVE_cc0
- if (reg_mentioned_p (cc0_rtx, jump))
+ if (HAVE_cc0 && reg_mentioned_p (cc0_rtx, jump))
move_before = prev_nonnote_nondebug_insn (jump);
  else
-#endif
move_before = jump;
}
 }
@@ -2522,12 +2516,10 @@ try_head_merge_bb (basic_block bb)
  /* Try again, using a different insertion point.  */
  move_before = jump;
 
-#if HAVE_cc0
  /* Don't try moving before a cc0 user, as that may invalidate
 the cc0.  */
- if (reg_mentioned_p (cc0_rtx, jump))
+ if (HAVE_cc0 && reg_mentioned_p (cc0_rtx, jump))
break;
-#endif
 
  continue;
}
@@ -2582,12 +2574,10 @@ try_head_merge_bb (basic_block bb)
  /* For the unmerged insns, try a different insertion point.  */
  move_before = jump;
 
-#if HAVE_cc0
  /* Don't try moving before a cc0 user,

[PATCH 09/12] remove #if for PIC_OFFSET_TABLE_REGNUM

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* df-scan.c (df_get_entry_block_def_set): Remove #ifdef
PIC_OFFSET_TABLE_REGNUM.
---
 gcc/df-scan.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/gcc/df-scan.c b/gcc/df-scan.c
index 69332a8..4232ec8 100644
--- a/gcc/df-scan.c
+++ b/gcc/df-scan.c
@@ -3589,10 +3589,6 @@ df_get_entry_block_def_set (bitmap entry_block_defs)
   /* These registers are live everywhere.  */
   if (!reload_completed)
 {
-#ifdef PIC_OFFSET_TABLE_REGNUM
-  unsigned int picreg = PIC_OFFSET_TABLE_REGNUM;
-#endif
-
 #if FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
   /* Pseudos with argument area equivalences may require
 reloading via the argument pointer.  */
@@ -3600,13 +3596,12 @@ df_get_entry_block_def_set (bitmap entry_block_defs)
bitmap_set_bit (entry_block_defs, ARG_POINTER_REGNUM);
 #endif
 
-#ifdef PIC_OFFSET_TABLE_REGNUM
   /* Any constant, or pseudo with constant equivalences, may
 require reloading from memory using the pic register.  */
+  unsigned int picreg = PIC_OFFSET_TABLE_REGNUM;
   if (picreg != INVALID_REGNUM
  && fixed_regs[picreg])
bitmap_set_bit (entry_block_defs, picreg);
-#endif
 }
 
 #ifdef INCOMING_RETURN_ADDR_RTX
-- 
2.3.0.80.g18d0fec.dirty



[PATCH 12/12] add default for INSN_REFERENCES_ARE_DELAYED

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h (INSN_REFERENCES_ARE_DELAYED): New definition.
* reorg.c (redundant_insn): Remove ifdef
INSN_REFERENCES_ARE_DELAYED.
* resource.c (mark_referenced_resources): Likewise.
---
 gcc/defaults.h | 4 
 gcc/reorg.c| 4 
 gcc/resource.c | 2 --
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/gcc/defaults.h b/gcc/defaults.h
index 79cb599..cafcb1e 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -1205,6 +1205,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively. 
 If not, see
 #define INSN_SETS_ARE_DELAYED(INSN) false
 #endif
 
+#ifndef INSN_REFERENCES_ARE_DELAYED
+#define INSN_REFERENCES_ARE_DELAYED(INSN) false
+#endif
+
 #ifdef GCC_INSN_FLAGS_H
 /* Dependent default target macro definitions
 
diff --git a/gcc/reorg.c b/gcc/reorg.c
index ae77f0a..d8d8ab69 100644
--- a/gcc/reorg.c
+++ b/gcc/reorg.c
@@ -1558,10 +1558,8 @@ redundant_insn (rtx insn, rtx_insn *target, rtx 
delay_list)
  if (INSN_SETS_ARE_DELAYED (seq->insn (0)))
return 0;
 
-#ifdef INSN_REFERENCES_ARE_DELAYED
  if (INSN_REFERENCES_ARE_DELAYED (seq->insn (0)))
return 0;
-#endif
 
  /* See if any of the insns in the delay slot match, updating
 resource requirements as we go.  */
@@ -1658,10 +1656,8 @@ redundant_insn (rtx insn, rtx_insn *target, rtx 
delay_list)
  if (INSN_SETS_ARE_DELAYED (control))
return 0;
 
-#ifdef INSN_REFERENCES_ARE_DELAYED
  if (INSN_REFERENCES_ARE_DELAYED (control))
return 0;
-#endif
 
  if (JUMP_P (control))
annul_p = INSN_ANNULLED_BRANCH_P (control);
diff --git a/gcc/resource.c b/gcc/resource.c
index 5af9376..26d9fca 100644
--- a/gcc/resource.c
+++ b/gcc/resource.c
@@ -392,11 +392,9 @@ mark_referenced_resources (rtx x, struct resources *res,
  include_delayed_effects
  ? MARK_SRC_DEST_CALL : MARK_SRC_DEST);
 
-#ifdef INSN_REFERENCES_ARE_DELAYED
   if (! include_delayed_effects
  && INSN_REFERENCES_ARE_DELAYED (as_a  (x)))
return;
-#endif
 
   /* No special processing, just speed up.  */
   mark_referenced_resources (PATTERN (x), res, include_delayed_effects);
-- 
2.3.0.80.g18d0fec.dirty



[PATCH 11/12] provide default for INSN_SETS_ARE_DELAYED

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h (INSN_SETS_ARE_DELAYED): New definition.
* reorg.c (redundant_insn): Remove ifdef INSN_SETS_ARE_DELAYED.
* resource.c (mark_set_resources): Likewise.
---
 gcc/defaults.h | 4 
 gcc/reorg.c| 4 
 gcc/resource.c | 2 --
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/gcc/defaults.h b/gcc/defaults.h
index 843d7e2..79cb599 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -1201,6 +1201,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively. 
 If not, see
 #define DEFAULT_PCC_STRUCT_RETURN 1
 #endif
 
+#ifndef INSN_SETS_ARE_DELAYED
+#define INSN_SETS_ARE_DELAYED(INSN) false
+#endif
+
 #ifdef GCC_INSN_FLAGS_H
 /* Dependent default target macro definitions
 
diff --git a/gcc/reorg.c b/gcc/reorg.c
index b7228f2..ae77f0a 100644
--- a/gcc/reorg.c
+++ b/gcc/reorg.c
@@ -1555,10 +1555,8 @@ redundant_insn (rtx insn, rtx_insn *target, rtx 
delay_list)
 slots because it is difficult to track its resource needs
 correctly.  */
 
-#ifdef INSN_SETS_ARE_DELAYED
  if (INSN_SETS_ARE_DELAYED (seq->insn (0)))
return 0;
-#endif
 
 #ifdef INSN_REFERENCES_ARE_DELAYED
  if (INSN_REFERENCES_ARE_DELAYED (seq->insn (0)))
@@ -1657,10 +1655,8 @@ redundant_insn (rtx insn, rtx_insn *target, rtx 
delay_list)
  /* If this is an INSN or JUMP_INSN with delayed effects, it
 is hard to track the resource needs properly, so give up.  */
 
-#ifdef INSN_SETS_ARE_DELAYED
  if (INSN_SETS_ARE_DELAYED (control))
return 0;
-#endif
 
 #ifdef INSN_REFERENCES_ARE_DELAYED
  if (INSN_REFERENCES_ARE_DELAYED (control))
diff --git a/gcc/resource.c b/gcc/resource.c
index 9a013b3..5af9376 100644
--- a/gcc/resource.c
+++ b/gcc/resource.c
@@ -696,11 +696,9 @@ mark_set_resources (rtx x, struct resources *res, int 
in_dest,
/* An insn consisting of just a CLOBBER (or USE) is just for flow
   and doesn't actually do anything, so we ignore it.  */
 
-#ifdef INSN_SETS_ARE_DELAYED
   if (mark_type != MARK_SRC_DEST_CALL
  && INSN_SETS_ARE_DELAYED (as_a  (x)))
return;
-#endif
 
   x = PATTERN (x);
   if (GET_CODE (x) != USE && GET_CODE (x) != CLOBBER)
-- 
2.3.0.80.g18d0fec.dirty



Re: [AArch64][PR65139] use clobber with match_scratch for aarch64_lshr_sisd_or_int_3

2015-04-21 Thread Richard Earnshaw
On 18/04/15 19:17, Maxim Kuvyrkov wrote:
>> On Apr 18, 2015, at 8:21 PM, Richard Earnshaw 
>>  wrote:
>>
>> On 18/04/15 16:13, Jakub Jelinek wrote:
>>> On Sat, Apr 18, 2015 at 03:07:16PM +0100, Richard Earnshaw wrote:
 You need to ensure that your scratch register cannot overlap op1, since
 the scratch is written before op1 is read.
>>>
>>> -   (clobber (match_scratch:QI 3 "=X,w,X"))]
>>> +   (clobber (match_scratch:QI 3 "=X,&w,X"))]
>>>
>>> incremental diff should ensure that, right?
>>>
>>> Jakub
>>>
>>
>>
>> Sorry, where in the patch is that hunk?
>>
>> I see just:
>>
>> +   (clobber (match_scratch:QI 3 "=X,w,X"))]
> 
> Jakub's suggestion is an incremental patch on top of Kugan's.
> 

Ah, sorry, I though he was implying it was already in the patch somewhere.

>>
>> And why would early clobbering the scratch be notably better than the
>> original?
>>
> 
> It will still be better.  With this patch we want to allow RA freedom to 
> optimally handle both of the following cases:
> 
> 1. operand[1] dies after the instruction.  In this case we want operand[0] 
> and operand[1] to be assigned to the same reg, and operand[3] to be assigned 
> to a different register to provide a temporary.  In this case we don't care 
> whether operand[3] is early-clobber or not.  This case is not optimally 
> handled with current insn patterns.
> 
> 2. operand[1] lives on after the instruction.  In this case we want 
> operand[0] and operand[3] to be assigned to the same reg, and not clobber 
> operand[1].  By marking operand[3] early-clobber we ensure that operand[1] is 
> in a different register from what operand[0] and operand[3] were assigned to. 
>  This case should be handled equally well before and after the patch.
> 
> My understanding is that Kugan's patch with Jakub's fix on top satisfy both 
> of these cases.
>  

I still don't think it handles all cases efficiently.  If we really want
the result in a different register from both of the inputs, then now we
need two registers for the results, one for the result and another for
the temporary.  In that case we could have used the result register as
the scratch, but now we can't.

Maybe we can provide two alternatives, one that early-clobbers the
result register but doesn't need a scratch and one that doesn't
early-clobber the result, but does need a scratch.

So something like

(define_insn "aarch64_lshr_sisd_or_int_3"
  [(set (match_operand:GPI 0 "register_operand" "=w,&w,w,r")
 (lshiftrt:GPI
   (match_operand:GPI 1 "register_operand" "w,w,w,r")
   (match_operand:QI 2 "aarch64_reg_or_shift_imm_"
  "Us,w,w,rUs")))
   (clobber (match_scratch:QI 3 "=X,X,w,X"))]

... but I haven't tested any of that.

I would also note the conversation in
https://gcc.gnu.org/ml/gcc/2015-04/msg00240.html.  That seems to suggest
we should be wary of using scratch sequences since the register
allocator doesn't account for them properly.

R.

> --
> Maxim Kuvyrkov
> www.linaro.org
> 



Re: [PATCH 11/12] provide default for INSN_SETS_ARE_DELAYED

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h (INSN_SETS_ARE_DELAYED): New definition.
* reorg.c (redundant_insn): Remove ifdef INSN_SETS_ARE_DELAYED.
* resource.c (mark_set_resources): Likewise.

OK.
Jeff



Re: [PATCH 12/12] add default for INSN_REFERENCES_ARE_DELAYED

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h (INSN_REFERENCES_ARE_DELAYED): New definition.
* reorg.c (redundant_insn): Remove ifdef
INSN_REFERENCES_ARE_DELAYED.
* resource.c (mark_referenced_resources): Likewise.

OK.
Jeff



Re: [PATCH 06/12] provide default for RETURN_ADDR_OFFSET

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h (RETURN_ADDR_OFFSET): New definition.
* except.c (expand_builtin_extract_return_addr): Remove ifdef
RETURN_ADDR_OFFSET.
(expand_builtin_frob_return_addr): Likewise.

OK.
jeff



Re: [PATCH 07/12] provide default for MASK_RETURN_ADDR

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h (MASK_RETURN_ADDR): New definition.
* except.c (expand_builtin_extract_return_addr): Remove ifdef
MASK_RETURN_ADDR.

OK.
jeff



[PATCH, i386]: Some spring cleaning in i386.h

2015-04-21 Thread Uros Bizjak
Hello!

This patch redefines various hard register numbers with ones from
i386.md. Also, the patch reshuffles some defines to group them
together in a better way.

No functional changes.

2015-04-21  Uros Bizjak  

* config/i386/i386.md (ARGP_REG, FRAME_REG, BND2_REG, BND3_REG,
FIRST_PSEUDO_REG): New.
* config/i386/i386.h (STACK_POINTER_REGNUM): Define to SP_REG.
(ARG_POINTER_REGNUM): Define to ARGP_REG.
(FRAME_POINTER_REGNUM): Define to FRAME_REG.
(HARD_FRAME_POINTER_REGNUM): Define to BP_REG.
(FIRST_PSEUDO_REGISTER): Define to FIRST_PSEUDO_REG.
(FIRST_INT_REG): New.
(LAST_INT_REG): New.
(FIRST_*_REG): Define using *_REG.
(LAST_*_REG): Ditto.
(QI_REGNO_P): Define using FIRST_QU_REG and LAST_QI_REG.
(LEGACY_INT_REGNO_P): Define using FIRST_INT_REG and LAST_INT_REG.
(FIRST_FLOAT_REG): Define to FIRST_STACK_REG.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32},
committed to mainline SVN.

Uros.
Index: config/i386/i386.h
===
--- config/i386/i386.h  (revision 57)
+++ config/i386/i386.h  (working copy)
@@ -957,7 +957,7 @@ extern const char *host_detect_local_cpu (int argc
eliminated during reloading in favor of either the stack or frame
pointer.  */
 
-#define FIRST_PSEUDO_REGISTER 81
+#define FIRST_PSEUDO_REGISTER FIRST_PSEUDO_REG
 
 /* Number of hardware registers that go into the DWARF-2 unwind info.
If not defined, equals FIRST_PSEUDO_REGISTER.  */
@@ -1100,7 +1100,7 @@ extern const char *host_detect_local_cpu (int argc
|| (MODE) == V16SImode || (MODE) == V16SFmode || (MODE) == V32HImode \
|| (MODE) == V4TImode)
 
-#define VALID_AVX512VL_128_REG_MODE(MODE)  
\
+#define VALID_AVX512VL_128_REG_MODE(MODE)  \
   ((MODE) == V2DImode || (MODE) == V2DFmode || (MODE) == V16QImode \
|| (MODE) == V4SImode || (MODE) == V4SFmode || (MODE) == V8HImode)
 
@@ -1121,6 +1121,10 @@ extern const char *host_detect_local_cpu (int argc
|| (MODE) == V2SImode || (MODE) == SImode   \
|| (MODE) == V4HImode || (MODE) == V8QImode)
 
+#define VALID_MASK_REG_MODE(MODE) ((MODE) == HImode || (MODE) == QImode)
+
+#define VALID_MASK_AVX512BW_MODE(MODE) ((MODE) == SImode || (MODE) == DImode)
+
 #define VALID_BND_REG_MODE(MODE) \
   (TARGET_64BIT ? (MODE) == BND64mode : (MODE) == BND32mode)
 
@@ -1150,10 +1154,16 @@ extern const char *host_detect_local_cpu (int argc
|| (MODE) == V16SImode || (MODE) == V32HImode || (MODE) == V8DFmode \
|| (MODE) == V16SFmode)
 
-#define VALID_MASK_REG_MODE(MODE) ((MODE) == HImode || (MODE) == QImode)
+#define X87_FLOAT_MODE_P(MODE) \
+  (TARGET_80387 && ((MODE) == SFmode || (MODE) == DFmode || (MODE) == XFmode))
 
-#define VALID_MASK_AVX512BW_MODE(MODE) ((MODE) == SImode || (MODE) == DImode)
+#define SSE_FLOAT_MODE_P(MODE) \
+  ((TARGET_SSE && (MODE) == SFmode) || (TARGET_SSE2 && (MODE) == DFmode))
 
+#define FMA4_VEC_FLOAT_MODE_P(MODE) \
+  (TARGET_FMA4 && ((MODE) == V4SFmode || (MODE) == V2DFmode \
+ || (MODE) == V8SFmode || (MODE) == V4DFmode))
+
 /* Value is 1 if hard register REGNO can hold a value of machine-mode MODE.  */
 
 #define HARD_REGNO_MODE_OK(REGNO, MODE)\
@@ -1198,42 +1208,46 @@ extern const char *host_detect_local_cpu (int argc
register.  The ordinary mov instructions won't work */
 /* #define PC_REGNUM  */
 
+/* Base register for access to arguments of the function.  */
+#define ARG_POINTER_REGNUM ARGP_REG
+
 /* Register to use for pushing function arguments.  */
-#define STACK_POINTER_REGNUM 7
+#define STACK_POINTER_REGNUM SP_REG
 
 /* Base register for access to local variables of the function.  */
-#define HARD_FRAME_POINTER_REGNUM 6
+#define FRAME_POINTER_REGNUM FRAME_REG
+#define HARD_FRAME_POINTER_REGNUM BP_REG
 
-/* Base register for access to local variables of the function.  */
-#define FRAME_POINTER_REGNUM 20
+#define FIRST_INT_REG AX_REG
+#define LAST_INT_REG  SP_REG
 
-/* First floating point reg */
-#define FIRST_FLOAT_REG 8
+#define FIRST_QI_REG AX_REG
+#define LAST_QI_REG  BX_REG
 
 /* First & last stack-like regs */
-#define FIRST_STACK_REG FIRST_FLOAT_REG
-#define LAST_STACK_REG (FIRST_FLOAT_REG + 7)
+#define FIRST_STACK_REG ST0_REG
+#define LAST_STACK_REG  ST7_REG
 
-#define FIRST_SSE_REG (FRAME_POINTER_REGNUM + 1)
-#define LAST_SSE_REG  (FIRST_SSE_REG + 7)
+#define FIRST_SSE_REG XMM0_REG
+#define LAST_SSE_REG  XMM7_REG
 
-#define FIRST_MMX_REG  (LAST_SSE_REG + 1)   /*29*/
-#define LAST_MMX_REG   (FIRST_MMX_REG + 7)
+#define FIRST_MMX_REG  MM0_REG
+#define LAST_MMX_REG   MM7_REG
 
-#define FIRST_REX_INT_REG  (LAST_MMX_REG + 1) /*37*/
-#define LAST_REX_INT_REG   (FIRST_REX_INT_REG + 7)
+#define FIRST_REX_INT_REG  R8_REG
+#define LAST_REX_INT_REG   R15_REG
 
-#define FIRST_REX_SSE_REG  (LAST_REX_INT_REG + 1) /*45*/
-#define LAST_REX_SSE_REG   (FIRST_REX_SSE_REG + 7)
+#define FIRST_R

Re: [PATCH 09/12] remove #if for PIC_OFFSET_TABLE_REGNUM

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* df-scan.c (df_get_entry_block_def_set): Remove #ifdef
PIC_OFFSET_TABLE_REGNUM.

OK.
jeff



Re: [PATCH 08/12] reduce conditional compilation for HARD_FRAME_POINTER_IS_FRAME_POINTER

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* alias.c (init_alias_target): Remove ifdef
* HARD_FRAME_POINTER_IS_FRAME_POINTER.
* df-scan.c (df_insn_refs_collect): Likewise.
(df_get_regular_block_artificial_uses): Likewise.
(df_get_eh_block_artificial_uses): Likewise.
(df_get_entry_block_def_set): Likewise.
(df_get_exit_block_use_set): Likewise.
* emit-rtl.c (gen_rtx_REG): Likewise.
* ira.c (ira_setup_eliminable_regset): Likewise.
* reginfo.c (init_reg_sets_1): Likewise.
* regrename.c (rename_chains): Likewise.
* reload1.c (reload): Likewise.
(eliminate_regs_in_insn): Likewise.
* resource.c (mark_referenced_resources): Likewise.
(init_resource_info): Likewise.

OK.
jeff



Re: [PATCH 01/12] add default definition of EH_RETURN_DATA_REGNO

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h: New definition of EH_RETURN_DATA_REGNO.
* except.c: Remove definition of EH_RETURN_DATA_REGNO.
* builtins.c (expand_builtin): Remove check if
EH_RETURN_DATA_REGNO is defined.
* df-scan.c (df_bb_refs_collect): Likewise.
(df_get_exit_block_use_set): Likewise.
* haifa-sched.c (initiate_bb_reg_pressure_info): Likewise.
* ira-lives.c (process_bb_node_lives): Likewise.
* lra-lives.c (process_bb_lives): Likewise.
This one wasn't as obvious as the others, but is clearly OK once the 
full loops being guarded by EH_RETURN_DATA_REGNO are examined.


Jeff



Re: [PATCH 00/12] Reduce conditional compilation

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

Hi,

This is a first round of patches to reduce the amount of code with in #if /
#ifdef.  This makes it incrementally easier to not break configs other than the
one being built, and moves things slightly closer to using target hooks for
everything.

each commit bootstrapped and regtested on x86_64-linux-gnu without regression,
and whole patch set run through config-list.mk without issue, ok?
Thanks for tackling this.  It's not particular deep work, but I do think 
it'll help reduce the long term maintenance costs and make developers' 
lives easier.


Onward to the HAVE_cc0 patches :-)

Jeff

ps.  You hit a good window, my daughter was update late last night and 
is sleeping in a bit, so I've got unexpected time this morning before my 
meetings.




Re: [PATCH 02/12] remove some ifdef HAVE_cc0

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* conditions.h: Define macros even if HAVE_cc0 is undefined.
* emit-rtl.c: Define functions even if HAVE_cc0 is undefined.
* final.c: Likewise.
* jump.c: Likewise.
* recog.c: Likewise.
* recog.h: Declare functions even when HAVE_cc0 is undefined.
* sched-deps.c (sched_analyze_2): Always compile case for cc0.
OK.  Note for anyone else reading at home, some of the functions being 
unconditionally compiled now already had unconditional prototypes in the 
header files. So not everything needed a .h file change.


jeff



Re: [PATCH 03/12] more removal of ifdef HAVE_cc0

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* combine.c (find_single_use): Remove HAVE_cc0 ifdef for code
that is trivially ded on non cc0 targets.
(simplify_set): Likewise.
(mark_used_regs_combine): Likewise.
* cse.c (new_basic_block): Likewise.
(fold_rtx): Likewise.
(cse_insn): Likewise.
(cse_extended_basic_block): Likewise.
(set_live_p): Likewise.
* rtlanal.c (canonicalize_condition): Likewise.
* simplify-rtx.c (simplify_binary_operation_1): Likewise.

OK.  I find myself wondering if the conditionals should look like
if (HAVE_cc0
&& (whatever))

But I doubt it makes any measurable difference.  It's something we can 
always add in the future if we feel the need to avoid the runtime checks 
for things that aren't ever going to happen on most modern targets.


jeff



Re: [PATCH 04/12] always define HAVE_cc0

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* genconfig.c (main): Always define HAVE_cc0.
* caller-save.c (insert_one_insn): Change ifdef HAVE_cc0 to #if
HAVE_cc0.
* cfgcleanup.c (flow_find_cross_jump): Likewise.
(flow_find_head_matching_sequence): Likewise.
(try_head_merge_bb): Likewise.
* cfgrtl.c (rtl_merge_blocks): Likewise.
(try_redirect_by_replacing_jump): Likewise.
(rtl_tidy_fallthru_edge): Likewise.
* combine.c (do_SUBST_MODE): Likewise.
(insn_a_feeds_b): Likewise.
(combine_instructions): Likewise.
(can_combine_p): Likewise.
(try_combine): Likewise.
(find_split_point): Likewise.
(subst): Likewise.
(simplify_set): Likewise.
(distribute_notes): Likewise.
* cprop.c (cprop_jump): Likewise.
* cse.c (cse_extended_basic_block): Likewise.
* df-problems.c (can_move_insns_across): Likewise.
* final.c (final): Likewise.
(final_scan_insn): Likewise.
* function.c (emit_use_return_register_into_block): Likewise.
* gcse.c (insert_insn_end_basic_block): Likewise.
* haifa-sched.c (sched_init): Likewise.
* ira.c (find_moveable_pseudos): Likewise.
* loop-invariant.c (find_invariant_insn): Likewise.
* lra-constraints.c (curr_insn_transform): Likewise.
* optabs.c (prepare_cmp_insn): Likewise.
* postreload.c (reload_combine_recognize_const_pattern):
* Likewise.
* reload.c (find_reloads): Likewise.
(find_reloads_address_1): Likewise.
* reorg.c (delete_scheduled_jump): Likewise.
(steal_delay_list_from_target): Likewise.
(steal_delay_list_from_fallthrough): Likewise.
(try_merge_delay_insns): Likewise.
(redundant_insn): Likewise.
(fill_simple_delay_slots): Likewise.
(fill_slots_from_thread): Likewise.
(delete_computation): Likewise.
(relax_delay_slots): Likewise.
* sched-deps.c (sched_analyze_2): Likewise.
* sched-rgn.c (add_branch_dependences): Likewise.

Doesn't go as far as I'd like, but it's still an improvement.

OK.

jeff



Re: [PATCH 05/12] make some HAVE_cc0 code always compiled

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* cfgrtl.c (rtl_merge_blocks): Change #if HAVE_cc0 to if (HAVE_cc0)
(try_redirect_by_replacing_jump): Likewise.
(rtl_tidy_fallthru_edge): Likewise.
* combine.c (insn_a_feeds_b): Likewise.
(find_split_point): Likewise.
(simplify_set): Likewise.
* cprop.c (cprop_jump): Likewise.
* cse.c (cse_extended_basic_block): Likewise.
* df-problems.c (can_move_insns_across): Likewise.
* function.c (emit_use_return_register_into_block): Likewise.
* haifa-sched.c (sched_init): Likewise.
* ira.c (find_moveable_pseudos): Likewise.
* loop-invariant.c (find_invariant_insn): Likewise.
* lra-constraints.c (curr_insn_transform): Likewise.
* postreload.c (reload_combine_recognize_const_pattern):
* Likewise.
* reload.c (find_reloads): Likewise.
* reorg.c (delete_scheduled_jump): Likewise.
(steal_delay_list_from_target): Likewise.
(steal_delay_list_from_fallthrough): Likewise.
(redundant_insn): Likewise.
(fill_simple_delay_slots): Likewise.
(fill_slots_from_thread): Likewise.
(delete_computation): Likewise.
* sched-rgn.c (add_branch_dependences): Likewise.

OK.  This is what I expected to see a lot of :-0

jeff



Re: [PATCH 10/12] remove more ifdefs for HAVE_cc0

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* caller-save.c (insert_one_insn): Remove ifdef HAVE_cc0.
* cfgcleanup.c (flow_find_cross_jump): Likewise.
(flow_find_head_matching_sequence): Likewise.
(try_head_merge_bb): Likewise.
* combine.c (can_combine_p): Likewise.
(try_combine): Likewise.
(distribute_notes): Likewise.
* df-problems.c (can_move_insns_across): Likewise.
* final.c (final): Likewise.
* gcse.c (insert_insn_end_basic_block): Likewise.
* ira.c (find_moveable_pseudos): Likewise.
* reorg.c (try_merge_delay_insns): Likewise.
(fill_simple_delay_slots): Likewise.
(fill_slots_from_thread): Likewise.
* sched-deps.c (sched_analyze_2): Likewise.

OK.

Jeff



Re: [PATCH 00/12] Reduce conditional compilation

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

Hi,

This is a first round of patches to reduce the amount of code with in #if /
#ifdef.  This makes it incrementally easier to not break configs other than the
one being built, and moves things slightly closer to using target hooks for
everything.

each commit bootstrapped and regtested on x86_64-linux-gnu without regression,
and whole patch set run through config-list.mk without issue, ok?
So I think after looking at this patchset, any changes of a similar 
nature you want to make should be considered pre-approved.  Just post 
them for archival purposes, but no need for you to wait for review as 
long as they have the same purpose and overall structure as was seen in 
these patches.


jeff



RE: [PATCH 6/13] mips musl support

2015-04-21 Thread Matthew Fortune
Szabolcs Nagy  writes:
> Set up dynamic linker name for mips.
> 
> gcc/Changelog:
> 
> 2015-04-16  Gregor Richards  
> 
>   * config/mips/linux.h (MUSL_DYNAMIC_LINKER): Define.

I understand that mips musl is o32 only currently is that correct?
There does however appear to be both soft and hard float variants
listed in the musl docs. Do you plan on using the same dynamic linker
name for both float variants? No problem if so but someone must have
decided to have unique names for big and little endian so I thought
it worth checking.

Also, are you aware of the two nan encoding formats that MIPS has
and the support present in glibc's dynamic linker to deal with it?

I wonder if it would be wise to refuse to target musl unless the
ABI is known to be supported so that we can avoid compatibility
issues when different ABI variants are added in musl.

Thanks,
Matthew


[PATCH][AArch64] Add branch-cost to cpu tuning information.

2015-04-21 Thread Matthew Wahab

The AArch64 backend sets BRANCH_COST to be the constant value 2 for all cpus,
meaning that the compiler thinks that branches cost the same across all cpus.

This patch reworks the handling of branch costs to allow per-cpu values to be
set. The actual value of the branch-costs is unchanged as the correct values for
will need to be decided for each core.

Tested aarch64-none-linux-gnu with gcc-check.

Ok for trunk?
Matthew

2015-05-21  Matthew Wahab  

* gcc/config/aarch64-protos.h (struct cpu_branch_cost): New.
(tune_params): Add field branch_costs.
(aarch64_branch_cost): Declare.
* gcc/config/aarch64.c (generic_branch_cost): New.
(generic_tunings): Set field cpu_branch_cost to generic_branch_cost.
(cortexa53_tunings): Likewise.
(cortexa57_tunings): Likewise.
(thunderx_tunings): Likewise.
(xgene1_tunings): Likewise.
(aarch64_branch_cost): Define.
* gcc/config/aarch64/aarch64.h (BRANCH_COST): Redefine.

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 8676c5c..77b01fa 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -162,12 +162,20 @@ struct cpu_vector_cost
   const int cond_not_taken_branch_cost;  /* Cost of not taken branch.  */
 };
 
+/* Branch costs.  */
+struct cpu_branch_cost
+{
+  const int predictable;/* Predictable branch or optimizing for size.  */
+  const int unpredictable;  /* Unpredictable branch or optimizing for speed.  */
+};
+
 struct tune_params
 {
   const struct cpu_cost_table *const insn_extra_cost;
   const struct cpu_addrcost_table *const addr_cost;
   const struct cpu_regmove_cost *const regmove_cost;
   const struct cpu_vector_cost *const vec_costs;
+  const struct cpu_branch_cost *const branch_costs;
   const int memmov_cost;
   const int issue_rate;
   const unsigned int fuseable_ops;
@@ -259,6 +267,8 @@ void aarch64_print_operand (FILE *, rtx, char);
 void aarch64_print_operand_address (FILE *, rtx);
 void aarch64_emit_call_insn (rtx);
 
+int aarch64_branch_cost (bool, bool);
+
 /* Initialize builtins for SIMD intrinsics.  */
 void init_aarch64_simd_builtins (void);
 
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 77a641e..a020316 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -339,12 +339,20 @@ static const struct cpu_vector_cost xgene1_vector_cost =
 #define AARCH64_FUSE_ADRP_LDR	(1 << 3)
 #define AARCH64_FUSE_CMP_BRANCH	(1 << 4)
 
+/* Generic costs for branch instructions.  */
+static const struct cpu_branch_cost generic_branch_cost =
+{
+  2,  /* Predictable.  */
+  2   /* Unpredictable.  */
+};
+
 static const struct tune_params generic_tunings =
 {
   &cortexa57_extra_costs,
   &generic_addrcost_table,
   &generic_regmove_cost,
   &generic_vector_cost,
+  &generic_branch_cost,
   4, /* memmov_cost  */
   2, /* issue_rate  */
   AARCH64_FUSE_NOTHING, /* fuseable_ops  */
@@ -362,6 +370,7 @@ static const struct tune_params cortexa53_tunings =
   &generic_addrcost_table,
   &cortexa53_regmove_cost,
   &generic_vector_cost,
+  &generic_branch_cost,
   4, /* memmov_cost  */
   2, /* issue_rate  */
   (AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD
@@ -380,6 +389,7 @@ static const struct tune_params cortexa57_tunings =
   &cortexa57_addrcost_table,
   &cortexa57_regmove_cost,
   &cortexa57_vector_cost,
+  &generic_branch_cost,
   4, /* memmov_cost  */
   3, /* issue_rate  */
   (AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD
@@ -398,6 +408,7 @@ static const struct tune_params thunderx_tunings =
   &generic_addrcost_table,
   &thunderx_regmove_cost,
   &generic_vector_cost,
+  &generic_branch_cost,
   6, /* memmov_cost  */
   2, /* issue_rate  */
   AARCH64_FUSE_CMP_BRANCH, /* fuseable_ops  */
@@ -415,6 +426,7 @@ static const struct tune_params xgene1_tunings =
   &xgene1_addrcost_table,
   &xgene1_regmove_cost,
   &xgene1_vector_cost,
+  &generic_branch_cost,
   6, /* memmov_cost  */
   4, /* issue_rate  */
   AARCH64_FUSE_NOTHING, /* fuseable_ops  */
@@ -5361,6 +5373,19 @@ aarch64_address_cost (rtx x,
   return cost;
 }
 
+int
+aarch64_branch_cost (bool speed_p, bool predictable_p)
+{
+  /* When optimizing for speed, use the cost of unpredictable branches.  */
+  const struct cpu_branch_cost *branch_costs =
+aarch64_tune_params->branch_costs;
+
+  if (!speed_p || predictable_p)
+return branch_costs->predictable;
+  else
+return branch_costs->unpredictable;
+}
+
 /* Return true if the RTX X in mode MODE is a zero or sign extract
usable in an ADD or SUB (extended register) instruction.  */
 static bool
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index bf59e40..93a32f5 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -823,7 +823,8 @@ do {	 \
 #define TRAMPOLINE_SECTION text_section
 
 /* To start with.  */
-#define BRANCH_COST(SPEED_P, PREDICTABLE_P) 2
+#d

Re: [PATCH 01/12] add default definition of EH_RETURN_DATA_REGNO

2015-04-21 Thread Jakub Jelinek
On Tue, Apr 21, 2015 at 07:40:37AM -0600, Jeff Law wrote:
> On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:
> >From: Trevor Saunders 
> >
> >gcc/ChangeLog:
> >
> >2015-04-21  Trevor Saunders  
> >
> > * defaults.h: New definition of EH_RETURN_DATA_REGNO.
> > * except.c: Remove definition of EH_RETURN_DATA_REGNO.
> > * builtins.c (expand_builtin): Remove check if
> > EH_RETURN_DATA_REGNO is defined.
> > * df-scan.c (df_bb_refs_collect): Likewise.
> > (df_get_exit_block_use_set): Likewise.
> > * haifa-sched.c (initiate_bb_reg_pressure_info): Likewise.
> > * ira-lives.c (process_bb_node_lives): Likewise.
> > * lra-lives.c (process_bb_lives): Likewise.
> This one wasn't as obvious as the others, but is clearly OK once the full
> loops being guarded by EH_RETURN_DATA_REGNO are examined.

Except that the bb_has_eh_pred predicate might burn CPU time for basic
blocks with many predecessors.  Though, the question is if there are any
important targets that don't define EH_RETURN_DATA_REGNO already.

Jakub


Re: [PATCH 01/12] add default definition of EH_RETURN_DATA_REGNO

2015-04-21 Thread Jeff Law

On 04/21/2015 08:00 AM, Jakub Jelinek wrote:

On Tue, Apr 21, 2015 at 07:40:37AM -0600, Jeff Law wrote:

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h: New definition of EH_RETURN_DATA_REGNO.
* except.c: Remove definition of EH_RETURN_DATA_REGNO.
* builtins.c (expand_builtin): Remove check if
EH_RETURN_DATA_REGNO is defined.
* df-scan.c (df_bb_refs_collect): Likewise.
(df_get_exit_block_use_set): Likewise.
* haifa-sched.c (initiate_bb_reg_pressure_info): Likewise.
* ira-lives.c (process_bb_node_lives): Likewise.
* lra-lives.c (process_bb_lives): Likewise.

This one wasn't as obvious as the others, but is clearly OK once the full
loops being guarded by EH_RETURN_DATA_REGNO are examined.


Except that the bb_has_eh_pred predicate might burn CPU time for basic
blocks with many predecessors.  Though, the question is if there are any
important targets that don't define EH_RETURN_DATA_REGNO already.
Probably not since they'll blow up elsewhere (I was recently helping 
someone with a private port that didn't define EH_RETURN_DATA_REGNO) :-)

jeff


Re: [PATCH][combine] Do not call rtx costs on potentially unrecognisable rtxes in combine

2015-04-21 Thread Jeff Law

On 04/21/2015 03:18 AM, Kyrill Tkachov wrote:


Though I do wonder if, in practice, we can identify those cases that do
simplify more directly apriori and just punt everything else rather than
this rather convoluted approach.


You mean like calling simplify_binary_operation that returns NULL
if no simplification is possible?
Not entirely sure, just a general sense that we're doing far more work 
here than is justified by the potential gains.  The cases we care about 
are very limited (negated or duplicated arguments) and I'd be surprised 
if they're still showing up in combine.c these days.  I didn't look at 
the history of that code, but I suspect it is *very very* old.


I'm not asking you to tackle this problem, it was more meant as an 
observation.  But if you want to dig deeper, go for it.  If it were me, 
the first thing I'd do is try to construct a testcase that would get me 
into that code -- I'd be it's hard, particularly with the tree and rtl 
reassociations we do these days.



Jeff


[patch] [java] bump libgcj soname

2015-04-21 Thread Matthias Klose
bump the libgcj soname on the trunk, as done for every release cycle, and update
the cygwin/mingw32 files.

ok for the trunk?

  Matthias


gcc/

2015-04-21  Matthias Klose  

	* config/i386/cygwin.h (LIBGCJ_SONAME): Set libgcj version to -17.
	* config/i386/mingw32.h (LIBGCJ_SONAME): Set libgcj version to -17.

libjava/

2015-04-21  Matthias Klose  

	* libtool-version: Bump soversion.

Index: gcc/config/i386/cygwin.h
===
--- gcc/config/i386/cygwin.h	(revision 68)
+++ gcc/config/i386/cygwin.h	(working copy)
@@ -154,5 +154,5 @@
 #define LIBGCC_SONAME "cyggcc_s" LIBGCC_EH_EXTN "-1.dll"
 
 /* We should find a way to not have to update this manually.  */
-#define LIBGCJ_SONAME "cyggcj" /*LIBGCC_EH_EXTN*/ "-16.dll"
+#define LIBGCJ_SONAME "cyggcj" /*LIBGCC_EH_EXTN*/ "-17.dll"
 
Index: gcc/config/i386/mingw32.h
===
--- gcc/config/i386/mingw32.h	(revision 68)
+++ gcc/config/i386/mingw32.h	(working copy)
@@ -254,4 +254,4 @@
 #define LIBGCC_SONAME "libgcc_s" LIBGCC_EH_EXTN "-1.dll"
 
 /* We should find a way to not have to update this manually.  */
-#define LIBGCJ_SONAME "libgcj" /*LIBGCC_EH_EXTN*/ "-16.dll"
+#define LIBGCJ_SONAME "libgcj" /*LIBGCC_EH_EXTN*/ "-17.dll"
Index: libjava/libtool-version
===
--- libjava/libtool-version	(revision 68)
+++ libjava/libtool-version	(working copy)
@@ -5,4 +5,4 @@
 # Note: When changing the version here, please do also update LIBGCJ_SONAME
 # in gcc/config/i386/cygwin.h and gcc/config/i386/mingw32.h.
 # CURRENT:REVISION:AGE
-16:0:0
+17:0:0


Re: [PATCH][expr.c] PR 65358 Avoid clobbering partial argument during sibcall

2015-04-21 Thread Jeff Law

On 04/21/2015 02:30 AM, Kyrill Tkachov wrote:


 From reading config/stormy16/stormy-abi it seems to me that we don't
pass arguments partially in stormy16, so this code would never be called
there. That leaves pa as the potential problematic target.
I don't suppose there's an easy way to test on pa? My checkout of binutils
doesn't seem to include a sim target for it.
No simulator, no machines in the testfarm, the box I had access to via 
parisc-linux.org seems dead and my ancient PA overheats well before a 
bootstrap could complete.  I often regret knowing about the backwards 
way many things were done on the PA because it makes me think about 
cases that only matter on dead architectures.



Jeff



Re: [PATCH][combine] Do not call rtx costs on potentially unrecognisable rtxes in combine

2015-04-21 Thread Kyrill Tkachov


On 21/04/15 15:06, Jeff Law wrote:

On 04/21/2015 03:18 AM, Kyrill Tkachov wrote:


Though I do wonder if, in practice, we can identify those cases that do
simplify more directly apriori and just punt everything else rather than
this rather convoluted approach.

You mean like calling simplify_binary_operation that returns NULL
if no simplification is possible?

Not entirely sure, just a general sense that we're doing far more work
here than is justified by the potential gains.  The cases we care about
are very limited (negated or duplicated arguments) and I'd be surprised
if they're still showing up in combine.c these days.  I didn't look at
the history of that code, but I suspect it is *very very* old.


I had a look when I was writing that patch and it was
from 2005 (r96681).



I'm not asking you to tackle this problem, it was more meant as an
observation.  But if you want to dig deeper, go for it.  If it were me,
the first thing I'd do is try to construct a testcase that would get me
into that code -- I'd be it's hard, particularly with the tree and rtl
reassociations we do these days.


Yeah, the comment does mention that it's supposed to
trigger rarely. I'm looking at it from the perspective
of cleaning up rtx cost usages though.

Thanks,
Kyrill




Jeff





Re: [patch] [java] bump libgcj soname

2015-04-21 Thread Jakub Jelinek
On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote:
> bump the libgcj soname on the trunk, as done for every release cycle,

Is that really needed though these days?
Weren't there basically zero changes to libjava (both libjava and
libjava/classpath) in the last 2 or more years?
The few ones were mostly updating Copyright notices, minor configure
changes, but I really haven't seen anything ABI changing for quite a while.

Jakub


Re: [patch] [java] bump libgcj soname

2015-04-21 Thread Matthias Klose
On 04/21/2015 04:11 PM, Jakub Jelinek wrote:
> On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote:
>> bump the libgcj soname on the trunk, as done for every release cycle,
> 
> Is that really needed though these days?
> Weren't there basically zero changes to libjava (both libjava and
> libjava/classpath) in the last 2 or more years?
> The few ones were mostly updating Copyright notices, minor configure
> changes, but I really haven't seen anything ABI changing for quite a while.

yes, the GCC version is embedded in the GCJ_VERSIONED_LIBDIR

which is defined as

gcjsubdir=gcj-$gcjversion-$libgcj_soversion
dbexecdir='$(toolexeclibdir)/'$gcjsubdir



Re: [PATCH 3/13] aarch64 musl support

2015-04-21 Thread pinskia




> On Apr 20, 2015, at 11:52 AM, Szabolcs Nagy  wrote:
> 
> Set up dynamic linker name for aarch64.
> 
> gcc/Changelog:
> 
> 2015-04-16  Gregor Richards  
>Szabolcs Nagy  
> 
>* config/aarch64/aarch64-linux.h (MUSL_DYNAMIC_LINKER): Define.


I don't think you need to check if defaulting to little or big-endian here are 
the specs always have one or the other passing through. 

Also if musl does not support ilp32, you might want to error out. Or even 
define the dynamic linker name even before support goes into musl. 

Thanks,
Andrew

> <03-aarch64.patch>


Re: [C/C++ PATCH] Improve -Wlogical-op (PR c/63357)

2015-04-21 Thread Manuel López-Ibáñez

On 21/04/15 13:16, Marek Polacek wrote:

(-Wlogical-op still isn't enabled neither by -Wall nor by -Wextra.)


The reason is https://gcc.gnu.org/PR61534

which means we don't want to warn for:

extern int xxx;
#define XXX xxx
int test (void)
{
  if (!XXX && xxx)
return 4;
  else
return 0;
}

(gcc/testsuite/gcc.dg/pr40172-3.c, although it should be moved to c-c++-common)

As noted in the PR: The problem is that !XXX becomes XXX == 0, but it has the 
location of "!", which is not virtual. If we look at the argument of the 
expression, then XXX is actually a var_decl, whose location corresponds to the 
declaration and not the use, and it is not virtual either. This is PR43486.




Bootstrapped/regtested on x86_64-linux, ok for trunk?


Does it pass bootstrap if you enable it? That is, is GCC itself -Wlogical-op 
clean?

Cheers,

Manuel.


Re: [patch] [java] bump libgcj soname

2015-04-21 Thread Jakub Jelinek
On Tue, Apr 21, 2015 at 04:16:18PM +0200, Matthias Klose wrote:
> On 04/21/2015 04:11 PM, Jakub Jelinek wrote:
> > On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote:
> >> bump the libgcj soname on the trunk, as done for every release cycle,
> > 
> > Is that really needed though these days?
> > Weren't there basically zero changes to libjava (both libjava and
> > libjava/classpath) in the last 2 or more years?
> > The few ones were mostly updating Copyright notices, minor configure
> > changes, but I really haven't seen anything ABI changing for quite a while.
> 
> yes, the GCC version is embedded in the GCJ_VERSIONED_LIBDIR
> 
> which is defined as
> 
> gcjsubdir=gcj-$gcjversion-$libgcj_soversion
> dbexecdir='$(toolexeclibdir)/'$gcjsubdir

But why is that an argument for bumping it?  If both GCC 5 and GCC 6 will
(likely) provide the same ABI in the library, there is no reason not to use
the same directory for those.

Jakub


Re: [PATCH 02/12] remove some ifdef HAVE_cc0

2015-04-21 Thread Richard Biener
On Tue, Apr 21, 2015 at 3:24 PM,   wrote:
> From: Trevor Saunders 
>
> gcc/ChangeLog:
>
> 2015-04-21  Trevor Saunders  
>
> * conditions.h: Define macros even if HAVE_cc0 is undefined.
> * emit-rtl.c: Define functions even if HAVE_cc0 is undefined.
> * final.c: Likewise.
> * jump.c: Likewise.
> * recog.c: Likewise.
> * recog.h: Declare functions even when HAVE_cc0 is undefined.
> * sched-deps.c (sched_analyze_2): Always compile case for cc0.
> ---
>  gcc/conditions.h | 6 --
>  gcc/emit-rtl.c   | 2 --
>  gcc/final.c  | 2 --
>  gcc/jump.c   | 3 ---
>  gcc/recog.c  | 2 --
>  gcc/recog.h  | 2 --
>  gcc/sched-deps.c | 5 +++--
>  7 files changed, 3 insertions(+), 19 deletions(-)
>
> diff --git a/gcc/conditions.h b/gcc/conditions.h
> index 2308bfc..7cd1e1c 100644
> --- a/gcc/conditions.h
> +++ b/gcc/conditions.h
> @@ -20,10 +20,6 @@ along with GCC; see the file COPYING3.  If not see
>  #ifndef GCC_CONDITIONS_H
>  #define GCC_CONDITIONS_H
>
> -/* None of the things in the files exist if we don't use CC0.  */
> -
> -#ifdef HAVE_cc0
> -
>  /* The variable cc_status says how to interpret the condition code.
> It is set by output routines for an instruction that sets the cc's
> and examined by output routines for jump instructions.
> @@ -117,6 +113,4 @@ extern CC_STATUS cc_status;
>   (cc_status.flags = 0, cc_status.value1 = 0, cc_status.value2 = 0,  \
>CC_STATUS_MDEP_INIT)
>
> -#endif
> -
>  #endif /* GCC_CONDITIONS_H */
> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index 483eacb..c1974bb 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -3541,7 +3541,6 @@ prev_active_insn (rtx uncast_insn)
>return insn;
>  }
>
> -#ifdef HAVE_cc0
>  /* Return the next insn that uses CC0 after INSN, which is assumed to
> set it.  This is the inverse of prev_cc0_setter (i.e., prev_cc0_setter
> applied to the result of this function should yield INSN).
> @@ -3589,7 +3588,6 @@ prev_cc0_setter (rtx uncast_insn)
>
>return insn;
>  }
> -#endif
>
>  #ifdef AUTO_INC_DEC
>  /* Find a RTX_AUTOINC class rtx which matches DATA.  */
> diff --git a/gcc/final.c b/gcc/final.c
> index 1fa93d9..41f6bd9 100644
> --- a/gcc/final.c
> +++ b/gcc/final.c
> @@ -191,7 +191,6 @@ static rtx last_ignored_compare = 0;
>
>  static int insn_counter = 0;
>
> -#ifdef HAVE_cc0
>  /* This variable contains machine-dependent flags (defined in tm.h)
> set and examined by output routines
> that describe how to interpret the condition codes properly.  */
> @@ -202,7 +201,6 @@ CC_STATUS cc_status;
> from before the insn.  */
>
>  CC_STATUS cc_prev_status;
> -#endif
>
>  /* Number of unmatched NOTE_INSN_BLOCK_BEG notes we have seen.  */
>
> diff --git a/gcc/jump.c b/gcc/jump.c
> index 34b3b7b..bc91550 100644
> --- a/gcc/jump.c
> +++ b/gcc/jump.c
> @@ -1044,8 +1044,6 @@ jump_to_label_p (const rtx_insn *insn)
>   && JUMP_LABEL (insn) != NULL && !ANY_RETURN_P (JUMP_LABEL (insn)));
>  }
>
> -#ifdef HAVE_cc0
> -
>  /* Return nonzero if X is an RTX that only sets the condition codes
> and has no side effects.  */
>
> @@ -1094,7 +1092,6 @@ sets_cc0_p (const_rtx x)
>  }
>return 0;
>  }
> -#endif
>
>  /* Find all CODE_LABELs referred to in X, and increment their use
> counts.  If INSN is a JUMP_INSN and there is at least one
> diff --git a/gcc/recog.c b/gcc/recog.c
> index a9d3b1f..c3ad86f 100644
> --- a/gcc/recog.c
> +++ b/gcc/recog.c
> @@ -971,7 +971,6 @@ validate_simplify_insn (rtx insn)
>return ((num_changes_pending () > 0) && (apply_change_group () > 0));
>  }
>
> -#ifdef HAVE_cc0
>  /* Return 1 if the insn using CC0 set by INSN does not contain
> any ordered tests applied to the condition codes.
> EQ and NE tests do not count.  */
> @@ -988,7 +987,6 @@ next_insn_tests_no_inequality (rtx insn)
>return (INSN_P (next)
>   && ! inequality_comparisons_p (PATTERN (next)));
>  }
> -#endif
>
>  /* Return 1 if OP is a valid general operand for machine mode MODE.
> This is either a register reference, a memory reference,
> diff --git a/gcc/recog.h b/gcc/recog.h
> index 45ea671..8a38b26 100644
> --- a/gcc/recog.h
> +++ b/gcc/recog.h
> @@ -112,9 +112,7 @@ extern void validate_replace_rtx_group (rtx, rtx, rtx);
>  extern void validate_replace_src_group (rtx, rtx, rtx);
>  extern bool validate_simplify_insn (rtx insn);
>  extern int num_changes_pending (void);
> -#ifdef HAVE_cc0
>  extern int next_insn_tests_no_inequality (rtx);
> -#endif
>  extern bool reg_fits_class_p (const_rtx, reg_class_t, int, machine_mode);
>
>  extern int offsettable_memref_p (rtx);
> diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c
> index 5434831..31de6be 100644
> --- a/gcc/sched-deps.c
> +++ b/gcc/sched-deps.c
> @@ -2608,8 +2608,10 @@ sched_analyze_2 (struct deps_desc *deps, rtx x, 
> rtx_insn *insn)
>
>return;
>
> -#ifdef HAVE_cc0
>  case CC0:
> +#ifdef HAVE_cc0

#ifndef ?

> +  gcc_unreachable ();
> +#endif
>/* Us

Re: [patch] [java] bump libgcj soname

2015-04-21 Thread Matthias Klose
On 04/21/2015 04:19 PM, Jakub Jelinek wrote:
> On Tue, Apr 21, 2015 at 04:16:18PM +0200, Matthias Klose wrote:
>> On 04/21/2015 04:11 PM, Jakub Jelinek wrote:
>>> On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote:
 bump the libgcj soname on the trunk, as done for every release cycle,
>>>
>>> Is that really needed though these days?
>>> Weren't there basically zero changes to libjava (both libjava and
>>> libjava/classpath) in the last 2 or more years?
>>> The few ones were mostly updating Copyright notices, minor configure
>>> changes, but I really haven't seen anything ABI changing for quite a while.
>>
>> yes, the GCC version is embedded in the GCJ_VERSIONED_LIBDIR
>>
>> which is defined as
>>
>> gcjsubdir=gcj-$gcjversion-$libgcj_soversion
>> dbexecdir='$(toolexeclibdir)/'$gcjsubdir
> 
> But why is that an argument for bumping it?  If both GCC 5 and GCC 6 will
> (likely) provide the same ABI in the library, there is no reason not to use
> the same directory for those.

but currently there are different directories used (gcjversion already changed
on the trunk) and compiled into the library.  Do you mean that gcjsubdir should
be just defined as gcj?

Matthias



[PATCH][libstc++v3]Add new dg-require-thread-fence directive.

2015-04-21 Thread Renlin Li

Hi all,

This patch defines a new dg-require-thread-fence directive. And three 
test cases are updated to use it.


The new directive are used to check whether the target support thread 
fence either by the target back-end or external library function call. A 
thread fence is required to expand atomic load/store.


There is a case that a call to some external __sync_synchronize will be 
emitted, and it's not implemented. You will get linking errors like 
this: undefined reference to `__sync_synchronize`. Test cases which are 
gated by this directive will be skipped if no thread fence is available. 
For example the three test cases updated here. They fail on 
arm-none-eabi target where __sync_synchronize() isn't implemented and 
target cpu has no memory_barrier.


___sync_synchronize () is used to check whether thread-fence is 
available. In GCC sync_synchronize is expanded as 
expand_mem_thread_fence (MEMMODEL_SEQ_CST).


Okay to commit?


libstdc++-v3/ChangeLog:

2015-04-21  Renlin Li  

* testsuite/lib/dg-options.exp (dg-require-thread-fence): New.
* testsuite/lib/libstdc++.exp (check_v3_target_thread_fence): New.
* testsuite/29_atomics/atomic_flag/clear/1.cc: Use it.
* testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc: Likewise.
* testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc: Likewise.
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc b/libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc
index 0a4219c..a6e2299 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc
@@ -1,4 +1,5 @@
 // { dg-options "-std=gnu++11" }
+// { dg-require-thread-fence "" }
 
 // Copyright (C) 2009-2015 Free Software Foundation, Inc.
 //
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc b/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc
index 2ff740b..0655be4 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc
@@ -1,4 +1,5 @@
 // { dg-options "-std=gnu++11" }
+// { dg-require-thread-fence "" }
 
 // Copyright (C) 2008-2015 Free Software Foundation, Inc.
 //
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc b/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc
index 6ac20c0..a867da2 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc
@@ -1,4 +1,5 @@
 // { dg-options "-std=gnu++11" }
+// { dg-require-thread-fence "" }
 
 // Copyright (C) 2008-2015 Free Software Foundation, Inc.
 //
diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp b/libstdc++-v3/testsuite/lib/dg-options.exp
index 38c8206..56ca896 100644
--- a/libstdc++-v3/testsuite/lib/dg-options.exp
+++ b/libstdc++-v3/testsuite/lib/dg-options.exp
@@ -115,6 +115,15 @@ proc dg-require-cmath { args } {
 return
 }
 
+proc dg-require-thread-fence { args } {
+if { ![ check_v3_target_thread_fence ] } {
+	upvar dg-do-what dg-do-what
+	set dg-do-what [list [lindex ${dg-do-what} 0] "N" "P"]
+	return
+}
+return
+}
+
 proc dg-require-atomic-builtins { args } {
 if { ![ check_v3_target_atomic_builtins ] } {
 	upvar dg-do-what dg-do-what
diff --git a/libstdc++-v3/testsuite/lib/libstdc++.exp b/libstdc++-v3/testsuite/lib/libstdc++.exp
index b2f7d00..9e395e2 100644
--- a/libstdc++-v3/testsuite/lib/libstdc++.exp
+++ b/libstdc++-v3/testsuite/lib/libstdc++.exp
@@ -1221,6 +1221,62 @@ proc check_v3_target_cmath { } {
 return $et_c99_math
 }
 
+proc check_v3_target_thread_fence { } {
+global cxxflags
+global DEFAULT_CXXFLAGS
+global et_thread_fence
+
+global tool
+
+if { ![info exists et_thread_fence_target_name] } {
+	set et_thread_fence_target_name ""
+}
+
+# If the target has changed since we set the cached value, clear it.
+set current_target [current_target_name]
+if { $current_target != $et_thread_fence_target_name } {
+	verbose "check_v3_target_thread_fence: `$et_thread_fence_target_name'" 2
+	set et_thread_fence_target_name $current_target
+	if [info exists et_thread_fence] {
+	verbose "check_v3_target_thread_fence: removing cached result" 2
+	unset et_thread_fence
+	}
+}
+
+if [info exists et_thread_fence] {
+	verbose "check_v3_target_thread_fence: using cached result" 2
+} else {
+	set et_thread_fence 0
+
+	# Set up and preprocess a C++11 test program that depends
+	# on the thread fence to be available.
+	set src thread_fence[pid].cc
+
+	set f [open $src "w"]
+	puts $f "int main() {"
+	puts $f "__sync_synchronize ();"
+	puts $f "return 0;"
+	puts $f "}"
+	close $f
+
+	set cxxflags_saved $cxxflags
+	set cxxflags "$cxxflags $DEFAULT_CXXFLAGS -Werror -std=gnu++11"
+
+	set lines [v3_target_compile $src /dev/null executable ""]
+	set cxxflags $cxxflag

Re: [patch] [java] bump libgcj soname

2015-04-21 Thread Jakub Jelinek
On Tue, Apr 21, 2015 at 04:29:52PM +0200, Matthias Klose wrote:
> On 04/21/2015 04:19 PM, Jakub Jelinek wrote:
> > On Tue, Apr 21, 2015 at 04:16:18PM +0200, Matthias Klose wrote:
> >> On 04/21/2015 04:11 PM, Jakub Jelinek wrote:
> >>> On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote:
>  bump the libgcj soname on the trunk, as done for every release cycle,
> >>>
> >>> Is that really needed though these days?
> >>> Weren't there basically zero changes to libjava (both libjava and
> >>> libjava/classpath) in the last 2 or more years?
> >>> The few ones were mostly updating Copyright notices, minor configure
> >>> changes, but I really haven't seen anything ABI changing for quite a 
> >>> while.
> >>
> >> yes, the GCC version is embedded in the GCJ_VERSIONED_LIBDIR
> >>
> >> which is defined as
> >>
> >> gcjsubdir=gcj-$gcjversion-$libgcj_soversion
> >> dbexecdir='$(toolexeclibdir)/'$gcjsubdir
> > 
> > But why is that an argument for bumping it?  If both GCC 5 and GCC 6 will
> > (likely) provide the same ABI in the library, there is no reason not to use
> > the same directory for those.
> 
> but currently there are different directories used (gcjversion already changed
> on the trunk) and compiled into the library.  Do you mean that gcjsubdir 
> should
> be just defined as gcj?

What depends on BASE-VER sure, that is bumped automatically and should track
the gcc version.  But the soname, which is an unrelated number, there is no
point to bump it.  If you have a packaging issue, just solve it on the
packaging side, but really there is no point to yearly bump a soname of
something that doesn't change at all (and is really dead project for many
years).

Jakub


Re: [PATCH] 65479 - sanitizer stack trace missing frames past #0 on powerpc64

2015-04-21 Thread Martin Sebor

--- a/libsanitizer/ChangeLog
+++ b/libsanitizer/ChangeLog
@@ -1,3 +1,15 @@
+2015-04-19  Martin Sebor  
+
+   PR sanitizer/65479
+   * libsanitizer/sanitizer_common/sanitizer_stacktrace.h
+   (StackTrace::signaled, StackTrace::min_insn_bytes): New data members.
+   (StackTrace::StackTrace): Initialize signaled.
+   * libsanitizer/sanitizer_common/sanitizer_stacktrace.cc
+   (StackTrace::GetPreviousInstructionPc): Rewrite.
+   * libsanitizer/sanitizer_common/sanitizer_stacktrace_libcdep.cc
+   (StackTrace::Print): Use min_insn_bytes to adjust PC value.
+   (BufferedStackTrace::Unwind): Set signaled.


libsanitizer/ should not show up in the ChangeLog entry.
But as somebody said earlier, the libsanitizer changes really should go
to LLVM compiler-rt repo first and then be just backported, either
cherry-picked (probably the case for the 5 branch backport later on) or go in
full merge from compiler-rt.


Okay, let me submit the sanitizer changes there. Since the
tests will continue to fail without it, the libbacktrace
change can go in later if that's preferable.




--- a/libsanitizer/sanitizer_common/sanitizer_stacktrace.cc
+++ b/libsanitizer/sanitizer_common/sanitizer_stacktrace.cc
@@ -15,19 +15,33 @@

  namespace __sanitizer {

-uptr StackTrace::GetPreviousInstructionPc(uptr pc) {
-#if defined(__arm__)
-  // Cancel Thumb bit.
-  pc = pc & (~1);
-#endif


Your code loses this, which is undesirable.


The original function fails to return the pc value on ARM
so I just took it out. I didn't look into what the intent
was but all the tests pass with the patch on aarch64 (after
applying the Fedora gcc 5 patch you mentioned yesterday).




-#if defined(__powerpc__) || defined(__powerpc64__)
-  // PCs are always 4 byte aligned.
-  return pc - 4;
-#elif defined(__sparc__) || defined(__mips__)
-  return pc - 8;


The SPARC/MIPS case is of course needed, because on these architectures
the call is followed by a delay slot.  But I wonder why you need anything
special on any other architecture, why pc - 1 isn't good enough for those.
The point isn't to find a PC of the call instruction, on some targets that
is very hard and you need to disassemble, but to just find some byte in the
call instruction.


I forgot about the delay slot. Thanks for the reminder.




+const unsigned StackTrace::min_insn_bytes =
+#if defined __ia64__
+// Intel Itanium has 5 byte instructions.
+5


E.g. this is wrong, ia64 doesn't have 5 byte instructions, but has VLIW
bundles, where in the 16 byte bundle there are up to 3 41-bit instructions
plus template.  But, ia64 isn't supported by libsanitizer and I doubt there
are enough users that would be interested in writing support for a dead
architecture.


I suppose with the sanitizer output referencing the unmodified
PC values on the stack the computation can be simplified to
just subtract (and later add) 1 on all targets. Let me change
that.

Martin


Re: [PATCH 6/13] mips musl support

2015-04-21 Thread Rich Felker
On Tue, Apr 21, 2015 at 01:58:02PM +, Matthew Fortune wrote:
> Szabolcs Nagy  writes:
> > Set up dynamic linker name for mips.
> > 
> > gcc/Changelog:
> > 
> > 2015-04-16  Gregor Richards  
> > 
> > * config/mips/linux.h (MUSL_DYNAMIC_LINKER): Define.
> 
> I understand that mips musl is o32 only currently is that correct?

This is correct. Other ABIs if/when we support them will have
different names.

> There does however appear to be both soft and hard float variants
> listed in the musl docs. Do you plan on using the same dynamic linker
> name for both float variants? No problem if so but someone must have
> decided to have unique names for big and little endian so I thought
> it worth checking.

No, it's supposed to be different (-sf suffix for soft float; see
arch/mips/reloc.h in musl source). If this didn't make it into the
patches it's an omission, probably because we didn't officially
support the sf ABI at all for a long time.

> Also, are you aware of the two nan encoding formats that MIPS has
> and the support present in glibc's dynamic linker to deal with it?

I am aware but somewhat skeptical of treating it as yet another
dimension to ABI and the resulting ABI combinatorics. The vast
majority of programs couldn't care less which is which and whether a
NAN is quiet or signaling. Officially we just use the classic mips ABI
(with qnan/snan swapped vs other archs) but there's no harm in
somebody doing the opposite if they really know what they're doing.

> I wonder if it would be wise to refuse to target musl unless the
> ABI is known to be supported so that we can avoid compatibility
> issues when different ABI variants are added in musl.

Possibly, though this might make bootstrapping new ABIs harder.

Rich


Re: [PATCH] PR 62173, re-shuffle insns for RTL loop invariant hoisting

2015-04-21 Thread Jiong Wang

Jiong Wang writes:

> 2015-04-14 18:24 GMT+01:00 Jeff Law :
>> On 04/14/2015 10:48 AM, Steven Bosscher wrote:

 So I think this stage2/3 binary difference is acceptable?
>>>
>>>
>>> No, they should be identical. If there's a difference, then there's a
>>> bug - which, it seems, you've already found, too.
>>
>> RIght.  And so the natural question is how to fix.
>>
>> At first glance it would seem like having this new code ignore dependencies
>> rising from debug insns would work.
>>
>> Which then begs the question, what happens to the debug insn -- it's
>> certainly not going to be correct anymore if the transformation is made.
>
> Exactly.
>
> The debug_insn 2776 in my example is to record the base address of a
> local array. the new code is doing correctly here by not shuffling the
> operands of insn 2556 and 2557 as there is additional reference of
> reg:1473 from debug insn, although the code will still execute correctly
> if we do the transformation.
>
> my understanding to fix this:
>
>   * delete the out-of-date mismatch debug_insn? as there is no guarantee
> to generate accurate debug info under -O2.
>
> IMO, this debug_insn may affect "DW_AT_location" field for variable
> descrption of "classes" in .debug_info section, but it's omitted in
> the final output already.
>
> <3><38a4d>: Abbrev Number: 137 (DW_TAG_variable)
> <38a4f>   DW_AT_name : (indirect string, offset: 0x18db): classes
> <38a53>   DW_AT_decl_file   : 1
> <38a54>   DW_AT_decl_line   : 548
> <38a56>   DW_AT_type: <0x38cb4>
>
>   * update the debug_insn? if the following change is OK with dwarf standard
>
>from
>
>  insn0: reg0 = fp + reg1
>  debug_insn: var_loc = reg0 + const_off
>  insn1: reg2 = reg0 + const_off
>
>to
>
>  insn0: reg0 = fp + const_off
>  debug_insn: var_loc = reg0 + reg1
>  insn1: reg2 = reg0 + reg1
>
> Thanks,
>

And attachment is the new patch which will update debug_insn as
described in the second solution above.

Now the stage2/3 binary differences on AArch64 gone away. Bootstrap OK.

On AArch64, this patch give 600+ new rtl loop invariants found across
spec2k6 float. +4.5% perf improvement on 436.cactusADM because four new
invariants found in the critical function "regex_compile".

The similar improvements may be achieved on other RISC backends like
powerpc/mips I guess.

One thing to mention, for AArch64, one minor glitch in
aarch64_legitimize_address needs to be fixed to let this patch take
effect, I will send out that patch later as it's a seperate issue.
Powerpc/Mips don't have this glitch in LEGITIMIZE_ADDRESS hook, so
should be OK, and I verified the base address of local array in the
testcase given by Seb on pr62173 do hoisted on ppc64 now. I think
pr62173 is fixed on those 64bit arch by this patch.

Thoughts?

Thanks.

2015-04-21  Jiong Wang  

gcc/
  * loop-invariant.c (find_defs): Enable DF_DU_CHAIN build.
  (vfp_const_iv): New hash table.
  (expensive_addr_check_p): New boolean.
  (init_inv_motion_data): Initialize new variables.>
  (free_inv_motion_data): Release hash table.
  (create_new_invariant): Set cheap_address to false for iv in
  vfp_const_iv table.
  (find_invariant_insn): Skip dependencies check for iv in vfp_const_iv
  table.
  (use_for_single_du): New function.
  (reshuffle_insn_with_vfp): Likewise.
  (find_invariants_bb): Call reshuffle_insn_with_vfp.

gcc/testsuite/
   * gcc.dg/pr62173.c: New testcase.

-- 
Regards,
Jiong

diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index f79b497..f70dfb0 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -203,6 +203,8 @@ typedef struct invariant *invariant_p;
 /* The invariants.  */
 
 static vec invariants;
+static hash_table  > *vfp_const_iv;
+static bool need_expensive_addr_check_p;
 
 /* Check the size of the invariant table and realloc if necessary.  */
 
@@ -695,7 +697,7 @@ find_defs (struct loop *loop)
 
   df_remove_problem (df_chain);
   df_process_deferred_rescans ();
-  df_chain_add_problem (DF_UD_CHAIN);
+  df_chain_add_problem (DF_UD_CHAIN + DF_DU_CHAIN);
   df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
   df_analyze_loop (loop);
   check_invariant_table_size ();
@@ -742,6 +744,9 @@ create_new_invariant (struct def *def, rtx_insn *insn, bitmap depends_on,
 	 See http://gcc.gnu.org/ml/gcc-patches/2009-10/msg01210.html .  */
   inv->cheap_address = address_cost (SET_SRC (set), word_mode,
 	 ADDR_SPACE_GENERIC, speed) < 3;
+
+  if (need_expensive_addr_check_p && vfp_const_iv->find (insn))
+	inv->cheap_address = false;
 }
   else
 {
@@ -952,7 +957,8 @@ find_invariant_insn (rtx_insn *insn, bool always_reached, bool always_executed)
 return;
 
   depends_on = BITMAP_ALLOC (NULL);
-  if (!check_dependencies (insn, depends_on))
+  if (!vfp_const_iv->find (insn)
+  && !check_dependencies (insn, depends_on))
 {
   BITMAP_FREE (depends_on);
   return;
@@ -1007,6 +1013,180 @@ find_invariants_insn (rtx_insn *in

Re: [PATCH] 65479 - sanitizer stack trace missing frames past #0 on powerpc64

2015-04-21 Thread Martin Sebor

On 04/21/2015 06:39 AM, Peter Bergner wrote:

On Tue, 2015-04-21 at 08:22 +0200, Jakub Jelinek wrote:

-#if defined(__powerpc__) || defined(__powerpc64__)
-  // PCs are always 4 byte aligned.
-  return pc - 4;
-#elif defined(__sparc__) || defined(__mips__)
-  return pc - 8;


The SPARC/MIPS case is of course needed, because on these architectures
the call is followed by a delay slot.  But I wonder why you need anything
special on any other architecture, why pc - 1 isn't good enough for those.
The point isn't to find a PC of the call instruction, on some targets that
is very hard and you need to disassemble, but to just find some byte in the
call instruction.


I wrote the "pc - 4" code for powerpc* and I guess I was just
being pedantic on returning the first address of the instruction.
If using "pc - 1" works, then I'm fine with that.


It works fine with the patch and produces sensible output
because the decremented address is only used to look up
the debug info and restored before it's output. Otherwise
(with the unpatched code) we'd end up with odd PC addresses
in the stack trace.

Martin



Peter





[patch, avr] extend part-clobbered check to AVR_TINY architecture

2015-04-21 Thread Sivanupandi, Pitchumani
Hi,

When tried backporting AVR_TINY architecture support to 4.9, build failed in 
libgcc for AVR_TINY.
Failure was due to ICE same as:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53065

Fix provided for that bug checks for if the mode crosses the callee saved 
register.
Below patch updates that check as the AVR_TINY has different set of callee 
saved 
registers (r18 and r19).

This patch is against trunk.

NOTE: ICE is re-produciable only with 4.9 + tiny patch and --with-dwarf2 
enabled.

Is this ok for trunk?

diff --git a/gcc/config/avr/avr.c b/gcc/config/avr/avr.c
index 68d5ddc..2f441e5 100644
--- a/gcc/config/avr/avr.c
+++ b/gcc/config/avr/avr.c
@@ -11333,9 +11333,10 @@ avr_hard_regno_call_part_clobbered (unsigned regno, 
machine_mode mode)
 return 0;

   /* Return true if any of the following boundaries is crossed:
- 17/18, 27/28 and 29/30.  */
+ 17/18 or 19/20 (if AVR_TINY), 27/28 and 29/30.  */

-  return ((regno < 18 && regno + GET_MODE_SIZE (mode) > 18)
+  return ((regno <= LAST_CALLEE_SAVED_REG &&
+   regno + GET_MODE_SIZE (mode) > (LAST_CALLEE_SAVED_REG + 1))
   || (regno < REG_Y && regno + GET_MODE_SIZE (mode) > REG_Y)
   || (regno < REG_Z && regno + GET_MODE_SIZE (mode) > REG_Z));
 }


Regards,
Pitchumani



Re: [PATCH 3/13] aarch64 musl support

2015-04-21 Thread Szabolcs Nagy


On 21/04/15 15:16, pins...@gmail.com wrote:
> 
> I don't think you need to check if defaulting to little or big-endian here 
> are the specs always have one or the other passing through. 
> 

i was not aware of this

may be the ifdef is not necessary for other archs either
i will check

> Also if musl does not support ilp32, you might want to error out. Or even 
> define the dynamic linker name even before support goes into musl. 
> 

ok, i guess adding %{mabi=ilp32:_ilp32} won't hurt us



RE: [PATCH 6/13] mips musl support

2015-04-21 Thread Matthew Fortune
Rich Felker  writes:
> On Tue, Apr 21, 2015 at 01:58:02PM +, Matthew Fortune wrote:
> > Szabolcs Nagy  writes:
> > > Set up dynamic linker name for mips.
> > >
> > > gcc/Changelog:
> > >
> > > 2015-04-16  Gregor Richards  
> > >
> > >   * config/mips/linux.h (MUSL_DYNAMIC_LINKER): Define.
> >
> > I understand that mips musl is o32 only currently is that correct?
> 
> This is correct. Other ABIs if/when we support them will have different
> names.
> 
> > There does however appear to be both soft and hard float variants
> > listed in the musl docs. Do you plan on using the same dynamic linker
> > name for both float variants? No problem if so but someone must have
> > decided to have unique names for big and little endian so I thought
> it
> > worth checking.
> 
> No, it's supposed to be different (-sf suffix for soft float; see
> arch/mips/reloc.h in musl source). If this didn't make it into the
> patches it's an omission, probably because we didn't officially support
> the sf ABI at all for a long time.
> 
> > Also, are you aware of the two nan encoding formats that MIPS has and
> > the support present in glibc's dynamic linker to deal with it?
> 
> I am aware but somewhat skeptical of treating it as yet another
> dimension to ABI and the resulting ABI combinatorics. The vast majority
> of programs couldn't care less which is which and whether a NAN is
> quiet or signaling. Officially we just use the classic mips ABI (with
> qnan/snan swapped vs other archs) but there's no harm in somebody doing
> the opposite if they really know what they're doing.

Couldn't agree more here but I know some people have been concerned about
it so the strict rules were put in place. I will attempt to remember and
copy the musl list when putting out a plan for formally relaxing the nan
encoding rules. The proposal is probably less than 2 weeks away from being
ready to review, it does of course make certain assumptions originating
from glibc as reference but is an independent ABI proposal.
 
> > I wonder if it would be wise to refuse to target musl unless the ABI
> > is known to be supported so that we can avoid compatibility issues
> > when different ABI variants are added in musl.
> 
> Possibly, though this might make bootstrapping new ABIs harder.

Indeed. The other alternative would be to set the dynamic linker name
to something slightly silly for unsupported ABIs like /lib/fixme.so
which would make it possible to bootstrap via the addition of a symlink
but it is clearly not the approved name.

thanks,
Matthew


Re: [PATCH 04/12] always define HAVE_cc0

2015-04-21 Thread Trevor Saunders
On Tue, Apr 21, 2015 at 07:53:05AM -0600, Jeff Law wrote:
> On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:
> >From: Trevor Saunders 
> >
> >gcc/ChangeLog:
> >
> >2015-04-21  Trevor Saunders  
> >
> > * genconfig.c (main): Always define HAVE_cc0.
> > * caller-save.c (insert_one_insn): Change ifdef HAVE_cc0 to #if
> > HAVE_cc0.
> > * cfgcleanup.c (flow_find_cross_jump): Likewise.
> > (flow_find_head_matching_sequence): Likewise.
> > (try_head_merge_bb): Likewise.
> > * cfgrtl.c (rtl_merge_blocks): Likewise.
> > (try_redirect_by_replacing_jump): Likewise.
> > (rtl_tidy_fallthru_edge): Likewise.
> > * combine.c (do_SUBST_MODE): Likewise.
> > (insn_a_feeds_b): Likewise.
> > (combine_instructions): Likewise.
> > (can_combine_p): Likewise.
> > (try_combine): Likewise.
> > (find_split_point): Likewise.
> > (subst): Likewise.
> > (simplify_set): Likewise.
> > (distribute_notes): Likewise.
> > * cprop.c (cprop_jump): Likewise.
> > * cse.c (cse_extended_basic_block): Likewise.
> > * df-problems.c (can_move_insns_across): Likewise.
> > * final.c (final): Likewise.
> > (final_scan_insn): Likewise.
> > * function.c (emit_use_return_register_into_block): Likewise.
> > * gcse.c (insert_insn_end_basic_block): Likewise.
> > * haifa-sched.c (sched_init): Likewise.
> > * ira.c (find_moveable_pseudos): Likewise.
> > * loop-invariant.c (find_invariant_insn): Likewise.
> > * lra-constraints.c (curr_insn_transform): Likewise.
> > * optabs.c (prepare_cmp_insn): Likewise.
> > * postreload.c (reload_combine_recognize_const_pattern):
> > * Likewise.
> > * reload.c (find_reloads): Likewise.
> > (find_reloads_address_1): Likewise.
> > * reorg.c (delete_scheduled_jump): Likewise.
> > (steal_delay_list_from_target): Likewise.
> > (steal_delay_list_from_fallthrough): Likewise.
> > (try_merge_delay_insns): Likewise.
> > (redundant_insn): Likewise.
> > (fill_simple_delay_slots): Likewise.
> > (fill_slots_from_thread): Likewise.
> > (delete_computation): Likewise.
> > (relax_delay_slots): Likewise.
> > * sched-deps.c (sched_analyze_2): Likewise.
> > * sched-rgn.c (add_branch_dependences): Likewise.
> Doesn't go as far as I'd like, but it's still an improvement.

Yeah, this one really just enables other nice things.  I really dislike
big patches since there's invariably something wrong somewhere and if
you don't really know the code in question it can be next to impossible
to figure out where the problem is.

Trev

> 
> OK.
> 
> jeff
> 


Re: [PATCH 03/12] more removal of ifdef HAVE_cc0

2015-04-21 Thread Trevor Saunders
On Tue, Apr 21, 2015 at 07:51:14AM -0600, Jeff Law wrote:
> On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:
> >From: Trevor Saunders 
> >
> >gcc/ChangeLog:
> >
> >2015-04-21  Trevor Saunders  
> >
> > * combine.c (find_single_use): Remove HAVE_cc0 ifdef for code
> > that is trivially ded on non cc0 targets.
> > (simplify_set): Likewise.
> > (mark_used_regs_combine): Likewise.
> > * cse.c (new_basic_block): Likewise.
> > (fold_rtx): Likewise.
> > (cse_insn): Likewise.
> > (cse_extended_basic_block): Likewise.
> > (set_live_p): Likewise.
> > * rtlanal.c (canonicalize_condition): Likewise.
> > * simplify-rtx.c (simplify_binary_operation_1): Likewise.
> OK.  I find myself wondering if the conditionals should look like
> if (HAVE_cc0
> && (whatever))
> 
> But I doubt it makes any measurable difference.  It's something we can
> always add in the future if we feel the need to avoid the runtime checks for
> things that aren't ever going to happen on most modern targets.

 yeah, it seems reasonably likely the branch predictor can deal with
 this for us (I tried to ensure things handled this way didn't do much
 other than a compare).  If not well that's what profiling is for :-)

 Trev

> 
> jeff
> 


Re: [PATCH 02/12] remove some ifdef HAVE_cc0

2015-04-21 Thread Trevor Saunders
On Tue, Apr 21, 2015 at 04:14:01PM +0200, Richard Biener wrote:
> On Tue, Apr 21, 2015 at 3:24 PM,   wrote:
> > From: Trevor Saunders 
> >
> > gcc/ChangeLog:
> >
> > 2015-04-21  Trevor Saunders  
> >
> > * conditions.h: Define macros even if HAVE_cc0 is undefined.
> > * emit-rtl.c: Define functions even if HAVE_cc0 is undefined.
> > * final.c: Likewise.
> > * jump.c: Likewise.
> > * recog.c: Likewise.
> > * recog.h: Declare functions even when HAVE_cc0 is undefined.
> > * sched-deps.c (sched_analyze_2): Always compile case for cc0.
> > ---
> >  gcc/conditions.h | 6 --
> >  gcc/emit-rtl.c   | 2 --
> >  gcc/final.c  | 2 --
> >  gcc/jump.c   | 3 ---
> >  gcc/recog.c  | 2 --
> >  gcc/recog.h  | 2 --
> >  gcc/sched-deps.c | 5 +++--
> >  7 files changed, 3 insertions(+), 19 deletions(-)
> >
> > diff --git a/gcc/conditions.h b/gcc/conditions.h
> > index 2308bfc..7cd1e1c 100644
> > --- a/gcc/conditions.h
> > +++ b/gcc/conditions.h
> > @@ -20,10 +20,6 @@ along with GCC; see the file COPYING3.  If not see
> >  #ifndef GCC_CONDITIONS_H
> >  #define GCC_CONDITIONS_H
> >
> > -/* None of the things in the files exist if we don't use CC0.  */
> > -
> > -#ifdef HAVE_cc0
> > -
> >  /* The variable cc_status says how to interpret the condition code.
> > It is set by output routines for an instruction that sets the cc's
> > and examined by output routines for jump instructions.
> > @@ -117,6 +113,4 @@ extern CC_STATUS cc_status;
> >   (cc_status.flags = 0, cc_status.value1 = 0, cc_status.value2 = 0,  \
> >CC_STATUS_MDEP_INIT)
> >
> > -#endif
> > -
> >  #endif /* GCC_CONDITIONS_H */
> > diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> > index 483eacb..c1974bb 100644
> > --- a/gcc/emit-rtl.c
> > +++ b/gcc/emit-rtl.c
> > @@ -3541,7 +3541,6 @@ prev_active_insn (rtx uncast_insn)
> >return insn;
> >  }
> >
> > -#ifdef HAVE_cc0
> >  /* Return the next insn that uses CC0 after INSN, which is assumed to
> > set it.  This is the inverse of prev_cc0_setter (i.e., prev_cc0_setter
> > applied to the result of this function should yield INSN).
> > @@ -3589,7 +3588,6 @@ prev_cc0_setter (rtx uncast_insn)
> >
> >return insn;
> >  }
> > -#endif
> >
> >  #ifdef AUTO_INC_DEC
> >  /* Find a RTX_AUTOINC class rtx which matches DATA.  */
> > diff --git a/gcc/final.c b/gcc/final.c
> > index 1fa93d9..41f6bd9 100644
> > --- a/gcc/final.c
> > +++ b/gcc/final.c
> > @@ -191,7 +191,6 @@ static rtx last_ignored_compare = 0;
> >
> >  static int insn_counter = 0;
> >
> > -#ifdef HAVE_cc0
> >  /* This variable contains machine-dependent flags (defined in tm.h)
> > set and examined by output routines
> > that describe how to interpret the condition codes properly.  */
> > @@ -202,7 +201,6 @@ CC_STATUS cc_status;
> > from before the insn.  */
> >
> >  CC_STATUS cc_prev_status;
> > -#endif
> >
> >  /* Number of unmatched NOTE_INSN_BLOCK_BEG notes we have seen.  */
> >
> > diff --git a/gcc/jump.c b/gcc/jump.c
> > index 34b3b7b..bc91550 100644
> > --- a/gcc/jump.c
> > +++ b/gcc/jump.c
> > @@ -1044,8 +1044,6 @@ jump_to_label_p (const rtx_insn *insn)
> >   && JUMP_LABEL (insn) != NULL && !ANY_RETURN_P (JUMP_LABEL 
> > (insn)));
> >  }
> >
> > -#ifdef HAVE_cc0
> > -
> >  /* Return nonzero if X is an RTX that only sets the condition codes
> > and has no side effects.  */
> >
> > @@ -1094,7 +1092,6 @@ sets_cc0_p (const_rtx x)
> >  }
> >return 0;
> >  }
> > -#endif
> >
> >  /* Find all CODE_LABELs referred to in X, and increment their use
> > counts.  If INSN is a JUMP_INSN and there is at least one
> > diff --git a/gcc/recog.c b/gcc/recog.c
> > index a9d3b1f..c3ad86f 100644
> > --- a/gcc/recog.c
> > +++ b/gcc/recog.c
> > @@ -971,7 +971,6 @@ validate_simplify_insn (rtx insn)
> >return ((num_changes_pending () > 0) && (apply_change_group () > 0));
> >  }
> >
> > -#ifdef HAVE_cc0
> >  /* Return 1 if the insn using CC0 set by INSN does not contain
> > any ordered tests applied to the condition codes.
> > EQ and NE tests do not count.  */
> > @@ -988,7 +987,6 @@ next_insn_tests_no_inequality (rtx insn)
> >return (INSN_P (next)
> >   && ! inequality_comparisons_p (PATTERN (next)));
> >  }
> > -#endif
> >
> >  /* Return 1 if OP is a valid general operand for machine mode MODE.
> > This is either a register reference, a memory reference,
> > diff --git a/gcc/recog.h b/gcc/recog.h
> > index 45ea671..8a38b26 100644
> > --- a/gcc/recog.h
> > +++ b/gcc/recog.h
> > @@ -112,9 +112,7 @@ extern void validate_replace_rtx_group (rtx, rtx, rtx);
> >  extern void validate_replace_src_group (rtx, rtx, rtx);
> >  extern bool validate_simplify_insn (rtx insn);
> >  extern int num_changes_pending (void);
> > -#ifdef HAVE_cc0
> >  extern int next_insn_tests_no_inequality (rtx);
> > -#endif
> >  extern bool reg_fits_class_p (const_rtx, reg_class_t, int, machine_mode);
> >
> >  extern int offsettable_memref_p (rtx);
> > diff --git 

Re: [PATCH 00/12] Reduce conditional compilation

2015-04-21 Thread Trevor Saunders
On Tue, Apr 21, 2015 at 07:57:19AM -0600, Jeff Law wrote:
> On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:
> >From: Trevor Saunders 
> >
> >Hi,
> >
> >This is a first round of patches to reduce the amount of code with in #if /
> >#ifdef.  This makes it incrementally easier to not break configs other than 
> >the
> >one being built, and moves things slightly closer to using target hooks for
> >everything.
> >
> >each commit bootstrapped and regtested on x86_64-linux-gnu without 
> >regression,
> >and whole patch set run through config-list.mk without issue, ok?
> So I think after looking at this patchset, any changes of a similar nature
> you want to make should be considered pre-approved.  Just post them for
> archival purposes, but no need for you to wait for review as long as they
> have the same purpose and overall structure as was seen in these patches.

thanks!  Its also always nice to have someone double check your logic
:-)

Trev

> 
> jeff
> 


[WIP] OpenMP 4 NVPTX support

2015-04-21 Thread Jakub Jelinek
Hi!

Attached is a minimal patch to get at least a trivial OpenMP 4.0 testcase
offloading to NVPTX (the first patch).  The second patch is WIP, just first
few needed changes to make libgomp to build for NVPTX (several weeks of work
at least).

The following seems to work and the output suggests that it was offloaded to
a non-SHM arch:

int
main ()
{
  int v = 0;
  int *w = 0;
  int x = 0;
#pragma omp target
  {
v = 6;
w = &v;
x = 1; // omp_is_initial_device ();
  }
  __builtin_printf ("%d %p %p %d\n", v, &v, w, x);
  return 0;
}

but already tiny bit more complicated testcase:

extern void *malloc (__SIZE_TYPE__);
extern void free (void *);

int
main ()
{
  int v = 0;
  int *w = 0;
  int x = 0;
#pragma omp target
  {
v = 6;
w = &v;
char *p = malloc (64);
x = 1; // omp_is_initial_device ();
free (p);
  }
  __builtin_printf ("%d %p %p %d\n", v, &v, w, x);
  return 0;
}

suggests that while it is nice that when building nvptx accel compiler
we build libgcc.a, libc.a, libm.a, libgfortran.a (and in the future hopefully 
libgomp.a),
nothing attempts to link those in :(.

Is the plan to link those in at mkoffload time (haven't seen any attempt
of mkoffload to invoke the nvptx-none-ld linker though), or link those in
somehow at link_ptx time in the plugin?
In either case, it isn't clear to me how things will work (if at all) in the
case where multiple shared libraries (or executable and at least one shared
library) have their own offloading bits, and if you try to e.g. call an
offloaded function defined in the shared library from an offloaded kernel in
the executable, because if any library needs some global singleton case, if
it is linked multiple times, no idea what the PTX JIT will do.

Once that is resolved, another thing will be to figure out how to
efficiently implement the TLS libgomp needs for its ICVs and other state
- right now it uses either __thread, or pthread_getspecific, neither of
these is usable of course.  I've been thinking about an array of those
structures in .shared memory indexed by %tid.x, but I guess that runs into
the issue that the array would need to be declared fixed size and there is a
very small size limitation on .shared memory size.
So perhaps a file scope .shared pointer to global memory, where whomever
launches an OpenMP 4.0 kernel (either the libgomp-plugin-nvptx.so.1 doing
GOMP_run, or later on dynamic parallelism from GOMP_target in the nvptx
libgomp.a) allocates the memory and some wrapper sets the .shared variable
to that allocated memory, then calls the kernel?

Jakub
--- libgomp/plugin/plugin-nvptx.c.jj2015-04-21 08:38:00.0 +0200
+++ libgomp/plugin/plugin-nvptx.c   2015-04-21 16:55:25.247470080 +0200
@@ -978,8 +978,8 @@ event_add (enum ptx_event_type type, CUe
 
 void
 nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
- size_t *sizes, unsigned short *kinds, int num_gangs, int num_workers,
- int vector_length, int async, void *targ_mem_desc)
+   size_t *sizes, unsigned short *kinds, int num_gangs,
+   int num_workers, int vector_length, int async, void *targ_mem_desc)
 {
   struct targ_fn_descriptor *targ_fn = (struct targ_fn_descriptor *) fn;
   CUfunction function;
@@ -1137,7 +1137,6 @@ nvptx_host2dev (void *d, const void *h,
   CUresult r;
   CUdeviceptr pb;
   size_t ps;
-  struct nvptx_thread *nvthd = nvptx_thread ();
 
   if (!s)
 return 0;
@@ -1162,7 +1161,8 @@ nvptx_host2dev (void *d, const void *h,
 GOMP_PLUGIN_fatal ("invalid size");
 
 #ifndef DISABLE_ASYNC
-  if (nvthd->current_stream != nvthd->ptx_dev->null_stream)
+  struct nvptx_thread *nvthd = nvptx_thread ();
+  if (nvthd && nvthd->current_stream != nvthd->ptx_dev->null_stream)
 {
   CUevent *e;
 
@@ -1202,7 +1202,6 @@ nvptx_dev2host (void *h, const void *d,
   CUresult r;
   CUdeviceptr pb;
   size_t ps;
-  struct nvptx_thread *nvthd = nvptx_thread ();
 
   if (!s)
 return 0;
@@ -1227,7 +1226,8 @@ nvptx_dev2host (void *h, const void *d,
 GOMP_PLUGIN_fatal ("invalid size");
 
 #ifndef DISABLE_ASYNC
-  if (nvthd->current_stream != nvthd->ptx_dev->null_stream)
+  struct nvptx_thread *nvthd = nvptx_thread ();
+  if (nvthd && nvthd->current_stream != nvthd->ptx_dev->null_stream)
 {
   CUevent *e;
 
@@ -1559,7 +1559,8 @@ GOMP_OFFLOAD_get_name (void)
 unsigned int
 GOMP_OFFLOAD_get_caps (void)
 {
-  return GOMP_OFFLOAD_CAP_OPENACC_200;
+  return GOMP_OFFLOAD_CAP_OPENACC_200
+| GOMP_OFFLOAD_CAP_OPENMP_400;
 }
 
 int
@@ -1759,7 +1760,7 @@ GOMP_OFFLOAD_openacc_parallel (void (*fn
   void *targ_mem_desc)
 {
   nvptx_exec (fn, mapnum, hostaddrs, devaddrs, sizes, kinds, num_gangs,
-   num_workers, vector_length, async, targ_mem_desc);
+ num_workers, vector_length, async, targ_mem_desc);
 }
 
 void
@@ -1889,3 +1890,27 @@ GOMP_OFFLOAD_openacc_set_cuda_stream (in
 {
   return nvptx_set_cuda_stream (async, stream);
 }
+
+void

Re: [PATCH][doc] Improve pipeline description docs a bit

2015-04-21 Thread Sandra Loosemore

On 04/20/2015 04:31 AM, Kyrill Tkachov wrote:

Hi all,

This patch attempts to improve the pipeline description documentation.
It fixes some grammar errors,typos and clarifies some concepts.

The sections on the syntactic constructs are formatted to have a
small description, and example, description of syntax elements and some
elaboration.

Is this ok for trunk?

Thanks,
Kyrill

2014-04-20  Kyrylo Tkachov  

* doc/md.texi (Specifying processor pipeline description):
Improve wording.
Clarify some constructs.


H.  I guess overall this is an improvement, but I still see quite a 
few things that need tweaking (and I wasn't even looking very hard).



+latency time}.  Instructions may not complete execution until all inputs
+to the instruction have been evaluated and are available for use.
+Taking data dependence delays into account is simple.


I don't think the above sentence adds anything and could be deleted.


+The data dependence (true, output, and anti-dependence) delay between two
+instructions is modelled as being constant.  In most cases this approach is
+adequate.  The second kind of interlock delays is a reservation delay.
+The reservation delay means that two or more executing instructions will 
require


s/will require/require/


+
+The define_automaton construct declares the names of automata.
+It takes the following form:

 @smallexample
 (define_automaton @var{automata-names})
 @end smallexample

 @var{automata-names} is a string giving names of the automata.  The
-names are separated by commas.  All the automata should have unique names.
-The automaton name is used in the constructions @code{define_cpu_unit} and
-@code{define_query_cpu_unit}.
+names are separated by commas.  All the automata must have unique names.
+The automaton name is used to bind @code{define_cpu_unit} and
+@code{define_query_cpu_unit} constructs to specific automata.
+
+This construct declares the names of automata.


You already said that a few sentences above; delete this one.


+The define_query_cpu_unit construct can be used to define units


Add @code{} markup here.


-@var{default_latency} is a number giving latency time of the
+@var{default_latency} is a number giving the latency of the
 instruction.  There is an important difference between the old
 description and the automaton based pipeline description.  The latency
-time is used for all dependencies when we use the old description.  In
-the automaton based pipeline description, the given latency time is only
-used for true dependencies.  The cost of anti-dependencies is always
-zero and the cost of output dependencies is the difference between
-latency times of the producing and consuming insns (if the difference
-is negative, the cost is considered to be zero).  You can always
-change the default costs for any description by using the target hook
+is used for all types of dependencies when we used the old description.  In
+the automaton based pipeline description, the  latency is only taken into
+account when analysing true dependencies (i.e. not output or
+anti-dependencies).  The cost of anti-dependencies is always zero and the
+cost of output dependencies is the difference between the latencies
+of the producing and consuming insns (if the difference is negative, the
+cost is considered to be zero).  You can always change the default cost
+between any pair of insns by using the target hook
 @code{TARGET_SCHED_ADJUST_COST} (@pxref{Scheduling}).


Here I am confused.  What is the "old description"?  If this is a 
leftover of some obsolete way of doing things, the references to it 
should be deleted.



+construct.  You must avoid having more than one
+@code{define_insn_reservation} matching any one RTL insn, as the behaviour is


s/behaviour/behavior/


+The following construct is used to describe a bypass i.e. an exception
+in the execution latency between a pair of instructions:


@dfn{bypass} ??


 @var{guard} is an optional string giving the name of a C function which
-defines an additional guard for the bypass.  The function will get the
+defines an additional guard for the bypass.  The function will take the
 two insns as parameters.  If the function returns zero the bypass will
 be ignored for this case.  The additional guard is necessary to


s/will take/takes/
s/will be ignored/is ignored/


+If there is more one bypass with the same output and input insns, the
+chosen bypass is the first bypass with a guard function in its definition that
+returns nonzero.  If there is no such bypass, then a bypass without a guard
+function is chosen.  These constructs can be used to describe, for example,
+forwarding paths in a processor pipeline.


I don't understand what the last sentence has to do with the rest of 
this paragraph.  If this is part of the general discussion of what 
define_bypass does, it should be moved up to the paragraph where the 
concept of a bypass is introduced.



-@var{unit-names} is a string giving names o

[PATCH][AARCH64]Use mov for add with large immediate.

2015-04-21 Thread Renlin Li

Hi all,

This is a simple patch to generate a move instruction to temporarily 
hold the large immediate for a add instruction.


GCC regression test has been run using aarch64-none-elf toolchain. NO 
new issues.


Okay for trunk?

Regards,
Renlin Li

gcc/ChangeLog:

2015-04-21  Renlin Li  

* config/aarch64/aarch64.md (add3): Use mov when allowed.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 1f4169e..9ea1939 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1414,18 +1414,28 @@
   "
   if (! aarch64_plus_operand (operands[2], VOIDmode))
 {
-  rtx subtarget = ((optimize && can_create_pseudo_p ())
-		   ? gen_reg_rtx (mode) : operands[0]);
   HOST_WIDE_INT imm = INTVAL (operands[2]);
-
-  if (imm < 0)
-	imm = -(-imm & ~0xfff);
+  if (aarch64_move_imm (imm, mode)
+	  && can_create_pseudo_p ())
+  {
+	rtx tmp = gen_reg_rtx (mode);
+	emit_move_insn (tmp, operands[2]);
+	operands[2] = tmp;
+  }
   else
-imm &= ~0xfff;
+  {
+	rtx subtarget = ((optimize && can_create_pseudo_p ())
+			 ? gen_reg_rtx (mode) : operands[0]);
+
+	if (imm < 0)
+	  imm = -(-imm & ~0xfff);
+	else
+	  imm &= ~0xfff;
 
-  emit_insn (gen_add3 (subtarget, operands[1], GEN_INT (imm)));
-  operands[1] = subtarget;
-  operands[2] = GEN_INT (INTVAL (operands[2]) - imm);
+	emit_insn (gen_add3 (subtarget, operands[1], GEN_INT (imm)));
+	operands[1] = subtarget;
+	operands[2] = GEN_INT (INTVAL (operands[2]) - imm);
+  }
 }
   "
 )


Re: [RFC stage 1] Proposed new warning: -Wmisleading-indentation

2015-04-21 Thread David Malcolm
On Thu, 2015-04-16 at 10:26 -0700, Mike Stump wrote:
> On Apr 16, 2015, at 8:01 AM, David Malcolm  wrote:
> > Attached is a work-in-progress patch for a new
> >  -Wmisleading-indentation
> > warning I've been experimenting with, for GCC 6.
> 
> Seems like a nice idea in general.
> 
> Does it also handle:
> 
> if (cone);
>   stmt;
> 
> ?  Would be good to add that to the test suite, as that is another hard to 
> spot common error that should be caught.

Not yet, but I agree that it would be a good thing to issue a warning
for.

> I do think that it is reasonable to warn for things like:
> 
>   stmt;
> stmt;
> 
> one of those two lines is likely misindented, though, maybe you want to start 
> with the high payback things first.

> > An issue here is how to determine (i), or if it's OK to default to 8
> 
> Yes, 8 is the proper value to default it to.
> 
> > and have a command-line option (param?) to override it? (though what about,
> > say, each header file?)
> 
> I’ll abstain from this.  The purist in me says no option for other
> than 8, life goes on.  20 years ago, someone was confused over hard v
> soft tabbing and what exactly the editor key TAB does.  That confusion
> is over, the 8 people have won.  Catering to other than 8 gives the
> impression that the people that lost still have a chance at
> winning.  :-)
> 
> > Thoughts on this, and on the patch?
> 
> Would be nice to have a stricter version that warns about all wildly 
> inconsistently or wrongly indented lines.
> 
> {
>   stmt;
> stmt;  // must be same as above
> }
> 
> {
> stmt; // must be indented at least 1
> }
> 
> if (cond)
> stmt;  // must be indented at least 1

I think I want to make a distinction between

(A) classic C "gotchas", like the one in my mail and the:

  if (cond);
stmt;

one you mentioned above

vs

(B) wrong/inconsistent indentation.

I think (A) is high-value, since it detects subtly wrong code, likely to
have misled the reader, whereas I don't find (B) as interesting.   I
think (A) is "misleading", whereas (B) is "wrong"; the ugliness of the
(B) cases tends to give me a "this code is ugly; beware, danger Will
Robinson!" reaction, whereas (A) is less ugly and thus more dangerous.

(if that makes sense; this may just be my own visceral reaction to the
erroneous code).

Or to put it another way, I hope to make (A) good enough to go into
-Wall, whereas I think (B) would meet more resistance. 
Also, I think autogenerated code is more likely to run into (B) than
(A).

I have the patch working now for the C++ frontend.  Am attaching the
work-in-progress (sans ChangeLog).  This one (v2) bootstrapped and
regrtested on x86_64-unknown-linux-gnu (Fedora 20), with:
  63 new "PASS" results in gcc.sum
  189 new "PASS" results in g++.sum
for the new test cases (relative to a control build of r48).

I also moved the visual-parser.c/h to c-family, to make use of the
-ftabstop option Tom mentioned in another mail.

I also made it identify the kind of clause, so error messages say things
like:

./Wmisleading-indentation-1.c:10:7: warning: statement is indented as if
it were guarded by... [-Wmisleading-indentation]
./Wmisleading-indentation-1.c:8:3: note: ...this 'if' clause, but it is
not

which makes it easier to read, especially when dealing with nesting.

This hasn't yet had any performance/leak fixes so it isn't ready as is.
I plan to look at making it warn about the:

  if (cond);
stmt;

gotcha next, before trying to optimize it.

(and no ChangeLog yet)

Dave
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 80c91f0..8154469 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1143,7 +1143,8 @@ C_COMMON_OBJS = c-family/c-common.o c-family/c-cppbuiltin.o c-family/c-dump.o \
   c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-print.o \
   c-family/c-semantics.o c-family/c-ada-spec.o \
   c-family/c-cilkplus.o \
-  c-family/array-notation-common.o c-family/cilk.o c-family/c-ubsan.o
+  c-family/array-notation-common.o c-family/cilk.o c-family/c-ubsan.o \
+  c-family/visual-parser.o
 
 # Language-independent object files.
 # We put the insn-*.o files first so that a parallel make will build
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 983f4a8..88f1f94 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -554,6 +554,10 @@ Wmemset-transposed-args
 C ObjC C++ ObjC++ Var(warn_memset_transposed_args) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall)
 Warn about suspicious calls to memset where the third argument is constant literal zero and the second is not
 
+Wmisleading-indentation
+C C++ Common Var(warn_misleading_indentation) Warning
+Warn when the indentation of the code does not reflect the block structure
+
 Wmissing-braces
 C ObjC C++ ObjC++ Var(warn_missing_braces) Warning LangEnabledBy(C ObjC,Wall)
 Warn about possibly missing braces around initializers
diff --git a/gcc/c-family/visual-parser.c b/gcc/c-family/visual-parser.c
new file mode 100644
index 000..b1fcb8b
--- /dev/null

Re: [RFC stage 1] Proposed new warning: -Wmisleading-indentation

2015-04-21 Thread Trevor Saunders
On Tue, Apr 21, 2015 at 12:07:00PM -0400, David Malcolm wrote:
> On Thu, 2015-04-16 at 10:26 -0700, Mike Stump wrote:
> > On Apr 16, 2015, at 8:01 AM, David Malcolm  wrote:
> > > Attached is a work-in-progress patch for a new
> > >  -Wmisleading-indentation
> > > warning I've been experimenting with, for GCC 6.
> > 
> > Seems like a nice idea in general.
> > 
> > Does it also handle:
> > 
> > if (cone);
> >   stmt;
> > 
> > ?  Would be good to add that to the test suite, as that is another hard to 
> > spot common error that should be caught.
> 
> Not yet, but I agree that it would be a good thing to issue a warning
> for.
> 
> > I do think that it is reasonable to warn for things like:
> > 
> >   stmt;
> > stmt;
> > 
> > one of those two lines is likely misindented, though, maybe you want to 
> > start with the high payback things first.
> 
> > > An issue here is how to determine (i), or if it's OK to default to 8
> > 
> > Yes, 8 is the proper value to default it to.
> > 
> > > and have a command-line option (param?) to override it? (though what 
> > > about,
> > > say, each header file?)
> > 
> > I’ll abstain from this.  The purist in me says no option for other
> > than 8, life goes on.  20 years ago, someone was confused over hard v
> > soft tabbing and what exactly the editor key TAB does.  That confusion
> > is over, the 8 people have won.  Catering to other than 8 gives the
> > impression that the people that lost still have a chance at
> > winning.  :-)
> > 
> > > Thoughts on this, and on the patch?
> > 
> > Would be nice to have a stricter version that warns about all wildly 
> > inconsistently or wrongly indented lines.
> > 
> > {
> >   stmt;
> > stmt;  // must be same as above
> > }
> > 
> > {
> > stmt; // must be indented at least 1
> > }
> > 
> > if (cond)
> > stmt;  // must be indented at least 1
> 
> I think I want to make a distinction between
> 
> (A) classic C "gotchas", like the one in my mail and the:
> 
>   if (cond);
> stmt;
> 
> one you mentioned above
> 
> vs
> 
> (B) wrong/inconsistent indentation.
> 
> I think (A) is high-value, since it detects subtly wrong code, likely to
> have misled the reader, whereas I don't find (B) as interesting.   I
> think (A) is "misleading", whereas (B) is "wrong"; the ugliness of the
> (B) cases tends to give me a "this code is ugly; beware, danger Will
> Robinson!" reaction, whereas (A) is less ugly and thus more dangerous.

So, while I was working on ifdef stuff in gcc I found the following
pattern

#ifdef FOO
if (FOO)
#endif
  bar ();

  which you may want to handle somehow.  In that sort of case one side
  of the ifdef will necessarily have the B type of miss indentation.

  Trev

> 
> (if that makes sense; this may just be my own visceral reaction to the
> erroneous code).
> 
> Or to put it another way, I hope to make (A) good enough to go into
> -Wall, whereas I think (B) would meet more resistance. 
> Also, I think autogenerated code is more likely to run into (B) than
> (A).
> 
> I have the patch working now for the C++ frontend.  Am attaching the
> work-in-progress (sans ChangeLog).  This one (v2) bootstrapped and
> regrtested on x86_64-unknown-linux-gnu (Fedora 20), with:
>   63 new "PASS" results in gcc.sum
>   189 new "PASS" results in g++.sum
> for the new test cases (relative to a control build of r48).
> 
> I also moved the visual-parser.c/h to c-family, to make use of the
> -ftabstop option Tom mentioned in another mail.
> 
> I also made it identify the kind of clause, so error messages say things
> like:
> 
> ./Wmisleading-indentation-1.c:10:7: warning: statement is indented as if
> it were guarded by... [-Wmisleading-indentation]
> ./Wmisleading-indentation-1.c:8:3: note: ...this 'if' clause, but it is
> not
> 
> which makes it easier to read, especially when dealing with nesting.
> 
> This hasn't yet had any performance/leak fixes so it isn't ready as is.
> I plan to look at making it warn about the:
> 
>   if (cond);
> stmt;
> 
> gotcha next, before trying to optimize it.
> 
> (and no ChangeLog yet)
> 
> Dave

> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 80c91f0..8154469 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1143,7 +1143,8 @@ C_COMMON_OBJS = c-family/c-common.o 
> c-family/c-cppbuiltin.o c-family/c-dump.o \
>c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-print.o \
>c-family/c-semantics.o c-family/c-ada-spec.o \
>c-family/c-cilkplus.o \
> -  c-family/array-notation-common.o c-family/cilk.o c-family/c-ubsan.o
> +  c-family/array-notation-common.o c-family/cilk.o c-family/c-ubsan.o \
> +  c-family/visual-parser.o
>  
>  # Language-independent object files.
>  # We put the insn-*.o files first so that a parallel make will build
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 983f4a8..88f1f94 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -554,6 +554,10 @@ Wmemset-transposed-args
>  C ObjC C++ ObjC++ Var(warn_memset_transp

Re: [RFC stage 1] Proposed new warning: -Wmisleading-indentation

2015-04-21 Thread Manuel López-Ibáñez

On 21/04/15 18:07, David Malcolm wrote:

On Thu, 2015-04-16 at 10:26 -0700, Mike Stump wrote:

Does it also handle:

if (cone);
   stmt;

?  Would be good to add that to the test suite, as that is another hard to spot 
common error that should be caught.


Not yet, but I agree that it would be a good thing to issue a warning
for.


GCC already warns for the above:

test.c:3:9: warning: suggest braces around empty body in an ‘if’ statement 
[-Wempty-body]

   if (a);
 ^

Cheers,

Manuel.


Re: [RFC] Dynamically aligning the stack

2015-04-21 Thread Steve Ellcey
On Tue, 2015-04-14 at 10:08 -0700, H.J. Lu wrote:

> We have done just that in GCC 4.4 to implement dynamic stack
> alignment on x86 :-).  Some of x86 backend changes for dynamic
> stack alignment are x86 psABI specific.  Others are historical,
> like -mstackrealign. which was the old attempt for dynamic stack
> alignment.

I am a bit confused about the history of stack alignment on x86.  So I
guess -mpreferred-stack-boundary=X came first and is not
obsolete/depreciated. But I thought -mstackrealign=X was the current
method of aligning the stack, but based on this comment and the patches
you pointed me at I guess this is also obsolete (or at least deprecated)
and that -mincoming-stack-boundary=X is the current option that should
be used.  But I am not sure how this option works.

Obviously it tells GCC what assumption to make about stack alignment at
the start of a function but how do you tell GCC what alignment you want
for the function?  Or does GCC figure that out for itself based on the
instructions and data types it sees in the function?

Steve Ellcey
sell...@imgtec.com




Re: [patch] [java] bump libgcj soname

2015-04-21 Thread Andrew Hughes
- Original Message -
> On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote:
> > bump the libgcj soname on the trunk, as done for every release cycle,
> 
> Is that really needed though these days?
> Weren't there basically zero changes to libjava (both libjava and
> libjava/classpath) in the last 2 or more years?
> The few ones were mostly updating Copyright notices, minor configure
> changes, but I really haven't seen anything ABI changing for quite a while.
> 

On the Classpath side, there's a bunch of stuff to merge in that would
change the ABI. It's a matter of finding a good point at which to do it
and time to do so. I keep missing the right point in the gcc lifecycle.

>   Jakub
> 

-- 
Andrew :)

Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: ed25519/35964222 (hkp://keys.gnupg.net)
Fingerprint = 5132 579D D154 0ED2 3E04  C5A0 CFDA 0F9B 3596 4222

PGP Key: rsa4096/248BDC07 (hkp://keys.gnupg.net)
Fingerprint = EC5A 1F5E C0AD 1D15 8F1F  8F91 3B96 A578 248B DC07



Re: [RFC stage 1] Proposed new warning: -Wmisleading-indentation

2015-04-21 Thread Mike Stump
On Apr 21, 2015, at 9:07 AM, David Malcolm  wrote:
> I think I want to make a distinction between
> 
> (A) classic C "gotchas", like the one in my mail and the:
> 
>  if (cond);
>stmt;
> 
> one you mentioned above
> 
> vs
> 
> (B) wrong/inconsistent indentation.
> 
> I think (A) is high-value, since it detects subtly wrong code, likely to
> have misled the reader, whereas I don't find (B) as interesting.

Ok.  I don’t have any problem with that.  Going for the high value only makes 
the problem space smaller, more likely to implement and do a good job and 
avoids false positives and all sorts of what ifs that the other class would 
expose you to.

I like your work and your plan.

Re: [patch] [java] bump libgcj soname

2015-04-21 Thread Jakub Jelinek
On Tue, Apr 21, 2015 at 01:04:04PM -0400, Andrew Hughes wrote:
> - Original Message -
> > On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote:
> > > bump the libgcj soname on the trunk, as done for every release cycle,
> > 
> > Is that really needed though these days?
> > Weren't there basically zero changes to libjava (both libjava and
> > libjava/classpath) in the last 2 or more years?
> > The few ones were mostly updating Copyright notices, minor configure
> > changes, but I really haven't seen anything ABI changing for quite a while.
> > 
> 
> On the Classpath side, there's a bunch of stuff to merge in that would
> change the ABI. It's a matter of finding a good point at which to do it
> and time to do so. I keep missing the right point in the gcc lifecycle.

Now might be a good time (any time next 6.5 months or so), and if that is
done, surely I have no issue with bumping the soname.

Jakub


  1   2   >