Re: match.pd: Three new patterns

2015-06-23 Thread Richard Biener
On Mon, 22 Jun 2015, Marek Polacek wrote:

> On Fri, Jun 19, 2015 at 05:51:53PM +0200, Marc Glisse wrote:
> > On Fri, 19 Jun 2015, Marek Polacek wrote:
> > 
> > >+/* x + y - (x | y) -> x & y */
> > >+(simplify
> > >+ (minus (plus @0 @1) (bit_ior @0 @1))
> > >+ (if (!TYPE_OVERFLOW_SANITIZED (type) && !TYPE_SATURATING (type))
> > >+  (bit_and @0 @1)))
> > >+
> > >+/* (x + y) - (x & y) -> x | y */
> > >+(simplify
> > >+ (minus (plus @0 @1) (bit_and @0 @1))
> > >+ (if (!TYPE_OVERFLOW_SANITIZED (type) && !TYPE_SATURATING (type))
> > >+  (bit_ior @0 @1)))
> > 
> > It could be macroized so they are handled by the same piece of code, but
> > that's not important for a couple lines.
>  
> Yeah, that could be done, but I didn't see much value in doing that.
> 
> > As far as I can tell, TYPE_SATURATING is for fixed point numbers only, are
> > we allowed to use bit_ior/bit_and on those? I never know what kind of
> > integers are supposed to be supported, so I would have checked
> > TYPE_OVERFLOW_UNDEFINED (type) || TYPE_OVERFLOW_WRAPS (type) since those are
> > the 2 cases where we know it is safe (for TYPE_OVERFLOW_TRAPS it is never
> > clear if we are supposed to preserve traps or just avoid introducing new
> > ones). Well, the reviewer will know, I'll shut up :-)
>  
> I think you're right about TYPE_SATURATING so I've dropped that and instead
> replaced it with TYPE_OVERFLOW_TRAPS.  That should do the right thing
> together with TYPE_OVERFLOW_SANITIZED.

Are you sure?  The point is that if the minus or the plus in the original
expression saturate the result isn't correct, no?

> > (I still believe that the necessity for TYPE_OVERFLOW_SANITIZED here points
> > to a design issue in ubsan, but it is way too late to discuss that)
> 
> I think delayed folding would help here a bit.  Also, we've been talking
> about doing the signed overflow sanitization earlier, but so far I didn't
> implement that.  And -ftrapv should be merged into the ubsan infrastructure
> some day.
> 
> > It is probably not worth the trouble adding the variant:
> > x+(y-(x&y)) -> x|y
> > since it decomposes as
> > y-(x&y) -> y&~x
> > x+(y&~x) -> x|y
> > x+(y-(x|y)) -> x-(x&~y) -> x&y is less likely to happen because the first
> > transform y-(x|y) -> -(x&~y) increases the number of insns. Bah, we can't
> > handle everything...
> 
> That sounds about right ;).  Thanks!
> 
> So, Richi, is this variant ok as well?  I also added one ubsan test.

As said, removing TYPE_SATURATING doesn't sound correct.  I'm not sure
about TYPE_OVERFLOW_TRAPS - we're certainly removing traps elsewhere
(look for the scarce use of this flag in fold-const.c and match.pd
where I only preserved those that were originally in fold-const.c).

So, TYPE_OVERFLOW_TRAPS is your choice but TYPE_SATURATING is
required IMHO.

Richard.

> Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> 2015-06-22  Marek Polacek  
> 
>   * match.pd ((x + y) - (x | y) -> x & y,
>   (x + y) - (x & y) -> x | y): New patterns.
> 
>   * gcc.dg/fold-minus-4.c: New test.
>   * gcc.dg/fold-minus-5.c: New test.
>   * c-c++-common/ubsan/overflow-add-5.c: New test.
> 
> diff --git gcc/match.pd gcc/match.pd
> index badb80a..6d520ef 100644
> --- gcc/match.pd
> +++ gcc/match.pd
> @@ -343,6 +343,18 @@ along with GCC; see the file COPYING3.  If not see
>   (plus:c (bit_and @0 @1) (bit_ior @0 @1))
>   (plus @0 @1))
>  
> +/* (x + y) - (x | y) -> x & y */
> +(simplify
> + (minus (plus @0 @1) (bit_ior @0 @1))
> + (if (!TYPE_OVERFLOW_SANITIZED (type) && !TYPE_OVERFLOW_TRAPS (type))
> +  (bit_and @0 @1)))
> +
> +/* (x + y) - (x & y) -> x | y */
> +(simplify
> + (minus (plus @0 @1) (bit_and @0 @1))
> + (if (!TYPE_OVERFLOW_SANITIZED (type) && !TYPE_OVERFLOW_TRAPS (type))
> +  (bit_ior @0 @1)))
> +
>  /* (x | y) - (x ^ y) -> x & y */
>  (simplify
>   (minus (bit_ior @0 @1) (bit_xor @0 @1))
> diff --git gcc/testsuite/c-c++-common/ubsan/overflow-add-5.c 
> gcc/testsuite/c-c++-common/ubsan/overflow-add-5.c
> index e69de29..905a60a 100644
> --- gcc/testsuite/c-c++-common/ubsan/overflow-add-5.c
> +++ gcc/testsuite/c-c++-common/ubsan/overflow-add-5.c
> @@ -0,0 +1,30 @@
> +/* { dg-do run } */
> +/* { dg-options "-fsanitize=signed-integer-overflow" } */
> +
> +int __attribute__ ((noinline))
> +foo (int i, int j)
> +{
> +  return (i + j) - (i | j);
> +}
> +
> +/* { dg-output "signed integer overflow: 2147483647 \\+ 1 cannot be 
> represented in type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*signed integer overflow: -2147483648 - 2147483647 
> cannot be represented in type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
> +
> +int __attribute__ ((noinline))
> +bar (int i, int j)
> +{
> +  return (i + j) - (i & j);
> +}
> +
> +/* { dg-output "\[^\n\r]*signed integer overflow: 2147483647 \\+ 1 cannot be 
> represented in type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*signed integer overflow: -2147483648 - 1 cannot be 
> represented in type 'int'" } */
> +
> +int
> +main ()
> +{
> +  in

Re: match.pd: Three new patterns

2015-06-23 Thread Marek Polacek
On Tue, Jun 23, 2015 at 09:56:33AM +0200, Richard Biener wrote:
> > I think you're right about TYPE_SATURATING so I've dropped that and instead
> > replaced it with TYPE_OVERFLOW_TRAPS.  That should do the right thing
> > together with TYPE_OVERFLOW_SANITIZED.
> 
> Are you sure?  The point is that if the minus or the plus in the original
> expression saturate the result isn't correct, no?
 
Yes, but I thought that TYPE_SATURATING is only true for fixed-point, i.e.
those _Accum/_Sat/_Fract (?), and you can't do bitwise & or | on them, which
means that the TYPE_SATURATING check wouldn't be necessary.

> As said, removing TYPE_SATURATING doesn't sound correct.  I'm not sure
> about TYPE_OVERFLOW_TRAPS - we're certainly removing traps elsewhere
> (look for the scarce use of this flag in fold-const.c and match.pd
> where I only preserved those that were originally in fold-const.c).
> 
> So, TYPE_OVERFLOW_TRAPS is your choice but TYPE_SATURATING is
> required IMHO.

Ok, I guess I'll add TYPE_SATURATING back, even though I'm not clear
on that one, and commit.

Thanks,

Marek


[nvptx] gcc/testsuite/gcc.target/nvptx/

2015-06-23 Thread Thomas Schwinge
Hi!

Written and internally approved by Bernd nearly a year ago; now committed
to trunk in r224822:

commit 5b988b24f5e557e19242d50179aa4e3e0c3752d9
Author: tschwinge 
Date:   Tue Jun 23 08:17:23 2015 +

[nvptx] gcc/testsuite/gcc.target/nvptx/

We don't claim to support "K&R C" for nvptx, but needed this corresponding
functionality ("incomplete prototypes") to support the Fortran
libgomp/openacc_lib.h file.

gcc/testsuite/
* gcc.target/nvptx/nvptx.exp: New file.
* gcc.target/nvptx/proto-1.c: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@224822 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/testsuite/ChangeLog  |  5 
 gcc/testsuite/gcc.target/nvptx/nvptx.exp | 42 
 gcc/testsuite/gcc.target/nvptx/proto-1.c | 13 ++
 3 files changed, 60 insertions(+)

diff --git gcc/testsuite/ChangeLog gcc/testsuite/ChangeLog
index d5329af..f17ae0d 100644
--- gcc/testsuite/ChangeLog
+++ gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2015-06-23  Thomas Schwinge  
+
+   * gcc.target/nvptx/nvptx.exp: New file.
+   * gcc.target/nvptx/proto-1.c: Likewise.
+
 2015-06-23  Bin Cheng  
 
PR tree-optimization/66449
diff --git gcc/testsuite/gcc.target/nvptx/nvptx.exp 
gcc/testsuite/gcc.target/nvptx/nvptx.exp
new file mode 100644
index 000..402c8d1
--- /dev/null
+++ gcc/testsuite/gcc.target/nvptx/nvptx.exp
@@ -0,0 +1,42 @@
+# Specific regression driver for nvptx.
+# Copyright (C) 2015 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+# 
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+# 
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .
+
+# GCC testsuite that uses the `dg.exp' driver.
+
+# Exit immediately if this isn't a nvptx target.
+if ![istarget nvptx*-*-*] then {
+  return
+}
+
+# Load support procs.
+load_lib gcc-dg.exp
+
+# If a testcase doesn't have special options, use these.
+global DEFAULT_CFLAGS
+if ![info exists DEFAULT_CFLAGS] then {
+set DEFAULT_CFLAGS " -ansi -pedantic-errors"
+}
+
+# Initialize `dg'.
+dg-init
+
+# Main loop.
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cS\]]] \
+   "" $DEFAULT_CFLAGS
+
+# All done.
+dg-finish
diff --git gcc/testsuite/gcc.target/nvptx/proto-1.c 
gcc/testsuite/gcc.target/nvptx/proto-1.c
new file mode 100644
index 000..5f77359
--- /dev/null
+++ gcc/testsuite/gcc.target/nvptx/proto-1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+
+int f(void)
+{
+  const int dev = 4;
+
+  /* Check that without an explicit prototype, we deduce from call site the
+ signature for the (mandatory in PTX) prototype.  */
+  /* extern int acc_on_device_(int *); */
+  /* { dg-final { scan-assembler-not "\\\.callprototype" } } */
+  /* { dg-final { scan-assembler "\\\.extern \\\.func 
\\\(\[^,\n\r\]+\\\)acc_on_device_ \\\(\[^,\n\r\]+\\\);" } } */
+  return !acc_on_device_(&dev);
+}


Grüße,
 Thomas


pgp7scPTgKEaM.pgp
Description: PGP signature


[Patch SRA] Fix PR66119 by calling get_move_ratio in SRA

2015-06-23 Thread James Greenhalgh

Hi,

The problem in PR66119 is that we assume MOVE_RATIO will be constant
for a compilation run, such that we only need to read it once at compiler
startup if we want to set up defaults for
--param sra-max-scalarization-size-Osize and
--param sra-max-scalarization-size-Osize.

This assumption is faulty. Some targets may have MOVE_RATIO set up
to use state which depends on the selected processor - which may vary
per function for a switchable target.

This patch fixes the issue by always calling get_move_ratio in the SRA
code, ensuring that an up-to-date value is used.

Unfortunately, this means we have to use 0 as a sentinel value for
the parameter - indicating no user override of the feature - and
therefore cannot use it to disable scalarization. However, there
are other ways to disable scalarazation (-fno-tree-sra) so this is not
a great loss.

Bootstrapped and tested on x86-64 and AArch64 with no issues.

OK for trunk (and 5.2 after a few days watching for fallout)?

Thanks,
James

---
gcc/

2015-06-23  James Greenhalgh  

PR tree-optimization/66119
* doc/invoke.texi (sra-max-scalarization-size-Osize): Mention that
"0" is used as a sentinel value.
(sra-max-scalarization-size-Ospeed): Likewise.
* toplev.c (process_options): Don't set up default values for
the sra_max_scalarization_size_{speed,size} parameters.
* tree-sra (analyze_all_variable_accesses): If no values
have been set for the sra_max_scalarization_size_{speed,size}
parameters, call get_move_ratio to get target defaults.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index b99ab1c..fc9dad7 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -10894,7 +10894,9 @@ variables.  These parameters control the maximum size, in storage units,
 of aggregate which is considered for replacement when compiling for
 speed
 (@option{sra-max-scalarization-size-Ospeed}) or size
-(@option{sra-max-scalarization-size-Osize}) respectively.
+(@option{sra-max-scalarization-size-Osize}) respectively.  The
+value 0 indicates that the compiler should use an appropriate size
+for the target processor, this is the default behaviour.
 
 @item tm-max-aggregate-size
 When making copies of thread-local variables in a transaction, this
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 2f43a89..902bfc7 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1301,20 +1301,6 @@ process_options (void)
  so we can correctly initialize debug output.  */
   no_backend = lang_hooks.post_options (&main_input_filename);
 
-  /* Set default values for parameters relation to the Scalar Reduction
- of Aggregates passes (SRA and IP-SRA).  We must do this here, rather
- than in opts.c:default_options_optimization as historically these
- tuning heuristics have been based on MOVE_RATIO, which on some
- targets requires other symbols from the backend.  */
-  maybe_set_param_value
-(PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED,
- get_move_ratio (true) * UNITS_PER_WORD,
- global_options.x_param_values, global_options_set.x_param_values);
-  maybe_set_param_value
-(PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE,
- get_move_ratio (false) * UNITS_PER_WORD,
- global_options.x_param_values, global_options_set.x_param_values);
-
   /* Some machines may reject certain combinations of options.  */
   targetm.target_option.override ();
 
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 8e34244..e2419af 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -2549,12 +2549,24 @@ analyze_all_variable_accesses (void)
   bitmap tmp = BITMAP_ALLOC (NULL);
   bitmap_iterator bi;
   unsigned i;
+  bool optimize_speed_p = !optimize_function_for_size_p (cfun);
+
   unsigned max_scalarization_size
-= (optimize_function_for_size_p (cfun)
-	? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE)
-	: PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED))
+= (optimize_speed_p
+	? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED)
+	: PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE))
   * BITS_PER_UNIT;
 
+  /* If the user didn't set PARAM_SRA_MAX_SCALARIZATION_SIZE_<...>,
+ fall back to a target default.  This means that zero cannot be
+ used to disable scalarization as we've taken it as a sentinel
+ value.  This is not ideal, but see PR66119 for the reason we
+ can't simply set the target defaults ahead of time during option
+ handling.  */
+  if (!max_scalarization_size)
+max_scalarization_size = get_move_ratio (optimize_speed_p)
+			 * UNITS_PER_WORD * BITS_PER_UNIT;
+
   EXECUTE_IF_SET_IN_BITMAP (candidate_bitmap, 0, i, bi)
 if (bitmap_bit_p (should_scalarize_away_bitmap, i)
 	&& !bitmap_bit_p (cannot_scalarize_away_bitmap, i))


Re: match.pd: Three new patterns

2015-06-23 Thread Richard Biener
On Tue, 23 Jun 2015, Marek Polacek wrote:

> On Tue, Jun 23, 2015 at 09:56:33AM +0200, Richard Biener wrote:
> > > I think you're right about TYPE_SATURATING so I've dropped that and 
> > > instead
> > > replaced it with TYPE_OVERFLOW_TRAPS.  That should do the right thing
> > > together with TYPE_OVERFLOW_SANITIZED.
> > 
> > Are you sure?  The point is that if the minus or the plus in the original
> > expression saturate the result isn't correct, no?
>  
> Yes, but I thought that TYPE_SATURATING is only true for fixed-point, i.e.
> those _Accum/_Sat/_Fract (?), and you can't do bitwise & or | on them, which
> means that the TYPE_SATURATING check wouldn't be necessary.

Who says you can't do bitwise ops on them?  I can't see that being
enforced in the GIMPLE checking in tree-cfg.c.  Yes, there is no
such thing as a "saturating" bitwise and but bitwise and should
just work fine.

You can check with a arm cross what the C FE does when you use
bitwise ops but I believe the regular and/ior md patterns work
just fine (there are no special modes/registers but they seem
to be shared with regular registers, just special operations
are available).

Richard.

> 
> > As said, removing TYPE_SATURATING doesn't sound correct.  I'm not sure
> > about TYPE_OVERFLOW_TRAPS - we're certainly removing traps elsewhere
> > (look for the scarce use of this flag in fold-const.c and match.pd
> > where I only preserved those that were originally in fold-const.c).
> > 
> > So, TYPE_OVERFLOW_TRAPS is your choice but TYPE_SATURATING is
> > required IMHO.
> 
> Ok, I guess I'll add TYPE_SATURATING back, even though I'm not clear
> on that one, and commit.
> 
> Thanks,
> 
>   Marek
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham 
Norton, HRB 21284 (AG Nuernberg)


Re: [PATCH] Expand PIC calls without PLT with -fno-plt

2015-06-23 Thread Ramana Radhakrishnan
On Mon, Jun 22, 2015 at 7:11 PM, Alexander Monakov  wrote:
> On Mon, 22 Jun 2015, Jiong Wang wrote:
>> Have done a quick experiment, -fno-plt doesn't work on AArch64.
>>
>> it's because although this patch force the function address into register,
>> but the combine pass runs later combine it back as AArch64 have defined such
>> insn pattern.
>>
>> For X86, it's not combined back. From the rtl dump, it's because the rtl pre
>> pass has moved the address load instruction into another basic block and
>> combine pass don't combine across basic blocks. Also, x86 backend has done
>> some check on flag_plt in the new added ix86_nopic_noplt_attribute_p which
>> could help generate correct insns.
>>
>> What I can think of the fix on AArch64 is by restricting the call symbol
>> under "flag_plt == true" only, so that call via register can't be combined
>> into call symbol direct,
>>
>> Or better to prohibit combine pass for such combining? as the generic fix on
>> combine may fix other broken targets.
>
> My colleagues at ISP RAS (CC'ed) have been looking on arm (and aarch64) no-plt
> codegen.  We also saw the problem with the combine pass you describe.  I think
> your description of why it's not observed on x86 is incorrect; the newly added
> ix86_nopic_noplt_attribute_p should not have anything to do with that.  It's
> just that the GOT load insn has a REG_EQUAL note, and the combine pass can use
> it to replace the register in the indirect branch, producing a direct branch
> to a symbol (i.e. a PLT jump).


>
> Actually we are not hitting the same problem on x86 by pure luck.  Early RTL
> passes manage to lose the REG_EQUAL note, so by the time combine runs, the
> register annotation is lost.  It's possible to reproduce the arm/aarch64
> problem on x86 with -fno-gcse and the following hack:
>
> diff --git a/gcc/cse.c b/gcc/cse.c
> index 2a33827..88cff96 100644
> --- a/gcc/cse.c
> +++ b/gcc/cse.c
> @@ -6634,6 +6634,9 @@ cse_main (rtx_insn *f ATTRIBUTE_UNUSED, int nregs)
>int *rc_order = XNEWVEC (int, last_basic_block_for_fn (cfun));
>int i, n_blocks;
>
> +  if (!flag_gcse)
> +return 0;
> +
>df_set_flags (DF_LR_RUN_DCE);
>df_note_add_problem ();
>df_analyze ();
>
> Regarding fixing the issue, I also think that combine pass might be a better
> place (than the backends).  I'd appreciate comments from maintainers.
>
>

Not on AArch64 the GOT slot can be accessed with a single PC relative
instruction followed by a load, thus I don't expect there to any more
work to be done in the AArch64 backend other than massaging this into
an indirect call in the "call" related patterns.

So you'd get something like

adrp x0, :got:a
ldr x0, [x0, :got_lo12:a]
blr [x0]

and in the tiny model

ldr x0, :got:a
blr [x0]

if your elf module is small enough.

> If you try disabling the REG_EQUAL note generation [*], you'll probably find a
> performance regression on arm32 (and probably on aarch64 as well?
> we only

IMHO disabling the REG_EQUAL note generation is the wrong way to go about this.

> tried arm32 so far).  The main reason for that is that GCC emits pretty bad
> code for a GOT load.  Instead of using two add instructions and one ldr for
> the GOT slot access, like the PLT stubs do, it uses three(!) ldr instructions
> and one add.  The first ldr is for loading the GOT address, and the second is
> for the offset of the GOT slot.  As I understand, to fix that, GCC has to
> learn using the GOT_PREL relocation type.

Irrespective of combine, as a first step we should fix the predicates
and the call expanders to prevent this sort of replacement in the
backends. Tightening the predicates in the call patterns will achieve
the same for you and then we can investigate the use of GOT_PREL. My
recollection of this is that you need to work out when it's more
beneficial to use GOT_PREL over GOT but it's been a while since I
looked in that area.

>
> [*] To do that, we hacked arm legitimize_pic_address not to emit REG_EQUAL
> note under !flag_plt.
>
> Alexander


Re: *Ping* patch, fortran] Warn about constant integer divisions

2015-06-23 Thread Janne Blomqvist
On Sun, Jun 21, 2015 at 4:57 PM, Thomas Koenig  wrote:
> *ping*
>
> https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00966.html
>
>
>> Hello world,
>>
>> the attached patch emits a warning for constant integer division.
>> While correct according to the standard, I cannot really think
>> of a legitimate reason why people would want to write 3/5 where
>> they could have written 0 , so my preference would be to put
>> this under -Wconversion (like in the attached patch).
>>
>> However, I am open to discussion on that.  It is easy enough to
>> change.
>>
>> Regression-tested.  Opinions?  Comments?  Would somebody rather
>> have -Wconversion-extra?  OK for trunk?

I'm a bit uncomfortable about this. IIRC I have code where I'm
iterating over some kind of grid, and I'm using integer division and
relying on truncation to calculate array indices. I can certainly
imagine that others have used it as well, and even that it's not a
particularly uncommon pattern.

Furthermore, I think it's confusing that you have it under
-Wconversion, as there is no type conversion going on.
-Winteger-truncation maybe?

Any other opinions?

-- 
Janne Blomqvist


[Patch AArch64 0/4] Add "-moverride" option for overriding tuning parameters

2015-06-23 Thread James Greenhalgh
Hi,

This patch set adds support for a new command line option "-moverride".
The purpose of this command line is to allow expert-level users of the
compiler, and those comfortable with experimenting with the compiler,
*unsupported* full access to the tuning structures used in the AArch64
back-end.

For now, we only enable command-line access to the fusion pairs to
enable and whether or not to use the Cortex-A57 FMA register renaming
pass. Though in future we can expand this further.

With this patch, you might write something like:

  -moverride=fuse=adrp+add.cmp+branch:tune=rename_fma_regs

To enable fusion of adrp+add and cmp+branch and to enable the
cortex-a57-fma-steering pass.

The registration of a new sub-option is table driven, you add an
option name and a function which mutates the tuning parameters having
parsed the string you are given to aarch64_tuning_override_functions.

Expanding this for some of the other options (or groups of options) is
therefore fairly easy, but I haven't done it yet.

The patch set first refactors the fusion and pass tuning structures
to drive them through definitions in tables
( config/aarch64/aarch64-fusion-pairs.def,
  config/aarch64/aarch64-tuning-flags.def ). We then de-constify the
tune_params structure, as it can now modify. Finally we wire up the
new option, and add the parsing code to give the desired behaviour.

I've bootstrapped and tested the patch set on aarch64-none-linux-gnu
with BOOT_CFLAGS set to the example string above, and again in the
standard configuration with no issues.

OK for trunk?

Thanks,
James

---
[Patch AArch64 1/4] Define candidates for instruction fusion in a .def file

gcc/

2015-06-23  James Greenhalgh  

* config/aarch64/aarch64-fusion-pairs.def: New.
* config/aarch64/aarch64-protos.h (aarch64_fusion_pairs): New.
* config/aarch64/aarch64.c (AARCH64_FUSE_NOTHING): Move to
aarch64_fusion_pairs.
(AARCH64_FUSE_MOV_MOVK): Likewise.
(AARCH64_FUSE_ADRP_ADD): Likewise.
(AARCH64_FUSE_MOVK_MOVK): Likewise.
(AARCH64_FUSE_ADRP_LDR): Likewise.
(AARCH64_FUSE_CMP_BRANCH): Likewise.

---
[Patch AArch64 2/4] Control the FMA steering pass in tuning
 structures rather than as core property

gcc/

2015-06-23  James Greenhalgh  

* config/aarch64/aarch64.h (AARCH64_FL_USE_FMA_STEERING_PASS): Delete.
(aarch64_tune_flags): Likewise.
(AARCH64_TUNE_FMA_STEERING): Likewise.
* config/aarch64/aarch64-cores.def (cortex-a57): Remove reference
to AARCH64_FL_USE_FMA_STEERING_PASS.
(cortex-a57.cortex-a53): Likewise.
(cortex-a72): Use cortexa72_tunings.
(cortex-a72.cortex-a53): Likewise.
(exynos-m1): Likewise.
* config/aarch64/aarch64-protos.h (tune_params): Add
a field: extra_tuning_flags.
* config/aarch64/aarch64-tuning-flags.def: New.
* config/aarch64/aarch64-protos.h (AARCH64_EXTRA_TUNING_OPTION): New.
(aarch64_extra_tuning_flags): Likewise.
(aarch64_tune_params): Declare here.
* config/aarch64/aarch64.c (generic_tunings): Set extra_tuning_flags.
(cortexa53_tunings): Likewise.
(cortexa57_tunings): Likewise.
(thunderx_tunings): Likewise.
(xgene1_tunings): Likewise.
(cortexa72_tunings): New.
* config/aarch64/cortex-a57-fma-steering.c: Include aarch64-protos.h.
 (gate): Check against aarch64_tune_params.
* config/aarch64/t-aarch64 (cortex-a57-fma-steering.o): Depend on
aarch64-protos.h.

---
[Patch AArch64 3/4] De-const-ify struct tune_params

gcc/

2015-06-23  James Greenhalgh  

* config/aarch64/aarch64-protos.h (tune_params): Remove
const from members.
(aarch64_tune_params): Remove const, change to no longer be
a pointer.
* config/aarch64/aarch64.c (aarch64_tune_params): Remove const,
change to no longer be a pointer, initialize to generic_tunings.
(aarch64_min_divisions_for_recip_mul): Change dereference of
aarch64_tune_params to member access.
(aarch64_reassociation_width): Likewise.
(aarch64_rtx_mult_cost): Likewise.
(aarch64_address_cost): Likewise.
(aarch64_branch_cost): Likewise.
(aarch64_rtx_costs): Likewise.
(aarch64_register_move_cost): Likewise.
(aarch64_memory_move_cost): Likewise.
(aarch64_sched_issue_rate): Likewise.
(aarch64_builtin_vectorization_cost): Likewise.
(aarch64_override_options): Take a copy of the selected tuning
struct in to aarch64_tune_params, rather than just setting
a pointer, change dereferences of aarch64_tune_params to member
accesses.
(aarch64_override_options_after_change): Change dereferences of
aarch64_tune_params to member access.
(aarch64_macro_fusion_p): Likewise.
(aarch_macro_fusion_pair_p): Likewise.
* config/aarch64/cortex-a57-fma-steering.c (gate):

[Patch AArch64 1/4] Define candidates for instruction fusion in a .def file

2015-06-23 Thread James Greenhalgh

Hi,

This patch moves the instruction fusion pairs from a set of #defines
to an enum which we can generate from a .def file.

We'll use that .def file again, and the friendly names it introduces
shortly.

OK?

Thanks,
James

---
2015-06-23  James Greenhalgh  

* config/aarch64/aarch64-fusion-pairs.def: New.
* config/aarch64/aarch64-protos.h (aarch64_fusion_pairs): New.
* config/aarch64/aarch64.c (AARCH64_FUSE_NOTHING): Move to
aarch64_fusion_pairs.
(AARCH64_FUSE_MOV_MOVK): Likewise.
(AARCH64_FUSE_ADRP_ADD): Likewise.
(AARCH64_FUSE_MOVK_MOVK): Likewise.
(AARCH64_FUSE_ADRP_LDR): Likewise.
(AARCH64_FUSE_CMP_BRANCH): Likewise.

diff --git a/gcc/config/aarch64/aarch64-fusion-pairs.def b/gcc/config/aarch64/aarch64-fusion-pairs.def
new file mode 100644
index 000..a7b00f6
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-fusion-pairs.def
@@ -0,0 +1,38 @@
+/* Copyright (C) 2015 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+/* Pairs of instructions which can be fused. before including this file,
+   define a macro:
+
+ AARCH64_FUSION_PAIR (name, internal_name, index_bit)
+
+   Where:
+
+ NAME is a string giving a friendly name for the instructions to fuse.
+ INTERNAL_NAME gives the internal name suitable for appending to
+ AARCH64_FUSE_ to give an enum name.
+ INDEX_BIT is the bit to set in the bitmask of supported fusion
+ operations.  */
+
+AARCH64_FUSION_PAIR ("mov+movk", MOV_MOVK, 0)
+AARCH64_FUSION_PAIR ("adrp+add", ADRP_ADD, 1)
+AARCH64_FUSION_PAIR ("movk+movk", MOVK_MOVK, 2)
+AARCH64_FUSION_PAIR ("adrp+ldr", ADRP_LDR, 3)
+AARCH64_FUSION_PAIR ("cmp+branch", CMP_BRANCH, 4)
+
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 965a11b..4bdcc46 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -189,6 +189,26 @@ struct tune_params
   const int min_div_recip_mul_df;
 };
 
+#define AARCH64_FUSION_PAIR(x, name, index) \
+  AARCH64_FUSE_##name = (1 << index),
+/* Supported fusion operations.  */
+enum aarch64_fusion_pairs
+{
+  AARCH64_FUSE_NOTHING = 0,
+#include "aarch64-fusion-pairs.def"
+
+/* Hacky macro to build AARCH64_FUSE_ALL.  The sequence below expands
+   to:
+   AARCH64_FUSE_ALL = 0 | AARCH64_FUSE_index1 | AARCH64_FUSE_index2 ...  */
+#undef AARCH64_FUSION_PAIR
+#define AARCH64_FUSION_PAIR(x, name, y) \
+  | AARCH64_FUSE_##name
+
+  AARCH64_FUSE_ALL = 0
+#include "aarch64-fusion-pairs.def"
+};
+#undef AARCH64_FUSION_PAIR
+
 HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, unsigned);
 int aarch64_get_condition_code (rtx);
 bool aarch64_bitmask_imm (HOST_WIDE_INT val, machine_mode);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 17bae08..5fe487b 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -319,13 +319,6 @@ static const struct cpu_vector_cost xgene1_vector_cost =
   1 /* cond_not_taken_branch_cost  */
 };
 
-#define AARCH64_FUSE_NOTHING	(0)
-#define AARCH64_FUSE_MOV_MOVK	(1 << 0)
-#define AARCH64_FUSE_ADRP_ADD	(1 << 1)
-#define AARCH64_FUSE_MOVK_MOVK	(1 << 2)
-#define AARCH64_FUSE_ADRP_LDR	(1 << 3)
-#define AARCH64_FUSE_CMP_BRANCH	(1 << 4)
-
 /* Generic costs for branch instructions.  */
 static const struct cpu_branch_cost generic_branch_cost =
 {


[Patch AArch64 3/4] De-const-ify struct tune_params

2015-06-23 Thread James Greenhalgh

Hi,

If we want to overwrite parts of this structure, we're going to need it
to be more malleable than it is presently.

Run through and remove const from each of the members, create a non-const
tuning structure we can modify, and set aarch64_tune_params to always
point to this new structure. Change the -mtune parsing code to take a
copy of the tuning structure in use rather than just taking the
reference from within the processor struct. Change all the current
users of aarch64_tune_params which no longer need to dereference a
pointer.

Checked on aarch64-none-linux-gnueabi with no issues.

OK?

Thanks,
James

---
2015-06-23  James Greenhalgh  

* config/aarch64/aarch64-protos.h (tune_params): Remove
const from members.
(aarch64_tune_params): Remove const, change to no longer be
a pointer.
* config/aarch64/aarch64.c (aarch64_tune_params): Remove const,
change to no longer be a pointer, initialize to generic_tunings.
(aarch64_min_divisions_for_recip_mul): Change dereference of
aarch64_tune_params to member access.
(aarch64_reassociation_width): Likewise.
(aarch64_rtx_mult_cost): Likewise.
(aarch64_address_cost): Likewise.
(aarch64_branch_cost): Likewise.
(aarch64_rtx_costs): Likewise.
(aarch64_register_move_cost): Likewise.
(aarch64_memory_move_cost): Likewise.
(aarch64_sched_issue_rate): Likewise.
(aarch64_builtin_vectorization_cost): Likewise.
(aarch64_override_options): Take a copy of the selected tuning
struct in to aarch64_tune_params, rather than just setting
a pointer, change dereferences of aarch64_tune_params to member
accesses.
(aarch64_override_options_after_change): Change dereferences of
aarch64_tune_params to member access.
(aarch64_macro_fusion_p): Likewise.
(aarch_macro_fusion_pair_p): Likewise.
* config/aarch64/cortex-a57-fma-steering.c (gate): Likewise.

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 7ece346..09e3077 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -171,23 +171,23 @@ struct cpu_branch_cost
 
 struct tune_params
 {
-  const struct cpu_cost_table *const insn_extra_cost;
-  const struct cpu_addrcost_table *const addr_cost;
-  const struct cpu_regmove_cost *const regmove_cost;
-  const struct cpu_vector_cost *const vec_costs;
-  const struct cpu_branch_cost *const branch_costs;
-  const int memmov_cost;
-  const int issue_rate;
-  const unsigned int fusible_ops;
-  const int function_align;
-  const int jump_align;
-  const int loop_align;
-  const int int_reassoc_width;
-  const int fp_reassoc_width;
-  const int vec_reassoc_width;
-  const int min_div_recip_mul_sf;
-  const int min_div_recip_mul_df;
-  const unsigned int extra_tuning_flags;
+  const struct cpu_cost_table *insn_extra_cost;
+  const struct cpu_addrcost_table *addr_cost;
+  const struct cpu_regmove_cost *regmove_cost;
+  const struct cpu_vector_cost *vec_costs;
+  const struct cpu_branch_cost *branch_costs;
+  int memmov_cost;
+  int issue_rate;
+  unsigned int fusible_ops;
+  int function_align;
+  int jump_align;
+  int loop_align;
+  int int_reassoc_width;
+  int fp_reassoc_width;
+  int vec_reassoc_width;
+  int min_div_recip_mul_sf;
+  int min_div_recip_mul_df;
+  unsigned int extra_tuning_flags;
 };
 
 #define AARCH64_FUSION_PAIR(x, name, index) \
@@ -228,7 +228,7 @@ enum aarch64_extra_tuning_flags
 };
 #undef AARCH64_EXTRA_TUNING_OPTION
 
-extern const struct tune_params *aarch64_tune_params;
+extern struct tune_params aarch64_tune_params;
 
 HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, unsigned);
 int aarch64_get_condition_code (rtx);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 96327a2..aa457db 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -164,9 +164,6 @@ unsigned aarch64_architecture_version;
 /* The processor for which instructions should be scheduled.  */
 enum aarch64_processor aarch64_tune = cortexa53;
 
-/* The current tuning set.  */
-const struct tune_params *aarch64_tune_params;
-
 /* Mask to specify which instructions we are allowed to generate.  */
 unsigned long aarch64_isa_flags = 0;
 
@@ -493,6 +490,9 @@ static const struct processor *selected_arch;
 static const struct processor *selected_cpu;
 static const struct processor *selected_tune;
 
+/* The current tuning set.  */
+struct tune_params aarch64_tune_params = generic_tunings;
+
 #define AARCH64_CPU_DEFAULT_FLAGS ((selected_cpu) ? selected_cpu->flags : 0)
 
 /* An ISA extension in the co-processor and main instruction set space.  */
@@ -544,8 +544,8 @@ static unsigned int
 aarch64_min_divisions_for_recip_mul (enum machine_mode mode)
 {
   if (GET_MODE_UNIT_SIZE (mode) == 4)
-return aarch64_tune_params->min_div_recip_mul_sf;
-  return aarch64_tune_params

[Patch AArch64 4/4] Add -moverride tuning command, and wire it up for control of fusion and fma-steering

2015-06-23 Thread James Greenhalgh

Hi,

This final patch adds support for the new command line option
"-moverride". The purpose of this command line is to allow expert-level users
of the compiler, and those comfortable with experimenting with the compiler,
*unsupported* full access to the tuning structures used in the AArch64
back-end.

For now, we only enable command-line access to the fusion pairs to
enable and whether or not to use the Cortex-A57 FMA register renaming
pass. Though in future we can expand this further.

With this patch, you might write something like:

  -moverride=fuse=adrp+add.cmp+branch:tune=rename_fma_regs

To enable fusion of adrp+add and cmp+branch and to enable the
fma-rename pass.

I've bootstrapped and tested the patch set on aarch64-none-linux-gnu
with BOOT_CFLAGS set to the example string above, and again in the
standard configuration with no issues.

OK?

Thanks,
James

---
2015-06-23  James Greenhalgh  

* config/aarch64/aarch64.opt: (override): New.
* doc/invoke.texi (override): Document.
* config/aarch64/aarch64.c (aarch64_flag_desc): New
(aarch64_fusible_pairs): Likewise.
(aarch64_tuning_flags): Likewise.
(aarch64_tuning_override_function): Likewise.
(aarch64_tuning_override_functions): Likewise.
(aarch64_parse_one_option_token): Likewise.
(aarch64_parse_boolean_options): Likewise.
(aarch64_parse_fuse_string): Likewise.
(aarch64_parse_tune_string): Likewise.
(aarch64_parse_one_override_token): Likewise.
(aarch64_parse_override_string): Likewise.
(aarch64_override_options): Parse the -override string if it
is present.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index aa457db..207c18b 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -170,6 +170,36 @@ unsigned long aarch64_isa_flags = 0;
 /* Mask to specify which instruction scheduling options should be used.  */
 unsigned long aarch64_tune_flags = 0;
 
+/* Support for command line parsing of boolean flags in the tuning
+   structures.  */
+struct aarch64_flag_desc
+{
+  const char* name;
+  unsigned int flag;
+};
+
+#define AARCH64_FUSION_PAIR(name, internal_name, y) \
+  { name, AARCH64_FUSE_##internal_name },
+static const struct aarch64_flag_desc aarch64_fusible_pairs[] =
+{
+  { "none", AARCH64_FUSE_NOTHING },
+#include "aarch64-fusion-pairs.def"
+  { "all", AARCH64_FUSE_ALL },
+  { NULL, AARCH64_FUSE_NOTHING }
+};
+#undef AARCH64_FUION_PAIR
+
+#define AARCH64_EXTRA_TUNING_OPTION(name, internal_name, y) \
+  { name, AARCH64_EXTRA_TUNE_##internal_name },
+static const struct aarch64_flag_desc aarch64_tuning_flags[] =
+{
+  { "none", AARCH64_EXTRA_TUNE_NONE },
+#include "aarch64-tuning-flags.def"
+  { "all", AARCH64_EXTRA_TUNE_ALL },
+  { NULL, AARCH64_EXTRA_TUNE_NONE }
+};
+#undef AARCH64_EXTRA_TUNING_OPTION
+
 /* Tuning parameters.  */
 
 static const struct cpu_addrcost_table generic_addrcost_table =
@@ -452,6 +482,24 @@ static const struct tune_params xgene1_tunings =
   (AARCH64_EXTRA_TUNE_NONE)	/* tune_flags.  */
 };
 
+/* Support for fine-grained override of the tuning structures.  */
+struct aarch64_tuning_override_function
+{
+  const char* name;
+  void (*parse_override)(const char*, struct tune_params*);
+};
+
+static void aarch64_parse_fuse_string (const char*, struct tune_params*);
+static void aarch64_parse_tune_string (const char*, struct tune_params*);
+
+static const struct aarch64_tuning_override_function
+  aarch64_tuning_override_functions[] =
+{
+  { "fuse", aarch64_parse_fuse_string },
+  { "tune", aarch64_parse_tune_string },
+  { NULL, NULL }
+};
+
 /* A processor implementing AArch64.  */
 struct processor
 {
@@ -7142,6 +7190,178 @@ aarch64_parse_tune (void)
   return;
 }
 
+/* Parse TOKEN, which has length LENGTH to see if it is an option
+   described in FLAG.  If it is, return the index bit for that fusion type.
+   If not, error (printing OPTION_NAME) and return zero.  */
+
+static unsigned int
+aarch64_parse_one_option_token (const char *token,
+size_t length,
+const struct aarch64_flag_desc *flag,
+const char *option_name)
+{
+  for (; flag->name != NULL; flag++)
+{
+  if (length == strlen (flag->name)
+	  && !strncmp (flag->name, token, length))
+	return flag->flag;
+}
+
+  error ("unknown flag passed in -moverride=%s (%s)", option_name, token);
+  return 0;
+}
+
+/* Parse OPTION which is a comma-separated list of flags to enable.
+   FLAGS gives the list of flags we understand, INITIAL_STATE gives any
+   default state we inherit from the CPU tuning structures.  OPTION_NAME
+   gives the top-level option we are parsing in the -moverride string,
+   for use in error messages.  */
+
+static unsigned int
+aarch64_parse_boolean_options (const char *option,
+			   const struct aarch64_flag_desc *flags,
+			   unsigned int initial_state,
+			   const char *option_name)
+{
+  const char separator = '.';
+  const char

[Patch AArch64 2/4] Control the FMA steering pass in tuning structures rather than as core property

2015-06-23 Thread James Greenhalgh

Hi,

The FMA steering pass should be enabled through the tuning structures
rather than be an intrinsic property of the core.  This patch moves
the control of the pass to the tuning structures - turning it off for
everything other than a Cortex-A57 system (i.e. -mcpu=cortex-a57
or -mcpu=cortex-a57.cortex-a53).

Some CPU's share the cortexa57 tuning structs, but do not use this
steering pass. For those I've taken a copy of the cortexa57 tuning
structures and called it cortexa72.

Tested with a compiler build and all known values of -mcpu to make sure
the pass runs in the expected configurations.

OK?

Thanks,
James

---
2015-06-23  James Greenhalgh  

* config/aarch64/aarch64.h (AARCH64_FL_USE_FMA_STEERING_PASS): Delete.
(aarch64_tune_flags): Likewise.
(AARCH64_TUNE_FMA_STEERING): Likewise.
* config/aarch64/aarch64-cores.def (cortex-a57): Remove reference
to AARCH64_FL_USE_FMA_STEERING_PASS.
(cortex-a57.cortex-a53): Likewise.
(cortex-a72): Use cortexa72_tunings.
(cortex-a72.cortex-a53): Likewise.
(exynos-m1): Likewise.
* config/aarch64/aarch64-protos.h (tune_params): Add
a field: extra_tuning_flags.
* config/aarch64/aarch64-tuning-flags.def: New.
* config/aarch64/aarch64-protos.h (AARCH64_EXTRA_TUNING_OPTION): New.
(aarch64_extra_tuning_flags): Likewise.
(aarch64_tune_params): Declare here.
* config/aarch64/aarch64.c (generic_tunings): Set extra_tuning_flags.
(cortexa53_tunings): Likewise.
(cortexa57_tunings): Likewise.
(thunderx_tunings): Likewise.
(xgene1_tunings): Likewise.
(cortexa72_tunings): New.
* config/aarch64/cortex-a57-fma-steering.c: Include aarch64-protos.h.
 (gate): Check against aarch64_tune_params.
* config/aarch64/t-aarch64 (cortex-a57-fma-steering.o): Depend on
aarch64-protos.h.

diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def
index dfc9cc8..c4e22fe 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -40,13 +40,13 @@
 /* V8 Architecture Processors.  */
 
 AARCH64_CORE("cortex-a53",  cortexa53, cortexa53, 8,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa53, "0x41", "0xd03")
-AARCH64_CORE("cortex-a57",  cortexa57, cortexa57, 8,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_USE_FMA_STEERING_PASS, cortexa57, "0x41", "0xd07")
-AARCH64_CORE("cortex-a72",  cortexa72, cortexa57, 8,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, "0x41", "0xd08")
-AARCH64_CORE("exynos-m1",   exynosm1,  cortexa57, 8,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, cortexa57, "0x53", "0x001")
+AARCH64_CORE("cortex-a57",  cortexa57, cortexa57, 8,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, "0x41", "0xd07")
+AARCH64_CORE("cortex-a72",  cortexa72, cortexa57, 8,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa72, "0x41", "0xd08")
+AARCH64_CORE("exynos-m1",   exynosm1,  cortexa57, 8,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, cortexa72, "0x53", "0x001")
 AARCH64_CORE("thunderx",thunderx,  thunderx,  8,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  "0x43", "0x0a1")
 AARCH64_CORE("xgene1",  xgene1,xgene1,8,  AARCH64_FL_FOR_ARCH8, xgene1, "0x50", "0x000")
 
 /* V8 big.LITTLE implementations.  */
 
-AARCH64_CORE("cortex-a57.cortex-a53",  cortexa57cortexa53, cortexa53, 8,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_USE_FMA_STEERING_PASS, cortexa57, "0x41", "0xd07.0xd03")
-AARCH64_CORE("cortex-a72.cortex-a53",  cortexa72cortexa53, cortexa53, 8,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, "0x41", "0xd08.0xd03")
+AARCH64_CORE("cortex-a57.cortex-a53",  cortexa57cortexa53, cortexa53, 8,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, "0x41", "0xd07.0xd03")
+AARCH64_CORE("cortex-a72.cortex-a53",  cortexa72cortexa53, cortexa53, 8,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa72, "0x41", "0xd08.0xd03")
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 4bdcc46..7ece346 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -187,6 +187,7 @@ struct tune_params
   const int vec_reassoc_width;
   const int min_div_recip_mul_sf;
   const int min_div_recip_mul_df;
+  const unsigned int extra_tuning_flags;
 };
 
 #define AARCH64_FUSION_PAIR(x, name, index) \
@@ -209,6 +210,26 @@ enum aarch64_fusion_pairs
 };
 #undef AARCH64_FUSION_PAIR
 
+#define AARCH64_EXTRA_TUNING_OPTION(x, name, index) \
+  AARCH64_EXTRA_TUNE_##name = (1 << index),
+/* Supported tuning flags.  */
+enum aarch64_extra_tuning_flags
+{
+  AARCH64_EXTRA_TUNE_NONE = 0,
+#include "aarch64-tuning-flags.def"
+
+/* Hacky macro to build the "all" flag mask.
+   Expands to 0 | AARCH64_TUNE_index0 | AARCH64_TUNE_index1 , etc.  */
+#undef AARCH64_EXTRA_TUNING_OPTION
+#define AARCH64_EXTRA_TUNING_OPTION(x, name, y) \
+  | 

Re: [Patch SRA] Fix PR66119 by calling get_move_ratio in SRA

2015-06-23 Thread Jakub Jelinek
On Tue, Jun 23, 2015 at 09:18:52AM +0100, James Greenhalgh wrote:
> This patch fixes the issue by always calling get_move_ratio in the SRA
> code, ensuring that an up-to-date value is used.
> 
> Unfortunately, this means we have to use 0 as a sentinel value for
> the parameter - indicating no user override of the feature - and
> therefore cannot use it to disable scalarization. However, there
> are other ways to disable scalarazation (-fno-tree-sra) so this is not
> a great loss.

You can handle even that.

> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index 8e34244..e2419af 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -2549,12 +2549,24 @@ analyze_all_variable_accesses (void)
>bitmap tmp = BITMAP_ALLOC (NULL);
>bitmap_iterator bi;
>unsigned i;
> +  bool optimize_speed_p = !optimize_function_for_size_p (cfun);
> +
>unsigned max_scalarization_size
> -= (optimize_function_for_size_p (cfun)
> - ? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE)
> - : PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED))
> += (optimize_speed_p
> + ? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED)
> + : PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE))
>* BITS_PER_UNIT;
>  
> +  /* If the user didn't set PARAM_SRA_MAX_SCALARIZATION_SIZE_<...>,
> + fall back to a target default.  This means that zero cannot be
> + used to disable scalarization as we've taken it as a sentinel
> + value.  This is not ideal, but see PR66119 for the reason we
> + can't simply set the target defaults ahead of time during option
> + handling.  */
> +  if (!max_scalarization_size)

  enum compiler_param param
= optimize_function_for_size_p (cfun)
  ? PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE
  : PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED;
  unsigned max_scalarization_size = PARAM_VALUE (param) * BITS_PER_UNIT;
  if (!max_scalarization_size && !global_options_set.x_param_values[param])

Then it will handle explicit --param sra-max-scalarization-size-Os*=0
differently from implicit 0.

Or you could allow value of -1 for those params and make that the default
- -1 would mean the special get_move_ration value, 0 would disable, > 0
would be the requested scalarization.
OT, shouldn't max_scalarization_size be at least unsigned HOST_WIDE_INT,
so that it doesn't overflow for larger values (0x4000 etc.)?
Probably need some cast in the multiplication to avoid UB in the compiler.

Jakub


[PATCH COMMITTED] MAINTAINERS (Write After Approval): Add myself.

2015-06-23 Thread Ludovic Courtès
FYI.

Index: ChangeLog
===
--- ChangeLog	(revision 224824)
+++ ChangeLog	(revision 224825)
@@ -1,3 +1,7 @@
+2015-06-23  Ludovic Courtès  
+
+	* MAINTAINERS (Write After Approval): Add myself.
+
 2015-06-22  Andreas Tobler  
 
 	* MAINTAINERS (OS Port Maintainers): Add myself.
Index: MAINTAINERS
===
--- MAINTAINERS	(revision 224824)
+++ MAINTAINERS	(revision 224825)
@@ -367,6 +367,7 @@
 Josh Conner	
 R. Kelley Cook	
 Christian Cornelssen
+Ludovic Courtès	
 Cary Coutant	
 Lawrence Crowl	
 Ian Dall	


[PATCH, i386]; Fix PR 66560, Fails to generate ADDSUBPS

2015-06-23 Thread Uros Bizjak
Hello!

Attached patch introduces combiner splitters to handle every possible
ADDSUB permutation of vec_merge and vec_select/vec_concat operands.
These combiners handle swapped PLUS and MINUS operators, and account
for commutative operands of PLUS RTX. As shown in the attached
testcases, there are quite some ways to create ADDSUB.

2015-06-23  Uros Bizjak  

PR target/66560
* config/i386/predicates.md (addsub_vm_operator): New predicate.
(addsub_vs_operator): Ditto.
(addsub_vs_parallel): Ditto.
* config/i386/sse.md (ssedoublemode): Add V4SF and V2DF modes.
(avx_addsubv4df3, avx_addsubv8sf3, sse3_addsubv2df3, sse3_addsubv4sf3):
Put minus RTX before plus and adjust vec_merge selector.
(*avx_addsubv4df3_1, *avx_addsubv4df3_1s, *sse3_addsubv2df3_1)
(*sse_addsubv2df3_1s, *avx_addsubv8sf3_1, *avx_addsubv8sf3_1s)
(*sse3_addsubv4sf3_1, *sse_addsubv4sf3_1s): Remove insn patterns.
(addsub vec_merge splitters): New combiner splitters.
(addsub vec_select/vec_concat splitters): Ditto.

testsuite/ChangeLog:

2015-06-23  Uros Bizjak  

PR target/66560
* gcc.target/i386/pr66560-1.c: New test.
* gcc.target/i386/pr66560-2.c: Ditto.
* gcc.target/i386/pr66560-3.c: Ditto.
* gcc.target/i386/pr66560-4.c: Ditto.

Patch was tested on x86_64-linux-gnu {,-m32}.and was committed to mainline SVN.

Uros.
Index: testsuite/gcc.target/i386/pr66560-1.c
===
--- testsuite/gcc.target/i386/pr66560-1.c   (revision 0)
+++ testsuite/gcc.target/i386/pr66560-1.c   (revision 0)
@@ -0,0 +1,35 @@
+/* PR target/66560 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse4" } */
+
+typedef float v4sf __attribute__((vector_size(16)));
+typedef int v4si __attribute__((vector_size(16)));
+v4sf foo1 (v4sf x, v4sf y)
+{
+  v4sf tem0 = x - y;
+  v4sf tem1 = x + y;
+  return __builtin_shuffle (tem0, tem1, (v4si) { 0, 5, 2, 7 });
+}
+
+v4sf foo2 (v4sf x, v4sf y)
+{
+  v4sf tem0 = x - y;
+  v4sf tem1 = y + x;
+  return __builtin_shuffle (tem0, tem1, (v4si) { 0, 5, 2, 7 });
+}
+
+v4sf foo3 (v4sf x, v4sf y)
+{
+  v4sf tem0 = x + y;
+  v4sf tem1 = x - y;
+  return __builtin_shuffle (tem0, tem1, (v4si) { 4, 1, 6, 3 });
+}
+
+v4sf foo4 (v4sf x, v4sf y)
+{
+  v4sf tem0 = y + x;
+  v4sf tem1 = x - y;
+  return __builtin_shuffle (tem0, tem1, (v4si) { 4, 1, 6, 3 });
+}
+
+/* { dg-final { scan-assembler-times "addsubps" 4 } } */
Index: testsuite/gcc.target/i386/pr66560-2.c
===
--- testsuite/gcc.target/i386/pr66560-2.c   (revision 0)
+++ testsuite/gcc.target/i386/pr66560-2.c   (revision 0)
@@ -0,0 +1,35 @@
+/* PR target/66560 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse4" } */
+
+typedef double v2df __attribute__((vector_size(16)));
+typedef long long v2di __attribute__((vector_size(16)));
+v2df foo1 (v2df x, v2df y)
+{
+  v2df tem0 = x - y;
+  v2df tem1 = x + y;
+  return __builtin_shuffle (tem0, tem1, (v2di) { 0, 3 });
+}
+
+v2df foo2 (v2df x, v2df y)
+{
+  v2df tem0 = x - y;
+  v2df tem1 = y + x;
+  return __builtin_shuffle (tem0, tem1, (v2di) { 0, 3 });
+}
+
+v2df foo3 (v2df x, v2df y)
+{
+  v2df tem0 = x + y;
+  v2df tem1 = x - y;
+  return __builtin_shuffle (tem0, tem1, (v2di) { 2, 1 });
+}
+
+v2df foo4 (v2df x, v2df y)
+{
+  v2df tem0 = y + x;
+  v2df tem1 = x - y;
+  return __builtin_shuffle (tem0, tem1, (v2di) { 2, 1 });
+}
+
+/* { dg-final { scan-assembler-times "addsubpd" 4 } } */
Index: testsuite/gcc.target/i386/pr66560-3.c
===
--- testsuite/gcc.target/i386/pr66560-3.c   (revision 0)
+++ testsuite/gcc.target/i386/pr66560-3.c   (revision 0)
@@ -0,0 +1,35 @@
+/* PR target/66560 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx" } */
+
+typedef float v8sf __attribute__((vector_size(32)));
+typedef int v8si __attribute__((vector_size(32)));
+v8sf foo1 (v8sf x, v8sf y)
+{
+  v8sf tem0 = x - y;
+  v8sf tem1 = x + y;
+  return __builtin_shuffle (tem0, tem1, (v8si) { 0, 9, 2, 11, 4, 13, 6, 15 });
+}
+
+v8sf foo2 (v8sf x, v8sf y)
+{
+  v8sf tem0 = x - y;
+  v8sf tem1 = y + x;
+  return __builtin_shuffle (tem0, tem1, (v8si) { 0, 9, 2, 11, 4, 13, 6, 15 });
+}
+
+v8sf foo3 (v8sf x, v8sf y)
+{
+  v8sf tem0 = x + y;
+  v8sf tem1 = x - y;
+  return __builtin_shuffle (tem0, tem1, (v8si) { 8, 1, 10, 3, 12, 5, 14, 7 });
+}
+
+v8sf foo4 (v8sf x, v8sf y)
+{
+  v8sf tem0 = y + x;
+  v8sf tem1 = x - y;
+  return __builtin_shuffle (tem0, tem1, (v8si) { 8, 1, 10, 3, 12, 5, 14, 7 });
+}
+
+/* { dg-final { scan-assembler-times "vaddsubps" 4 } } */
Index: testsuite/gcc.target/i386/pr66560-4.c
===
--- testsuite/gcc.target/i386/pr66560-4.c   (revision 0)
+++ testsuite/gcc.target/i386/pr66560-4.c   (revision 0)
@@ -0,0 +1,35 @@
+/* PR target/66560 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx" } */
+
+typedef doub

Re: [Ping, Patch, fortran, 64674, v3] [OOP] ICE in ASSOCIATE with class array

2015-06-23 Thread Andre Vehreschild
Hi Paul,

thanks for the review. Submitted as r224827.

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
Index: gcc/testsuite/gfortran.dg/associate_18.f08
===
--- gcc/testsuite/gfortran.dg/associate_18.f08	(Revision 0)
+++ gcc/testsuite/gfortran.dg/associate_18.f08	(Revision 224827)
@@ -0,0 +1,80 @@
+! { dg-do run }
+!
+! Contributed by Antony Lewis  
+!Andre Vehreschild  
+! Check that associating array-sections/scalars is working
+! with class arrays.
+!
+
+program associate_18
+  Type T
+integer :: map = 1
+  end Type T
+
+  class(T), allocatable :: av(:)
+  class(T), allocatable :: am(:,:)
+  class(T), pointer :: pv(:)
+  class(T), pointer :: pm(:,:)
+
+  integer :: iv(5) = 17
+  integer :: im(4,5) = 23
+  integer :: expect(20) = 23
+  integer :: c
+
+  allocate(av(2))
+  associate(i => av(1))
+i%map = 2
+  end associate
+  if (any (av%map /= [2,1])) call abort()
+  deallocate(av)
+
+  allocate(am(3,4))
+  associate(pam => am(2:3, 2:3))
+pam%map = 7
+pam(1,2)%map = 8
+  end associate
+  if (any (reshape(am%map, [12]) /= [1,1,1, 1,7,7, 1,8,7, 1,1,1])) call abort()
+  deallocate(am)
+
+  allocate(pv(2))
+  associate(i => pv(1))
+i%map = 2
+  end associate
+  if (any (pv%map /= [2,1])) call abort()
+  deallocate(pv)
+
+  allocate(pm(3,4))
+  associate(ppm => pm(2:3, 2:3))
+ppm%map = 7
+ppm(1,2)%map = 8
+  end associate
+  if (any (reshape(pm%map, [12]) /= [1,1,1, 1,7,7, 1,8,7, 1,1,1])) call abort()
+  deallocate(pm)
+
+  associate(i => iv(1))
+i = 7
+  end associate
+  if (any (iv /= [7, 17, 17, 17, 17])) call abort()
+
+  associate(pam => im(2:3, 2:3))
+pam = 9
+pam(1,2) = 10
+do c = 1, 2
+pam(2, c) = 0
+end do
+  end associate
+  if (any (reshape(im, [20]) /= [23,23,23,23, 23,9,0,23, &
+23,10,0,23, 23,23,23,23, 23,23,23,23])) call abort()
+
+  expect(2:3) = 9
+  do c = 1, 5
+im = 23
+associate(pam => im(:, c))
+  pam(2:3) = 9
+end associate
+if (any (reshape(im, [20]) /= expect)) call abort()
+! Shift expect
+expect = [expect(17:), expect(:16)]
+  end do
+end program
+
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(Revision 224826)
+++ gcc/testsuite/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,8 @@
+2015-06-23  Andre Vehreschild  
+
+	PR fortran/64674
+	* gfortran.dg/associate_18.f08: New test.
+
 2015-06-23  Uros Bizjak  
 
 	PR target/66560
Index: gcc/fortran/parse.c
===
--- gcc/fortran/parse.c	(Revision 224826)
+++ gcc/fortran/parse.c	(Arbeitskopie)
@@ -3958,6 +3958,8 @@
   for (a = new_st.ext.block.assoc; a; a = a->next)
 {
   gfc_symbol* sym;
+  gfc_ref *ref;
+  gfc_array_ref *array_ref;
 
   if (gfc_get_sym_tree (a->name, NULL, &a->st, false))
 	gcc_unreachable ();
@@ -3974,6 +3976,84 @@
 	 for parsing component references on the associate-name
 	 in case of association to a derived-type.  */
   sym->ts = a->target->ts;
+
+  /* Check if the target expression is array valued.  This can not always
+	 be done by looking at target.rank, because that might not have been
+	 set yet.  Therefore traverse the chain of refs, looking for the last
+	 array ref and evaluate that.  */
+  array_ref = NULL;
+  for (ref = a->target->ref; ref; ref = ref->next)
+	if (ref->type == REF_ARRAY)
+	  array_ref = &ref->u.ar;
+  if (array_ref || a->target->rank)
+	{
+	  gfc_array_spec *as;
+	  int dim, rank = 0;
+	  if (array_ref)
+	{
+	  /* Count the dimension, that have a non-scalar extend.  */
+	  for (dim = 0; dim < array_ref->dimen; ++dim)
+		if (array_ref->dimen_type[dim] != DIMEN_ELEMENT
+		&& !(array_ref->dimen_type[dim] == DIMEN_UNKNOWN
+			 && array_ref->end[dim] == NULL
+			 && array_ref->start[dim] != NULL))
+		  ++rank;
+	}
+	  else
+	rank = a->target->rank;
+	  /* When the rank is greater than zero then sym will be an array.  */
+	  if (sym->ts.type == BT_CLASS)
+	{
+	  if ((!CLASS_DATA (sym)->as && rank != 0)
+		  || (CLASS_DATA (sym)->as
+		  && CLASS_DATA (sym)->as->rank != rank))
+		{
+		  /* Don't just (re-)set the attr and as in the sym.ts,
+		 because this modifies the target's attr and as.  Copy the
+		 data and do a build_class_symbol.  */
+		  symbol_attribute attr = CLASS_DATA (a->target)->attr;
+		  int corank = gfc_get_corank (a->target);
+		  gfc_typespec type;
+
+		  if (rank || corank)
+		{
+		  as = gfc_get_array_spec ();
+		  as->type = AS_DEFERRED;
+		  as->rank = rank;
+		  as->corank = corank;
+		  attr.dimension = rank ? 1 : 0;
+		  attr.codimension = corank ? 1 : 0;
+		}
+		  else
+		{
+		  as = NULL;
+		  attr.dimension = attr.codimension = 0;
+		}
+		  attr.class_ok = 0;
+		  type = CLASS_DATA (sym)->ts;
+		  if (!gfc_build_clas

Re: [PATCH][ARM] PR/65711: Don't pass '-dynamic-linker' when '-shared' is used

2015-06-23 Thread Richard Biener
On Mon, May 18, 2015 at 9:09 PM, Ludovic Courtès  wrote:
> Ramana Radhakrishnan  skribis:
>
>> On Thu, Apr 23, 2015 at 9:29 AM, Ludovic Courtès  wrote:
>>> As discussed at .
>>>
>>> Patch is for both 4.8 and 4.9 (possibly 5.1 too, I haven’t checked.)
>>>
>>
>> OK for trunk. This is also ok for all release branches if no
>> objections in 24 hours.
>
> OK, thank you.
>
> I haven’t applied for write-after-approval so perhaps you should commit
> it yourself?

So you just committed to the already closed 4.8 branch.  Please always
check gcc.gnu.org/ for branch status.

Richard.

> Ludo’.


Re: [PATCH][ARM] PR/65711: Don't pass '-dynamic-linker' when '-shared' is used

2015-06-23 Thread Ludovic Courtès
Ramana Radhakrishnan  skribis:

> On Thu, Apr 23, 2015 at 9:29 AM, Ludovic Courtès  wrote:
>> As discussed at .
>>
>> Patch is for both 4.8 and 4.9 (possibly 5.1 too, I haven’t checked.)
>>
>
> OK for trunk. This is also ok for all release branches if no
> objections in 24 hours.

[...]

>> gcc/
>> 2015-04-23  Ludovic Courtès  
>>
>> PR 65711
>> * config/arm/linux-elf.h (LINUX_TARGET_LINK_SPEC): Move
>> '-dynamic-linker' within %{!shared: ...}.

Committed to gcc-4_8-branch, gcc-4_9-branch, gcc-5-branch, and trunk.
Please let me know if there’s anything I missed.

Thanks,
Ludo’.


Re: [PATCH] Check dominator info in compute_dominance_frontiers

2015-06-23 Thread Richard Biener
On Mon, Jun 22, 2015 at 7:10 PM, Tom de Vries  wrote:
> On 22/06/15 13:47, Richard Biener wrote:
>>>
>>> (eventually also for the case where we
>>> >>end up only computing the fast-query stuff).
>>>
>
> Like this?
> ...
> diff --git a/gcc/dominance.c b/gcc/dominance.c
> index 9c66ca2..58fc6fd 100644
> --- a/gcc/dominance.c
> +++ b/gcc/dominance.c
> @@ -679,6 +679,12 @@ calculate_dominance_info (enum cdi_direction dir)
>free_dom_info (&di);
>dom_computed[dir_index] = DOM_NO_FAST_QUERY;
>  }
> +  else
> +{
> +#if ENABLE_CHECKING
> +  verify_dominators (CDI_DOMINATORS);
> +#endif
> +}
>
>compute_dom_fast_query (dir);

Yeah.

Richard.

> ...
>
> Thanks,
> - Tom


Re: [PATCH][ARM] PR/65711: Don't pass '-dynamic-linker' when '-shared' is used

2015-06-23 Thread Jakub Jelinek
On Tue, Jun 23, 2015 at 11:30:25AM +0200, Ludovic Courtès wrote:
> Ramana Radhakrishnan  skribis:
> 
> > On Thu, Apr 23, 2015 at 9:29 AM, Ludovic Courtès  wrote:
> >> As discussed at .
> >>
> >> Patch is for both 4.8 and 4.9 (possibly 5.1 too, I haven’t checked.)
> >>
> >
> > OK for trunk. This is also ok for all release branches if no
> > objections in 24 hours.
> 
> [...]
> 
> >> gcc/
> >> 2015-04-23  Ludovic Courtès  
> >>
> >> PR 65711
> >> * config/arm/linux-elf.h (LINUX_TARGET_LINK_SPEC): Move
> >> '-dynamic-linker' within %{!shared: ...}.
> 
> Committed to gcc-4_8-branch, gcc-4_9-branch, gcc-5-branch, and trunk.
> Please let me know if there’s anything I missed.

See richi's mail.  4.8 branch has already been closed, and 4.9 branch
is frozen, so you should have asked for RM permission.
Also, in the ChangeLog entries, one should write it in the form
PR component/bugno,
so
PR target/65711
in your case.

Jakub


Re: [i386, PATCH, 3/3] IA MCU psABI support: testsuite.

2015-06-23 Thread Kirill Yukhin
Hello,
This patch introduces tests for new psABI.

gcc/testsuite/
* gcc.target/i386/iamcu/abi-iamcu.exp: New file.
* gcc.target/i386/iamcu/args.h: Likewise.
* gcc.target/i386/iamcu/asm-support.S: Likewise.
* gcc.target/i386/iamcu/defines.h: Likewise.
* gcc.target/i386/iamcu/macros.h: Likewise.
* gcc.target/i386/iamcu/test_3_element_struct_and_unions.c: Likewise.
* gcc.target/i386/iamcu/test_basic_64bit_returning.c: Likewise.
* gcc.target/i386/iamcu/test_basic_alignment.c: Likewise.
* gcc.target/i386/iamcu/test_basic_array_size_and_align.c: Likewise.
* gcc.target/i386/iamcu/test_basic_returning.c: Likewise.
* gcc.target/i386/iamcu/test_basic_sizes.c: Likewise.
* gcc.target/i386/iamcu/test_basic_struct_size_and_align.c: Likewise.
* gcc.target/i386/iamcu/test_basic_union_size_and_align.c: Likewise.
* gcc.target/i386/iamcu/test_bitfields.c: Likewise.
* gcc.target/i386/iamcu/test_complex_returning.c: Likewise.
* gcc.target/i386/iamcu/test_passing_floats.c: Likewise.
* gcc.target/i386/iamcu/test_passing_integers.c: Likewise.
* gcc.target/i386/iamcu/test_passing_structs.c: Likewise.
* gcc.target/i386/iamcu/test_passing_structs_and_unions.c: Likewise.
* gcc.target/i386/iamcu/test_passing_unions.c: Likewise.
* gcc.target/i386/iamcu/test_struct_returning.c: Likewise.
* gcc.target/i386/iamcu/test_varargs.c: Likewise.

New tests pass, when run on for 32b target.
Is it ok for trunk?

--
Thanks, K

diff --git a/gcc/testsuite/gcc.target/i386/iamcu/abi-iamcu.exp 
b/gcc/testsuite/gcc.target/i386/iamcu/abi-iamcu.exp
new file mode 100644
index 000..b5b3261
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/iamcu/abi-iamcu.exp
@@ -0,0 +1,42 @@
+# Copyright (C) 2015 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .
+
+# The Intel MCU psABI testsuite needs one additional assembler file for
+# most testcases.  For simplicity we will just link it into each test.
+
+load_lib c-torture.exp
+load_lib target-supports.exp
+load_lib torture-options.exp
+
+if { (![istarget x86_64-*-linux*] && ![istarget i?86-*-linux*])
+ || ![is-effective-target ia32] } then {
+  return
+}
+
+
+torture-init
+set-torture-options $C_TORTURE_OPTIONS
+set additional_flags "-miamcu -W -Wall -Wno-abi"
+
+foreach src [lsort [glob -nocomplain $srcdir/$subdir/test_*.c]] {
+if {[runtest_file_p $runtests $src]} {
+   c-torture-execute [list $src \
+   $srcdir/$subdir/asm-support.S] \
+   $additional_flags
+}
+}
+
+torture-finish
diff --git a/gcc/testsuite/gcc.target/i386/iamcu/args.h 
b/gcc/testsuite/gcc.target/i386/iamcu/args.h
new file mode 100644
index 000..f8abde4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/iamcu/args.h
@@ -0,0 +1,77 @@
+#ifndef INCLUDED_ARGS_H
+#define INCLUDED_ARGS_H
+
+/* This defines the calling sequences for integers and floats.  */
+#define I0 eax
+#define I1 edx
+#define I2 ecx
+
+typedef unsigned int size_t;
+
+extern void (*callthis)(void);
+extern unsigned long eax,ebx,ecx,edx,esi,edi,esp,ebp;
+extern unsigned long sret_eax;
+extern volatile unsigned long volatile_var;
+extern void snapshot (void);
+extern void snapshot_ret (void);
+extern void *iamcu_memset (void *, int, size_t);
+#define WRAP_CALL(N) \
+  (callthis = (void (*)()) (N), (typeof (&N)) snapshot)
+#define WRAP_RET(N) \
+  (callthis = (void (*)()) (N), (typeof (&N)) snapshot_ret)
+
+/* Clear all scratch integer registers.  */
+#define clear_int_hardware_registers \
+  asm __volatile__ ("xor %%eax, %%eax\n\t" \
+   "xor %%edx, %%edx\n\t" \
+   "xor %%ecx, %%ecx\n\t" \
+   ::: "eax", "edx", "ecx");
+
+/* Clear all scratch integer registers, excluding the one used to return
+   aggregate.  */
+#define clear_non_sret_int_hardware_registers \
+  asm __volatile__ ("xor %%edx, %%ebx\n\t" \
+   "xor %%ecx, %%ecx\n\t" \
+   ::: "edx", "ecx");
+
+/* This is the list of registers available for passing arguments. Not all of
+   these are used or even really available.  */
+struct IntegerRegisters
+{
+  unsigned long eax, ebx, ecx, edx, esi, edi;
+};
+
+/* Implemented in scalarargs.c  */
+extern struct IntegerRegiste

Re: [PATCH][ARM] PR/65711: Don't pass '-dynamic-linker' when '-shared' is used

2015-06-23 Thread Ludovic Courtès
Richard Biener  skribis:

> On Mon, May 18, 2015 at 9:09 PM, Ludovic Courtès  wrote:
>> Ramana Radhakrishnan  skribis:
>>
>>> On Thu, Apr 23, 2015 at 9:29 AM, Ludovic Courtès  wrote:
 As discussed at .

 Patch is for both 4.8 and 4.9 (possibly 5.1 too, I haven’t checked.)

>>>
>>> OK for trunk. This is also ok for all release branches if no
>>> objections in 24 hours.
>>
>> OK, thank you.
>>
>> I haven’t applied for write-after-approval so perhaps you should commit
>> it yourself?
>
> So you just committed to the already closed 4.8 branch.  Please always
> check gcc.gnu.org/ for branch status.

My bad, sorry about that!

Ludo’.


Re: match.pd: Three new patterns

2015-06-23 Thread Marek Polacek
On Tue, Jun 23, 2015 at 10:22:35AM +0200, Richard Biener wrote:
> Who says you can't do bitwise ops on them?  I can't see that being
> enforced in the GIMPLE checking in tree-cfg.c.  Yes, there is no
> such thing as a "saturating" bitwise and but bitwise and should
> just work fine.
> 
> You can check with a arm cross what the C FE does when you use
> bitwise ops but I believe the regular and/ior md patterns work
> just fine (there are no special modes/registers but they seem
> to be shared with regular registers, just special operations
> are available).

Ok ;).  Applied the following then.

2015-06-23  Marek Polacek  

* match.pd ((x + y) - (x | y) -> x & y,
(x + y) - (x & y) -> x | y): New patterns.

* gcc.dg/fold-minus-4.c: New test.
* gcc.dg/fold-minus-5.c: New test.
* c-c++-common/ubsan/overflow-add-5.c: New test.

diff --git gcc/match.pd gcc/match.pd
index badb80a..9c88e3e 100644
--- gcc/match.pd
+++ gcc/match.pd
@@ -343,6 +343,20 @@ along with GCC; see the file COPYING3.  If not see
  (plus:c (bit_and @0 @1) (bit_ior @0 @1))
  (plus @0 @1))
 
+/* (x + y) - (x | y) -> x & y */
+(simplify
+ (minus (plus @0 @1) (bit_ior @0 @1))
+ (if (!TYPE_OVERFLOW_SANITIZED (type) && !TYPE_OVERFLOW_TRAPS (type)
+  && !TYPE_SATURATING (type))
+  (bit_and @0 @1)))
+
+/* (x + y) - (x & y) -> x | y */
+(simplify
+ (minus (plus @0 @1) (bit_and @0 @1))
+ (if (!TYPE_OVERFLOW_SANITIZED (type) && !TYPE_OVERFLOW_TRAPS (type)
+  && !TYPE_SATURATING (type))
+  (bit_ior @0 @1)))
+
 /* (x | y) - (x ^ y) -> x & y */
 (simplify
  (minus (bit_ior @0 @1) (bit_xor @0 @1))
diff --git gcc/testsuite/c-c++-common/ubsan/overflow-add-5.c 
gcc/testsuite/c-c++-common/ubsan/overflow-add-5.c
index e69de29..905a60a 100644
--- gcc/testsuite/c-c++-common/ubsan/overflow-add-5.c
+++ gcc/testsuite/c-c++-common/ubsan/overflow-add-5.c
@@ -0,0 +1,30 @@
+/* { dg-do run } */
+/* { dg-options "-fsanitize=signed-integer-overflow" } */
+
+int __attribute__ ((noinline))
+foo (int i, int j)
+{
+  return (i + j) - (i | j);
+}
+
+/* { dg-output "signed integer overflow: 2147483647 \\+ 1 cannot be 
represented in type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*signed integer overflow: -2147483648 - 2147483647 
cannot be represented in type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+
+int __attribute__ ((noinline))
+bar (int i, int j)
+{
+  return (i + j) - (i & j);
+}
+
+/* { dg-output "\[^\n\r]*signed integer overflow: 2147483647 \\+ 1 cannot be 
represented in type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*signed integer overflow: -2147483648 - 1 cannot be 
represented in type 'int'" } */
+
+int
+main ()
+{
+  int r = foo (__INT_MAX__, 1);
+  asm volatile ("" : "+g" (r));
+  r = bar (__INT_MAX__, 1);
+  asm volatile ("" : "+g" (r));
+  return 0;
+}
diff --git gcc/testsuite/gcc.dg/fold-minus-4.c 
gcc/testsuite/gcc.dg/fold-minus-4.c
index e69de29..2d76b4f 100644
--- gcc/testsuite/gcc.dg/fold-minus-4.c
+++ gcc/testsuite/gcc.dg/fold-minus-4.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-cddce1" } */
+
+int
+fn1 (int a, int b)
+{
+  int tem1 = a + b;
+  int tem2 = a & b;
+  return tem1 - tem2;
+}
+
+int
+fn2 (int a, int b)
+{
+  int tem1 = b + a;
+  int tem2 = a & b;
+  return tem1 - tem2;
+}
+
+int
+fn3 (int a, int b)
+{
+  int tem1 = a + b;
+  int tem2 = b & a;
+  return tem1 - tem2;
+}
+
+int
+fn4 (int a, int b)
+{
+  int tem1 = b + a;
+  int tem2 = b & a;
+  return tem1 - tem2;
+}
+
+/* { dg-final { scan-tree-dump-not " & " "cddce1" } } */
+/* { dg-final { scan-tree-dump-not " \\+ " "cddce1" } } */
diff --git gcc/testsuite/gcc.dg/fold-minus-5.c 
gcc/testsuite/gcc.dg/fold-minus-5.c
index e69de29..a31e1cc 100644
--- gcc/testsuite/gcc.dg/fold-minus-5.c
+++ gcc/testsuite/gcc.dg/fold-minus-5.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-cddce1" } */
+
+int
+fn1 (int a, int b)
+{
+  int tem1 = a + b;
+  int tem2 = a | b;
+  return tem1 - tem2;
+}
+
+int
+fn2 (int a, int b)
+{
+  int tem1 = b + a;
+  int tem2 = a | b;
+  return tem1 - tem2;
+}
+
+int
+fn3 (int a, int b)
+{
+  int tem1 = a + b;
+  int tem2 = b | a;
+  return tem1 - tem2;
+}
+
+int
+fn4 (int a, int b)
+{
+  int tem1 = b + a;
+  int tem2 = b | a;
+  return tem1 - tem2;
+}
+
+/* { dg-final { scan-tree-dump-not " \\+ " "cddce1" } } */
+/* { dg-final { scan-tree-dump-not " \\| " "cddce1" } } */

Marek


Re: [PATCH] c/66516 - missing diagnostic on taking the address of a builtin function

2015-06-23 Thread Marek Polacek
On Mon, Jun 22, 2015 at 07:59:20PM -0600, Martin Sebor wrote:
> >It seems like this patch regresess pr59630.c testcase; I don't see
> >the testcase being addressed in this patch.
> 
> Thanks for the review and for pointing out this regression!
> I missed it among all the C test suite failures (I see 157
> of them in 24 distinct tests on x86_64.)

You might want to use contrib/test_summary and then compare its
outputs.

> pr59630 is marked ice-on-valid-code even though the call via
> the converted pointer is clearly invalid (UB). What's more
> relevant, though, is that the test case is one of those that
> (while they both compile and link with the unpatched GCC) are
> not intended to compile with the patch (and don't compile with
> Clang).

Right, just turn dg-warning into dg-error.

> In this simple case, the call to __builtin_abs(0) is folded
> into the constant 0, but in more involved cases GCC emits
> a call to abs. It's not clear to me from the manual or from
> the builtin tests I've seen whether this is by design or
> an accident of the implementation
> 
> Is it intended that programs be able to take the address of
> the builtins that correspond to libc functions and make calls
> to the underlying libc functions via such pointers? (If so,
> the patch will need some tweaking.)

I don't think so, at least clang doesn't allow e.g.
size_t (*fp) (const char *) = __builtin_strlen;

Marek


Re: [PATCH] c/66516 - missing diagnostic on taking the address of a builtin function

2015-06-23 Thread Jakub Jelinek
On Tue, Jun 23, 2015 at 12:18:30PM +0200, Marek Polacek wrote:
> > Is it intended that programs be able to take the address of
> > the builtins that correspond to libc functions and make calls
> > to the underlying libc functions via such pointers? (If so,
> > the patch will need some tweaking.)
> 
> I don't think so, at least clang doesn't allow e.g.
> size_t (*fp) (const char *) = __builtin_strlen;

Well, clang is irrelevant here, __builtin_strlen etc. is a GNU
extension, so it matters what we decide about it.  As this used to work
for decades (if the builtin function has a libc fallback), suddenly
rejecting it could break various programs that e.g. just
#define strlen __builtin_strlen
or similar.  Can't we really reject it just for the functions
that don't have a unique fallback?

Jakub


Re: [PATCH] Expand PIC calls without PLT with -fno-plt

2015-06-23 Thread Alexander Monakov
On Tue, 23 Jun 2015, Ramana Radhakrishnan wrote:
> > If you try disabling the REG_EQUAL note generation [*], you'll probably 
> > find a
> > performance regression on arm32 (and probably on aarch64 as well?
> > we only
> 
> IMHO disabling the REG_EQUAL note generation is the wrong way to go about 
> this.

Of course.  I only mentioned that as a way to look at no-plt codegen that we
used, lacking a solution to the combine problem.

Alexander


Re: [PATCH] Enable two UNSIGNED_FLOAT simplifications in simplify_unary_operation_1

2015-06-23 Thread Renlin Li

Hi Christophe,

Yes, we have also noticed this failure.

Here I have a simple patch to remove the mfloat-abi option for 
hard-float toolchain. The default abi is used.

For non-hardfloat toolchain, softfp abi is specified.

I have checked with arm-none-eabi and arm-none-linux-gnueabihf 
toolchain, this problem should be resolved by this patch.


Okay to commit?


gcc/testsuite/ChangeLog:

2015-06-23  Renlin Li  

* gcc.target/arm/unsigned-float.c: Different options for hf toolchain.


On 16/06/15 14:33, Christophe Lyon wrote:

On 20 March 2015 at 18:03, Renlin Li  wrote:

Hi all,

This is a simple patch to enable two simplifications for UNSIGNED_FLOAT
expression.

For the following rtx patterns, they can be simplified when the integer x
can be
represented in float mode without precision loss:

float_truncate (float x) --> float x
float_extend (float x) --> float x

Those two simplifications are also applicable to UNSIGNED_FLOAT expression.

For example, compile the following code using aarch64-none-elf toolchain
with -O1 flag.
double
f1 (uint16_t x)
{
   return (double)(float)x;
}
Before the change, the compiler generates the following code:
f1:
 uxthw0, w0
 ucvtf   s0, w0
 fcvtd0, s0
 ret
After the change, the following simplified asm code snipts are generated.
f1:
 uxthw0, w0
 ucvtf   d0, w0
 ret


aarch64-none-elf regression test runs Okay. x86_64 bootstraps Okay.
Okay to commit?

gcc/ChangeLog:

2015-03-20  Renlin Li  

 * simplify-rtx.c (simplify_unary_operation_1): Fix a typo. Enable two
 simplifications for UNSIGNED_FLOAT.

gcc/testsuite/ChangeLog:

2015-03-20  Renlin Li  

 * gcc.target/aarch64/unsigned-float.c: New.
 * gcc.target/arm/unsigned-float.c: New.

This new test fails on ARM targets defaulting to hard-float which have
no softfp multilib.
I'm not sure about the best way to fix this.

Note that dg-require-effective-target arm_vfp_ok passes, but the
testcase fails because it includes stdint.h, leading to:
sysroot-arm-none-linux-gnueabihf/usr/include/gnu/stubs.h:7:29: fatal
error: gnu/stubs-soft.h: No such file or directory

Christophe.

diff --git a/gcc/testsuite/gcc.target/arm/unsigned-float.c b/gcc/testsuite/gcc.target/arm/unsigned-float.c
index bb05c85..b9ed681 100644
--- a/gcc/testsuite/gcc.target/arm/unsigned-float.c
+++ b/gcc/testsuite/gcc.target/arm/unsigned-float.c
@@ -1,7 +1,8 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_vfp_ok } */
-/* { dg-options "-march=armv7-a -O1 -mfloat-abi=softfp" } */
 /* { dg-skip-if "need fp instructions" { *-*-* } { "-mfloat-abi=soft" } { "" } } */
+/* { dg-options "-march=armv7-a -O1" } */
+/* { dg-additional-options "-mfloat-abi=softfp" { target { ! { arm_hf_eabi } } } } */
 
 #include 
 


Re: [gomp4.1] Handle linear clause on worksharing loop

2015-06-23 Thread Ilya Verbin
On Thu, Jun 18, 2015 at 15:15:21 +0200, Jakub Jelinek wrote:
> This patch adds support for linear clause on OpenMP 4.1 worksharing loops.
> 
> 2015-06-18  Jakub Jelinek  
> 
>   * gimplify.c (gimplify_scan_omp_clauses): For linear clause
>   on worksharing loop combined with parallel add shared clause
>   on the parallel.
>   * omp-low.c (lower_rec_input_clauses): Set lastprivate_firstprivate
>   flag for linear that needs copyin and copyout.
>   (expand_omp_for_generic, expand_omp_for_static_nochunk,
>   expand_omp_for_static_chunk): Handle linear clauses on worksharing
>   loop.
>   (lower_omp_for): Adjust OMP_CLAUSE_DECL and OMP_CLAUSE_LINEAR_STEP
>   so that expand_omp_for_* can use it during expansion for linear
>   adjustments.
> gcc/c-family/
>   * c-omp.c (c_omp_split_clauses): Fix up a comment.  Put
>   OMP_CLAUSE_LINEAR on OMP_FOR if not combined with OMP_SIMD.
> libgomp/
>   * testsuite/libgomp.c/pr66199-3.c: New test.
>   * testsuite/libgomp.c/pr66199-4.c: New test.
>   * testsuite/libgomp.c/linear-1.c: New test.
>   * testsuite/libgomp.c/linear-2.c: New test.
>   * testsuite/libgomp.c++/linear-1.C: New test.
>   * testsuite/libgomp.c++/linear-2.C: New test.

Have you seen this (using mic emul)?

FAIL: libgomp.c/linear-2.c execution test
FAIL: libgomp.c++/linear-2.C execution test

 -- Ilya


Re: [gomp4.1] Handle linear clause on worksharing loop

2015-06-23 Thread Jakub Jelinek
On Tue, Jun 23, 2015 at 02:06:27PM +0300, Ilya Verbin wrote:
> On Thu, Jun 18, 2015 at 15:15:21 +0200, Jakub Jelinek wrote:
> > This patch adds support for linear clause on OpenMP 4.1 worksharing loops.
> > 
> > 2015-06-18  Jakub Jelinek  
> > 
> > * gimplify.c (gimplify_scan_omp_clauses): For linear clause
> > on worksharing loop combined with parallel add shared clause
> > on the parallel.
> > * omp-low.c (lower_rec_input_clauses): Set lastprivate_firstprivate
> > flag for linear that needs copyin and copyout.
> > (expand_omp_for_generic, expand_omp_for_static_nochunk,
> > expand_omp_for_static_chunk): Handle linear clauses on worksharing
> > loop.
> > (lower_omp_for): Adjust OMP_CLAUSE_DECL and OMP_CLAUSE_LINEAR_STEP
> > so that expand_omp_for_* can use it during expansion for linear
> > adjustments.
> > gcc/c-family/
> > * c-omp.c (c_omp_split_clauses): Fix up a comment.  Put
> > OMP_CLAUSE_LINEAR on OMP_FOR if not combined with OMP_SIMD.
> > libgomp/
> > * testsuite/libgomp.c/pr66199-3.c: New test.
> > * testsuite/libgomp.c/pr66199-4.c: New test.
> > * testsuite/libgomp.c/linear-1.c: New test.
> > * testsuite/libgomp.c/linear-2.c: New test.
> > * testsuite/libgomp.c++/linear-1.C: New test.
> > * testsuite/libgomp.c++/linear-2.C: New test.
> 
> Have you seen this (using mic emul)?
> 
> FAIL: libgomp.c/linear-2.c execution test
> FAIL: libgomp.c++/linear-2.C execution test

I admit I've been testing gomp-4.1 branch only without mic emul/ptx,
and for linear-2.c I've only eyeballed the omplower and ssa dumps,
so yes, it is possible something is broken for distribute.
Though, I'd more expect it to be broken on the (non-existing) PTX
OpenMP support than mic emul, where we use only a single team.

Jakub


Re: [AArch64] Implement -fpic for -mcmodel=small

2015-06-23 Thread Marcus Shawcroft
On 20 May 2015 at 11:21, Jiong Wang  wrote:

> gcc/
>   * config/aarch64/aarch64.md: (ldr_got_small_): Support new GOT 
> relocation
>   modifiers.
>   (ldr_got_small_sidi): Ditto.
>   * config/aarch64/iterators.md (got_modifier): New mode iterator.
>   * config/aarch64/aarch64-otps.h (aarch64_code_model): New model.
>   * config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Support 
> -fpic.
>   (aarch64_rtx_costs): Add costs for new instruction sequences.
>   (initialize_aarch64_code_model): Initialize new model.
>   (aarch64_classify_symbol): Recognize new model.
>   (aarch64_asm_preferred_eh_data_format): Support new model.
>   (aarch64_load_symref_appropriately): Generate new instruction sequences for 
> -fpic.
>   (TARGET_USE_PSEUDO_PIC_REG): New definition.
>   (aarch64_use_pseudo_pic_reg): New function.
>
> gcc/testsuite/
>   * gcc.target/aarch64/pic-small.c: New testcase.


Rather than thread tests against aarch64_cmodel throughout the
existing code can we instead extend classify_symbol with a new symbol
classification?

Cheers
/Marcus


Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-06-23 Thread Ilya Verbin
On Sat, Jun 20, 2015 at 00:35:14 +0300, Ilya Verbin wrote:
> Given that a mapped variable in 4.1 can have different kinds across nested 
> data
> regions, we need to store map-type not only for each var, but also for each
> structured mapping.  Here is my WIP patch, is it sane? :)
> Attached testcase works OK on the device with non-shared memory.

A bit updated version with a fix for GOMP_MAP_TO_PSET.
make check-target-libgomp passed.


include/gcc/
* gomp-constants.h (GOMP_MAP_ALWAYS_TO_P,
GOMP_MAP_ALWAYS_FROM_P): Define.
libgomp/
* libgomp.h (struct target_var_desc): New.
(struct target_mem_desc): Replace array of splay_tree_key with array of
target_var_desc.
(struct splay_tree_key_s): Move copy_from to target_var_desc.
* oacc-mem.c (gomp_acc_remove_pointer): Use copy_from from
target_var_desc.
* oacc-parallel.c (GOACC_parallel): Use copy_from from target_var_desc.
* target.c (gomp_map_vars_existing): Copy data to device if map-type is
'always to' or 'always tofrom'.
(gomp_map_vars): Use key from target_var_desc.  Set copy_from and
always_copy_from.
(gomp_copy_from_async): Use key and copy_from from target_var_desc.
(gomp_unmap_vars): Copy data from device if always_copy_from is set.
(gomp_offload_image_to_device): Do not use copy_from.
* testsuite/libgomp.c/target-11.c: New test.


diff --git a/include/gomp-constants.h b/include/gomp-constants.h
index 1849478..42bec04 100644
--- a/include/gomp-constants.h
+++ b/include/gomp-constants.h
@@ -107,6 +107,12 @@ enum gomp_map_kind
 #define GOMP_MAP_POINTER_P(X) \
   ((X) == GOMP_MAP_POINTER)
 
+#define GOMP_MAP_ALWAYS_TO_P(X) \
+  (((X) == GOMP_MAP_ALWAYS_TO) || ((X) == GOMP_MAP_ALWAYS_TOFROM))
+
+#define GOMP_MAP_ALWAYS_FROM_P(X) \
+  (((X) == GOMP_MAP_ALWAYS_FROM) || ((X) == GOMP_MAP_ALWAYS_TOFROM))
+
 
 /* Asynchronous behavior.  Keep in sync with
libgomp/{openacc.h,openacc.f90,openacc_lib.h}:acc_async_t.  */
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index 87d6c40..8e6d4ac 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -636,6 +636,15 @@ typedef struct splay_tree_node_s *splay_tree_node;
 typedef struct splay_tree_s *splay_tree;
 typedef struct splay_tree_key_s *splay_tree_key;
 
+struct target_var_desc {
+  /* Splay key.  */
+  splay_tree_key key;
+  /* True if data should be copied from device to host at the end.  */
+  bool copy_from;
+  /* True if data always should be copied from device to host at the end.  */
+  bool always_copy_from;
+};
+
 struct target_mem_desc {
   /* Reference count.  */
   uintptr_t refcount;
@@ -655,9 +664,9 @@ struct target_mem_desc {
   /* Corresponding target device descriptor.  */
   struct gomp_device_descr *device_descr;
 
-  /* List of splay keys to remove (or decrease refcount)
+  /* List of target items to remove (or decrease refcount)
  at the end of region.  */
-  splay_tree_key list[];
+  struct target_var_desc list[];
 };
 
 struct splay_tree_key_s {
@@ -673,8 +682,6 @@ struct splay_tree_key_s {
   uintptr_t refcount;
   /* Asynchronous reference count.  */
   uintptr_t async_refcount;
-  /* True if data should be copied from device to host at the end.  */
-  bool copy_from;
 };
 
 #include "splay-tree.h"
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 90d43eb..c0fcb07 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -651,7 +651,7 @@ gomp_acc_remove_pointer (void *h, bool force_copyfrom, int 
async, int mapnum)
 }
 
   if (force_copyfrom)
-t->list[0]->copy_from = 1;
+t->list[0].copy_from = 1;
 
   gomp_mutex_unlock (&acc_dev->lock);
 
diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c
index d899946..8ea3dd1 100644
--- a/libgomp/oacc-parallel.c
+++ b/libgomp/oacc-parallel.c
@@ -135,8 +135,8 @@ GOACC_parallel (int device, void (*fn) (void *),
 
   devaddrs = gomp_alloca (sizeof (void *) * mapnum);
   for (i = 0; i < mapnum; i++)
-devaddrs[i] = (void *) (tgt->list[i]->tgt->tgt_start
-   + tgt->list[i]->tgt_offset);
+devaddrs[i] = (void *) (tgt->list[i].key->tgt->tgt_start
+   + tgt->list[i].key->tgt_offset);
 
   acc_dev->openacc.exec_func (tgt_fn, mapnum, hostaddrs, devaddrs, sizes, 
kinds,
  num_gangs, num_workers, vector_length, async,
diff --git a/libgomp/target.c b/libgomp/target.c
index fb8487a..b1640c1 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -161,6 +161,12 @@ gomp_map_vars_existing (struct gomp_device_descr *devicep, 
splay_tree_key oldn,
  (void *) newn->host_start, (void *) newn->host_end,
  (void *) oldn->host_start, (void *) oldn->host_end);
 }
+
+  if (GOMP_MAP_ALWAYS_TO_P (kind))
+devicep->host2dev_func (devicep->target_id,
+   (void *) (oldn->tgt->tgt_start + oldn->tgt_offset),
+   (void *) newn->host_start,
+

Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-06-23 Thread Jakub Jelinek
On Tue, Jun 23, 2015 at 02:40:43PM +0300, Ilya Verbin wrote:
> On Sat, Jun 20, 2015 at 00:35:14 +0300, Ilya Verbin wrote:
> > Given that a mapped variable in 4.1 can have different kinds across nested 
> > data
> > regions, we need to store map-type not only for each var, but also for each
> > structured mapping.  Here is my WIP patch, is it sane? :)
> > Attached testcase works OK on the device with non-shared memory.
> 
> A bit updated version with a fix for GOMP_MAP_TO_PSET.
> make check-target-libgomp passed.

Ok, thanks.

> include/gcc/
>   * gomp-constants.h (GOMP_MAP_ALWAYS_TO_P,
>   GOMP_MAP_ALWAYS_FROM_P): Define.
> libgomp/
>   * libgomp.h (struct target_var_desc): New.
>   (struct target_mem_desc): Replace array of splay_tree_key with array of
>   target_var_desc.
>   (struct splay_tree_key_s): Move copy_from to target_var_desc.
>   * oacc-mem.c (gomp_acc_remove_pointer): Use copy_from from
>   target_var_desc.
>   * oacc-parallel.c (GOACC_parallel): Use copy_from from target_var_desc.
>   * target.c (gomp_map_vars_existing): Copy data to device if map-type is
>   'always to' or 'always tofrom'.
>   (gomp_map_vars): Use key from target_var_desc.  Set copy_from and
>   always_copy_from.
>   (gomp_copy_from_async): Use key and copy_from from target_var_desc.
>   (gomp_unmap_vars): Copy data from device if always_copy_from is set.
>   (gomp_offload_image_to_device): Do not use copy_from.
>   * testsuite/libgomp.c/target-11.c: New test.

> +  /* Set dd on target to 0 for the further check.  */
> +  #pragma omp target map(always to: dd)
> + { dd; }

This reminds me that:
  if (ctx->region_type == ORT_TARGET && !(n->value & GOVD_SEEN))
remove = true;
in gimplify.c is not what we want, if it is has GOMP_MAP_KIND_ALWAYS,
then we shouldn't remove it even when it is not mentioned inside of the
region's body, because it then has side-effects.

Jakub


RFA: FT32: Fix building gcc.

2015-06-23 Thread Nicholas Clifton

Hi Guys,

  It seems that the FT32 port of GCC does not have a maintainer at the
  moment.  Nevertheless I have a patch to fix a couple of build time
  problems compiling gcc for the FT32.  Is this OK to apply ?

Cheers
  Nick

gcc/ChangeLog
2015-06-23  Nick Clifton  

* config/ft32/ft32.c: Include emit-rtl.h for the definition of
crtl.
(ft32_print_operand): Cast the result of INTVAL in order to make
sure that the correct value is printed.
* config/ft32/ft32.h (STACK_GROWS_DOWNWARD): Define to an
integer.

Index: gcc/config/ft32/ft32.c
===
--- gcc/config/ft32/ft32.c  (revision 224834)
+++ gcc/config/ft32/ft32.c  (working copy)
@@ -59,8 +59,8 @@
 #include "basic-block.h"
 #include "df.h"
 #include "builtins.h"
+#include "emit-rtl.h"

-
 #include 

 #define LOSE_AND_RETURN(msgid, x)   \
@@ -199,7 +199,7 @@
   return;

 case 'm':
-  fprintf (file, "%d", -INTVAL(x));
+  fprintf (file, "%ld", (long) (- INTVAL(x)));
   return;

 case 'd':   // a DW spec, from an integer 
alignment (for BLKmode insns)

Index: gcc/config/ft32/ft32.h
===
--- gcc/config/ft32/ft32.h  (revision 224834)
+++ gcc/config/ft32/ft32.h  (working copy)
@@ -248,7 +248,7 @@

 /* Define this macro if pushing a word onto the stack moves the stack
pointer to a smaller address.  */
-#define STACK_GROWS_DOWNWARD
+#define STACK_GROWS_DOWNWARD 1

 #define INITIAL_FRAME_POINTER_OFFSET(DEPTH) (DEPTH) = 0



[PATCH 1/2] Add mask to specify which LEON3 targets support CASA

2015-06-23 Thread Daniel Cederman
Not all LEON3 support the CASA instruction. This patch provides a mask
that can be used to specify which LEON3 targets that support CASA.

gcc/ChangeLog:

2015-06-22  Daniel Cederman  

* config/sparc/sparc.c (sparc_option_override): Mark CPU targets
  leon3 and leon3v7 as supporting the CASA instruction
* config/sparc/sparc.opt: Add mask specifying that the LEON3
  supports the CASA instruction (MASK_LEON3_CASA)
* config/sparc/sync.md: Only generate CASA for V9 and targets
  with the MASK_LEON3_CASA mask
---
 gcc/config/sparc/sparc.c   | 4 ++--
 gcc/config/sparc/sparc.opt | 3 +++
 gcc/config/sparc/sync.md   | 6 +++---
 3 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 995a769..205e3cb 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -1280,8 +1280,8 @@ sparc_option_override (void)
 { "supersparc",MASK_ISA, MASK_V8 },
 { "hypersparc",MASK_ISA, MASK_V8|MASK_FPU },
 { "leon",  MASK_ISA, MASK_V8|MASK_LEON|MASK_FPU },
-{ "leon3", MASK_ISA, MASK_V8|MASK_LEON3|MASK_FPU },
-{ "leon3v7",   MASK_ISA, MASK_LEON3|MASK_FPU },
+{ "leon3", MASK_ISA, MASK_V8|MASK_LEON3|MASK_FPU|MASK_LEON_CASA },
+{ "leon3v7",   MASK_ISA, MASK_LEON3|MASK_FPU|MASK_LEON_CASA },
 { "sparclite", MASK_ISA, MASK_SPARCLITE },
 /* The Fujitsu MB86930 is the original sparclite chip, with no FPU.  */
 { "f930",  MASK_ISA|MASK_FPU, MASK_SPARCLITE },
diff --git a/gcc/config/sparc/sparc.opt b/gcc/config/sparc/sparc.opt
index 5c7f546..e6caa95 100644
--- a/gcc/config/sparc/sparc.opt
+++ b/gcc/config/sparc/sparc.opt
@@ -228,6 +228,9 @@ Mask(LEON)
 Mask(LEON3)
 ;; Generate code for LEON3
 
+Mask(LEON_CASA)
+;; Generate CAS instruction for LEON
+
 Mask(SPARCLITE)
 ;; Generate code for SPARClite
 
diff --git a/gcc/config/sparc/sync.md b/gcc/config/sparc/sync.md
index 2fabff5..8e1baee 100644
--- a/gcc/config/sparc/sync.md
+++ b/gcc/config/sparc/sync.md
@@ -181,7 +181,7 @@
(match_operand:SI 5 "const_int_operand" "") ;; is_weak
(match_operand:SI 6 "const_int_operand" "") ;; mod_s
(match_operand:SI 7 "const_int_operand" "")];; mod_f
-  "(TARGET_V9 || TARGET_LEON3)
+  "(TARGET_V9 || TARGET_LEON_CASA)
&& (mode != DImode || TARGET_ARCH64 || TARGET_V8PLUS)"
 {
   sparc_expand_compare_and_swap (operands);
@@ -197,7 +197,7 @@
 [(match_operand:I48MODE 2 "register_operand" "")
  (match_operand:I48MODE 3 "register_operand" "")]
 UNSPECV_CAS))])]
-  "TARGET_V9 || TARGET_LEON3"
+  "TARGET_V9 || TARGET_LEON_CASA"
   "")
 
 (define_insn "*atomic_compare_and_swap_1"
@@ -220,7 +220,7 @@
  [(match_operand:SI 2 "register_operand" "r")
   (match_operand:SI 3 "register_operand" "0")]
  UNSPECV_CAS))]
-  "TARGET_LEON3"
+  "TARGET_LEON_CASA"
 {
   if (TARGET_SV_MODE)
 return "casa\t%1 0xb, %2, %0"; /* ASI for supervisor data space.  */
-- 
2.4.3



[PATCH] Make muser-mode the default for LEON3

2015-06-23 Thread Daniel Cederman
The muser-mode flag causes the CASA instruction for LEON3 to use the
user mode ASI. This is the correct behavior for almost all LEON3 targets.
For this reason it makes sense to make user mode the default. This patch
adds a flag for supervisor mode that can be used on the very few LEON3 targets
that requires CASA to use the supervisor ASI.

gcc/ChangeLog:

2015-06-22  Daniel Cederman  

* config/sparc/sparc.opt: Add supervisor mode flag (-msv-mode) and
  make user mode the default
* config/sparc/sync.md: Only use supervisor ASI for CASA when in
  supervisor mode
* doc/invoke.texi: Document msv-mode flag
---
 gcc/config/sparc/sparc.opt |  8 ++--
 gcc/config/sparc/sync.md   |  6 +++---
 gcc/doc/invoke.texi| 13 -
 3 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/gcc/config/sparc/sparc.opt b/gcc/config/sparc/sparc.opt
index 93d24a6..5c7f546 100644
--- a/gcc/config/sparc/sparc.opt
+++ b/gcc/config/sparc/sparc.opt
@@ -113,9 +113,13 @@ mrelax
 Target
 Optimize tail call instructions in assembler and linker
 
+msv-mode
+Target RejectNegative Report Mask(SV_MODE)
+Generate code that can only run in supervisor mode
+
 muser-mode
-Target Report Mask(USER_MODE)
-Do not generate code that can only run in supervisor mode
+Target RejectNegative Report InverseMask(SV_MODE)
+Do not generate code that can only run in supervisor mode (default)
 
 mcpu=
 Target RejectNegative Joined Var(sparc_cpu_and_features) 
Enum(sparc_processor_type) Init(PROCESSOR_V7)
diff --git a/gcc/config/sparc/sync.md b/gcc/config/sparc/sync.md
index 7d00b10..2fabff5 100644
--- a/gcc/config/sparc/sync.md
+++ b/gcc/config/sparc/sync.md
@@ -222,10 +222,10 @@
  UNSPECV_CAS))]
   "TARGET_LEON3"
 {
-  if (TARGET_USER_MODE)
-return "casa\t%1 0xa, %2, %0"; /* ASI for user data space.  */
-  else
+  if (TARGET_SV_MODE)
 return "casa\t%1 0xb, %2, %0"; /* ASI for supervisor data space.  */
+  else
+return "casa\t%1 0xa, %2, %0"; /* ASI for user data space.  */
 }
   [(set_attr "type" "multi")])
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index b99ab1c..211e8e9 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1008,7 +1008,7 @@ See RS/6000 and PowerPC Options.
 -mhard-quad-float  -msoft-quad-float @gol
 -mstack-bias  -mno-stack-bias @gol
 -munaligned-doubles  -mno-unaligned-doubles @gol
--muser-mode  -mno-user-mode @gol
+-muser-mode  -msv-mode @gol
 -mv8plus  -mno-v8plus  -mvis  -mno-vis @gol
 -mvis2  -mno-vis2  -mvis3  -mno-vis3 @gol
 -mcbcond -mno-cbcond @gol
@@ -21300,13 +21300,16 @@ Specifying this option avoids some rare compatibility 
problems with code
 generated by other compilers.  It is not the default because it results
 in a performance loss, especially for floating-point code.
 
+@item -msv-mode
+@opindex msv-mode
+Generate code that can only run in supervisor mode.  This is relevant
+only for the @code{casa} instruction emitted for the LEON3 processor.
+
 @item -muser-mode
-@itemx -mno-user-mode
 @opindex muser-mode
-@opindex mno-user-mode
 Do not generate code that can only run in supervisor mode.  This is relevant
-only for the @code{casa} instruction emitted for the LEON3 processor.  The
-default is @option{-mno-user-mode}.
+only for the @code{casa} instruction emitted for the LEON3 processor.  This
+is the default.
 
 @item -mno-faster-structs
 @itemx -mfaster-structs
-- 
2.4.3



[PATCH 2/2] Add leon3r0 and leon3r0v7 CPU targets

2015-06-23 Thread Daniel Cederman
Early variants of LEON3, revision 0, do not support the CASA instruction.
This patch adds two new targets, leon3r0 and leon3r0v7, that are equivalent
to leon3 and leon3v7, except that they do not support CASA.

gcc/ChangeLog:

2015-06-22  Daniel Cederman  

* config.gcc: Add leon3r0[v7] targets
* config/sparc/leon.md: Add leon3r0[v7] to FPU timing
* config/sparc/sparc-opts.h (enum processor_type): Add leon3r0[v7] 
targets
* config/sparc/sparc.c (sparc_option_override): Add leon3r0[v7] as 
targets
  without CASA support
* config/sparc/sparc.h: Add leon3r0[v7] targets
* config/sparc/sparc.md: Add leon3r0[v7] targets
* config/sparc/sparc.opt: Add leon3r0[v7] targets
* doc/invoke.texi: Add leon3r0[v7] targets
---
 gcc/config.gcc|  6 ++
 gcc/config/sparc/leon.md  | 14 +++---
 gcc/config/sparc/sparc-opts.h |  2 ++
 gcc/config/sparc/sparc.c  |  4 
 gcc/config/sparc/sparc.h  | 44 ---
 gcc/config/sparc/sparc.md |  2 ++
 gcc/config/sparc/sparc.opt|  6 ++
 gcc/doc/invoke.texi   | 22 +++---
 8 files changed, 59 insertions(+), 41 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 805638d..b10a1c9 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3322,10 +3322,7 @@ if test x$with_cpu = x ; then
  with_cpu=leon
  ;;
*-leon[3-9]*)
- with_cpu=leon3
- ;;
-   *-leon[3-9]v7*)
- with_cpu=leon3v7
+ with_cpu="`echo ${target} | sed 's/.*-\(leon[a-z0-9]*\).*$/\1/'`"
  ;;
*)
  with_cpu="`echo ${target} | sed 's/-.*$//'`"
@@ -4198,6 +4195,7 @@ case "${target}" in
"" | sparc | sparcv9 | sparc64 \
| v7 | cypress \
| v8 | supersparc | hypersparc | leon | leon3 | leon3v7 
\
+   | leon3r0 | leon3r0v7 \
| sparclite | f930 | f934 | sparclite86x \
| sparclet | tsc701 \
| v9 | ultrasparc | ultrasparc3 | niagara | niagara2 \
diff --git a/gcc/config/sparc/leon.md b/gcc/config/sparc/leon.md
index aca92fc..3441a74 100644
--- a/gcc/config/sparc/leon.md
+++ b/gcc/config/sparc/leon.md
@@ -29,11 +29,11 @@
 
 ;; Use a double reservation to work around the load pipeline hazard on UT699.
 (define_insn_reservation "leon3_load" 1
-  (and (eq_attr "cpu" "leon3,leon3v7") (eq_attr "type" "load,sload"))
+  (and (eq_attr "cpu" "leon3,leon3v7,leon3r0,leon3r0v7") (eq_attr "type" 
"load,sload"))
   "leon_memory*2")
 
 (define_insn_reservation "leon_store" 2
-  (and (eq_attr "cpu" "leon,leon3,leon3v7") (eq_attr "type" "store"))
+  (and (eq_attr "cpu" "leon,leon3,leon3v7,leon3r0,leon3r0v7") (eq_attr "type" 
"store"))
   "leon_memory*2")
 
 ;; This describes Gaisler Research's FPU
@@ -44,21 +44,21 @@
 (define_cpu_unit "grfpu_ds" "grfpu")
 
 (define_insn_reservation "leon_fp_alu" 4
-  (and (eq_attr "cpu" "leon,leon3,leon3v7") (eq_attr "type" "fp,fpcmp,fpmul"))
+  (and (eq_attr "cpu" "leon,leon3,leon3v7,leon3r0,leon3r0v7") (eq_attr "type" 
"fp,fpcmp,fpmul"))
   "grfpu_alu, nothing*3")
 
 (define_insn_reservation "leon_fp_divs" 16
-  (and (eq_attr "cpu" "leon,leon3,leon3v7") (eq_attr "type" "fpdivs"))
+  (and (eq_attr "cpu" "leon,leon3,leon3v7,leon3r0,leon3r0v7") (eq_attr "type" 
"fpdivs"))
   "grfpu_ds*14, nothing*2")
 
 (define_insn_reservation "leon_fp_divd" 17
-  (and (eq_attr "cpu" "leon,leon3,leon3v7") (eq_attr "type" "fpdivd"))
+  (and (eq_attr "cpu" "leon,leon3,leon3v7,leon3r0,leon3r0v7") (eq_attr "type" 
"fpdivd"))
   "grfpu_ds*15, nothing*2")
 
 (define_insn_reservation "leon_fp_sqrts" 24
-  (and (eq_attr "cpu" "leon,leon3,leon3v7") (eq_attr "type" "fpsqrts"))
+  (and (eq_attr "cpu" "leon,leon3,leon3v7,leon3r0,leon3r0v7") (eq_attr "type" 
"fpsqrts"))
   "grfpu_ds*22, nothing*2")
 
 (define_insn_reservation "leon_fp_sqrtd" 25
-  (and (eq_attr "cpu" "leon,leon3,leon3v7") (eq_attr "type" "fpsqrtd"))
+  (and (eq_attr "cpu" "leon,leon3,leon3v7,leon3r0,leon3r0v7") (eq_attr "type" 
"fpsqrtd"))
   "grfpu_ds*23, nothing*2")
diff --git a/gcc/config/sparc/sparc-opts.h b/gcc/config/sparc/sparc-opts.h
index 7679d0d..24a2b64 100644
--- a/gcc/config/sparc/sparc-opts.h
+++ b/gcc/config/sparc/sparc-opts.h
@@ -30,6 +30,8 @@ enum processor_type {
   PROCESSOR_SUPERSPARC,
   PROCESSOR_HYPERSPARC,
   PROCESSOR_LEON,
+  PROCESSOR_LEON3R0,
+  PROCESSOR_LEON3R0V7,
   PROCESSOR_LEON3,
   PROCESSOR_LEON3V7,
   PROCESSOR_SPARCLITE,
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 205e3cb..862e88d 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -1280,6 +1280,8 @@ sparc_option_override (void)
 { "supersparc",MASK_ISA, MASK_V8 },
 { "hypersparc",MASK_ISA, MASK_V8|MASK_FPU },
 { "leon",  MASK_ISA, MASK_V8|MASK_LEON|MASK_FPU },
+{ "leon3r0",   MASK_ISA, MASK_V8

[PATCH] Use leon3 target for native LEON on Linux

2015-06-23 Thread Daniel Cederman
Linux requires LEON version 3 or above with CASA support.

gcc/ChangeLog:

2015-06-23  Daniel Cederman  

* config/sparc/driver-sparc.c: map /proc/cpuinfo with CPU LEON
  to leon3
---
 gcc/config/sparc/driver-sparc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/config/sparc/driver-sparc.c b/gcc/config/sparc/driver-sparc.c
index 778de2c..5969735 100644
--- a/gcc/config/sparc/driver-sparc.c
+++ b/gcc/config/sparc/driver-sparc.c
@@ -73,6 +73,7 @@ static const struct cpu_names {
   { "UltraSparc T2",   "niagara2" },
   { "UltraSparc T3",   "niagara3" },
   { "UltraSparc T4",   "niagara4" },
+  { "LEON","leon3" },
 #endif
   { NULL,  NULL }
   };
-- 
2.4.3



Re: [PATCH][ARM] PR/65711: Don't pass '-dynamic-linker' when '-shared' is used

2015-06-23 Thread Ludovic Courtès
Jakub Jelinek  skribis:

> On Tue, Jun 23, 2015 at 11:30:25AM +0200, Ludovic Courtès wrote:
>> Ramana Radhakrishnan  skribis:
>> 
>> > On Thu, Apr 23, 2015 at 9:29 AM, Ludovic Courtès  wrote:
>> >> As discussed at .
>> >>
>> >> Patch is for both 4.8 and 4.9 (possibly 5.1 too, I haven’t checked.)
>> >>
>> >
>> > OK for trunk. This is also ok for all release branches if no
>> > objections in 24 hours.
>> 
>> [...]
>> 
>> >> gcc/
>> >> 2015-04-23  Ludovic Courtès  
>> >>
>> >> PR 65711
>> >> * config/arm/linux-elf.h (LINUX_TARGET_LINK_SPEC): Move
>> >> '-dynamic-linker' within %{!shared: ...}.
>> 
>> Committed to gcc-4_8-branch, gcc-4_9-branch, gcc-5-branch, and trunk.
>> Please let me know if there’s anything I missed.
>
> See richi's mail.  4.8 branch has already been closed, and 4.9 branch
> is frozen, so you should have asked for RM permission.

Noted.  That part of the process was not clear to me, apologies.

> Also, in the ChangeLog entries, one should write it in the form
> PR component/bugno,
> so
>   PR target/65711
> in your case.

OK.

Thanks,
Ludo’.


Re: [PATCH] Make muser-mode the default for LEON3

2015-06-23 Thread Sebastian Huber
Instead of introducing a new option which may conflict with an existing 
one, is it not possible to simply use -mno-user-mode?


On 23/06/15 14:22, Daniel Cederman wrote:

The muser-mode flag causes the CASA instruction for LEON3 to use the
user mode ASI. This is the correct behavior for almost all LEON3 targets.
For this reason it makes sense to make user mode the default. This patch
adds a flag for supervisor mode that can be used on the very few LEON3 targets
that requires CASA to use the supervisor ASI.

gcc/ChangeLog:

2015-06-22  Daniel Cederman  

* config/sparc/sparc.opt: Add supervisor mode flag (-msv-mode) and
  make user mode the default
* config/sparc/sync.md: Only use supervisor ASI for CASA when in
  supervisor mode
* doc/invoke.texi: Document msv-mode flag


--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



Re: [PATCH] Make muser-mode the default for LEON3

2015-06-23 Thread Jakub Jelinek
On Tue, Jun 23, 2015 at 02:22:34PM +0200, Daniel Cederman wrote:
> The muser-mode flag causes the CASA instruction for LEON3 to use the
> user mode ASI. This is the correct behavior for almost all LEON3 targets.
> For this reason it makes sense to make user mode the default. This patch
> adds a flag for supervisor mode that can be used on the very few LEON3 targets
> that requires CASA to use the supervisor ASI.

Why are you adding a new option and without deprecation removing a
previously accepted (at least since 4.8) option?
For just changing the default, you really don't need to add a new option
or remove -mno-user-mode, just change the default, which can be done
e.g. by checking if the bit has been explicitly set and if not, use the
desired default, or if you want to change the Mask() name, just
make it InverseMask, but keep the options as they are.

Jakub


[gomp4] Additional testing for deviceptr clause.

2015-06-23 Thread James Norris

Hi!

The following patch adds additional testing of the deviceptr
clause.

Patch applied to gomp-4_0-branch.

Thanks!
Jim
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deviceptr-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deviceptr-1.c
index e271a37..e62c315 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/deviceptr-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deviceptr-1.c
@@ -28,5 +28,26 @@ int main (void)
 abort ();
 #endif
 
+  a_1 = a_2 = 0;
+
+#pragma acc data deviceptr (a)
+#pragma acc parallel copyout (a_1, a_2)
+  {
+a_1 = a;
+a_2 = &a;
+  }
+
+  if (a != A)
+abort ();
+  if (a_1 != a)
+abort ();
+#if ACC_MEM_SHARED
+  if (a_2 != &a)
+abort ();
+#else
+  if (a_2 == &a)
+abort ();
+#endif
+
   return 0;
 }


Re: [PATCH] Make muser-mode the default for LEON3

2015-06-23 Thread Daniel Cederman



On 2015-06-23 14:34, Jakub Jelinek wrote:

On Tue, Jun 23, 2015 at 02:22:34PM +0200, Daniel Cederman wrote:

The muser-mode flag causes the CASA instruction for LEON3 to use the
user mode ASI. This is the correct behavior for almost all LEON3 targets.
For this reason it makes sense to make user mode the default. This patch
adds a flag for supervisor mode that can be used on the very few LEON3 targets
that requires CASA to use the supervisor ASI.


Why are you adding a new option and without deprecation removing a
previously accepted (at least since 4.8) option?
For just changing the default, you really don't need to add a new option
or remove -mno-user-mode, just change the default, which can be done
e.g. by checking if the bit has been explicitly set and if not, use the
desired default, or if you want to change the Mask() name, just
make it InverseMask, but keep the options as they are.

Jakub



How does one check if the bit has been explicitly set? It was not 
obvious to me, which is why I took a similar approach to a patch I found 
for another CPU target. If it is possible to change the default without 
adding another flag then that is obviously better and I will update my 
patch.


Best regards,
Daniel Cederman



Re: [gomp4] Generate sequential loop for OpenACC loop directive inside kernels

2015-06-23 Thread Chung-Lin Tang
On 2015/6/16 05:05 PM, Tom de Vries wrote:
> On 16/06/15 10:59, Chung-Lin Tang wrote:
>> This patch adjusts omp-low.c:expand_omp_for_generic() to expand to a 
>> "sequential"
>> loop form (without the OMP runtime calls), used for loop directives inside
>> OpenACC kernels constructs. Tom mentions that this allows the kernels 
>> parallelization
>> to work when '#pragma acc loop' makes the front-ends create OMP_FOR, which 
>> the
>> loop analysis phases don't understand.
>>
>> Tested and committed to gomp-4_0-branch.
>>
> 
> Hi Chung-Lin,
> 
> can you commit a test-case to exercise the code?
> 
> Thanks,
> - Tom

Just committed the attached testcase patch to gomp-4_0-branch.

Chung-Lin

2015-06-23  Chung-Lin Tang  

gcc/testsuite/
* c-c++-common/goacc/kernels-loop.c (ACC_LOOP): Add #ifndef/#define.
(main): Tag loops inside kernels construct with '#pragma ACC_LOOP'.
* c-c++-common/goacc/kernels-loop-2.c: Likewise.
* c-c++-common/goacc/kernels-loop-3.c: Likewise.
* c-c++-common/goacc/kernels-loop-n.c: Likewise.
* c-c++-common/goacc/kernels-loop-acc-loop.c: New test.
* c-c++-common/goacc/kernels-loop-2-acc-loop.c: New test.
* c-c++-common/goacc/kernels-loop-3-acc-loop.c: New test.
* c-c++-common/goacc/kernels-loop-n-acc-loop.c: New test.

Index: gcc/testsuite/c-c++-common/goacc/kernels-loop-3-acc-loop.c
===
--- gcc/testsuite/c-c++-common/goacc/kernels-loop-3-acc-loop.c	(revision 0)
+++ gcc/testsuite/c-c++-common/goacc/kernels-loop-3-acc-loop.c	(revision 0)
@@ -0,0 +1,20 @@
+/* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-ftree-parallelize-loops=32" } */
+/* { dg-additional-options "-fdump-tree-parloops_oacc_kernels-all" } */
+/* { dg-additional-options "-fdump-tree-optimized" } */
+
+/* Check that loops with '#pragma acc loop' tagged gets properly parallelized.  */
+#define ACC_LOOP acc loop
+#include "kernels-loop-3.c"
+
+/* Check that only one loop is analyzed, and that it can be parallelized.  */
+/* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops_oacc_kernels" } } */
+/* { dg-final { scan-tree-dump-not "FAILED:" "parloops_oacc_kernels" } } */
+
+/* Check that the loop has been split off into a function.  */
+/* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */
+
+/* { dg-final { scan-tree-dump-times "(?n)pragma omp target oacc_parallel.*num_gangs\\(32\\)" 1 "parloops_oacc_kernels" } } */
+
+/* { dg-final { cleanup-tree-dump "parloops_oacc_kernels" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c
===
--- gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c	(revision 224836)
+++ gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c	(working copy)
@@ -8,6 +8,10 @@
 #define N (1024 * 512)
 #define COUNTERTYPE unsigned int
 
+#ifndef ACC_LOOP
+#define ACC_LOOP
+#endif
+
 int
 main (void)
 {
@@ -21,18 +25,21 @@ main (void)
 
 #pragma acc kernels copyout (a[0:N])
   {
+#pragma ACC_LOOP
 for (COUNTERTYPE i = 0; i < N; i++)
   a[i] = i * 2;
   }
 
 #pragma acc kernels copyout (b[0:N])
   {
+#pragma ACC_LOOP
 for (COUNTERTYPE i = 0; i < N; i++)
   b[i] = i * 4;
   }
 
 #pragma acc kernels copyin (a[0:N], b[0:N]) copyout (c[0:N])
   {
+#pragma ACC_LOOP
 for (COUNTERTYPE ii = 0; ii < N; ii++)
   c[ii] = a[ii] + b[ii];
   }
Index: gcc/testsuite/c-c++-common/goacc/kernels-loop.c
===
--- gcc/testsuite/c-c++-common/goacc/kernels-loop.c	(revision 224836)
+++ gcc/testsuite/c-c++-common/goacc/kernels-loop.c	(working copy)
@@ -8,6 +8,10 @@
 #define N (1024 * 512)
 #define COUNTERTYPE unsigned int
 
+#ifndef ACC_LOOP
+#define ACC_LOOP
+#endif
+
 int
 main (void)
 {
@@ -27,6 +31,7 @@ main (void)
 
 #pragma acc kernels copyin (a[0:N], b[0:N]) copyout (c[0:N])
   {
+#pragma ACC_LOOP
 for (COUNTERTYPE ii = 0; ii < N; ii++)
   c[ii] = a[ii] + b[ii];
   }
Index: gcc/testsuite/c-c++-common/goacc/kernels-loop-2-acc-loop.c
===
--- gcc/testsuite/c-c++-common/goacc/kernels-loop-2-acc-loop.c	(revision 0)
+++ gcc/testsuite/c-c++-common/goacc/kernels-loop-2-acc-loop.c	(revision 0)
@@ -0,0 +1,23 @@
+/* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-ftree-parallelize-loops=32" } */
+/* { dg-additional-options "-fdump-tree-parloops_oacc_kernels-all" } */
+/* { dg-additional-options "-fdump-tree-optimized" } */
+
+/* Check that loops with '#pragma acc loop' tagged gets properly parallelized.  */
+#define ACC_LOOP acc loop
+#include "kernels-loop-2.c"
+
+/* Check that only three loops are analyzed, and that all can be
+   parallelized.  */
+/* { dg-final { scan-tree-dump-times "SUCCESS: may be paralleliz

RFA: Add support for -fstack-usage to various ports

2015-06-23 Thread Nick Clifton
Hi Guys,

  The patch below adds support for the -fstack-usage option to the BFIN,
  FT32, H8300, IQ2000 and M32C ports.  It also adjusts the expected
  output in the gcc.dg/stack-usage-1.c test for the V850 and MN10300 to
  match the actual results generated by these toolchains.

  Tested with no regressions on bfin-elf, ft32-elf, h8300-elf,
  iq2000-elf, m32c-elf, mn10300-elf and v850-elf toolchains.

  OK to apply ?

Cheers
  Nick

gcc/ChangeLog
2015-06-23  Nick Clifton  

* config/bfin/bfin.c (bfin_expand_prologue): Set
current_function_static_stack_size if flag_stack_usage_info is
set. 
* config/ft32/ft32.c (ft32_expand_prologue): Likewise.
* config/h8300/h8300.c (h8300_expand_prologue): Likewise.
* config/iq2000/iq2000.c (iq2000_expand_prologue): Likewise.
* config/m32c/m32c.c (m32c_emit_prologue): Likewise.

gcc/testsuite/ChangeLog
2015-06-23  Nick Clifton  

* gcc.dg/stack-usage-1.c: Add SIZE values for V850, MN10300,
H8300 and M32R targets.

Index: gcc/config/bfin/bfin.c
===
--- gcc/config/bfin/bfin.c  (revision 224834)
+++ gcc/config/bfin/bfin.c  (working copy)
@@ -1090,6 +1090,9 @@
   tree attrs = TYPE_ATTRIBUTES (TREE_TYPE (current_function_decl));
   bool all = lookup_attribute ("saveall", attrs) != NULL_TREE;
 
+  if (flag_stack_usage_info)
+current_function_static_stack_size = frame_size;
+
   if (fkind != SUBROUTINE)
 {
   expand_interrupt_handler_prologue (spreg, fkind, all);
Index: gcc/config/ft32/ft32.c
===
--- gcc/config/ft32/ft32.c  (revision 224834)
+++ gcc/config/ft32/ft32.c  (working copy)
@@ -456,6 +456,9 @@
 
   ft32_compute_frame ();
 
+  if (flag_stack_usage_info)
+current_function_static_stack_size = cfun->machine->size_for_adjusting_sp;
+
   if (!must_link () && (cfun->machine->callee_saved_reg_size == 4))
 {
   insn =
Index: gcc/config/h8300/h8300.c
===
--- gcc/config/h8300/h8300.c(revision 224834)
+++ gcc/config/h8300/h8300.c(working copy)
@@ -896,6 +896,12 @@
 
   /* Leave room for locals.  */
   h8300_emit_stack_adjustment (-1, round_frame_size (get_frame_size ()), true);
+
+  if (flag_stack_usage_info)
+current_function_static_stack_size
+  = round_frame_size (get_frame_size ())
+  + (__builtin_popcount (saved_regs) * UNITS_PER_WORD)
+  + (frame_pointer_needed ? UNITS_PER_WORD : 0);
 }
 
 /* Return nonzero if we can use "rts" for the function currently being
Index: gcc/config/iq2000/iq2000.c
===
--- gcc/config/iq2000/iq2000.c  (revision 224834)
+++ gcc/config/iq2000/iq2000.c  (working copy)
@@ -2072,6 +2072,9 @@
}
 }
 
+  if (flag_stack_usage_info)
+current_function_static_stack_size = cfun->machine->total_size;
+
   emit_insn (gen_blockage ());
 }
 
Index: gcc/config/m32c/m32c.c
===
--- gcc/config/m32c/m32c.c  (revision 224834)
+++ gcc/config/m32c/m32c.c  (working copy)
@@ -4123,6 +4123,9 @@
   && !m32c_function_needs_enter ())
 cfun->machine->use_rts = 1;
 
+  if (flag_stack_usage_info)
+current_function_static_stack_size = frame_size;
+  
   if (frame_size > 254)
 {
   extra_frame_size = frame_size - 254;
Index: gcc/config/m32r/m32r.c
===
--- gcc/config/m32r/m32r.c  (revision 224834)
+++ gcc/config/m32r/m32r.c  (working copy)
@@ -1665,6 +1665,9 @@
   if (! current_frame_info.initialized)
 m32r_compute_frame_size (get_frame_size ());
 
+  if (flag_stack_usage_info)
+current_function_static_stack_size = current_frame_info.total_size;
+
   gmask = current_frame_info.gmask;
 
   /* These cases shouldn't happen.  Catch them now.  */
Index: gcc/testsuite/gcc.dg/stack-usage-1.c
===
--- gcc/testsuite/gcc.dg/stack-usage-1.c(revision 224834)
+++ gcc/testsuite/gcc.dg/stack-usage-1.c(working copy)
@@ -81,6 +81,14 @@
 #  define SIZE 254
 #elif defined (__nios2__)
 #  define SIZE 252
+#elif defined (__v850__)
+#define SIZE 260
+#elif defined (__mn10300__)
+#define SIZE 252
+#elif defined (__H8300SX__) || defined (__H8300S__) || defined (__H8300H__) || 
defined (__H8300__) 
+#define SIZE 252
+#elif defined (__M32R__)
+#define SIZE 252
 #else
 #  define SIZE 256
 #endif


Re: [PATCH] Make muser-mode the default for LEON3

2015-06-23 Thread Jakub Jelinek
On Tue, Jun 23, 2015 at 02:48:45PM +0200, Daniel Cederman wrote:
> How does one check if the bit has been explicitly set? It was not obvious to

if (TARGET_USER_MODE_P (target_flags_explicit))

> me, which is why I took a similar approach to a patch I found for another
> CPU target. If it is possible to change the default without adding another
> flag then that is obviously better and I will update my patch.

Or you can just change the default target_flags, supposedly with
TargetVariable
int target_flags = MASK_USER_MODE
in the opt file, there are really many possibilities.

Jakub


[nvptx] add select

2015-06-23 Thread Nathan Sidwell
I've committed this PTX patch to add support for the selp instruction.  It's 
pretty much a direct implementation of 'r = a ? b : c'.  This is sufficient for 
combine(?) to generate selp instructions such as:


selp.u32%r22, %r25, %r26, %r28;
selp.u32%r22, %r25, 4, %r27;
selp.f32%r22, %r25, %r26, %r28;
selp.f32%r22, %r25, 0f40a0, %r27;

Approved by Bernd off list.
--
Nathan Sidwell - Director, Sourcery Services - Mentor Embedded
2015-06-22  Nathan Sidwell  

	* config/nvptx/nvptx.md (sel_true, sel_false): New
	conditional selects.
	(setcc_int, setcc_float): Reformat.

Index: config/nvptx/nvptx.md
===
--- config/nvptx/nvptx.md	(revision 224757)
+++ config/nvptx/nvptx.md	(working copy)
@@ -873,35 +873,71 @@
   ""
   "%.\\tselp%t0 %0,-1,0,%1;")
 
+(define_insn "sel_true"
+  [(set (match_operand:HSDIM 0 "nvptx_register_operand" "=R")
+(if_then_else:HSDIM
+	  (ne (match_operand:BI 1 "nvptx_register_operand" "R") (const_int 0))
+	  (match_operand:HSDIM 2 "nvptx_nonmemory_operand" "Ri")
+	  (match_operand:HSDIM 3 "nvptx_nonmemory_operand" "Ri")))]
+  ""
+  "%.\\tselp%t0\\t%0, %2, %3, %1;")
+
+(define_insn "sel_true"
+  [(set (match_operand:SDFM 0 "nvptx_register_operand" "=R")
+(if_then_else:SDFM
+	  (ne (match_operand:BI 1 "nvptx_register_operand" "R") (const_int 0))
+	  (match_operand:SDFM 2 "nvptx_nonmemory_operand" "RF")
+	  (match_operand:SDFM 3 "nvptx_nonmemory_operand" "RF")))]
+  ""
+  "%.\\tselp%t0\\t%0, %2, %3, %1;")
+
+(define_insn "sel_false"
+  [(set (match_operand:HSDIM 0 "nvptx_register_operand" "=R")
+(if_then_else:HSDIM
+	  (eq (match_operand:BI 1 "nvptx_register_operand" "R") (const_int 0))
+	  (match_operand:HSDIM 2 "nvptx_nonmemory_operand" "Ri")
+	  (match_operand:HSDIM 3 "nvptx_nonmemory_operand" "Ri")))]
+  ""
+  "%.\\tselp%t0\\t%0, %3, %2, %1;")
+
+(define_insn "sel_false"
+  [(set (match_operand:SDFM 0 "nvptx_register_operand" "=R")
+(if_then_else:SDFM
+	  (eq (match_operand:BI 1 "nvptx_register_operand" "R") (const_int 0))
+	  (match_operand:SDFM 2 "nvptx_nonmemory_operand" "RF")
+	  (match_operand:SDFM 3 "nvptx_nonmemory_operand" "RF")))]
+  ""
+  "%.\\tselp%t0\\t%0, %3, %2, %1;")
+
 (define_insn "setcc_int"
   [(set (match_operand:SI 0 "nvptx_register_operand" "=R")
 	(match_operator:SI 1 "nvptx_comparison_operator"
-			   [(match_operand:HSDIM 2 "nvptx_register_operand" "R")
-			(match_operand:HSDIM 3 "nvptx_nonmemory_operand" "Ri")]))]
+	  [(match_operand:HSDIM 2 "nvptx_register_operand" "R")
+	   (match_operand:HSDIM 3 "nvptx_nonmemory_operand" "Ri")]))]
   ""
   "%.\\tset%t0%c1 %0,%2,%3;")
 
 (define_insn "setcc_int"
   [(set (match_operand:SI 0 "nvptx_register_operand" "=R")
 	(match_operator:SI 1 "nvptx_float_comparison_operator"
-			   [(match_operand:SDFM 2 "nvptx_register_operand" "R")
-			(match_operand:SDFM 3 "nvptx_nonmemory_operand" "RF")]))]
+	   [(match_operand:SDFM 2 "nvptx_register_operand" "R")
+	(match_operand:SDFM 3 "nvptx_nonmemory_operand" "RF")]))]
   ""
   "%.\\tset%t0%c1 %0,%2,%3;")
 
 (define_insn "setcc_float"
   [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
 	(match_operator:SF 1 "nvptx_comparison_operator"
-			   [(match_operand:HSDIM 2 "nvptx_register_operand" "R")
-			(match_operand:HSDIM 3 "nvptx_nonmemory_operand" "Ri")]))]
+	   [(match_operand:HSDIM 2 "nvptx_register_operand" "R")
+	(match_operand:HSDIM 3 "nvptx_nonmemory_operand" "Ri")]))]
   ""
   "%.\\tset%t0%c1 %0,%2,%3;")
 
 (define_insn "setcc_float"
   [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
 	(match_operator:SF 1 "nvptx_float_comparison_operator"
-			   [(match_operand:SDFM 2 "nvptx_register_operand" "R")
-			(match_operand:SDFM 3 "nvptx_nonmemory_operand" "RF")]))]
+	   [(match_operand:SDFM 2 "nvptx_register_operand" "R")
+	(match_operand:SDFM 3 "nvptx_nonmemory_operand" "RF")]))]
   ""
   "%.\\tset%t0%c1 %0,%2,%3;")
 


Re: [AArch64] Implement -fpic for -mcmodel=small

2015-06-23 Thread Jiong Wang

Marcus Shawcroft writes:

> On 20 May 2015 at 11:21, Jiong Wang  wrote:
>
>> gcc/
>>   * config/aarch64/aarch64.md: (ldr_got_small_): Support new GOT 
>> relocation
>>   modifiers.
>>   (ldr_got_small_sidi): Ditto.
>>   * config/aarch64/iterators.md (got_modifier): New mode iterator.
>>   * config/aarch64/aarch64-otps.h (aarch64_code_model): New model.
>>   * config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Support 
>> -fpic.
>>   (aarch64_rtx_costs): Add costs for new instruction sequences.
>>   (initialize_aarch64_code_model): Initialize new model.
>>   (aarch64_classify_symbol): Recognize new model.
>>   (aarch64_asm_preferred_eh_data_format): Support new model.
>>   (aarch64_load_symref_appropriately): Generate new instruction sequences 
>> for -fpic.
>>   (TARGET_USE_PSEUDO_PIC_REG): New definition.
>>   (aarch64_use_pseudo_pic_reg): New function.
>>
>> gcc/testsuite/
>>   * gcc.target/aarch64/pic-small.c: New testcase.
>
>
> Rather than thread tests against aarch64_cmodel throughout the
> existing code can we instead extend classify_symbol with a new symbol
> classification?

Yes, we can. As -fPIC/-fpic allow 4G/32K GOT table size, we may name
corresponding symbol classification as "SYMBOL_GOT_4G",
"SYMBOL_GOT_32K".

But can we let this patch go in and create a another patch to improve
this? there are several other TLS patches may needs rebase if we change
this immedaitely.

Thanks

--
Regards,
Jiong



[c-family PATCH] Fix for -Wlogical-op

2015-06-23 Thread Marek Polacek
While looking at something else I noticed that we're using == for
INTEGER_CSTs comparison.  That isn't going to work well, so use
tree_int_cst_equal instead.  Because of that we weren't diagnosing
the following test.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-06-23  Marek Polacek  

* c-common.c (warn_logical_operator): Use tree_int_cst_equal
when comparing INTEGER_CSTs.

* c-c++-common/Wlogical-op-3.c: New test.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index c39a36d..9fcd9d6 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -1838,7 +1838,8 @@ warn_logical_operator (location_t location, enum 
tree_code code, tree type,
}
   /* Or warn if the operands have exactly the same range, e.g.
 A > 0 && A > 0.  */
-  else if (low0 == low1 && high0 == high1)
+  else if (tree_int_cst_equal (low0, low1)
+  && tree_int_cst_equal (high0, high1))
{
  if (or_op)
warning_at (location, OPT_Wlogical_op,
diff --git gcc/testsuite/c-c++-common/Wlogical-op-3.c 
gcc/testsuite/c-c++-common/Wlogical-op-3.c
index e69de29..83b5df4 100644
--- gcc/testsuite/c-c++-common/Wlogical-op-3.c
+++ gcc/testsuite/c-c++-common/Wlogical-op-3.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-Wlogical-op" } */
+
+void
+fn1 (int a)
+{
+  const int x = a;
+  if (x && x) {} /* { dg-warning "logical .and. of equal expressions" } */
+  if (x && (int) x) {} /* { dg-warning "logical .and. of equal expressions" } 
*/
+  if ((int) x && x) {} /* { dg-warning "logical .and. of equal expressions" } 
*/
+  if ((int) x && (int) x) {} /* { dg-warning "logical .and. of equal 
expressions" } */
+}
+
+void
+fn2 (int a)
+{
+  const int x = a;
+  if (x || x) {} /* { dg-warning "logical .or. of equal expressions" } */
+  if (x || (int) x) {} /* { dg-warning "logical .or. of equal expressions" } */
+  if ((int) x || x) {} /* { dg-warning "logical .or. of equal expressions" } */
+  if ((int) x || (int) x) {} /* { dg-warning "logical .or. of equal 
expressions" } */
+}

Marek


Re: [PATCH] Expand PIC calls without PLT with -fno-plt

2015-06-23 Thread Jeff Law

On 06/23/2015 02:29 AM, Ramana Radhakrishnan wrote:


If you try disabling the REG_EQUAL note generation [*], you'll probably find a
performance regression on arm32 (and probably on aarch64 as well?
we only


IMHO disabling the REG_EQUAL note generation is the wrong way to go about this.

Agreed.


Irrespective of combine, as a first step we should fix the predicates
and the call expanders to prevent this sort of replacement in the
backends. Tightening the predicates in the call patterns will achieve
the same for you and then we can investigate the use of GOT_PREL. My
recollection of this is that you need to work out when it's more
beneficial to use GOT_PREL over GOT but it's been a while since I
looked in that area.

Also agreed.  This is primarily a backend issue with the call patterns.

This is similar to the situation on the PA with the 32bit SOM runtime 
where direct and indirect calls have different calling conventions. 
Those different calling conventions combined with the early loading of 
the parameter registers in effect restricts us from being able to 
transform an indirect call into a direct call (combine) or vice-versa (cse).


The way we handled this was to split the calls into two patterns, one 
for direct one for indirect and tightening their predicates appropriately.


Jeff



Re: [c-family PATCH] Fix for -Wlogical-op

2015-06-23 Thread Jeff Law

On 06/23/2015 07:12 AM, Marek Polacek wrote:

While looking at something else I noticed that we're using == for
INTEGER_CSTs comparison.  That isn't going to work well, so use
tree_int_cst_equal instead.  Because of that we weren't diagnosing
the following test.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-06-23  Marek Polacek  

* c-common.c (warn_logical_operator): Use tree_int_cst_equal
when comparing INTEGER_CSTs.

* c-c++-common/Wlogical-op-3.c: New test.

OK.
jeff



Re: [AArch64] Implement -fpic for -mcmodel=small

2015-06-23 Thread Marcus Shawcroft
On 23 June 2015 at 14:02, Jiong Wang  wrote:
>
> Marcus Shawcroft writes:
>
>> On 20 May 2015 at 11:21, Jiong Wang  wrote:
>>
>>> gcc/
>>>   * config/aarch64/aarch64.md: (ldr_got_small_): Support new GOT 
>>> relocation
>>>   modifiers.
>>>   (ldr_got_small_sidi): Ditto.
>>>   * config/aarch64/iterators.md (got_modifier): New mode iterator.
>>>   * config/aarch64/aarch64-otps.h (aarch64_code_model): New model.
>>>   * config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Support 
>>> -fpic.
>>>   (aarch64_rtx_costs): Add costs for new instruction sequences.
>>>   (initialize_aarch64_code_model): Initialize new model.
>>>   (aarch64_classify_symbol): Recognize new model.
>>>   (aarch64_asm_preferred_eh_data_format): Support new model.
>>>   (aarch64_load_symref_appropriately): Generate new instruction sequences 
>>> for -fpic.
>>>   (TARGET_USE_PSEUDO_PIC_REG): New definition.
>>>   (aarch64_use_pseudo_pic_reg): New function.
>>>
>>> gcc/testsuite/
>>>   * gcc.target/aarch64/pic-small.c: New testcase.
>>
>>
>> Rather than thread tests against aarch64_cmodel throughout the
>> existing code can we instead extend classify_symbol with a new symbol
>> classification?
>
> Yes, we can. As -fPIC/-fpic allow 4G/32K GOT table size, we may name
> corresponding symbol classification as "SYMBOL_GOT_4G",
> "SYMBOL_GOT_32K".
>
> But can we let this patch go in and create a another patch to improve
> this? there are several other TLS patches may needs rebase if we change
> this immedaitely.

We can wait for a proper solution that fits with the code already in place.


Re: RFA: FT32: Fix building gcc.

2015-06-23 Thread Jeff Law

On 06/23/2015 06:10 AM, Nicholas Clifton wrote:

Hi Guys,

   It seems that the FT32 port of GCC does not have a maintainer at the
   moment.  Nevertheless I have a patch to fix a couple of build time
   problems compiling gcc for the FT32.  Is this OK to apply ?

Cheers
   Nick

gcc/ChangeLog
2015-06-23  Nick Clifton  

 * config/ft32/ft32.c: Include emit-rtl.h for the definition of
 crtl.
 (ft32_print_operand): Cast the result of INTVAL in order to make
 sure that the correct value is printed.
 * config/ft32/ft32.h (STACK_GROWS_DOWNWARD): Define to an
 integer.
James Bowman is the maintainer, though that isn't reflected in the 
MAINTAINERS file.


OK for the trunk.

jeff



Re: RFA: Add support for -fstack-usage to various ports

2015-06-23 Thread Jeff Law

On 06/23/2015 06:56 AM, Nick Clifton wrote:

Hi Guys,

   The patch below adds support for the -fstack-usage option to the BFIN,
   FT32, H8300, IQ2000 and M32C ports.  It also adjusts the expected
   output in the gcc.dg/stack-usage-1.c test for the V850 and MN10300 to
   match the actual results generated by these toolchains.

   Tested with no regressions on bfin-elf, ft32-elf, h8300-elf,
   iq2000-elf, m32c-elf, mn10300-elf and v850-elf toolchains.

   OK to apply ?

Cheers
   Nick

gcc/ChangeLog
2015-06-23  Nick Clifton  

* config/bfin/bfin.c (bfin_expand_prologue): Set
current_function_static_stack_size if flag_stack_usage_info is
 set.
* config/ft32/ft32.c (ft32_expand_prologue): Likewise.
* config/h8300/h8300.c (h8300_expand_prologue): Likewise.
* config/iq2000/iq2000.c (iq2000_expand_prologue): Likewise.
* config/m32c/m32c.c (m32c_emit_prologue): Likewise.

gcc/testsuite/ChangeLog
2015-06-23  Nick Clifton  

* gcc.dg/stack-usage-1.c: Add SIZE values for V850, MN10300,
H8300 and M32R targets.

OK.
jeff



[PATCH] Fix PR66636

2015-06-23 Thread Richard Biener

The following fixes an ICE with 188.ammp and AVX2.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2015-06-23  Richard Biener  

PR tree-optimization/66636
* tree-vect-stmts.c (vectorizable_store): Properly compute the
def type for further defs for strided stores.

* gcc.dg/vect/pr66636.c: New testcase.

Index: gcc/tree-vect-stmts.c
===
*** gcc/tree-vect-stmts.c   (revision 224834)
--- gcc/tree-vect-stmts.c   (working copy)
*** vectorizable_store (gimple stmt, gimple_
*** 5365,5371 
  if (slp)
vec_oprnd = vec_oprnds[j];
  else
!   vec_oprnd = vect_get_vec_def_for_stmt_copy (dt, vec_oprnd);
}
  
  for (i = 0; i < nstores; i++)
--- 5365,5375 
  if (slp)
vec_oprnd = vec_oprnds[j];
  else
!   {
! vect_is_simple_use (vec_oprnd, NULL, loop_vinfo,
! bb_vinfo, &def_stmt, &def, &dt);
! vec_oprnd = vect_get_vec_def_for_stmt_copy (dt, 
vec_oprnd);
!   }
}
  
  for (i = 0; i < nstores; i++)
Index: gcc/testsuite/gcc.dg/vect/pr66636.c
===
*** gcc/testsuite/gcc.dg/vect/pr66636.c (revision 0)
--- gcc/testsuite/gcc.dg/vect/pr66636.c (working copy)
***
*** 0 
--- 1,29 
+ /* { dg-additional-options "-mavx2" { target avx_runtime } } */
+ 
+ #include "tree-vect.h"
+ 
+ extern void abort (void);
+ 
+ struct X { double x; double y; };
+ 
+ void foo (struct X *x, double px, int s)
+ {
+   int i;
+   for (i = 0; i < 256; ++i)
+ {
+   x[i*s].x = px;
+   x[i*s].y = i + px;
+ }
+ }
+ 
+ int main()
+ {
+   struct X x[512];
+   int i;
+   check_vect ();
+   foo (x, 1., 2);
+   if (x[0].x != 1. || x[0].y != 1.
+   || x[510].x != 1. || x[510].y != 256.)
+ abort ();
+   return 0;
+ }


C++ PATCH for c++/65879 (error on member function of nested class of anonymous class)

2015-06-23 Thread Jason Merrill
It doesn't make sense to complain about a function using its own 
enclosing class.  There were two problems here:


1) The function should have been marked as internal because its class is 
internal.

2) We shouldn't bother looking at 'this' for no-linkage types.

Tested x86_64-pc-linux-gnu, applying to trunk.  Applying #2 to 5 and 4.9 
as well.
commit 6ad33315431b2e6dc2664a36f0a3b308a9eafd40
Author: Jason Merrill 
Date:   Mon Jun 22 16:07:13 2015 -0400

	PR c++/65879
	* decl.c (grokfndecl): Check the linkage of ctype, not just
	TYPE_ANONYMOUS_P.
	* tree.c (no_linkage_check): Skip the 'this' pointer.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index d14ffe2..a8fc1a5 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -7794,7 +7794,7 @@ grokfndecl (tree ctype,
 
   /* Members of anonymous types and local classes have no linkage; make
  them internal.  If a typedef is made later, this will be changed.  */
-  if (ctype && (TYPE_ANONYMOUS_P (ctype)
+  if (ctype && (!TREE_PUBLIC (TYPE_MAIN_DECL (ctype))
 		|| decl_function_context (TYPE_MAIN_DECL (ctype
 publicp = 0;
 
diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index a9c9214..bc8428d 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -2299,14 +2299,14 @@ no_linkage_check (tree t, bool relaxed_p)
   return no_linkage_check (TYPE_PTRMEM_CLASS_TYPE (t), relaxed_p);
 
 case METHOD_TYPE:
-  r = no_linkage_check (TYPE_METHOD_BASETYPE (t), relaxed_p);
-  if (r)
-	return r;
-  /* Fall through.  */
 case FUNCTION_TYPE:
   {
-	tree parm;
-	for (parm = TYPE_ARG_TYPES (t);
+	tree parm = TYPE_ARG_TYPES (t);
+	if (TREE_CODE (t) == METHOD_TYPE)
+	  /* The 'this' pointer isn't interesting; a method has the same
+	 linkage (or lack thereof) as its enclosing class.  */
+	  parm = TREE_CHAIN (parm);
+	for (;
 	 parm && parm != void_list_node;
 	 parm = TREE_CHAIN (parm))
 	  {
diff --git a/gcc/testsuite/g++.dg/abi/anon2.C b/gcc/testsuite/g++.dg/abi/anon2.C
index cee9237..396edd3 100644
--- a/gcc/testsuite/g++.dg/abi/anon2.C
+++ b/gcc/testsuite/g++.dg/abi/anon2.C
@@ -23,9 +23,9 @@ namespace N2 {
 typedef struct { } B;
 struct C {
   // { dg-final { scan-assembler-not ".weak\(_definition\)?\[ \t\]_?_ZN2N23._31C3fn1ENS0_1BE" { target c++11 } } }
-  static void fn1 (B) { } // { dg-error "no linkage" "" { target { ! c++11 } } }
+  static void fn1 (B) { }
   // { dg-final { scan-assembler-not ".weak\(_definition\)?\[ \t\]_?_ZN2N23._31C3fn2ES1_" { target c++11 } } }
-  static void fn2 (C) { } // { dg-error "no linkage" "" { target { ! c++11 } } }
+  static void fn2 (C) { }
 };
   } const D;
 
diff --git a/gcc/testsuite/g++.dg/other/anon7.C b/gcc/testsuite/g++.dg/other/anon7.C
new file mode 100644
index 000..12c1ab2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/other/anon7.C
@@ -0,0 +1,10 @@
+// PR c++/65879
+
+static struct
+{
+  void f();
+  struct Inner
+  {
+void g();
+  };
+} x;


Fwd: [PATCH] Add CFI entries for ARM Linux idiv0 / ldiv0

2015-06-23 Thread James Lemke

Ping..


 Forwarded Message 
Subject: [PATCH] Add CFI entries for ARM Linux idiv0 / ldiv0
Date: Tue, 16 Jun 2015 17:25:49 -0400
From: James Lemke 
To: gcc-patches@gcc.gnu.org

A divide by zero exception was not giving a proper traceback for LINUX
ARM_EABI.  The attached patch fixes the problem on trunk (and several
local branches).

Tested on gcc-trunk for arm-none-linux-gnueabi.

OK to commit?

--
Jim Lemke, GNU Tools Sourcerer
Mentor Graphics / CodeSourcery
Orillia, Ontario



2015-06-16  James Lemke  

	libgcc/config/arm/
	* lib1funcs.S (aeabi_idiv0, aeabi_ldiv0): Add CFI entries for
	Linux ARM_EABI.

Index: libgcc/config/arm/lib1funcs.S
===
--- libgcc/config/arm/lib1funcs.S	(revision 224523)
+++ libgcc/config/arm/lib1funcs.S	(working copy)
@@ -1336,23 +1336,30 @@ LSYM(Lover12):
 #define SIGFPE	8
 
 #ifdef __ARM_EABI__
+	cfi_start	__aeabi_ldiv0, LSYM(Lend_aeabi_ldiv0)
 	WEAK aeabi_idiv0
 	WEAK aeabi_ldiv0
 	ARM_FUNC_START aeabi_idiv0
 	ARM_FUNC_START aeabi_ldiv0
+	do_push	{r1, lr}
+98:	cfi_push 98b - __aeabi_ldiv0, 0xe, -0x4, 0x8
 #else
+	cfi_start	__div0, LSYM(Lend_div0)
 	ARM_FUNC_START div0
+	do_push	{r1, lr}
+98:	cfi_push 98b - __div0, 0xe, -0x4, 0x8
 #endif
 
-	do_push	{r1, lr}
 	mov	r0, #SIGFPE
 	bl	SYM(raise) __PLT__
-	RETLDM	r1
+	RETLDM	r1 unwind=98b
 
 #ifdef __ARM_EABI__
+	cfi_end	LSYM(Lend_aeabi_ldiv0)
 	FUNC_END aeabi_ldiv0
 	FUNC_END aeabi_idiv0
 #else
+	cfi_end	LSYM(Lend_div0)
 	FUNC_END div0
 #endif
 	



C++ PATCH for c++/66542 (missing error with deleted dtor and static variable)

2015-06-23 Thread Jason Merrill
In expand_static_init we were diagnosing a deleted dtor if there was 
also no initializer, but not if there was, and nothing later on was 
diagnosing it either.  Fixed thus.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 78593d02fb6af72a8f97e52cbfbbe9f49b29e9db
Author: Jason Merrill 
Date:   Mon Jun 22 14:00:30 2015 -0400

	PR c++/66542
	* decl.c (expand_static_init): Make sure the destructor is callable
	here even if we have an initializer.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index c934ff9..d14ffe2 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -7163,12 +7163,12 @@ expand_static_init (tree decl, tree init)
   gcc_assert (TREE_STATIC (decl));
 
   /* Some variables require no dynamic initialization.  */
-  if (!init
-  && TYPE_HAS_TRIVIAL_DESTRUCTOR (TREE_TYPE (decl)))
+  if (TYPE_HAS_TRIVIAL_DESTRUCTOR (TREE_TYPE (decl)))
 {
   /* Make sure the destructor is callable.  */
   cxx_maybe_build_cleanup (decl, tf_warning_or_error);
-  return;
+  if (!init)
+	return;
 }
 
   if (DECL_THREAD_LOCAL_P (decl) && DECL_GNU_TLS_P (decl)
diff --git a/gcc/testsuite/g++.dg/cpp0x/deleted12.C b/gcc/testsuite/g++.dg/cpp0x/deleted12.C
new file mode 100644
index 000..770bb9c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/deleted12.C
@@ -0,0 +1,10 @@
+// PR c++/66542
+// { dg-do compile { target c++11 } }
+
+struct A
+{
+  A() {}
+  ~A() = delete;		// { dg-message "declared here" }
+};
+
+static A a;			// { dg-error "deleted" }


Re: [PATCH] Make muser-mode the default for LEON3

2015-06-23 Thread Daniel Cederman



On 2015-06-23 14:58, Jakub Jelinek wrote:

On Tue, Jun 23, 2015 at 02:48:45PM +0200, Daniel Cederman wrote:

How does one check if the bit has been explicitly set? It was not obvious to


if (TARGET_USER_MODE_P (target_flags_explicit))


me, which is why I took a similar approach to a patch I found for another
CPU target. If it is possible to change the default without adding another
flag then that is obviously better and I will update my patch.


Or you can just change the default target_flags, supposedly with
TargetVariable
int target_flags = MASK_USER_MODE
in the opt file, there are really many possibilities.

Jakub



Thanks! I went with your suggestion in the previous mail and removed the 
new -msv-mode option and inversed the user mode mask.


Best regards,
Daniel Cederman


[PATCH] Make muser-mode the default for LEON3

2015-06-23 Thread Daniel Cederman
The muser-mode flag causes the CASA instruction for LEON3 to use the
user mode ASI. This is the correct behavior for almost all LEON3 targets.
For this reason it makes sense to make user mode the default.

gcc/ChangeLog:

2015-06-23  Daniel Cederman  

* config/sparc/sparc.opt: Rename mask from USER_MODE to SV_MODE
  and make it inverse to change default
* config/sparc/sync.md: Only use supervisor ASI for CASA when in
  supervisor mode
* doc/invoke.texi: Document change of default
---
 gcc/config/sparc/sparc.opt | 4 ++--
 gcc/config/sparc/sync.md   | 6 +++---
 gcc/doc/invoke.texi| 4 ++--
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/gcc/config/sparc/sparc.opt b/gcc/config/sparc/sparc.opt
index 93d24a6..85bf0bd 100644
--- a/gcc/config/sparc/sparc.opt
+++ b/gcc/config/sparc/sparc.opt
@@ -114,8 +114,8 @@ Target
 Optimize tail call instructions in assembler and linker
 
 muser-mode
-Target Report Mask(USER_MODE)
-Do not generate code that can only run in supervisor mode
+Target Report InverseMask(SV_MODE)
+Do not generate code that can only run in supervisor mode (default)
 
 mcpu=
 Target RejectNegative Joined Var(sparc_cpu_and_features) 
Enum(sparc_processor_type) Init(PROCESSOR_V7)
diff --git a/gcc/config/sparc/sync.md b/gcc/config/sparc/sync.md
index 7d00b10..2fabff5 100644
--- a/gcc/config/sparc/sync.md
+++ b/gcc/config/sparc/sync.md
@@ -222,10 +222,10 @@
  UNSPECV_CAS))]
   "TARGET_LEON3"
 {
-  if (TARGET_USER_MODE)
-return "casa\t%1 0xa, %2, %0"; /* ASI for user data space.  */
-  else
+  if (TARGET_SV_MODE)
 return "casa\t%1 0xb, %2, %0"; /* ASI for supervisor data space.  */
+  else
+return "casa\t%1 0xa, %2, %0"; /* ASI for user data space.  */
 }
   [(set_attr "type" "multi")])
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index b99ab1c..86b2a73 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -21305,8 +21305,8 @@ in a performance loss, especially for floating-point 
code.
 @opindex muser-mode
 @opindex mno-user-mode
 Do not generate code that can only run in supervisor mode.  This is relevant
-only for the @code{casa} instruction emitted for the LEON3 processor.  The
-default is @option{-mno-user-mode}.
+only for the @code{casa} instruction emitted for the LEON3 processor.  This
+is the default.
 
 @item -mno-faster-structs
 @itemx -mfaster-structs
-- 
2.4.3



C++ PATCH for c++/66501 (wrong code with array move assignment)

2015-06-23 Thread Jason Merrill
build_vec_init was assuming that if a class has a trivial copy 
assignment, then an array assignment is trivial.  But overload 
resolution might not choose the copy assignment operator.  So this patch 
changes build_vec_init to check for any non-trivial assignment operator.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 2f1cd98c72127d70e198ed99c67cc5d031f052b6
Author: Jason Merrill 
Date:   Mon Jun 22 15:13:58 2015 -0400

	PR c++/66501
	* class.c (type_has_nontrivial_assignment): New.
	* init.c (build_vec_init): Use it.
	* cp-tree.h: Declare it.
	* method.c (trivial_fn_p): Templates aren't trivial.

diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index 9da532e..88f1022 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -5136,6 +5136,24 @@ type_has_non_user_provided_default_constructor (tree t)
   return false;
 }
 
+/* Return true if TYPE has some non-trivial assignment operator.  */
+
+bool
+type_has_nontrivial_assignment (tree type)
+{
+  gcc_assert (TREE_CODE (type) != ARRAY_TYPE);
+  if (CLASS_TYPE_P (type))
+for (tree fns
+	   = lookup_fnfields_slot_nolazy (type, ansi_assopname (NOP_EXPR));
+	 fns; fns = OVL_NEXT (fns))
+  {
+	tree fn = OVL_CURRENT (fns);
+	if (!trivial_fn_p (fn))
+	  return true;
+  }
+  return false;
+}
+
 /* TYPE is being used as a virtual base, and has a non-trivial move
assignment.  Return true if this is due to there being a user-provided
move assignment in TYPE or one of its subobjects; if there isn't, then
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index b53aa90..8eb7474 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5295,6 +5295,7 @@ extern tree in_class_defaulted_default_constructor (tree);
 extern bool user_provided_p			(tree);
 extern bool type_has_user_provided_constructor  (tree);
 extern bool type_has_non_user_provided_default_constructor (tree);
+extern bool type_has_nontrivial_assignment	(tree);
 extern bool vbase_has_user_provided_move_assign (tree);
 extern tree default_init_uninitialized_part (tree);
 extern bool trivial_default_constructor_is_constexpr (tree);
diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index fc30fef..08c6c0e 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -3460,8 +3460,7 @@ build_vec_init (tree base, tree maxindex, tree init,
   && TREE_CODE (atype) == ARRAY_TYPE
   && TREE_CONSTANT (maxindex)
   && (from_array == 2
-	  ? (!CLASS_TYPE_P (inner_elt_type)
-	 || !TYPE_HAS_COMPLEX_COPY_ASSIGN (inner_elt_type))
+	  ? !type_has_nontrivial_assignment (inner_elt_type)
 	  : !TYPE_NEEDS_CONSTRUCTING (type))
   && ((TREE_CODE (init) == CONSTRUCTOR
 	   /* Don't do this if the CONSTRUCTOR might contain something
diff --git a/gcc/cp/method.c b/gcc/cp/method.c
index 79e4bbc..da03c36 100644
--- a/gcc/cp/method.c
+++ b/gcc/cp/method.c
@@ -476,6 +476,8 @@ type_set_nontrivial_flag (tree ctype, special_function_kind sfk)
 bool
 trivial_fn_p (tree fn)
 {
+  if (TREE_CODE (fn) == TEMPLATE_DECL)
+return false;
   if (!DECL_DEFAULTED_FN (fn))
 return false;
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/rv-array1.C b/gcc/testsuite/g++.dg/cpp0x/rv-array1.C
new file mode 100644
index 000..9075764
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/rv-array1.C
@@ -0,0 +1,55 @@
+// PR c++/66501
+// { dg-do run { target c++11 } }
+
+int total_size;
+
+struct Object
+{
+  int size = 0;
+
+  Object () = default;
+
+  ~Object () {
+total_size -= size;
+  }
+
+  Object (const Object &) = delete;
+  Object & operator= (const Object &) = delete;
+
+  Object (Object && b) {
+size = b.size;
+b.size = 0;
+  }
+
+  Object & operator= (Object && b) {
+if (this != & b) {
+  total_size -= size;
+  size = b.size;
+  b.size = 0;
+}
+return * this;
+  }
+
+  void grow () {
+size ++;
+total_size ++;
+  }
+};
+
+struct Container {
+  Object objects[2];
+};
+
+int main (void)
+{
+  Container container;
+
+  // grow some objects in the container
+  for (auto & object : container.objects)
+object.grow ();
+
+  // now empty it
+  container = Container ();
+
+  return total_size;
+}


[gomp4.1] Taskloop C++ random access iterator support

2015-06-23 Thread Jakub Jelinek
Hi!

I've committed following patch to add support for C++ random access
iterators in taskloop constructs.

2015-06-23  Jakub Jelinek  

* tree.h (OMP_CLAUSE_PRIVATE_TASKLOOP_IV,
OMP_CLAUSE_LASTPRIVATE_TASKLOOP_IV): Define.
* gimplify.c (gimplify_omp_for): Handle gimplification of
OMP_TASKLOOP with C++ random access iterator clauses.
* omp-low.c (scan_sharing_clauses): Ignore
OMP_CLAUSE_SHARED with OMP_CLAUSE_SHARED_FIRSTPRIVATE if
it is a global var outside of the outer taskloop for.
(lower_lastprivate_clauses): Handle
OMP_CLAUSE_LASTPRIVATE_TASKLOOP_IV lastprivate if the
decl is global outside of outer taskloop for.
(lower_send_clauses): Look beyond the outer taskloop for.
gcc/cp/
* semantics.c (handle_omp_for_class_iterator): Handle
OMP_TASKLOOP class iterators.
(finish_omp_for): Adjust handle_omp_for_class_iterator
caller.
libgomp/
* testsuite/libgomp.c++/taskloop-6.C: New test.
* testsuite/libgomp.c++/taskloop-7.C: New test.
* testsuite/libgomp.c++/taskloop-8.C: New test.
* testsuite/libgomp.c++/taskloop-9.C: New test.

--- gcc/tree.h.jj   2015-06-17 21:02:00.0 +0200
+++ gcc/tree.h  2015-06-22 15:19:37.501110534 +0200
@@ -1356,6 +1356,12 @@ extern void protected_set_expr_location
 #define OMP_CLAUSE_PRIVATE_OUTER_REF(NODE) \
   TREE_PRIVATE (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_PRIVATE))
 
+/* True if a PRIVATE clause is for a C++ class IV on taskloop construct
+   (thus should be private on the outer taskloop and firstprivate on
+   task).  */
+#define OMP_CLAUSE_PRIVATE_TASKLOOP_IV(NODE) \
+  TREE_PROTECTED (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_PRIVATE))
+
 /* True on a LASTPRIVATE clause if a FIRSTPRIVATE clause for the same
decl is present in the chain.  */
 #define OMP_CLAUSE_LASTPRIVATE_FIRSTPRIVATE(NODE) \
@@ -1367,6 +1373,12 @@ extern void protected_set_expr_location
 #define OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ(NODE) \
   (OMP_CLAUSE_CHECK (NODE))->omp_clause.gimple_reduction_init
 
+/* True if a LASTPRIVATE clause is for a C++ class IV on taskloop construct
+   (thus should be lastprivate on the outer taskloop and firstprivate on
+   task).  */
+#define OMP_CLAUSE_LASTPRIVATE_TASKLOOP_IV(NODE) \
+  TREE_PROTECTED (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_LASTPRIVATE))
+
 /* True on a SHARED clause if a FIRSTPRIVATE clause for the same
decl is present in the chain (this can happen only for taskloop
with FIRSTPRIVATE/LASTPRIVATE on it originally.  */
--- gcc/gimplify.c.jj   2015-06-18 15:16:18.0 +0200
+++ gcc/gimplify.c  2015-06-23 10:03:28.908079507 +0200
@@ -7230,7 +7230,8 @@ gimplify_omp_for (tree *expr_p, gimple_s
{
  TREE_OPERAND (t, 1)
= get_initialized_tmp_var (TREE_OPERAND (t, 1),
-  pre_p, NULL);
+  gimple_seq_empty_p (for_pre_body)
+  ? pre_p : &for_pre_body, NULL);
  tree c = build_omp_clause (input_location,
 OMP_CLAUSE_FIRSTPRIVATE);
  OMP_CLAUSE_DECL (c) = TREE_OPERAND (t, 1);
@@ -7250,7 +7251,9 @@ gimplify_omp_for (tree *expr_p, gimple_s
 
  if (!is_gimple_constant (*tp))
{
- *tp = get_initialized_tmp_var (*tp, pre_p, NULL);
+ gimple_seq *seq = gimple_seq_empty_p (for_pre_body)
+   ? pre_p : &for_pre_body;
+ *tp = get_initialized_tmp_var (*tp, seq, NULL);
  tree c = build_omp_clause (input_location,
 OMP_CLAUSE_FIRSTPRIVATE);
  OMP_CLAUSE_DECL (c) = *tp;
@@ -7683,7 +7686,6 @@ gimplify_omp_for (tree *expr_p, gimple_s
  {
  /* These clauses are allowed on task, move them there.  */
  case OMP_CLAUSE_SHARED:
- case OMP_CLAUSE_PRIVATE:
  case OMP_CLAUSE_FIRSTPRIVATE:
  case OMP_CLAUSE_DEFAULT:
  case OMP_CLAUSE_IF:
@@ -7694,6 +7696,26 @@ gimplify_omp_for (tree *expr_p, gimple_s
*gtask_clauses_ptr = c;
gtask_clauses_ptr = &OMP_CLAUSE_CHAIN (c);
break;
+ case OMP_CLAUSE_PRIVATE:
+   if (OMP_CLAUSE_PRIVATE_TASKLOOP_IV (c))
+ {
+   /* We want private on outer for and firstprivate
+  on task.  */
+   *gtask_clauses_ptr
+ = build_omp_clause (OMP_CLAUSE_LOCATION (c),
+ OMP_CLAUSE_FIRSTPRIVATE);
+   OMP_CLAUSE_DECL (*gtask_clauses_ptr) = OMP_CLAUSE_DECL (c);
+   lang_hooks.decls.omp_finish_clause (*gtask_clauses_ptr, NULL);
+   gtask_clauses_ptr = &OMP_CLAUSE_CHAIN (*gtask_clauses_ptr);
+   *gforo_clauses_ptr = c;
+   gforo_clau

[00/12] Share hash traits between hash_table and hash_map

2015-06-23 Thread Richard Sandiford
Following on from: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01066.html
this series unifies the key hashing traits for hash_maps in all but one case
(which needs to use the values rather than keys to represent deleted and
empty entries).

It also consolidates the various tree, integer and string hashers
so that there's only one copy of each.

This series is a net reduction of 121 lines, despite using the more
verbose out-of-class function definitions and having new copyright
notices.  The series linked above is a net reduction of 419 lines.

Series bootstrapped & regression-tested on x86_64-linux-gnu.
Also tested with config-list.mk.

Thanks,
Richard



[01/12] Add hash_map traits that use existing hash_table-like traits

2015-06-23 Thread Richard Sandiford
This patch defines a class that converts hash_table-style traits into
hash_map traits.  It can be used as the default traits for all hash_maps
that don't specify their own traits (i.e. this patch does work on its own).

By the end of the series this class replaces default_hashmap_traits.


gcc/
* hash-map-traits.h: Include hash-traits.h.
(simple_hashmap_traits): New class.
* mem-stats.h (hash_map): Change the default traits to
simple_hashmap_traits >.

Index: gcc/hash-map-traits.h
===
--- gcc/hash-map-traits.h   2015-06-23 15:42:24.132002236 +0100
+++ gcc/hash-map-traits.h   2015-06-23 15:42:24.128002280 +0100
@@ -23,6 +23,8 @@ #define HASH_MAP_TRAITS_H
 /* Bacause mem-stats.h uses default hashmap traits, we have to
put the class to this separate header file.  */
 
+#include "hash-traits.h"
+
 /* implement default behavior for traits when types allow it.  */
 
 struct default_hashmap_traits
@@ -101,4 +103,75 @@ struct default_hashmap_traits
 }
 };
 
+/* Implement hash_map traits for a key with hash traits H.  Empty and
+   deleted map entries are represented as empty and deleted keys.  */
+
+template 
+struct simple_hashmap_traits
+{
+  static inline hashval_t hash (const typename H::value_type &);
+  static inline bool equal_keys (const typename H::value_type &,
+const typename H::value_type &);
+  template  static inline void remove (T &);
+  template  static inline bool is_empty (const T &);
+  template  static inline bool is_deleted (const T &);
+  template  static inline void mark_empty (T &);
+  template  static inline void mark_deleted (T &);
+};
+
+template 
+inline hashval_t
+simple_hashmap_traits ::hash (const typename H::value_type &h)
+{
+  return H::hash (h);
+}
+
+template 
+inline bool
+simple_hashmap_traits ::equal_keys (const typename H::value_type &k1,
+  const typename H::value_type &k2)
+{
+  return H::equal (k1, k2);
+}
+
+template 
+template 
+inline void
+simple_hashmap_traits ::remove (T &entry)
+{
+  H::remove (entry.m_key);
+}
+
+template 
+template 
+inline bool
+simple_hashmap_traits ::is_empty (const T &entry)
+{
+  return H::is_empty (entry.m_key);
+}
+
+template 
+template 
+inline bool
+simple_hashmap_traits ::is_deleted (const T &entry)
+{
+  return H::is_deleted (entry.m_key);
+}
+
+template 
+template 
+inline void
+simple_hashmap_traits ::mark_empty (T &entry)
+{
+  H::mark_empty (entry.m_key);
+}
+
+template 
+template 
+inline void
+simple_hashmap_traits ::mark_deleted (T &entry)
+{
+  H::mark_deleted (entry.m_key);
+}
+
 #endif // HASH_MAP_TRAITS_H
Index: gcc/mem-stats.h
===
--- gcc/mem-stats.h 2015-06-23 15:42:24.132002236 +0100
+++ gcc/mem-stats.h 2015-06-23 15:42:24.128002280 +0100
@@ -3,7 +3,7 @@ #define GCC_MEM_STATS_H
 
 /* Forward declaration.  */
 template
+typename Traits = simple_hashmap_traits > >
 class hash_map;
 
 #define LOCATION_LINE_EXTRA_SPACE 30



[02/12] Move tree operand hashers to a new header file

2015-06-23 Thread Richard Sandiford
There were three tree operand hashers, so move them to their own
header file.

The typedefs in this and subsequent patches are temporary and
get removed in patch 12.


gcc/
* tree-hash-traits.h: New file.
(tree_operand_hash): New class.
* sanopt.c: Include tree-hash-traits.h.
(sanopt_tree_map_traits): Use tree_operand_hash.
* tree-if-conv.c: Include tree-hash-traits.h.
(phi_args_hash_traits): Use tree_operand_hash.
* tree-ssa-uncprop.c: Include tree-hash-traits.h.
(val_ssa_equiv_hash_traits): Use tree_operand_hash.

Index: gcc/tree-hash-traits.h
===
--- /dev/null   2015-06-02 17:27:28.541944012 +0100
+++ gcc/tree-hash-traits.h  2015-06-23 15:44:07.966809173 +0100
@@ -0,0 +1,42 @@
+/* Traits for hashing trees.
+   Copyright (C) 2014-2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#ifndef tree_hash_traits_h
+#define tree_hash_traits_h
+
+/* Hash for trees based on operand_equal_p.  */
+struct tree_operand_hash : ggc_ptr_hash 
+{
+  static inline hashval_t hash (const_tree);
+  static inline bool equal_keys (const_tree, const_tree);
+};
+
+inline hashval_t
+tree_operand_hash::hash (const_tree t)
+{
+  return iterative_hash_expr (t, 0);
+}
+
+inline bool
+tree_operand_hash::equal_keys (const_tree t1, const_tree t2)
+{
+  return operand_equal_p (t1, t2, 0);
+}
+
+#endif
Index: gcc/sanopt.c
===
--- gcc/sanopt.c2015-06-23 15:44:07.970809082 +0100
+++ gcc/sanopt.c2015-06-23 15:44:07.962809243 +0100
@@ -50,6 +50,7 @@ Software Foundation; either version 3, o
 #include "ubsan.h"
 #include "params.h"
 #include "tree-ssa-operands.h"
+#include "tree-hash-traits.h"
 
 
 /* This is used to carry information about basic blocks.  It is
@@ -98,20 +99,7 @@ maybe_get_single_definition (tree t)
   return NULL_TREE;
 }
 
-/* Traits class for tree hash maps below.  */
-
-struct sanopt_tree_map_traits : default_hashmap_traits
-{
-  static inline hashval_t hash (const_tree ref)
-  {
-return iterative_hash_expr (ref, 0);
-  }
-
-  static inline bool equal_keys (const_tree ref1, const_tree ref2)
-  {
-return operand_equal_p (ref1, ref2, 0);
-  }
-}; 
+typedef simple_hashmap_traits  sanopt_tree_map_traits;
 
 /* Tree triplet for vptr_check_map.  */
 struct sanopt_tree_triplet
Index: gcc/tree-if-conv.c
===
--- gcc/tree-if-conv.c  2015-06-23 15:44:07.970809082 +0100
+++ gcc/tree-if-conv.c  2015-06-23 15:44:07.966809173 +0100
@@ -135,6 +135,7 @@ Software Foundation; either version 3, o
 #include "expr.h"
 #include "insn-codes.h"
 #include "optabs.h"
+#include "tree-hash-traits.h"
 
 /* List of basic blocks in if-conversion-suitable order.  */
 static basic_block *ifc_bbs;
@@ -1594,27 +1595,9 @@ convert_scalar_cond_reduction (gimple re
   return rhs;
 }
 
-/* Helpers for PHI arguments hashtable map.  */
+typedef simple_hashmap_traits  phi_args_hash_traits;
 
-struct phi_args_hash_traits : default_hashmap_traits
-{
-  static inline hashval_t hash (tree);
-  static inline bool equal_keys (tree, tree);
-};
-
-inline hashval_t
-phi_args_hash_traits::hash (tree value)
-{
-  return iterative_hash_expr (value, 0);
-}
-
-inline bool
-phi_args_hash_traits::equal_keys (tree value1, tree value2)
-{
-  return operand_equal_p (value1, value2, 0);
-}
-
-  /* Produce condition for all occurrences of ARG in PHI node.  */
+/* Produce condition for all occurrences of ARG in PHI node.  */
 
 static tree
 gen_phi_arg_condition (gphi *phi, vec *occur,
Index: gcc/tree-ssa-uncprop.c
===
--- gcc/tree-ssa-uncprop.c  2015-06-23 15:44:07.970809082 +0100
+++ gcc/tree-ssa-uncprop.c  2015-06-23 15:44:07.966809173 +0100
@@ -50,6 +50,7 @@ the Free Software Foundation; either ver
 #include "domwalk.h"
 #include "tree-pass.h"
 #include "tree-ssa-propagate.h"
+#include "tree-hash-traits.h"
 
 /* The basic structure describing an equivalency created by traversing
an edge.  Traversing the edge effectively means that we can assume
@@ -294,25 +295,11 @@ struct equiv_hash_elt
 
 /* Value to ssa name equivalence hashtable helpers.  */
 
-struct val_ssa_equiv_hash_traits : default_hashmap_traits
+struct val_

[03/12] Move decl hasher to header file

2015-06-23 Thread Richard Sandiford
Like the previous patch, but for decl hashers.  There's only one copy
of this so far, but the idea seems general.


gcc/
* tree-hash-traits.h (tree_decl_hash): New class.
* tree-ssa-strlen.c: Include tree-hash-traits.h.
(stridxlist_hash_traits): Use tree_decl_hash.

Index: gcc/tree-hash-traits.h
===
--- gcc/tree-hash-traits.h  2015-06-23 15:45:22.993947116 +0100
+++ gcc/tree-hash-traits.h  2015-06-23 15:45:22.989947161 +0100
@@ -39,4 +39,18 @@ tree_operand_hash::equal_keys (const_tre
   return operand_equal_p (t1, t2, 0);
 }
 
+/* Hasher for tree decls.  Pointer equality is enough here, but the DECL_UID
+   is a better hash than the pointer value and gives a predictable traversal
+   order.  */
+struct tree_decl_hash : ggc_ptr_hash 
+{
+  static inline hashval_t hash (tree);
+};
+
+inline hashval_t
+tree_decl_hash::hash (tree t)
+{
+  return DECL_UID (t);
+}
+
 #endif
Index: gcc/tree-ssa-strlen.c
===
--- gcc/tree-ssa-strlen.c   2015-06-23 15:45:22.993947116 +0100
+++ gcc/tree-ssa-strlen.c   2015-06-23 15:45:22.989947161 +0100
@@ -73,6 +73,7 @@ the Free Software Foundation; either ver
 #include "ipa-ref.h"
 #include "cgraph.h"
 #include "ipa-chkp.h"
+#include "tree-hash-traits.h"
 
 /* A vector indexed by SSA_NAME_VERSION.  0 means unknown, positive value
is an index into strinfo vector, negative value stands for
@@ -155,20 +156,7 @@ struct decl_stridxlist_map
   struct stridxlist list;
 };
 
-/* stridxlist hashtable helpers.  */
-
-struct stridxlist_hash_traits : default_hashmap_traits
-{
-  static inline hashval_t hash (tree);
-};
-
-/* Hash a from tree in a decl_stridxlist_map.  */
-
-inline hashval_t
-stridxlist_hash_traits::hash (tree item)
-{
-  return DECL_UID (item);
-}
+typedef simple_hashmap_traits  stridxlist_hash_traits;
 
 /* Hash table for mapping decls to a chained list of offset -> idx
mappings.  */



[04/12] Move ssa_name hasher to header file

2015-06-23 Thread Richard Sandiford
Another tree hasher, this time for SSA names.  Again there's only one copy
at the moment, but the idea seems general.


gcc/
* tree-hash-traits.h (tree_ssa_name_hasher): New class.
* sese.c: Include tree-hash-traits.h.
(rename_map_hasher): Use tree_ssa_name_hasher.

Index: gcc/tree-hash-traits.h
===
--- gcc/tree-hash-traits.h  2015-06-23 15:46:11.453390373 +0100
+++ gcc/tree-hash-traits.h  2015-06-23 15:46:11.449390427 +0100
@@ -53,4 +53,18 @@ tree_decl_hash::hash (tree t)
   return DECL_UID (t);
 }
 
+/* Hash for SSA_NAMEs in the same function.  Pointer equality is enough
+   here, but the SSA_NAME_VERSION is a better hash than the pointer
+   value and gives a predictable traversal order.  */
+struct tree_ssa_name_hash : ggc_ptr_hash 
+{
+  static inline hashval_t hash (tree);
+};
+
+inline hashval_t
+tree_ssa_name_hash::hash (tree t)
+{
+  return SSA_NAME_VERSION (t);
+}
+
 #endif
Index: gcc/sese.c
===
--- gcc/sese.c  2015-06-23 15:46:11.453390373 +0100
+++ gcc/sese.c  2015-06-23 15:46:11.449390427 +0100
@@ -63,6 +63,7 @@ the Free Software Foundation; either ver
 #include "value-prof.h"
 #include "sese.h"
 #include "tree-ssa-propagate.h"
+#include "tree-hash-traits.h"
 
 /* Helper function for debug_rename_map.  */
 
@@ -78,22 +79,7 @@ debug_rename_map_1 (tree_node *const &ol
   return true;
 }
 
-
-/* Hashtable helpers.  */
-
-struct rename_map_hasher : default_hashmap_traits
-{
-  static inline hashval_t hash (tree);
-};
-
-/* Computes a hash function for database element ELT.  */
-
-inline hashval_t
-rename_map_hasher::hash (tree old_name)
-{
-  return SSA_NAME_VERSION (old_name);
-}
-
+typedef simple_hashmap_traits rename_map_hasher;
 typedef hash_map rename_map_type;
 
 



[05/12] Move TREE_HASH hasher to header file

2015-06-23 Thread Richard Sandiford
One more tree hasher, this time based on TREE_HASH.


gcc/
* tree-hash-traits.h (tree_hash): New class.
* except.c: Include tree-hash-traits.h.
(tree_hash_traits): Use tree_hash.

Index: gcc/tree-hash-traits.h
===
--- gcc/tree-hash-traits.h  2015-06-23 15:47:41.132358999 +0100
+++ gcc/tree-hash-traits.h  2015-06-23 15:47:41.128359041 +0100
@@ -67,4 +67,16 @@ tree_ssa_name_hash::hash (tree t)
   return SSA_NAME_VERSION (t);
 }
 
+/* Hasher for general trees, based on their TREE_HASH.  */
+struct tree_hash : ggc_ptr_hash 
+{
+  static hashval_t hash (tree);
+};
+
+inline hashval_t
+tree_hash::hash (tree t)
+{
+  return TREE_HASH (t);
+}
+
 #endif
Index: gcc/except.c
===
--- gcc/except.c2015-06-23 15:47:41.132358999 +0100
+++ gcc/except.c2015-06-23 15:47:41.128359041 +0100
@@ -161,14 +161,11 @@ Software Foundation; either version 3, o
 #include "tree-pass.h"
 #include "cfgloop.h"
 #include "builtins.h"
+#include "tree-hash-traits.h"
 
 static GTY(()) int call_site_base;
 
-struct tree_hash_traits : default_hashmap_traits
-{
-  static hashval_t hash (tree t) { return TREE_HASH (t); }
-};
-
+struct tree_hash_traits : simple_hashmap_traits  {};
 static GTY (()) hash_map *type_to_runtime_map;
 
 /* Describe the SjLj_Function_Context structure.  */



[06/12] Consolidate string hashers

2015-06-23 Thread Richard Sandiford
This patch replaces various string hashers with a single copy
in hash-traits.h.


gcc/
* hash-traits.h (string_hash, nofree_string_hash): New classes.
* genmatch.c (capture_id_map_hasher): Use nofree_string_hash.
* passes.c (pass_registry_hasher): Likewise.
* config/alpha/alpha.c (string_traits): Likewise.
* config/i386/winnt.c (i386_find_on_wrapper_list): Likewise.
* config/m32c/m32c.c (pragma_traits): Likewise.
* config/mep/mep.c (pragma_traits): Likewise.

gcc/java/
* jcf-io.c (memoized_class_lookups): Use nofree_string_hash.
(find_class): Likewise.

Index: gcc/hash-traits.h
===
--- gcc/hash-traits.h   2015-06-23 15:48:30.751788389 +0100
+++ gcc/hash-traits.h   2015-06-23 15:48:30.743788520 +0100
@@ -121,6 +121,27 @@ pointer_hash ::is_empty (Type *e)
   return e == NULL;
 }
 
+/* Hasher for "const char *" strings, using string rather than pointer
+   equality.  */
+
+struct string_hash : pointer_hash 
+{
+  static inline hashval_t hash (const char *);
+  static inline bool equal (const char *, const char *);
+};
+
+inline hashval_t
+string_hash::hash (const char *id)
+{
+  return htab_hash_string (id);
+}
+
+inline bool
+string_hash::equal (const char *id1, const char *id2)
+{
+  return strcmp (id1, id2) == 0;
+}
+
 /* Remover and marker for entries in gc memory.  */
 
 template
@@ -190,6 +211,11 @@ struct ggc_ptr_hash : pointer_hash ,
 template 
 struct ggc_cache_ptr_hash : pointer_hash , ggc_cache_remove  {};
 
+/* Traits for string elements that should not be freed when an element
+   is deleted.  */
+
+struct nofree_string_hash : string_hash, typed_noop_remove  {};
+
 template  struct default_hash_traits;
 
 template 
Index: gcc/genmatch.c
===
--- gcc/genmatch.c  2015-06-23 15:48:30.751788389 +0100
+++ gcc/genmatch.c  2015-06-23 15:48:30.743788520 +0100
@@ -392,26 +392,7 @@ get_operator (const char *id)
   return 0;
 }
 
-
-/* Helper for the capture-id map.  */
-
-struct capture_id_map_hasher : default_hashmap_traits
-{
-  static inline hashval_t hash (const char *);
-  static inline bool equal_keys (const char *, const char *);
-};
-
-inline hashval_t
-capture_id_map_hasher::hash (const char *id)
-{
-  return htab_hash_string (id);
-}
-
-inline bool
-capture_id_map_hasher::equal_keys (const char *id1, const char *id2)
-{
-  return strcmp (id1, id2) == 0;
-}
+typedef simple_hashmap_traits capture_id_map_hasher;
 
 typedef hash_map cid_map_t;
 
Index: gcc/passes.c
===
--- gcc/passes.c2015-06-23 15:48:30.751788389 +0100
+++ gcc/passes.c2015-06-23 15:48:30.747788453 +0100
@@ -861,29 +861,7 @@ pass_manager::register_dump_files (opt_p
   while (pass);
 }
 
-/* Helper for pass_registry hash table.  */
-
-struct pass_registry_hasher : default_hashmap_traits
-{
-  static inline hashval_t hash (const char *);
-  static inline bool equal_keys (const char *, const char *);
-};
-
-/* Pass registry hash function.  */
-
-inline hashval_t
-pass_registry_hasher::hash (const char *name)
-{
-  return htab_hash_string (name);
-}
-
-/* Hash equal function  */
-
-inline bool
-pass_registry_hasher::equal_keys (const char *s1, const char *s2)
-{
-  return !strcmp (s1, s2);
-}
+typedef simple_hashmap_traits pass_registry_hasher;
 
 static hash_map
   *name_to_pass_map;
Index: gcc/config/alpha/alpha.c
===
--- gcc/config/alpha/alpha.c2015-06-23 15:48:30.751788389 +0100
+++ gcc/config/alpha/alpha.c2015-06-23 15:48:30.747788453 +0100
@@ -4808,13 +4808,7 @@ alpha_multipass_dfa_lookahead (void)
 
 struct GTY(()) alpha_links;
 
-struct string_traits : default_hashmap_traits
-{
-  static bool equal_keys (const char *const &a, const char *const &b)
-  {
-return strcmp (a, b) == 0;
-  }
-};
+typedef simple_hashmap_traits  string_traits;
 
 struct GTY(()) machine_function
 {
Index: gcc/config/i386/winnt.c
===
--- gcc/config/i386/winnt.c 2015-06-23 15:48:30.751788389 +0100
+++ gcc/config/i386/winnt.c 2015-06-23 15:48:30.739788568 +0100
@@ -709,29 +709,6 @@ i386_pe_record_stub (const char *name)
 
 #ifdef CXX_WRAP_SPEC_LIST
 
-/* Hashtable helpers.  */
-
-struct wrapped_symbol_hasher : nofree_ptr_hash 
-{
-  static inline hashval_t hash (const char *);
-  static inline bool equal (const char *, const char *);
-  static inline void remove (const char *);
-};
-
-inline hashval_t
-wrapped_symbol_hasher::hash (const char *v)
-{
-  return htab_hash_string (v);
-}
-
-/*  Hash table equality helper function.  */
-
-inline bool
-wrapped_symbol_hasher::equal (const char *x, const char *y)
-{
-  return !strcmp (x, y);
-}
-
 /* Search for a function named TARGET in the list of library wrappers
we are using, returni

[07/12] Use new string hasher for MIPS

2015-06-23 Thread Richard Sandiford
Use the string hasher from patch 6 for MIPS.  I split this out because
local_alias_traits doesn't actually need to use SYMBOL_REF rtxes as
the map keys, since the only data used is the symbol name.


gcc/
* config/mips/mips.c (mips16_flip_traits): Use it.
(local_alias_traits, mips16_local_aliases): Convert from a map of
rtxes to a map of symbol names.
(mips16_local_alias): Update accordingly.

Index: gcc/config/mips/mips.c
===
--- gcc/config/mips/mips.c  2015-06-23 15:49:32.187081876 +0100
+++ gcc/config/mips/mips.c  2015-06-23 15:49:32.183081933 +0100
@@ -1265,15 +1265,7 @@ static int mips_register_move_cost (mach
 static unsigned int mips_function_arg_boundary (machine_mode, const_tree);
 static machine_mode mips_get_reg_raw_mode (int regno);
 
-struct mips16_flip_traits : default_hashmap_traits
-{
-  static hashval_t hash (const char *s) { return htab_hash_string (s); }
-  static bool
-  equal_keys (const char *a, const char *b)
-  {
-return !strcmp (a, b);
-  }
-};
+struct mips16_flip_traits : simple_hashmap_traits  {};
 
 /* This hash table keeps track of implicit "mips16" and "nomips16" attributes
for -mflip_mips16.  It maps decl names onto a boolean mode setting.  */
@@ -6601,30 +6593,13 @@ mips_load_call_address (enum mips_call_t
 }
 }
 
-struct local_alias_traits : default_hashmap_traits
-{
-  static hashval_t hash (rtx);
-  static bool equal_keys (rtx, rtx);
-};
+struct local_alias_traits : simple_hashmap_traits  {};
 
 /* Each locally-defined hard-float MIPS16 function has a local symbol
associated with it.  This hash table maps the function symbol (FUNC)
to the local symbol (LOCAL). */
-static GTY (()) hash_map *mips16_local_aliases;
-
-/* Hash table callbacks for mips16_local_aliases.  */
-
-hashval_t
-local_alias_traits::hash (rtx func)
-{
-  return htab_hash_string (XSTR (func, 0));
-}
-
-bool
-local_alias_traits::equal_keys (rtx func1, rtx func2)
-{
-  return rtx_equal_p (func1, func2);
-}
+static GTY (()) hash_map
+  *mips16_local_aliases;
 
 /* FUNC is the symbol for a locally-defined hard-float MIPS16 function.
Return a local alias for it, creating a new one if necessary.  */
@@ -6635,23 +6610,23 @@ mips16_local_alias (rtx func)
   /* Create the hash table if this is the first call.  */
   if (mips16_local_aliases == NULL)
 mips16_local_aliases
-  = hash_map::create_ggc (37);
+  = hash_map::create_ggc (37);
 
   /* Look up the function symbol, creating a new entry if need be.  */
   bool existed;
-  rtx *slot = &mips16_local_aliases->get_or_insert (func, &existed);
+  const char *func_name = XSTR (func, 0);
+  rtx *slot = &mips16_local_aliases->get_or_insert (func_name, &existed);
   gcc_assert (slot != NULL);
 
   if (!existed)
 {
-  const char *func_name, *local_name;
   rtx local;
 
   /* Create a new SYMBOL_REF for the local symbol.  The choice of
 __fn_local_* is based on the __fn_stub_* names that we've
 traditionally used for the non-MIPS16 stub.  */
   func_name = targetm.strip_name_encoding (XSTR (func, 0));
-  local_name = ACONCAT (("__fn_local_", func_name, NULL));
+  const char *local_name = ACONCAT (("__fn_local_", func_name, NULL));
   local = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (local_name));
   SYMBOL_REF_FLAGS (local) = SYMBOL_REF_FLAGS (func) | SYMBOL_FLAG_LOCAL;
 



[08/12] Add common traits for integer hash keys

2015-06-23 Thread Richard Sandiford
Several places define hash traits for integers, using particular integer
values as "empty" and "deleted" markers.  This patch defines them in terms
of a single int_hash class.

I also needed to extend gengtype to accept "+" in template arguments.


gcc/
* gengtype-parse.c (require_template_declaration): Allow '+' in
template parameters.  Consolidate cases.
* hash-traits.h (int_hash): New class.
* alias.c (alias_set_hash): New structure.
(alias_set_traits): Use it.
* symbol-summary.h (function_summary::map_hash): New class.
(function_summary::summary_hashmap_traits): Use it.
* tree-inline.h (dependence_hash): New class.
(dependence_hasher): Use it.
* tree-ssa-reassoc.c (oecount_hasher): Use int_hash.
* value-prof.c (profile_id_hash): New class.
(profile_id_traits): Use it.

Index: gcc/gengtype-parse.c
===
--- gcc/gengtype-parse.c2015-06-23 15:50:56.686110247 +0100
+++ gcc/gengtype-parse.c2015-06-23 15:50:56.678110339 +0100
@@ -274,17 +274,13 @@ require_template_declaration (const char
  str = concat (str, "enum ", (char *) 0);
  continue;
}
-  if (token () == NUM)
+  if (token () == NUM
+ || token () == ':'
+ || token () == '+')
{
  str = concat (str, advance (), (char *) 0);
  continue;
}
-  if (token () == ':')
-   {
- advance ();
- str = concat (str, ":", (char *) 0);
- continue;
-   }
   if (token () == '<')
{
  advance ();
Index: gcc/hash-traits.h
===
--- gcc/hash-traits.h   2015-06-23 15:50:56.686110247 +0100
+++ gcc/hash-traits.h   2015-06-23 15:50:56.674110387 +0100
@@ -57,6 +57,68 @@ typed_noop_remove ::remove (Type &
 }
 
 
+/* Hasher for integer type Type in which Empty is a spare value that can be
+   used to mark empty slots.  If Deleted != Empty then Deleted is another
+   spare value that can be used for deleted slots; if Deleted == Empty then
+   hash table entries cannot be deleted.  */
+
+template 
+struct int_hash : typed_noop_remove 
+{
+  typedef Type value_type;
+  typedef Type compare_type;
+
+  static inline hashval_t hash (value_type);
+  static inline bool equal (value_type existing, value_type candidate);
+  static inline void mark_deleted (Type &);
+  static inline void mark_empty (Type &);
+  static inline bool is_deleted (Type);
+  static inline bool is_empty (Type);
+};
+
+template 
+inline hashval_t
+int_hash ::hash (value_type x)
+{
+  return x;
+}
+
+template 
+inline bool
+int_hash ::equal (value_type x, value_type y)
+{
+  return x == y;
+}
+
+template 
+inline void
+int_hash ::mark_deleted (Type &x)
+{
+  gcc_assert (Empty != Deleted);
+  x = Deleted;
+}
+
+template 
+inline void
+int_hash ::mark_empty (Type &x)
+{
+  x = Empty;
+}
+
+template 
+inline bool
+int_hash ::is_deleted (Type x)
+{
+  return Empty != Deleted && x == Deleted;
+}
+
+template 
+inline bool
+int_hash ::is_empty (Type x)
+{
+  return x == Empty;
+}
+
 /* Pointer hasher based on pointer equality.  Other types of pointer hash
can inherit this and override the hash and equal functions with some
other form of equality (such as string equality).  */
Index: gcc/alias.c
===
--- gcc/alias.c 2015-06-23 15:50:56.686110247 +0100
+++ gcc/alias.c 2015-06-23 15:50:56.678110339 +0100
@@ -143,31 +143,8 @@ Software Foundation; either version 3, o
However, this is no actual entry for alias set zero.  It is an
error to attempt to explicitly construct a subset of zero.  */
 
-struct alias_set_traits : default_hashmap_traits
-{
-  template
-  static bool
-  is_empty (T &e)
-  {
-return e.m_key == INT_MIN;
-  }
-
-  template
-  static bool
-  is_deleted (T &e)
-  {
-return e.m_key == (INT_MIN + 1);
-  }
-
-  template static void mark_empty (T &e) { e.m_key = INT_MIN; }
-
-  template
-  static void
-  mark_deleted (T &e)
-  {
-e.m_key = INT_MIN + 1;
-  }
-};
+struct alias_set_hash : int_hash  {};
+struct alias_set_traits : simple_hashmap_traits  {};
 
 struct GTY(()) alias_set_entry_d {
   /* The alias set number, as stored in MEM_ALIAS_SET.  */
Index: gcc/symbol-summary.h
===
--- gcc/symbol-summary.h2015-06-23 15:50:56.686110247 +0100
+++ gcc/symbol-summary.h2015-06-23 15:50:56.678110339 +0100
@@ -200,45 +200,8 @@ class GTY((user)) function_summary 
   bool m_ggc;
 
 private:
-  struct summary_hashmap_traits: default_hashmap_traits
-  {
-static const int deleted_value = -1;
-static const int empty_value = 0;
-
-static hashval_t
-hash (const int v)
-{
-  return (hashval_t)v;
-}
-
-template
-static bool
-is_deleted (Type &e)
-{
-  return 

[09/12] Remove all but one use of default_hashmap_traits

2015-06-23 Thread Richard Sandiford
After the previous patches in the series, there are three remaining hash
traits that use the key to represent empty and deleted entries.  This patch
makes them use simple_hashmap_traits.


gcc/
* ipa-icf.h (symbol_compare_hash): New class.
(symbol_compare_hashmap_traits): Use it.
* mem-stats.h (mem_alloc_description::mem_location_hash): New class.
(mem_alloc_description::mem_alloc_hashmap_traits): Use it.
(mem_alloc_description::reverse_mem_map_t): Remove redundant
default_hashmap_traits.
* sanopt.c (sanopt_tree_triplet_hash): New class.
(sanopt_tree_triplet_map_traits): Use it.

Index: gcc/ipa-icf.h
===
--- gcc/ipa-icf.h   2015-06-23 15:52:24.937095524 +0100
+++ gcc/ipa-icf.h   2015-06-23 15:52:24.929095617 +0100
@@ -87,10 +87,10 @@ enum sem_item_type
 
 /* Hash traits for symbol_compare_collection map.  */
 
-struct symbol_compare_hashmap_traits: default_hashmap_traits
+struct symbol_compare_hash : nofree_ptr_hash 
 {
   static hashval_t
-  hash (const symbol_compare_collection *v)
+  hash (value_type v)
   {
 inchash::hash hstate;
 hstate.add_int (v->m_references.length ());
@@ -107,8 +107,7 @@ struct symbol_compare_hashmap_traits: de
   }
 
   static bool
-  equal_keys (const symbol_compare_collection *a,
- const symbol_compare_collection *b)
+  equal (value_type a, value_type b)
   {
 if (a->m_references.length () != b->m_references.length ()
|| a->m_interposables.length () != b->m_interposables.length ())
@@ -126,6 +125,8 @@ struct symbol_compare_hashmap_traits: de
 return true;
   }
 };
+typedef simple_hashmap_traits 
+  symbol_compare_hashmap_traits;
 
 
 /* Semantic item usage pair.  */
Index: gcc/mem-stats.h
===
--- gcc/mem-stats.h 2015-06-23 15:52:24.937095524 +0100
+++ gcc/mem-stats.h 2015-06-23 15:52:24.929095617 +0100
@@ -238,10 +238,10 @@ struct mem_usage_pair
 class mem_alloc_description
 {
 public:
-  struct mem_alloc_hashmap_traits: default_hashmap_traits
+  struct mem_location_hash : nofree_ptr_hash 
   {
 static hashval_t
-hash (const mem_location *l)
+hash (value_type l)
 {
inchash::hash hstate;
 
@@ -253,18 +253,18 @@ struct mem_usage_pair
 }
 
 static bool
-equal_keys (const mem_location *l1, const mem_location *l2)
+equal (value_type l1, value_type l2)
 {
   return l1->m_filename == l2->m_filename
&& l1->m_function == l2->m_function
&& l1->m_line == l2->m_line;
 }
   };
+  typedef simple_hashmap_traits mem_alloc_hashmap_traits;
 
   /* Internal class type definitions.  */
   typedef hash_map  mem_map_t;
-  typedef hash_map , default_hashmap_traits>
-reverse_mem_map_t;
+  typedef hash_map  > reverse_mem_map_t;
   typedef hash_map  > 
reverse_object_map_t;
   typedef std::pair  mem_list_t;
 
Index: gcc/sanopt.c
===
--- gcc/sanopt.c2015-06-23 15:52:24.937095524 +0100
+++ gcc/sanopt.c2015-06-23 15:52:24.929095617 +0100
@@ -109,8 +109,11 @@ struct sanopt_tree_triplet
 
 /* Traits class for tree triplet hash maps below.  */
 
-struct sanopt_tree_triplet_map_traits : default_hashmap_traits
+struct sanopt_tree_triplet_hash : typed_noop_remove 
 {
+  typedef sanopt_tree_triplet value_type;
+  typedef sanopt_tree_triplet compare_type;
+
   static inline hashval_t
   hash (const sanopt_tree_triplet &ref)
   {
@@ -122,41 +125,39 @@ struct sanopt_tree_triplet_map_traits :
   }
 
   static inline bool
-  equal_keys (const sanopt_tree_triplet &ref1, const sanopt_tree_triplet &ref2)
+  equal (const sanopt_tree_triplet &ref1, const sanopt_tree_triplet &ref2)
   {
 return operand_equal_p (ref1.t1, ref2.t1, 0)
   && operand_equal_p (ref1.t2, ref2.t2, 0)
   && operand_equal_p (ref1.t3, ref2.t3, 0);
   }
 
-  template
   static inline void
-  mark_deleted (T &e)
+  mark_deleted (sanopt_tree_triplet &ref)
   {
-e.m_key.t1 = reinterpret_cast (1);
+ref.t1 = reinterpret_cast (1);
   }
 
-  template
   static inline void
-  mark_empty (T &e)
+  mark_empty (sanopt_tree_triplet &ref)
   {
-e.m_key.t1 = NULL;
+ref.t1 = NULL;
   }
 
-  template
   static inline bool
-  is_deleted (T &e)
+  is_deleted (const sanopt_tree_triplet &ref)
   {
-return e.m_key.t1 == (void *) 1;
+return ref.t1 == (void *) 1;
   }
 
-  template
   static inline bool
-  is_empty (T &e)
+  is_empty (const sanopt_tree_triplet &ref)
   {
-return e.m_key.t1 == NULL;
+return ref.t1 == NULL;
   }
 };
+typedef simple_hashmap_traits 
+  sanopt_tree_triplet_map_traits;
 
 /* This is used to carry various hash maps and variables used
in sanopt_optimize_walker.  */



[10/12] Add helper class for valued-based empty and deleted slots

2015-06-23 Thread Richard Sandiford
part_traits in cfgexpand.c needs to use the value rather than the key to
represent empty and deleted slots.  What it's doing is pretty generic,
so this patch adds a helper class to hash-map-traits.h.


gcc/
* hash-map-traits.h (unbounded_hashmap_traits): New class.
(unbounded_int_hashmap_traits): Likewise.
* cfgexpand.c (part_traits): Use unbounded_int_hashmap_traits.

Index: gcc/hash-map-traits.h
===
--- gcc/hash-map-traits.h   2015-06-23 15:54:04.515950631 +0100
+++ gcc/hash-map-traits.h   2015-06-23 15:54:04.511950679 +0100
@@ -174,4 +174,84 @@ simple_hashmap_traits ::mark_deleted
   H::mark_deleted (entry.m_key);
 }
 
+/* Implement traits for a hash_map with values of type Value for cases
+   in which the key cannot represent empty and deleted slots.  Instead
+   record empty and deleted entries in Value.  Derived classes must
+   implement the hash and equal_keys functions.  */
+
+template 
+struct unbounded_hashmap_traits
+{
+  template  static inline void remove (T &);
+  template  static inline bool is_empty (const T &);
+  template  static inline bool is_deleted (const T &);
+  template  static inline void mark_empty (T &);
+  template  static inline void mark_deleted (T &);
+};
+
+template 
+template 
+inline void
+unbounded_hashmap_traits ::remove (T &entry)
+{
+  default_hash_traits ::remove (entry.m_value);
+}
+
+template 
+template 
+inline bool
+unbounded_hashmap_traits ::is_empty (const T &entry)
+{
+  return default_hash_traits ::is_empty (entry.m_value);
+}
+
+template 
+template 
+inline bool
+unbounded_hashmap_traits ::is_deleted (const T &entry)
+{
+  return default_hash_traits ::is_deleted (entry.m_value);
+}
+
+template 
+template 
+inline void
+unbounded_hashmap_traits ::mark_empty (T &entry)
+{
+  default_hash_traits ::mark_empty (entry.m_value);
+}
+
+template 
+template 
+inline void
+unbounded_hashmap_traits ::mark_deleted (T &entry)
+{
+  default_hash_traits ::mark_deleted (entry.m_value);
+}
+
+/* Implement traits for a hash_map from integer type Key to Value in
+   cases where Key has no spare values for recording empty and deleted
+   slots.  */
+
+template 
+struct unbounded_int_hashmap_traits : unbounded_hashmap_traits 
+{
+  static inline hashval_t hash (Key);
+  static inline bool equal_keys (Key, Key);
+};
+
+template 
+inline hashval_t
+unbounded_int_hashmap_traits ::hash (Key k)
+{
+  return k;
+}
+
+template 
+inline bool
+unbounded_int_hashmap_traits ::equal_keys (Key k1, Key k2)
+{
+  return k1 == k2;
+}
+
 #endif // HASH_MAP_TRAITS_H
Index: gcc/cfgexpand.c
===
--- gcc/cfgexpand.c 2015-06-23 15:54:04.515950631 +0100
+++ gcc/cfgexpand.c 2015-06-23 15:54:04.511950679 +0100
@@ -612,25 +612,7 @@ stack_var_cmp (const void *a, const void
   return 0;
 }
 
-struct part_traits : default_hashmap_traits
-{
-  template
-static bool
-is_deleted (T &e)
-{ return e.m_value == reinterpret_cast (1); }
-
-  template static bool is_empty (T &e) { return e.m_value == NULL; 
}
-  template
-static void
-mark_deleted (T &e)
-{ e.m_value = reinterpret_cast (1); }
-
-  template
-static void
-mark_empty (T &e)
-  { e.m_value = NULL; }
-};
-
+struct part_traits : unbounded_int_hashmap_traits  {};
 typedef hash_map part_hashmap;
 
 /* If the points-to solution *PI points to variables that are in a partition



Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-06-23 Thread Ilya Verbin
On Tue, Jun 23, 2015 at 13:51:39 +0200, Jakub Jelinek wrote:
> > +  /* Set dd on target to 0 for the further check.  */
> > +  #pragma omp target map(always to: dd)
> > +   { dd; }
> 
> This reminds me that:
>   if (ctx->region_type == ORT_TARGET && !(n->value & GOVD_SEEN))
> remove = true;
> in gimplify.c is not what we want, if it is has GOMP_MAP_KIND_ALWAYS,
> then we shouldn't remove it even when it is not mentioned inside of the
> region's body, because it then has side-effects.

OK for gomp-4_1-branch?


gcc/
* gimplify.c (gimplify_adjust_omp_clauses): Don't remove map clause if
it has map-type-modifier always.
libgomp/
* testsuite/libgomp.c/target-11.c (main): Remove dd from target region.


diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 9b2347a..74fe60b 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -6870,7 +6870,8 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, tree 
*list_p)
  if (!DECL_P (decl))
break;
  n = splay_tree_lookup (ctx->variables, (splay_tree_key) decl);
- if (ctx->region_type == ORT_TARGET && !(n->value & GOVD_SEEN))
+ if (ctx->region_type == ORT_TARGET && !(n->value & GOVD_SEEN)
+ && !(OMP_CLAUSE_MAP_KIND (c) & GOMP_MAP_FLAG_ALWAYS))
remove = true;
  else if (DECL_SIZE (decl)
   && TREE_CODE (DECL_SIZE (decl)) != INTEGER_CST
diff --git a/libgomp/testsuite/libgomp.c/target-11.c 
b/libgomp/testsuite/libgomp.c/target-11.c
index 4562d88..0fd183b 100644
--- a/libgomp/testsuite/libgomp.c/target-11.c
+++ b/libgomp/testsuite/libgomp.c/target-11.c
@@ -13,7 +13,7 @@ int main ()
 
   /* Set dd on target to 0 for the further check.  */
   #pragma omp target map(always to: dd)
-   { dd; }
+   ;
 
   dd = 1;
   #pragma omp target map(tofrom: aa) map(always to: bb) \


  -- Ilya


[11/12] Remove default_hashmap_traits

2015-06-23 Thread Richard Sandiford
The previous patches removed all uses of default_hashmap_traits,
so this patch deletes the definition.


gcc/
* hash-map-traits.h (default_hashmap_traits): Delete.

Index: gcc/hash-map-traits.h
===
--- gcc/hash-map-traits.h   2015-06-23 15:55:43.054817986 +0100
+++ gcc/hash-map-traits.h   2015-06-23 15:55:43.050818009 +0100
@@ -25,84 +25,6 @@ #define HASH_MAP_TRAITS_H
 
 #include "hash-traits.h"
 
-/* implement default behavior for traits when types allow it.  */
-
-struct default_hashmap_traits
-{
-  /* Hashes the passed in key.  */
-
-  template
-  static hashval_t
-  hash (T *p)
-{
-  return uintptr_t (p) >> 3;
-}
-
-  /* If the value converts to hashval_t just use it.  */
-
-  template static hashval_t hash (T v) { return v; }
-
-  /* Return true if the two keys passed as arguments are equal.  */
-
-  template
-  static bool
-  equal_keys (const T &a, const T &b)
-{
-  return a == b;
-}
-
-  /* Called to dispose of the key and value before marking the entry as
- deleted.  */
-
-  template static void remove (T &v) { v.~T (); }
-
-  /* Mark the passed in entry as being deleted.  */
-
-  template
-  static void
-  mark_deleted (T &e)
-{
-  mark_key_deleted (e.m_key);
-}
-
-  /* Mark the passed in entry as being empty.  */
-
-  template
-  static void
-  mark_empty (T &e)
-{
-  mark_key_empty (e.m_key);
-}
-
-  /* Return true if the passed in entry is marked as deleted.  */
-
-  template
-  static bool
-  is_deleted (T &e)
-{
-  return e.m_key == (void *)1;
-}
-
-  /* Return true if the passed in entry is marked as empty.  */
-
-  template static bool is_empty (T &e) { return e.m_key == NULL; }
-
-private:
-  template
-  static void
-  mark_key_deleted (T *&k)
-{
-  k = reinterpret_cast (1);
-}
-
-  template
-  static void
-  mark_key_empty (T *&k)
-{
-  k = static_cast (0);
-}
-};
-
 /* Implement hash_map traits for a key with hash traits H.  Empty and
deleted map entries are represented as empty and deleted keys.  */
 




[12/12] Simplify uses of hash_map

2015-06-23 Thread Richard Sandiford
At this point all hash_map traits know what kind of key they're
dealing with, so we can make that a traits typedef, like it is for
hash_table traits.  Then, if we make the default hash traits for
T be T, we can use hash_table-style traits as the first template
parameter to hash_map, without the need for a third.  That is, if
foo_hash hashes elements of type foo_type:

  typedef simple_hashmap_traits  foo_traits;
  hash_map  x;

becomes just:

  hash_map  x;

just like a hash_table of foo_types would be:

  hash_table  y;

This patch makes that simplification.


gcc/
* hash-map-traits.h (simple_hashmap_traits::key_type): New typedef.
(unbounded_int_hashmap_traits::key_type): Likewise.
* hash-map.h (hash_map): Get the key type from the traits.
* hash-traits.h (default_hash_traits): By default, inherit from the
template parameter.
* alias.c (alias_set_traits): Delete.
(alias_set_entry_d::children): Use alias_set_hash as the first
template parameter.
(record_alias_subset): Update accordingly.
* except.c (tree_hash_traits): Delete.
(type_to_runtime_map): Use tree_hash as the first template parameter.
(init_eh): Update accordingly.
* genmatch.c (capture_id_map_hasher): Delete.
(cid_map_t): Use nofree_string_hash as first template parameter.
* ipa-icf.h (symbol_compare_hashmap_traits): Delete.
* ipa-icf.c (sem_item_optimizer::subdivide_classes_by_sensitive_refs):
Use symbol_compare_hash as the first template parameter in
subdivide_hash_map.
* mem-stats.h (mem_usage_pair::mem_alloc_hashmap_traits): Delete.
(mem_usage_pair::mem_map_t): Use mem_location_hash as the first
template parameter.
* passes.c (pass_registry_hasher): Delete.
(name_to_pass_map): Use nofree_string_hash as the first template
parameter.
(register_pass_name): Update accordingly.
* sanopt.c (sanopt_tree_map_traits): Delete.
(sanopt_tree_triplet_map_traits): Delete.
(sanopt_ctx::asan_check_map): Use tree_operand_hash as the first
template parameter.
(sanopt_ctx::vptr_check_map): Use sanopt_tree_triplet_hash as
the first template parameter.
* sese.c (rename_map_hasher): Delete.
(rename_map_type): Use tree_ssa_name_hash as the first template
parameter.
* symbol-summary.h (function_summary::summary_hashmap_traits): Delete.
(function_summary::m_map): Use map_hash as the first template
parameter.
(function_summary::release): Update accordingly.
* tree-if-conv.c (phi_args_hash_traits): Delete.
(predicate_scalar_phi): Use tree_operand_hash as the first template
parameter to phi_arg_map.
* tree-inline.h (dependence_hasher): Delete.
(copy_body_data::dependence_map): Use dependence_hash as the first
template parameter.
* tree-inline.c (remap_dependence_clique): Update accordingly.
* tree-ssa-strlen.c (stridxlist_hash_traits): Delete.
(decl_to_stridxlist_htab): Use tree_decl_hash as the first template
parameter.
(addr_stridxptr): Update accordingly.
* value-prof.c (profile_id_traits): Delete.
(cgraph_node_map): Use profile_id_hash as the first template
parameter.
(init_node_map): Update accordingly.
* config/alpha/alpha.c (string_traits): Delete.
(machine_function::links): Use nofree_string_hash as the first
template parameter.
(alpha_use_linkage, alpha_write_linkage): Update accordingly.
* config/m32c/m32c.c (pragma_traits): Delete.
(pragma_htab): Use nofree_string_hash as the first template parameter.
(m32c_note_pragma_address): Update accordingly.
* config/mep/mep.c (pragma_traits): Delete.
(pragma_htab): Use nofree_string_hash as the first template parameter.
(mep_note_pragma_flag): Update accordingly.
* config/mips/mips.c (mips16_flip_traits): Delete.
(mflip_mips16_htab): Use nofree_string_hash as the first template
parameter.
(mflip_mips16_use_mips16_p): Update accordingly.
(local_alias_traits): Delete.
(mips16_local_aliases): Use nofree_string_hash as the first template
parameter.
(mips16_local_alias): Update accordingly.

Index: gcc/hash-map-traits.h
===
--- gcc/hash-map-traits.h   2015-06-23 15:56:38.990174759 +0100
+++ gcc/hash-map-traits.h   2015-06-23 15:56:38.986174805 +0100
@@ -31,9 +31,9 @@ #define HASH_MAP_TRAITS_H
 template 
 struct simple_hashmap_traits
 {
-  static inline hashval_t hash (const typename H::value_type &);
-  static inline bool equal_keys (const typename H::value_type &,
-const typename H::value_type &);
+  typedef typename H::value_type key_type;
+  static inline has

Re: [PATCH] Add CFI entries for ARM Linux idiv0 / ldiv0

2015-06-23 Thread Ramana Radhakrishnan



On 16/06/15 22:25, James Lemke wrote:

A divide by zero exception was not giving a proper traceback for LINUX
ARM_EABI.  The attached patch fixes the problem on trunk (and several
local branches).

Tested on gcc-trunk for arm-none-linux-gnueabi.

OK to commit?
>
2015-06-16  James Lemke  

libgcc/config/arm/
* lib1funcs.S (aeabi_idiv0, aeabi_ldiv0): Add CFI entries for
Linux ARM_EABI.


s/for Linux ARM EABI//

given you handle both __ARM_EABI__ and the not __ARM_EABI__ targets in 
the source.


This is OK if no regressions.

Thanks,
Ramana


Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-06-23 Thread Jakub Jelinek
On Tue, Jun 23, 2015 at 05:54:48PM +0300, Ilya Verbin wrote:
> On Tue, Jun 23, 2015 at 13:51:39 +0200, Jakub Jelinek wrote:
> > > +  /* Set dd on target to 0 for the further check.  */
> > > +  #pragma omp target map(always to: dd)
> > > + { dd; }
> > 
> > This reminds me that:
> >   if (ctx->region_type == ORT_TARGET && !(n->value & GOVD_SEEN))
> > remove = true;
> > in gimplify.c is not what we want, if it is has GOMP_MAP_KIND_ALWAYS,
> > then we shouldn't remove it even when it is not mentioned inside of the
> > region's body, because it then has side-effects.
> 
> OK for gomp-4_1-branch?
> 
> 
> gcc/
>   * gimplify.c (gimplify_adjust_omp_clauses): Don't remove map clause if
>   it has map-type-modifier always.
> libgomp/
>   * testsuite/libgomp.c/target-11.c (main): Remove dd from target region.

GOMP_MAP_RELEASE uses the GOMP_MAP_FLAG_ALWAYS for something different from
always, because always release and always delete is not meaningful.
But as neither release nor delete can appear on map clause in target region,
it doesn't matter (at least for now).
So the patch is ok, thanks.

Jakub


[committed] Use abort in parloops-exit-first-loop-alt-{3,4}.c

2015-06-23 Thread Tom de Vries

Hi,

committed attached patch as trivial.

Thanks,
- Tom
Use abort in parloops-exit-first-loop-alt-{3,4}.c

2015-06-23  Tom de Vries  

	* testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c (main): Use
	abort.
	* testsuite/libgomp.c/parloops-exit-first-loop-alt-4.c (main): Same.
---
 libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c | 9 -
 libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-4.c | 6 +-
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c b/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c
index 43b9194..cb5bf9c 100644
--- a/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c
+++ b/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c
@@ -3,6 +3,8 @@
 
 /* Variable bound, reduction.  */
 
+#include 
+
 #define N 4000
 
 unsigned int *a;
@@ -25,9 +27,14 @@ main (void)
   unsigned int res;
   unsigned int array[N];
   int i;
+
   for (i = 0; i < N; ++i)
 array[i] = i % 7;
   a = &array[0];
+
   res = f (N);
-  return !(res == 11995);
+  if (res != 11995)
+abort ();
+
+  return 0;
 }
diff --git a/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-4.c b/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-4.c
index 8599a89..ac420fa 100644
--- a/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-4.c
+++ b/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-4.c
@@ -3,6 +3,8 @@
 
 /* Constant bound, reduction.  */
 
+#include 
+
 #define N 4000
 
 unsigned int *a;
@@ -29,5 +31,7 @@ main (void)
 array[i] = i % 7;
   a = &array[0];
   res = f ();
-  return !(res == 11995);
+  if (res != 11995)
+abort ();
+  return 0;
 }
-- 
1.9.1



Re: [patch] Delete temporary response file

2015-06-23 Thread Jeff Law

On 06/22/2015 11:37 AM, Eric Botcazou wrote:

Hi,

when you pass a response file at link time and you use the GNU linker, then
collect2 creates another, temporary response file and passes it to the linker.
But it fails to delete the file after it is done.  This can easily be seen
with the following manipulation:

eric@polaris:~/build/gcc/native> cat t.c
int main (void) { return 0; }
eric@polaris:~/build/gcc/native> cat t.resp
-L/usr/lib64
eric@polaris:~/build/gcc/native> gcc -c t.c
eric@polaris:~/build/gcc/native> export TMPDIR=$PWD
eric@polaris:~/build/gcc/native> gcc -o t t.o @t.resp
eric@polaris:~/build/gcc/native> ls cc*
ccVSQ6W5

The problem is that do_wait is not invoked by tlink_execute, only collect_wait
is, so the cleanup code present therein is never invoked.

Tested on x86_64-suse-linux, OK for the mainline?


2015-06-22  Tristan Gingold  

* collect2.c (collect_wait): Unlink the response file here instead of...
(do_wait): ...here.
(utils_cleanup): ...and here.

OK.
jeff


Re: [PATCH] c/66516 - missing diagnostic on taking the address of a builtin function

2015-06-23 Thread Martin Sebor

On 06/23/2015 04:29 AM, Jakub Jelinek wrote:

On Tue, Jun 23, 2015 at 12:18:30PM +0200, Marek Polacek wrote:

Is it intended that programs be able to take the address of
the builtins that correspond to libc functions and make calls
to the underlying libc functions via such pointers? (If so,
the patch will need some tweaking.)


I don't think so, at least clang doesn't allow e.g.
size_t (*fp) (const char *) = __builtin_strlen;


Well, clang is irrelevant here, __builtin_strlen etc. is a GNU
extension, so it matters what we decide about it.  As this used to work
for decades (if the builtin function has a libc fallback), suddenly
rejecting it could break various programs that e.g. just
#define strlen __builtin_strlen
or similar.  Can't we really reject it just for the functions
that don't have a unique fallback?


Let me look into it.

Martin


Re: [Patch SRA] Fix PR66119 by calling get_move_ratio in SRA

2015-06-23 Thread James Greenhalgh

On Tue, Jun 23, 2015 at 09:52:01AM +0100, Jakub Jelinek wrote:
> On Tue, Jun 23, 2015 at 09:18:52AM +0100, James Greenhalgh wrote:
> > This patch fixes the issue by always calling get_move_ratio in the SRA
> > code, ensuring that an up-to-date value is used.
> >
> > Unfortunately, this means we have to use 0 as a sentinel value for
> > the parameter - indicating no user override of the feature - and
> > therefore cannot use it to disable scalarization. However, there
> > are other ways to disable scalarazation (-fno-tree-sra) so this is not
> > a great loss.
>
> You can handle even that.
>



>   enum compiler_param param
> = optimize_function_for_size_p (cfun)
>   ? PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE
>   : PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED;
>   unsigned max_scalarization_size = PARAM_VALUE (param) * BITS_PER_UNIT;
>   if (!max_scalarization_size && !global_options_set.x_param_values[param])
>
> Then it will handle explicit --param sra-max-scalarization-size-Os*=0
> differently from implicit 0.

Ah hah! OK, I've respun the patch removing this extra justification in
the documentation and reshuffling the logic a little.

> OT, shouldn't max_scalarization_size be at least unsigned HOST_WIDE_INT,
> so that it doesn't overflow for larger values (0x4000 etc.)?
> Probably need some cast in the multiplication to avoid UB in the compiler.

I've increased the size of max_scalarization_size to a UHWI in this spin.

Bootstrapped and tested on AArch64 and x86-64 with no issues and checked
to see the PR is fixed.

OK for trunk, and gcc-5 in a few days?

Thanks,
James

---
gcc/

2015-06-23  James Greenhalgh  

PR tree-optimization/66119
* toplev.c (process_options): Don't set up default values for
the sra_max_scalarization_size_{speed,size} parameters.
* tree-sra (analyze_all_variable_accesses): If no values
have been set for the sra_max_scalarization_size_{speed,size}
parameters, call get_move_ratio to get target defaults.

diff --git a/gcc/toplev.c b/gcc/toplev.c
index 2f43a89..902bfc7 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1301,20 +1301,6 @@ process_options (void)
  so we can correctly initialize debug output.  */
   no_backend = lang_hooks.post_options (&main_input_filename);
 
-  /* Set default values for parameters relation to the Scalar Reduction
- of Aggregates passes (SRA and IP-SRA).  We must do this here, rather
- than in opts.c:default_options_optimization as historically these
- tuning heuristics have been based on MOVE_RATIO, which on some
- targets requires other symbols from the backend.  */
-  maybe_set_param_value
-(PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED,
- get_move_ratio (true) * UNITS_PER_WORD,
- global_options.x_param_values, global_options_set.x_param_values);
-  maybe_set_param_value
-(PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE,
- get_move_ratio (false) * UNITS_PER_WORD,
- global_options.x_param_values, global_options_set.x_param_values);
-
   /* Some machines may reject certain combinations of options.  */
   targetm.target_option.override ();
 
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 8e34244..5f573f6 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -2549,11 +2549,20 @@ analyze_all_variable_accesses (void)
   bitmap tmp = BITMAP_ALLOC (NULL);
   bitmap_iterator bi;
   unsigned i;
-  unsigned max_scalarization_size
-= (optimize_function_for_size_p (cfun)
-	? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE)
-	: PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED))
-  * BITS_PER_UNIT;
+  bool optimize_speed_p = !optimize_function_for_size_p (cfun);
+
+  enum compiler_param param = optimize_speed_p
+			? PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED
+			: PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE;
+
+  /* If the user didn't set PARAM_SRA_MAX_SCALARIZATION_SIZE_<...>,
+ fall back to a target default.  */
+  unsigned HOST_WIDE_INT max_scalarization_size
+= global_options_set.x_param_values[param]
+  ? PARAM_VALUE (param)
+  : get_move_ratio (optimize_speed_p) * UNITS_PER_WORD;
+
+  max_scalarization_size *= BITS_PER_UNIT;
 
   EXECUTE_IF_SET_IN_BITMAP (candidate_bitmap, 0, i, bi)
 if (bitmap_bit_p (should_scalarize_away_bitmap, i)


Remove redundant AND from count reduction loop

2015-06-23 Thread Richard Sandiford
We vectorise:

int
f (int *a, int n)
{
  int count = 0;
  for (int i = 0; i < n; ++i)
if (a[i] < 255)
  count += 1;
  return count;
}

using an add reduction of a VEC_COND_EXPR .
This leads to the main loop having an AND with a loop invariant {1, 1, ...}.
E.g. on aarch64:

moviv2.4s, 0x1
.L4:
lsl x5, x4, 4
add x4, x4, 1
cmp w2, w4
ldr q1, [x0, x5]
cmgev1.4s, v3.4s, v1.4s
and v1.16b, v2.16b, v1.16b
add v0.4s, v0.4s, v1.4s
bhi .L4

This patch converts an ADD of that VEC_COND_EXPR into a SUB of COND:

.L4:
lsl x5, x4, 4
add x4, x4, 1
cmp w2, w4
ldr q1, [x0, x5]
cmgev1.4s, v2.4s, v1.4s
sub v0.4s, v0.4s, v1.4s
bhi .L4

At the moment the simplification is done during forwprop4, after the last
dce pass, and so the VEC_COND_EXPR survives until expand.  Richi says
this a known problem.  Of course, the expression gets deleted by
rtl dce, but it means that a scan-tree-dump of the *.optimized output
can't easily tell that the optimisation has triggered.  I've therefore
added a scan-assembler test instead.

Bootstrapped & regression-tested on x86_64-linux-gnu.  Also tested
on aarch64-elf.  OK to install?

Thanks,
Richard


gcc/
* match.pd: Add patterns for vec_conds between -1 and 0, and
between 1 and 0.

gcc/testsuite/
* gcc.target/aarch64/vect-add-sub-cond.c: New test.

Index: gcc/match.pd
===
--- gcc/match.pd2015-06-23 11:42:23.644645975 +0100
+++ gcc/match.pd2015-06-23 11:42:23.760644655 +0100
@@ -973,6 +973,36 @@ along with GCC; see the file COPYING3.
   (cnd @0 @2 @1)))
 
 
+/* Vector comparisons are defined to produce all-one or all-zero results.  */
+(simplify
+ (vec_cond @0 integer_all_onesp@1 integer_zerop@2)
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
+   (convert @0)))
+
+/* We could instead convert all instances of the vec_cond to negate,
+   but that isn't necessarily a win on its own.  */
+(simplify
+ (plus:c @3 (vec_cond @0 integer_each_onep@1 integer_zerop@2))
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
+  (minus @3 (convert @0
+
+(simplify
+ (plus:c @3 (view_convert_expr
+(vec_cond @0 integer_each_onep@1 integer_zerop@2)))
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
+  (minus @3 (convert @0
+
+(simplify
+ (minus @3 (vec_cond @0 integer_each_onep@1 integer_zerop@2))
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
+  (plus @3 (convert @0
+
+(simplify
+ (minus @3 (view_convert_expr
+   (vec_cond @0 integer_each_onep@1 integer_zerop@2)))
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
+  (plus @3 (convert @0
+
 /* Simplifications of comparisons.  */
 
 /* We can simplify a logical negation of a comparison to the
Index: gcc/testsuite/gcc.target/aarch64/vect-add-sub-cond.c
===
--- /dev/null   2015-06-02 17:27:28.541944012 +0100
+++ gcc/testsuite/gcc.target/aarch64/vect-add-sub-cond.c2015-06-23 
12:06:27.120203685 +0100
@@ -0,0 +1,94 @@
+/* Make sure that vector comaprison results are not unnecessarily ANDed
+   with vectors of 1.  */
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#define COUNT1(X) if (X) count += 1
+#define COUNT2(X) if (X) count -= 1
+#define COUNT3(X) count += (X)
+#define COUNT4(X) count -= (X)
+
+#define COND1(X) (X)
+#define COND2(X) ((X) ? 1 : 0)
+#define COND3(X) ((X) ? -1 : 0)
+#define COND4(X) ((X) ? 0 : 1)
+#define COND5(X) ((X) ? 0 : -1)
+
+#define TEST_LT(X, Y) ((X) < (Y))
+#define TEST_LE(X, Y) ((X) <= (Y))
+#define TEST_GT(X, Y) ((X) > (Y))
+#define TEST_GE(X, Y) ((X) >= (Y))
+#define TEST_EQ(X, Y) ((X) == (Y))
+#define TEST_NE(X, Y) ((X) != (Y))
+
+#define COUNT_LOOP(ID, TYPE, CMP_ARRAY, TEST, COUNT) \
+  TYPE \
+  reduc_##ID (__typeof__ (CMP_ARRAY[0]) x) \
+  { \
+TYPE count = 0; \
+for (unsigned int i = 0; i < 1024; ++i) \
+  COUNT (TEST (CMP_ARRAY[i], x)); \
+return count; \
+  }
+
+#define COND_LOOP(ID, ARRAY, CMP_ARRAY, TEST, COND) \
+  void \
+  plus_##ID (__typeof__ (CMP_ARRAY[0]) x) \
+  { \
+for (unsigned int i = 0; i < 1024; ++i) \
+  ARRAY[i] += COND (TEST (CMP_ARRAY[i], x)); \
+  } \
+  void \
+  plusc_##ID (void) \
+  { \
+for (unsigned int i = 0; i < 1024; ++i) \
+  ARRAY[i] += COND (TEST (CMP_ARRAY[i], 10)); \
+  } \
+  void \
+  minus_##ID (__typeof__ (CMP_ARRAY[0]) x) \
+  { \
+for (unsigned int i = 0; i < 1024; ++i) \
+  ARRAY[i] -= COND (TEST (CMP_ARRAY[i], x)); \
+  } \
+  void \
+  minusc_##ID (void) \
+  { \
+for (unsigned int i = 0; i < 1024; ++i) \
+  ARRAY[i] += COND (TEST (CMP_ARRAY[i], 1)); \
+  }
+
+#define ALL_LOOPS(ID, ARRAY, CMP_ARRAY, TEST) \
+  typedef __typeof__(ARRAY[0]) ID##_type; \
+  COUNT_LOOP (ID##_1, ID##_type, CMP_ARRAY, 

[PATCH 2/3][AArch64 nofp] Clarify docs for +nofp/-mgeneral-regs-only

2015-06-23 Thread Alan Lawrence

James Greenhalgh wrote:



-Generate code which uses only the general registers.
+Generate code which uses only the general registers.  Equivalent to feature


The ARMARM uses "general-purpose registers" to refer to these registers,
we should match that style.

s/Equivalent to feature/This is equivalent to the feature/


Done.


-Feature modifiers used with @option{-march} and @option{-mcpu} can be one
-the following:
+Feature modifiers used with @option{-march} and @option{-mcpu} can be any of
+the following, or their inverses @option{no@var{feature}}:


s/inverses/inverse/


The grammar is quite difficult here, so have gone for "and their inverses" as 
the set of possibilities definitely includes 3 inverses.


 
+As stated above, @option{crypto} implies @option{simd} implies @option{fp}.


Drop the "As stated above".


To my eye, beginning a sentence in lowercase looks very odd in pdf, and still a 
bit odd in html. Have changed to "That is"...?


Tested with make pdf & make html.

gcc/ChangeLog (unchanged):

* doc/invoke.texi: Clarify AArch64 feature modifiers (no)fp, (no)simd
and (no)crypto.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index d8e982c3aa338819df3785696c493a66c1f5b674..0579bf2ecf993bb56987e0bb9686925537ab61e3 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -12359,7 +12359,10 @@ Generate big-endian code.  This is the default when GCC is configured for an
 
 @item -mgeneral-regs-only
 @opindex mgeneral-regs-only
-Generate code which uses only the general registers.
+Generate code which uses only the general-purpose registers.  This is equivalent
+to feature modifier @option{nofp} of @option{-march} or @option{-mcpu}, except
+that @option{-mgeneral-regs-only} takes precedence over any conflicting feature
+modifier regardless of sequence.
 
 @item -mlittle-endian
 @opindex mlittle-endian
@@ -12498,20 +12501,22 @@ over the appropriate part of this option.
 @subsubsection @option{-march} and @option{-mcpu} Feature Modifiers
 @cindex @option{-march} feature modifiers
 @cindex @option{-mcpu} feature modifiers
-Feature modifiers used with @option{-march} and @option{-mcpu} can be one
-the following:
+Feature modifiers used with @option{-march} and @option{-mcpu} can be any of
+the following and their inverses @option{no@var{feature}}:
 
 @table @samp
 @item crc
 Enable CRC extension.
 @item crypto
-Enable Crypto extension.  This implies Advanced SIMD is enabled.
+Enable Crypto extension.  This also enables Advanced SIMD and floating-point
+instructions.
 @item fp
-Enable floating-point instructions.
+Enable floating-point instructions.  This is on by default for all possible
+values for options @option{-march} and @option{-mcpu}.
 @item simd
-Enable Advanced SIMD instructions.  This implies floating-point instructions
-are enabled.  This is the default for all current possible values for options
-@option{-march} and @option{-mcpu=}.
+Enable Advanced SIMD instructions.  This also enables floating-point
+instructions.  This is on by default for all possible values for options
+@option{-march} and @option{-mcpu}.
 @item lse
 Enable Large System Extension instructions.
 @item pan
@@ -12522,6 +12527,10 @@ Enable Limited Ordering Regions support.
 Enable ARMv8.1 Advanced SIMD instructions.
 @end table
 
+That is, @option{crypto} implies @option{simd} implies @option{fp}.
+Conversely, @option{nofp} (or equivalently, @option{-mgeneral-regs-only})
+implies @option{nosimd} implies @option{nocrypto}.
+
 @node Adapteva Epiphany Options
 @subsection Adapteva Epiphany Options
 


[PATCH 1/3][AArch64 nofp] Fix ICEs with +nofp/-mgeneral-regs-only and improve error messages

2015-06-23 Thread Alan Lawrence

James Greenhalgh wrote:

Submissions on this list should be one patch per mail, it makes
tracking review easier.


OK here's a respin of the first, I've added a third patch after I found another 
route to get to an ICE.



+void
+aarch64_err_no_fpadvsimd (machine_mode mode, const char *msg)
+{
+  const char *mc = FLOAT_MODE_P (mode) ? "floating point" : "vector";


GCC coding conventions, this should be
floating-point (https://gcc.gnu.org/codingconventions.html).


Done.


+  if (!TARGET_FLOAT
+  && fndecl && TREE_PUBLIC (fndecl)
+  && fntype && fntype != error_mark_node)
+{
+  const_tree args, type = TREE_TYPE (fntype);
+  machine_mode mode; /* To pass pointer as argument; never used.  */
+  int nregs; /* Likewise.  */


Do these need annotations to avoid errors in a Werror build? I don't see
any mention of what testing this patch has been through?


Dropped the args, added ATTRIBUTE_UNUSED to the others - the attribute isn't 
necessary at the moment but might become so if inlining became more aggressive.


This version has been bootstrapped on aarch64 linux.


-  if (cum->aapcs_nvrn > 0)
-   sorry ("%qs and floating point or vector arguments",
-  "-mgeneral-regs-only");
+  gcc_assert (cum->aapcs_nvrn == 0);


This promotes an error to an ICE? Can we really never get to this
point through the error control flow


Indeed - the new checks in init_cumulative_args and aarch64_layout_arg mean we 
never get here. (If said new checks were sorry(), we would still get here, but 
since they are error() we do not.)



@@ -7920,9 +7941,7 @@ aarch64_setup_incoming_varargs (cumulative_args_t cum_v, 
machine_mode mode,
 
   if (!TARGET_FLOAT)

 {
-  if (local_cum.aapcs_nvrn > 0)
-   sorry ("%qs and floating point or vector arguments",
-  "-mgeneral-regs-only");
+  gcc_assert (local_cum.aapcs_nvrn == 0);


As above?


Similarly because of change from sorry() -> error().


diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 11123d6..99cefec 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -981,10 +981,7 @@
   ""
   "
 if (!TARGET_FLOAT)
- {
-   sorry (\"%qs and floating point code\", \"-mgeneral-regs-only\");
-   FAIL;
- }
+  aarch64_err_no_fpadvsimd (mode, \"code\");


You've dropped the FAIL?


(*2)

This usually gets called from emit_move_insn_1, via a call to emit_insn 
(GEN_FCN...). If we FAIL, we return NULL, and emit_insn then returns whatever 
insn was last in the BB; if we don't FAIL, we return a move to a general 
register (which is still a valid bit of RTL!). So either seems valid, but 
keeping the FAIL generates fewer instances of the error message, which can 
already get quite numerous. So reinstated FAIL. (Also changed "" quotes to {} 
braces.)


Bootstrap + check-gcc on aarch64-none-linux-gnu.

(ChangeLog's identical to v1)

gcc/ChangeLog:

* config/aarch64/aarch64-protos.h (aarch64_err_no_fpadvsimd): New.

* config/aarch64/aarch64.md (mov/GPF, movtf): Use
aarch64_err_no_fpadvsimd.

* config/aarch64/aarch64.c (aarch64_err_no_fpadvsimd): New.
(aarch64_layout_arg, aarch64_init_cumulative_args): Use
aarch64_err_no_fpadvsimd if !TARGET_FLOAT and we need FP regs.
(aarch64_expand_builtin_va_start, aarch64_setup_incoming_varargs):
Turn error into assert, test TARGET_FLOAT.
(aarch64_gimplify_va_arg_expr): Use aarch64_err_no_fpadvsimd, test
TARGET_FLOAT.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/mgeneral-regs_1.c: New file.
* gcc.target/aarch64/mgeneral-regs_2.c: New file.
* gcc.target/aarch64/nofp_1.c: New file.
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 965a11b7bee188819796e2b17017a87dca80..ac92c5924a4cfc5941fe8eeb31281e18bd21a5a0 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -259,6 +259,7 @@ unsigned aarch64_dbx_register_number (unsigned);
 unsigned aarch64_trampoline_size (void);
 void aarch64_asm_output_labelref (FILE *, const char *);
 void aarch64_elf_asm_named_section (const char *, unsigned, tree);
+void aarch64_err_no_fpadvsimd (machine_mode, const char *);
 void aarch64_expand_epilogue (bool);
 void aarch64_expand_mov_immediate (rtx, rtx);
 void aarch64_expand_prologue (void);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index a79bb6a96572799181a5bff3c3818e294f87cb7a..3193a15970e5524e0f3a8a5505baea5582e55731 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -522,6 +522,16 @@ static const char * const aarch64_condition_codes[] =
   "hi", "ls", "ge", "lt", "gt", "le", "al", "nv"
 };
 
+void
+aarch64_err_no_fpadvsimd (machine_mode mode, const char *msg)
+{
+  const char *mc = FLOAT_MODE_P (mode) ? "floating-point" : "vector";
+  if (TARGET_GENERAL_REGS_ONLY)
+error ("%qs is incom

[PATCH 3/3][AArch64 nofp] Fix another ICE with +nofp/-mgeneral-regs-only

2015-06-23 Thread Alan Lawrence
This fixes another ICE, obtained with the attached testcase - yes, there was a 
way to get hold of a float, without passing an argument or going through 
movsf/movdf!


Bootstrapped + check-gcc on aarch64-none-linux-gnu.

gcc/ChangeLog:

* config/aarch64/aarch64.md (2):
Condition on TARGET_FLOAT.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/mgeneral-regs_3.c: New.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 99cefece8093791ccf17cb071a4e9997bda8fd89..bcaafda5ea46f136dc90f34aa8f2dfaddabd09f5 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4106,7 +4106,7 @@
 (define_insn "2"
   [(set (match_operand:GPF 0 "register_operand" "=w,w")
 (FLOATUORS:GPF (match_operand: 1 "register_operand" "w,r")))]
-  ""
+  "TARGET_FLOAT"
   "@
cvtf\t%0, %1
cvtf\t%0, %1"
diff --git a/gcc/testsuite/gcc.target/aarch64/mgeneral-regs_3.c b/gcc/testsuite/gcc.target/aarch64/mgeneral-regs_3.c
new file mode 100644
index ..225d9eaa45530d88315a146f3fae72d86fe66373
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/mgeneral-regs_3.c
@@ -0,0 +1,11 @@
+/* { dg-options "-mgeneral-regs-only -O2" } */
+
+extern void abort (void);
+
+int
+test (int i, ...)
+{
+  float f = (float) i; /* { dg-error "'-mgeneral-regs-only' is incompatible with floating point code" } */
+  if (f != f) abort ();
+  return 2;
+}


PING: Re: [patch] PR debug/66482: Do not ICE in gen_formal_parameter_die

2015-06-23 Thread Aldy Hernandez

On 06/12/2015 10:07 AM, Aldy Hernandez wrote:

Hi.

This is now a P2, as it is causing a secondary target bootstrap to fail 
(s390).


Aldy


Sigh.  I must say my head is spinning with this testcase and what we do
with it (-O3), even prior to the debug-early work:

void f(int p) {}
int g() {
   void f(int p);
   g();
   return 0;
}

The inliner recursively inlines this function up to a certain depth, but
the useless inlining gets cleaned up shortly afterwards.  However, the
BLOCK_SOURCE_LOCATION are still set throughout which is technically
correct.

Eventually late dwarf gets a hold of all this and we end up calling
dwarf2out_abstract_function to build debug info for the abstract
instance of a function for which we have already generated a DIE for.
Basically, a similar issue to what we encountered for template parameter
packs.  Or at least, that's my understanding, because as I've said, I
admit to being slightly confused here.

Since technically this is all going away when we remove
dwarf2out_abstract_function, I suggest we remove the assert and avoid
sudden death.  It's not like the we generated useful debugging for this
testcase anyhow.

Aldy




Re: [PATCH] Add CFI entries for ARM Linux idiv0 / ldiv0

2015-06-23 Thread James Lemke



Tested on gcc-trunk for arm-none-linux-gnueabi.

OK to commit?



2015-06-16  James Lemke  

libgcc/config/arm/
* lib1funcs.S (aeabi_idiv0, aeabi_ldiv0): Add CFI entries for
Linux ARM_EABI.


s/for Linux ARM EABI//

given you handle both __ARM_EABI__ and the not __ARM_EABI__ targets in
the source.

This is OK if no regressions.


I saw no regressions for arm-none-linux-gnueabi.
However, I don't have access to a non-eabi linux target.
Shall I commit with the non-eabi portions or remove them?
Jim.

--
Jim Lemke, GNU Tools Sourcerer
Mentor Graphics / CodeSourcery
Orillia, Ontario


Re: *Ping* patch, fortran] Warn about constant integer divisions

2015-06-23 Thread Jerry DeLisle
On 06/23/2015 01:36 AM, Janne Blomqvist wrote:
> On Sun, Jun 21, 2015 at 4:57 PM, Thomas Koenig  wrote:
>> *ping*
>>
>> https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00966.html
>>
>>
>>> Hello world,
>>>
>>> the attached patch emits a warning for constant integer division.
>>> While correct according to the standard, I cannot really think
>>> of a legitimate reason why people would want to write 3/5 where
>>> they could have written 0 , so my preference would be to put
>>> this under -Wconversion (like in the attached patch).
>>>
>>> However, I am open to discussion on that.  It is easy enough to
>>> change.
>>>
>>> Regression-tested.  Opinions?  Comments?  Would somebody rather
>>> have -Wconversion-extra?  OK for trunk?
> 
> I'm a bit uncomfortable about this. IIRC I have code where I'm
> iterating over some kind of grid, and I'm using integer division and
> relying on truncation to calculate array indices. I can certainly
> imagine that others have used it as well, and even that it's not a
> particularly uncommon pattern.
> 
> Furthermore, I think it's confusing that you have it under
> -Wconversion, as there is no type conversion going on.
> -Winteger-truncation maybe?
> 
> Any other opinions?
> 

I am not sure it is worth warning about. I don't think it justifies its own
compilation warning option. I have no objection to -Wconversion, 3/5 being
converted to zero in a sense. It would help users catch a missing decimal point
when they meant 3./5

Regards,

Jerry

Jerry


Re: [PATCH] Add CFI entries for ARM Linux idiv0 / ldiv0

2015-06-23 Thread Ramana Radhakrishnan
On Tue, Jun 23, 2015 at 5:18 PM, James Lemke  wrote:
>
>>> Tested on gcc-trunk for arm-none-linux-gnueabi.
>>>
>>> OK to commit?


>>> 2015-06-16  James Lemke  
>>>
>>> libgcc/config/arm/
>>> * lib1funcs.S (aeabi_idiv0, aeabi_ldiv0): Add CFI entries for
>>> Linux ARM_EABI.
>>
>>
>> s/for Linux ARM EABI//
>>
>> given you handle both __ARM_EABI__ and the not __ARM_EABI__ targets in
>> the source.
>>
>> This is OK if no regressions.
>
>
> I saw no regressions for arm-none-linux-gnueabi.
> However, I don't have access to a non-eabi linux target.
> Shall I commit with the non-eabi portions or remove them?
> Jim.

I have no access to a non-EABI arm target to test this, not sure if
there are any intree anymore (probably vxworks ?) .  It looks sane on
a read - just apply it and look after regressions if anything gets
reported in bugzilla.


regards
Ramana

>
>
> --
> Jim Lemke, GNU Tools Sourcerer
> Mentor Graphics / CodeSourcery
> Orillia, Ontario


Re: [PATCH] Add CFI entries for ARM Linux idiv0 / ldiv0

2015-06-23 Thread James Lemke

On 06/23/2015 12:22 PM, Ramana Radhakrishnan wrote:

I have no access to a non-EABI arm target to test this, not sure if
there are any intree anymore (probably vxworks ?) .  It looks sane on
a read - just apply it and look after regressions if anything gets
reported in bugzilla.


Thanks Ramana.  I will change the CL entry as you suggested.
Jim.

--
Jim Lemke, GNU Tools Sourcerer
Mentor Graphics / CodeSourcery
Orillia, Ontario


Re: [PATCH] [ARM] Post-indexed addressing for NEON memory access

2015-06-23 Thread Ramana Radhakrishnan
On Fri, Jun 19, 2015 at 7:04 PM, Charles Baylis
 wrote:
> On 18 June 2014 at 11:06, Ramana Radhakrishnan
>  wrote:
>> On Tue, Jun 17, 2014 at 4:03 PM, Charles Baylis
>>  wrote:
>>> Your mention of larger vector modes prompted me to check that the
>>> patch has the desired result with them. In fact, the costs are
>>> estimated incorrectly which means the post_modify pattern is not used.
>>> The attached patch fixes that. (used in combination with my original
>>> patch)
>>>
>>>
>>> 2014-06-15  Charles Baylis  
>>>
>>> * config/arm/arm.c (arm_new_rtx_costs): Reduce cost for mem with
>>> embedded side effects.
>>
>> I'm not too thrilled with putting in more special cases that are not
>> table driven in there. Can you file a PR with some testcases that show
>> this so that we don't forget and CC me on it please ?
>
> I created https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61551 at the time.
>
> I've come back to look at this again and would like to fix it in this
> release cycle. I still don't really understand what you mean by
> table-driven in this context. Do you still hold this view, and if so,
> could you describe what you'd like to see instead of this patch?

By table-driven I mean something similar to the existing rtx_costs
infrastructure with a walker and a data structure holding costs for
each addressing mode which are tweakable on a per-core basis.

Thus for example I'd expect this to be on a per access mode basis
along with different costs with respect to post_inc, reg indirect,
pre_inc etc. that were in a data structure and then a worker function
that peeled rtx's to obtain the appropriate cost from said
data-structure.



Ramana


Re: [PATCH][testsuite] Fix TORTURE_OPTIONS overriding

2015-06-23 Thread James Greenhalgh

On Thu, Jun 18, 2015 at 11:10:01AM +0100, Richard Biener wrote:
>
> Currently when doing
>
> make check-gcc RUNTESTFLAGS="TORTURE_OPTIONS=\\\"{ -O3 } { -O2 }\\\"
> dg-torture.exp"
>
> you get -O3 and -O2 but also the two LTO torture option combinations.
> That's undesired (those are the most expensive anyway).  The following
> patch avoids this by setting LTO_TORTURE_OPTIONS only when
> TORTURE_OPTIONS isn't specified.
>
> Tested with and without TORTURE_OPTIONS for C and fortran tortures.
>
> Seems the instruction in c-torture.exp how to override TORTURE_OPTIONS
> is off, RUNTESTFLAGS="TORTURE_OPTIONS=\\\"{ { -O3 } { -O2 } }\\\"
> certainly doesn't do what it should.

This patch causes issues for ARM and AArch64 cross multilib
testing. There are two issues, one is that we now clobber
gcc_force_conventional_output after setting it in the conditional this patch
moved (hits all targets, see the new x86-64 failures like pr61848.c).

The other is that we no longer protect environment settings before calling
check_effective_target_lto, which results in our cross --specs files no
longer being on the path.

I've fixed these issues by rearranging the file again, but I'm not
sure if what I've done is sensible and does not cause other issues. This
seems to bring back the tests I'd lost overnight, and doesn't cause
issues elsewhere.

I've run some cross-tests to ensure this brings back the missing tests,
and a full x86-64 testrun to make sure I haven't dropped any from there.

OK for trunk?

Thanks,
James

---
2015-06-23  James Greenhalgh  

* lib/c-torture.exp: Don't call check_effective_target_lto
before setting up environment correctly.
* lib/gcc-dg.exp: Likewise, and protect
gcc_force_conventional_output.

diff --git a/gcc/testsuite/lib/c-torture.exp b/gcc/testsuite/lib/c-torture.exp
index 607e7d0..c88c439 100644
--- a/gcc/testsuite/lib/c-torture.exp
+++ b/gcc/testsuite/lib/c-torture.exp
@@ -21,6 +21,20 @@ load_lib file-format.exp
 load_lib target-libpath.exp
 load_lib target-utils.exp
 
+global GCC_UNDER_TEST
+if ![info exists GCC_UNDER_TEST] {
+set GCC_UNDER_TEST "[find_gcc]"
+}
+
+global orig_environment_saved
+
+# This file may be sourced, so don't override environment settings
+# that have been previously setup.
+if { $orig_environment_saved == 0 } {
+append ld_library_path [gcc-set-multilib-library-path $GCC_UNDER_TEST]
+set_ld_library_path_env_vars
+}
+
 # The default option list can be overridden by
 # TORTURE_OPTIONS="{ list1 } ... { listN }"
 
@@ -68,20 +82,6 @@ if [info exists ADDITIONAL_TORTURE_OPTIONS] {
 	[concat $C_TORTURE_OPTIONS $ADDITIONAL_TORTURE_OPTIONS]
 }
 
-global GCC_UNDER_TEST
-if ![info exists GCC_UNDER_TEST] {
-set GCC_UNDER_TEST "[find_gcc]"
-}
-
-global orig_environment_saved
-
-# This file may be sourced, so don't override environment settings
-# that have been previously setup.
-if { $orig_environment_saved == 0 } {
-append ld_library_path [gcc-set-multilib-library-path $GCC_UNDER_TEST]
-set_ld_library_path_env_vars
-}
-
 #
 # c-torture-compile -- runs the Tege C-torture test
 #
diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
index 00ca0c5..d463f81 100644
--- a/gcc/testsuite/lib/gcc-dg.exp
+++ b/gcc/testsuite/lib/gcc-dg.exp
@@ -46,6 +46,19 @@ if ![info exists GCC_UNDER_TEST] {
 set GCC_UNDER_TEST "[find_gcc]"
 }
 
+# This file may be sourced, so don't override environment settings
+# that have been previously setup.
+if { $orig_environment_saved == 0 } {
+append ld_library_path [gcc-set-multilib-library-path $GCC_UNDER_TEST]
+set_ld_library_path_env_vars
+}
+
+# Some torture-options cause intermediate code output, unusable for
+# testing using e.g. scan-assembler.  In this variable are the options
+# how to force it, when needed.
+global gcc_force_conventional_output
+set gcc_force_conventional_output ""
+
 set LTO_TORTURE_OPTIONS ""
 if [info exists TORTURE_OPTIONS] {
 set DG_TORTURE_OPTIONS $TORTURE_OPTIONS
@@ -92,19 +105,6 @@ if [info exists ADDITIONAL_TORTURE_OPTIONS] {
 
 global orig_environment_saved
 
-# This file may be sourced, so don't override environment settings
-# that have been previously setup.
-if { $orig_environment_saved == 0 } {
-append ld_library_path [gcc-set-multilib-library-path $GCC_UNDER_TEST]
-set_ld_library_path_env_vars
-}
-
-# Some torture-options cause intermediate code output, unusable for
-# testing using e.g. scan-assembler.  In this variable are the options
-# how to force it, when needed.
-global gcc_force_conventional_output
-set gcc_force_conventional_output ""
-
 # Deduce generated files from tool flags, return finalcode string
 proc schedule-cleanups { opts } {
 global additional_sources


[PATCH] backport FreeBSD add functionality to build PIE executables

2015-06-23 Thread Andreas Tobler

Hi all,

I'm going to commit this patch to 5.1 in the next days unless someone 
objects.


The patch is in my 5.1 tree since a longer time and I regularly post 
results.


Thanks,
Andreas

2015-06-22  Andreas Tobler  

Backport from mainline
2015-05-18  Andreas Tobler  

* config/freebsd-spec.h (FBSD_STARTFILE_SPEC): Add the bits to build
pie executables.
(FBSD_ENDFILE_SPEC): Likewise.
* config/i386/freebsd.h (STARTFILE_SPEC): Remove and use the one from
config/freebsd-spec.h.
(ENDFILE_SPEC): Likewise.

2015-06-22  Andreas Tobler  

Backport from mainline
2015-05-12  Andreas Tobler  

* lib/target-supports.exp (check_effective_target_pie): Add *-*-freebsd*
to the family of pie capable targets.

Index: config/freebsd-spec.h
===
--- config/freebsd-spec.h   (revision 224751)
+++ config/freebsd-spec.h   (working copy)
@@ -66,8 +66,9 @@
   "%{!shared: \
  %{pg:gcrt1.o%s} %{!pg:%{p:gcrt1.o%s} \
   %{!p:%{profile:gcrt1.o%s} \
-%{!profile:crt1.o%s \
-   crti.o%s %{!shared:crtbegin.o%s} %{shared:crtbeginS.o%s}"
+%{!profile: \
+%{pie: Scrt1.o%s;:crt1.o%s} \
+   crti.o%s %{static:crtbeginT.o%s;shared|pie:crtbeginS.o%s;:crtbegin.o%s}"
 
 /* Provide a ENDFILE_SPEC appropriate for FreeBSD.  Here we tack on
the magical crtend.o file (see crtstuff.c) which provides part of 
@@ -76,7 +77,7 @@
`crtn.o'.  */
 
 #define FBSD_ENDFILE_SPEC \
-  "%{!shared:crtend.o%s} %{shared:crtendS.o%s} crtn.o%s"
+  "%{shared|pie:crtendS.o%s;:crtend.o%s} crtn.o%s"
 
 /* Provide a LIB_SPEC appropriate for FreeBSD as configured and as
required by the user-land thread model.  Before __FreeBSD_version
Index: config/i386/freebsd.h
===
--- config/i386/freebsd.h   (revision 224751)
+++ config/i386/freebsd.h   (working copy)
@@ -59,29 +59,16 @@
 #define SUBTARGET_EXTRA_SPECS \
   { "fbsd_dynamic_linker", FBSD_DYNAMIC_LINKER }
 
-/* Provide a STARTFILE_SPEC appropriate for FreeBSD.  Here we add
-   the magical crtbegin.o file (see crtstuff.c) which provides part 
-   of the support for getting C++ file-scope static object constructed 
-   before entering `main'.  */
-   
-#undef STARTFILE_SPEC
-#define STARTFILE_SPEC \
-  "%{!shared: \
- %{pg:gcrt1.o%s} %{!pg:%{p:gcrt1.o%s} \
-  %{!p:%{profile:gcrt1.o%s} \
-%{!profile:crt1.o%s \
-   crti.o%s %{!shared:crtbegin.o%s} %{shared:crtbeginS.o%s}"
+/* Use the STARTFILE_SPEC from config/freebsd-spec.h.  */
 
-/* Provide a ENDFILE_SPEC appropriate for FreeBSD.  Here we tack on
-   the magical crtend.o file (see crtstuff.c) which provides part of 
-   the support for getting C++ file-scope static object constructed 
-   before entering `main', followed by a normal "finalizer" file, 
-   `crtn.o'.  */
+#undef  STARTFILE_SPEC
+#define STARTFILE_SPEC FBSD_STARTFILE_SPEC
 
-#undef ENDFILE_SPEC
-#define ENDFILE_SPEC \
-  "%{!shared:crtend.o%s} %{shared:crtendS.o%s} crtn.o%s"
+/* Use the ENDFILE_SPEC from config/freebsd-spec.h.  */
 
+#undef  ENDFILE_SPEC
+#define ENDFILE_SPEC FBSD_ENDFILE_SPEC
+
 /* Provide a LINK_SPEC appropriate for FreeBSD.  Here we provide support
for the special GCC options -static and -shared, which allow us to
link things in one of these three modes by applying the appropriate
Index: testsuite/lib/target-supports.exp
===
--- testsuite/lib/target-supports.exp   (revision 224751)
+++ testsuite/lib/target-supports.exp   (working copy)
@@ -952,6 +952,7 @@
 
 proc check_effective_target_pie { } {
 if { [istarget *-*-darwin\[912\]*]
+|| [istarget *-*-freebsd*]
 || [istarget *-*-linux*]
 || [istarget *-*-gnu*] } {
return 1;


[PATCH] backport Contribute FreeBSD unwind support (x86_64 and x86)

2015-06-23 Thread Andreas Tobler

Hi all,

the next one. I'm going to commit this patch to 5.1 in the next days.

Also in my tree, test results posted for amd64/i386-*-freebsd11.0

Thanks,

Andreas


2015-06-22  Andreas Tobler  

Backport from mainline
2015-05-27  John Marino 

* config.host (i[34567]86-*-freebsd*, x86_64-*-freebsd*): Set
md_unwind_header
* config/i386/freebsd-unwind.h: New.
Index: config/i386/freebsd-unwind.h
===
--- config/i386/freebsd-unwind.h(revision 0)
+++ config/i386/freebsd-unwind.h(working copy)
@@ -0,0 +1,173 @@
+/* DWARF2 EH unwinding support for FreeBSD: AMD x86-64 and x86.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+   Contributed by John Marino 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+/* Do code reading to identify a signal frame, and set the frame
+   state data appropriately.  See unwind-dw2.c for the structs. */
+
+#include 
+#include 
+#include 
+#include 
+
+#define REG_NAME(reg)  sf_uc.uc_mcontext.mc_## reg
+
+#ifdef __x86_64__
+#define MD_FALLBACK_FRAME_STATE_FOR x86_64_freebsd_fallback_frame_state
+
+static _Unwind_Reason_Code
+x86_64_freebsd_fallback_frame_state
+(struct _Unwind_Context *context, _Unwind_FrameState *fs)
+{
+  struct sigframe *sf;
+  long new_cfa;
+
+  /* Prior to FreeBSD 9, the signal trampoline was located immediately
+ before the ps_strings.  To support non-executable stacks on AMD64,
+ the sigtramp was moved to a shared page for FreeBSD 9.  Unfortunately
+ this means looking frame patterns again (sys/amd64/amd64/sigtramp.S)
+ rather than using the robust and convenient KERN_PS_STRINGS trick.
+
+ :  lea 0x10(%rsp),%rdi
+ :  pushq   $0x0
+ :  mov $0x1a1,%rax
+ :  syscall
+
+ If we can't find this pattern, we're at the end of the stack.
+  */
+
+  if (!(   *(unsigned int *)(context->ra)  == 0x247c8d48
+&& *(unsigned int *)(context->ra +  4) == 0x48006a10
+&& *(unsigned int *)(context->ra +  8) == 0x01a1c0c7
+&& *(unsigned int *)(context->ra + 12) == 0x050f ))
+return _URC_END_OF_STACK;
+
+  sf = (struct sigframe *) context->cfa;
+  new_cfa = sf->REG_NAME(rsp);
+  fs->regs.cfa_how = CFA_REG_OFFSET;
+  /* Register 7 is rsp  */
+  fs->regs.cfa_reg = 7;
+  fs->regs.cfa_offset = new_cfa - (long) context->cfa;
+
+  /* The SVR4 register numbering macros aren't usable in libgcc.  */
+  fs->regs.reg[0].how = REG_SAVED_OFFSET;
+  fs->regs.reg[0].loc.offset = (long)&sf->REG_NAME(rax) - new_cfa;
+  fs->regs.reg[1].how = REG_SAVED_OFFSET;
+  fs->regs.reg[1].loc.offset = (long)&sf->REG_NAME(rdx) - new_cfa;
+  fs->regs.reg[2].how = REG_SAVED_OFFSET;
+  fs->regs.reg[2].loc.offset = (long)&sf->REG_NAME(rcx) - new_cfa;
+  fs->regs.reg[3].how = REG_SAVED_OFFSET;
+  fs->regs.reg[3].loc.offset = (long)&sf->REG_NAME(rbx) - new_cfa;
+  fs->regs.reg[4].how = REG_SAVED_OFFSET;
+  fs->regs.reg[4].loc.offset = (long)&sf->REG_NAME(rsi) - new_cfa;
+  fs->regs.reg[5].how = REG_SAVED_OFFSET;
+  fs->regs.reg[5].loc.offset = (long)&sf->REG_NAME(rdi) - new_cfa;
+  fs->regs.reg[6].how = REG_SAVED_OFFSET;
+  fs->regs.reg[6].loc.offset = (long)&sf->REG_NAME(rbp) - new_cfa;
+  fs->regs.reg[8].how = REG_SAVED_OFFSET;
+  fs->regs.reg[8].loc.offset = (long)&sf->REG_NAME(r8) - new_cfa;
+  fs->regs.reg[9].how = REG_SAVED_OFFSET;
+  fs->regs.reg[9].loc.offset = (long)&sf->REG_NAME(r9) - new_cfa;
+  fs->regs.reg[10].how = REG_SAVED_OFFSET;
+  fs->regs.reg[10].loc.offset = (long)&sf->REG_NAME(r10) - new_cfa;
+  fs->regs.reg[11].how = REG_SAVED_OFFSET;
+  fs->regs.reg[11].loc.offset = (long)&sf->REG_NAME(r11) - new_cfa;
+  fs->regs.reg[12].how = REG_SAVED_OFFSET;
+  fs->regs.reg[12].loc.offset = (long)&sf->REG_NAME(r12) - new_cfa;
+  fs->regs.reg[13].how = REG_SAVED_OFFSET;
+  fs->regs.reg[13].loc.offset = (long)&sf->REG_NAME(r13) - new_cfa;
+  fs->regs.reg[14].how = REG_SAVED_OFFSET;
+  fs->regs.reg[14].loc.offset = (long)&sf->REG_NAME(r14) - new_cfa;
+  fs->regs.reg[15].how = REG_SAVED_OFFSET;
+  fs->regs.reg[15].loc.offset = (long)&sf->REG_NAME(r15) - new

[PATCH] backport libjava signal handling for FreeBSD (amd64/i386)

2015-06-23 Thread Andreas Tobler

Hi again,

number three.

This one is also on my list to be committed to 5.1.

Results on the usual place.

Thanks,
Andreas

2015-06-22  Andreas Tobler  

Backport from mainline
2015-05-28  Andreas Tobler  

* configure.host: Add bits for FreeBSD amd64 and i386.
* configure.ac: Add signal handler for FreeBSD (amd64/i386)
* configure: Regenerate.
* include/freebsd-signal.h: New file.

2015-05-26  Andreas Tobler  

* testsuite/libjava.jni/jni.exp (gcj_jni_get_cxxflags_invocation): Add
libiconv for FreeBSD to cxxflags.


Index: configure
===
--- configure   (revision 224751)
+++ configure   (working copy)
@@ -24442,6 +24442,9 @@
  powerpc*-*-aix*)
 SIGNAL_HANDLER=include/aix-signal.h
 ;;
+ i?86-*-freebsd* | x86_64-*-freebsd*)
+SIGNAL_HANDLER=include/freebsd-signal.h
+;;
  *)
 SIGNAL_HANDLER=include/default-signal.h
 ;;
Index: configure.ac
===
--- configure.ac(revision 224751)
+++ configure.ac(working copy)
@@ -1755,6 +1755,9 @@
  powerpc*-*-aix*)
 SIGNAL_HANDLER=include/aix-signal.h
 ;;
+ i?86-*-freebsd* | x86_64-*-freebsd*)
+SIGNAL_HANDLER=include/freebsd-signal.h
+;;
  *)
 SIGNAL_HANDLER=include/default-signal.h
 ;;
Index: configure.host
===
--- configure.host  (revision 224751)
+++ configure.host  (working copy)
@@ -338,6 +338,8 @@
;;
   *-*-freebsd*)
slow_pthread_self=
+   can_unwind_signal=yes
+   DIVIDESPEC=-fuse-divide-subroutine
;;
   *-mingw*)
 libgcj_flags="${libgcj_flags} -fno-omit-frame-pointer"
Index: include/freebsd-signal.h
===
--- include/freebsd-signal.h(revision 0)
+++ include/freebsd-signal.h(working copy)
@@ -0,0 +1,48 @@
+/* freebsd-signal.h - Catch runtime signals and turn them into exceptions,
+   on a FreeBSD system.  */
+
+/* Copyright (C) 2015 Free Software Foundation
+
+   This file is part of libgcj.
+
+This software is copyrighted work licensed under the terms of the
+Libgcj License.  Please consult the file "LIBGCJ_LICENSE" for
+details.  */
+
+/* This file is really more of a specification.  The rest of the system
+   should be arranged so that this Just Works.  */
+
+#ifndef JAVA_SIGNAL_H
+# define JAVA_SIGNAL_H 1
+
+#include 
+#include 
+
+# define HANDLE_SEGV 1
+# define HANDLE_FPE  1
+
+# define SIGNAL_HANDLER(_name) \
+  static void _name (int _dummy __attribute__ ((unused)))
+
+# define MAKE_THROW_FRAME(_exception)
+
+# define INIT_SEGV \
+  do { \
+struct sigaction sa;   \
+sa.sa_handler = catch_segv;\
+sigemptyset (&sa.sa_mask); \
+sa.sa_flags = SA_NODEFER;  \
+sigaction (SIGBUS, &sa, NULL); \
+sigaction (SIGSEGV, &sa, NULL);\
+} while (0)
+
+# define INIT_FPE  \
+  do { \
+struct sigaction sa;   \
+sa.sa_handler = catch_fpe; \
+sigemptyset (&sa.sa_mask); \
+sa.sa_flags = SA_NODEFER;  \
+sigaction (SIGFPE, &sa, NULL); \
+} while (0)
+
+#endif /* JAVA_SIGNAL_H */
Index: testsuite/libjava.jni/jni.exp
===
--- testsuite/libjava.jni/jni.exp   (revision 224751)
+++ testsuite/libjava.jni/jni.exp   (working copy)
@@ -274,6 +274,11 @@
 eval lappend cxxflags "-shared-libgcc -lgcj $libiconv"
   }
 
+  # FreeBSD needs -liconv linked, otherwise we get some unresolved.
+  if { [istarget "*-*-freebsd*"] } {
+eval lappend cxxflags "$libiconv"
+  }
+
   # Make sure libgcc unwinder is used on 64-bit Solaris 10+/x86 rather than
   # the libc one.
   if { [istarget "*-*-solaris*"] } {


[PATCH] libgomp sysctl check

2015-06-23 Thread Andreas Tobler

Hi all,

this patch fixes a long standing bug in the libgomp configury.
The initial patch was correct, but the commit I did was wrong.

Ok for trunk?

Thanks,

Andreas

2015-06-23  Andreas Tobler  

* configure.ac: Fix check for header 
* configure: Regenerate.
* config.h.in: Likewise.

Index: configure.ac
===
--- configure.ac(revision 224759)
+++ configure.ac(working copy)
@@ -170,7 +170,7 @@
 AC_STDC_HEADERS
 AC_HEADER_TIME
 ACX_HEADER_STRING
-AC_CHECK_HEADERS(pthread.h unistd.h semaphore.h sys/loadavg.h 
sys/time.h sys/time.h)
+AC_CHECK_HEADERS(pthread.h unistd.h semaphore.h sys/loadavg.h 
sys/sysctl.h sys/time.h)


 GCC_HEADER_STDINT(gstdint.h)



Re: [PATCH] libgomp sysctl check

2015-06-23 Thread Jakub Jelinek
On Tue, Jun 23, 2015 at 07:36:22PM +0200, Andreas Tobler wrote:
> Hi all,
> 
> this patch fixes a long standing bug in the libgomp configury.
> The initial patch was correct, but the commit I did was wrong.
> 
> Ok for trunk?
> 
> Thanks,
> 
> Andreas
> 
> 2015-06-23  Andreas Tobler  
> 
>   * configure.ac: Fix check for header 
> * configure: Regenerate.
> * config.h.in: Likewise.

The last 2 lines are weidly indented (should be one tab before *),
the first line lacks full stop at the end.
Ok with those changes.

> --- configure.ac  (revision 224759)
> +++ configure.ac  (working copy)
> @@ -170,7 +170,7 @@
>  AC_STDC_HEADERS
>  AC_HEADER_TIME
>  ACX_HEADER_STRING
> -AC_CHECK_HEADERS(pthread.h unistd.h semaphore.h sys/loadavg.h sys/time.h
> sys/time.h)
> +AC_CHECK_HEADERS(pthread.h unistd.h semaphore.h sys/loadavg.h sys/sysctl.h
> sys/time.h)
> 
>  GCC_HEADER_STDINT(gstdint.h)

Jakub


Re: [PATCH] backport FreeBSD add functionality to build PIE executables

2015-06-23 Thread Jakub Jelinek
On Tue, Jun 23, 2015 at 07:26:09PM +0200, Andreas Tobler wrote:
> Hi all,
> 
> I'm going to commit this patch to 5.1 in the next days unless someone
> objects.
> 
> The patch is in my 5.1 tree since a longer time and I regularly post
> results.

Note, Richard announced plan to do 5.2-rc2 on July, 3rd, so either you
should do that before, or after 5.2 is released (general comment for all the
patches).  Or you'd need an exception when the branch is frozen.

Jakub


Re: [PATCH] backport FreeBSD add functionality to build PIE executables

2015-06-23 Thread Andreas Tobler

On 23.06.15 19:50, Jakub Jelinek wrote:

On Tue, Jun 23, 2015 at 07:26:09PM +0200, Andreas Tobler wrote:

Hi all,

I'm going to commit this patch to 5.1 in the next days unless someone
objects.

The patch is in my 5.1 tree since a longer time and I regularly post
results.


Note, Richard announced plan to do 5.2-rc2 on July, 3rd, so either you
should do that before, or after 5.2 is released (general comment for all the
patches).  Or you'd need an exception when the branch is frozen.


With 'in the next days' I meant to say 26.6 latest. Is this early enough?

I'm ready to commit right now, but I wanted to give people some time to 
object :)


Thank you!

Andreas




Re: [PATCH] libgomp sysctl check

2015-06-23 Thread Andreas Tobler

On 23.06.15 19:47, Jakub Jelinek wrote:

On Tue, Jun 23, 2015 at 07:36:22PM +0200, Andreas Tobler wrote:

Hi all,

this patch fixes a long standing bug in the libgomp configury.
The initial patch was correct, but the commit I did was wrong.

Ok for trunk?

Thanks,

Andreas

2015-06-23  Andreas Tobler  

* configure.ac: Fix check for header 
 * configure: Regenerate.
 * config.h.in: Likewise.


The last 2 lines are weidly indented (should be one tab before *),
the first line lacks full stop at the end.
Ok with those changes.


Committed with issues fixed.

Thanks,
Andreas






  1   2   >