Re: [v3 patch] refactoring - pull out common data members as _Context

2015-09-21 Thread Tim Shen
Hi,

As the changes grow
(https://github.com/innocentim/gcc/commits/master), it's getting
harder to rebase them onto svn trunk. Can we start slowly reviewing
and checking these in? Should I post them one by one to the lis?

These patches typically break one giant piece of code (mainly
_Executor::_M_dfs) into different smaller ones and fix mostly known
issues and standard conformance issues.


-- 
Regards,
Tim Shen


Re: [PR middle-end/60832] Do not convert widest_int to tree just for printing it.

2015-09-21 Thread Richard Biener
On Sun, 20 Sep 2015, Manuel López-Ibáñez wrote:

> In do_warn_aggressive_loop_optimizations, we convert to a tree just to print a
> widest_int. Apart from overly complicated, this results in printing '3u'
> instead of just '3'.
> 
> Unfortunately, adding a printf-like conversion specifier would require making
> pretty-print.c link with wide-int.cc, which will include a lot of new
> dependencies into several other programs (gcov-tool for example). It would be
> possible to add the conversion specifier to every FE pretty-printer, but this
> still would require updating c-format.c, which is far from trivial. A simpler
> approach is to convert to a string rather than to a tree.

I'd add the pretty printer overloads to wide-int-print.[ch], but yes
it would require a change in c-format.c.

> In addition, "iteration 3 invokes undefined behavior within this loop"
> seems to me clearer than  "iteration 3 invokes undefined behavior;
> containing loop".
> 
> Boot&tested in x86_64-linux-gnu
> 
> OK?

Ok.  Let's remember this case and if more of these pop up try working
on a "proper" solution.

Thanks,
Richard.

> gcc/testsuite/ChangeLog:
> 
> 2015-09-20  Manuel López-Ibáñez  
> 
> PR middle-end/60832
> * gcc.dg/pr53265.c: Update.
> 
> gcc/ChangeLog:
> 
> 2015-09-20  Manuel López-Ibáñez  
> 
> PR middle-end/60832
> * tree-ssa-loop-niter.c (do_warn_aggressive_loop_optimizations):
> Print i_bound without converting it to a tree.
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

[SH][committed] Remove SH5/SH64 documentation

2015-09-21 Thread Oleg Endo
Hi,

As announced here https://gcc.gnu.org/ml/gcc/2015-08/msg00101.html this
patch removes some SH5/SH64 related documentation.
Tested with make info dvi pdf.
Committed as r227959.

Cheers,
Oleg

gcc/ChangeLog:
* doc/invoke.texi (SH Options): Undocument SH5/SH64 related
options.
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 227958)
+++ gcc/doc/invoke.texi	(working copy)
@@ -983,16 +983,12 @@
 -m3  -m3e @gol
 -m4-nofpu  -m4-single-only  -m4-single  -m4 @gol
 -m4a-nofpu -m4a-single-only -m4a-single -m4a -m4al @gol
--m5-64media  -m5-64media-nofpu @gol
--m5-32media  -m5-32media-nofpu @gol
--m5-compact  -m5-compact-nofpu @gol
 -mb  -ml  -mdalign  -mrelax @gol
--mbigtable -mfmovd -mhitachi -mrenesas -mno-renesas -mnomacsave @gol
+-mbigtable -mfmovd -mrenesas -mno-renesas -mnomacsave @gol
 -mieee -mno-ieee -mbitops  -misize  -minline-ic_invalidate -mpadstruct @gol
 -mspace -mprefergot  -musermode -multcost=@var{number} -mdiv=@var{strategy} @gol
 -mdivsi3_libfunc=@var{name} -mfixed-range=@var{register-range} @gol
--mindexed-addressing -mgettrcost=@var{number} -mpt-fixed @gol
--maccumulate-outgoing-args -minvalid-symbols @gol
+-maccumulate-outgoing-args @gol
 -matomic-model=@var{atomic-model} @gol
 -mbranch-cost=@var{num} -mzdcbranch -mno-zdcbranch @gol
 -mcbranch-force-delay-slot @gol
@@ -20805,33 +20801,6 @@
 @option{-dsp} to the assembler.  GCC doesn't generate any DSP
 instructions at the moment.
 
-@item -m5-32media
-@opindex m5-32media
-Generate 32-bit code for SHmedia.
-
-@item -m5-32media-nofpu
-@opindex m5-32media-nofpu
-Generate 32-bit code for SHmedia in such a way that the
-floating-point unit is not used.
-
-@item -m5-64media
-@opindex m5-64media
-Generate 64-bit code for SHmedia.
-
-@item -m5-64media-nofpu
-@opindex m5-64media-nofpu
-Generate 64-bit code for SHmedia in such a way that the
-floating-point unit is not used.
-
-@item -m5-compact
-@opindex m5-compact
-Generate code for SHcompact.
-
-@item -m5-compact-nofpu
-@opindex m5-compact-nofpu
-Generate code for SHcompact in such a way that the
-floating-point unit is not used.
-
 @item -mb
 @opindex mb
 Compile code for the processor in big-endian mode.
@@ -21006,68 +20975,10 @@
 @item -mdiv=@var{strategy}
 @opindex mdiv=@var{strategy}
 Set the division strategy to be used for integer division operations.
-For SHmedia @var{strategy} can be one of: 
+@var{strategy} can be one of: 
 
 @table @samp
 
-@item fp 
-Performs the operation in floating point.  This has a very high latency,
-but needs only a few instructions, so it might be a good choice if
-your code has enough easily-exploitable ILP to allow the compiler to
-schedule the floating-point instructions together with other instructions.
-Division by zero causes a floating-point exception.
-
-@item inv
-Uses integer operations to calculate the inverse of the divisor,
-and then multiplies the dividend with the inverse.  This strategy allows
-CSE and hoisting of the inverse calculation.  Division by zero calculates
-an unspecified result, but does not trap.
-
-@item inv:minlat
-A variant of @samp{inv} where, if no CSE or hoisting opportunities
-have been found, or if the entire operation has been hoisted to the same
-place, the last stages of the inverse calculation are intertwined with the
-final multiply to reduce the overall latency, at the expense of using a few
-more instructions, and thus offering fewer scheduling opportunities with
-other code.
-
-@item call
-Calls a library function that usually implements the @samp{inv:minlat}
-strategy.
-This gives high code density for @code{m5-*media-nofpu} compilations.
-
-@item call2
-Uses a different entry point of the same library function, where it
-assumes that a pointer to a lookup table has already been set up, which
-exposes the pointer load to CSE and code hoisting optimizations.
-
-@item inv:call
-@itemx inv:call2
-@itemx inv:fp
-Use the @samp{inv} algorithm for initial
-code generation, but if the code stays unoptimized, revert to the @samp{call},
-@samp{call2}, or @samp{fp} strategies, respectively.  Note that the
-potentially-trapping side effect of division by zero is carried by a
-separate instruction, so it is possible that all the integer instructions
-are hoisted out, but the marker for the side effect stays where it is.
-A recombination to floating-point operations or a call is not possible
-in that case.
-
-@item inv20u
-@itemx inv20l
-Variants of the @samp{inv:minlat} strategy.  In the case
-that the inverse calculation is not separated from the multiply, they speed
-up division where the dividend fits into 20 bits (plus sign where applicable)
-by inserting a test to skip a number of operations in this case; this test
-slows down the case of larger dividends.  @samp{inv20u} assumes the case of a such
-a small dividend to be unlikely, and @samp{inv20l} assumes it to be likely.
-
-@end table
-
-For targets other than SHmedia @var{strat

[PATCH] [ARM] Replacing variable swaps that use a temporary variable with a call to std::swap in gcc/config/arm/arm.c

2015-09-21 Thread Bilyan Borisov

Replacing variable swaps that use a temporary variable with a call to 
std::swap. Tested against arm-none-eabi target including a variant with neon 
enabled.

2015-XX-XX  Bilyan Borisov  

* config/arm/arm.c (thumb_output_move_mem_multiple): Replaced
operands[4] operands[5] swap with std::swap, removed tmp variable.
(arm_evpc_neon_vzip): Replaced in0/in1 and
out0/out1 swaps with std::swap, removed x variable.
(arm_evpc_neon_vtrn): Replaced in0/int1 and
out0/out1 swaos with std::swap, removed x variable.
(arm_expand_vec_perm_const_1): Replaced
d->op0/d->op1 swap with std::swap, removed x variable.
(arm_evpc_neon_vuzp): Replaced in0/in1 and
out0/out1 swaps with std::swap, removed x variable.

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index fa4e083adfe215b5820237f3cc6b449dbdefc778..dc549fe470e5a82d740cf8014057ee2cc1d54085 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -25362,17 +25362,12 @@ thumb_load_double_from_address (rtx *operands)
 const char *
 thumb_output_move_mem_multiple (int n, rtx *operands)
 {
-  rtx tmp;
-
   switch (n)
 {
 case 2:
   if (REGNO (operands[4]) > REGNO (operands[5]))
-	{
-	  tmp = operands[4];
-	  operands[4] = operands[5];
-	  operands[5] = tmp;
-	}
+	std::swap (operands[4], operands[5]);
+
   output_asm_insn ("ldmia\t%1!, {%4, %5}", operands);
   output_asm_insn ("stmia\t%0!, {%4, %5}", operands);
   break;
@@ -27885,7 +27880,7 @@ static bool
 arm_evpc_neon_vuzp (struct expand_vec_perm_d *d)
 {
   unsigned int i, odd, mask, nelt = d->nelt;
-  rtx out0, out1, in0, in1, x;
+  rtx out0, out1, in0, in1;
   rtx (*gen)(rtx, rtx, rtx, rtx);
 
   if (GET_MODE_UNIT_SIZE (d->vmode) >= 8)
@@ -27929,14 +27924,14 @@ arm_evpc_neon_vuzp (struct expand_vec_perm_d *d)
   in1 = d->op1;
   if (BYTES_BIG_ENDIAN)
 {
-  x = in0, in0 = in1, in1 = x;
+  std::swap (in0, in1);
   odd = !odd;
 }
 
   out0 = d->target;
   out1 = gen_reg_rtx (d->vmode);
   if (odd)
-x = out0, out0 = out1, out1 = x;
+std::swap (out0, out1);
 
   emit_insn (gen (out0, in0, in1, out1));
   return true;
@@ -27948,7 +27943,7 @@ static bool
 arm_evpc_neon_vzip (struct expand_vec_perm_d *d)
 {
   unsigned int i, high, mask, nelt = d->nelt;
-  rtx out0, out1, in0, in1, x;
+  rtx out0, out1, in0, in1;
   rtx (*gen)(rtx, rtx, rtx, rtx);
 
   if (GET_MODE_UNIT_SIZE (d->vmode) >= 8)
@@ -27996,14 +27991,14 @@ arm_evpc_neon_vzip (struct expand_vec_perm_d *d)
   in1 = d->op1;
   if (BYTES_BIG_ENDIAN)
 {
-  x = in0, in0 = in1, in1 = x;
+  std::swap (in0, in1);
   high = !high;
 }
 
   out0 = d->target;
   out1 = gen_reg_rtx (d->vmode);
   if (high)
-x = out0, out0 = out1, out1 = x;
+std::swap (out0, out1);
 
   emit_insn (gen (out0, in0, in1, out1));
   return true;
@@ -28089,7 +28084,7 @@ static bool
 arm_evpc_neon_vtrn (struct expand_vec_perm_d *d)
 {
   unsigned int i, odd, mask, nelt = d->nelt;
-  rtx out0, out1, in0, in1, x;
+  rtx out0, out1, in0, in1;
   rtx (*gen)(rtx, rtx, rtx, rtx);
 
   if (GET_MODE_UNIT_SIZE (d->vmode) >= 8)
@@ -28134,14 +28129,14 @@ arm_evpc_neon_vtrn (struct expand_vec_perm_d *d)
   in1 = d->op1;
   if (BYTES_BIG_ENDIAN)
 {
-  x = in0, in0 = in1, in1 = x;
+  std::swap (in0, in1);
   odd = !odd;
 }
 
   out0 = d->target;
   out1 = gen_reg_rtx (d->vmode);
   if (odd)
-x = out0, out0 = out1, out1 = x;
+std::swap (out0, out1);
 
   emit_insn (gen (out0, in0, in1, out1));
   return true;
@@ -28264,14 +28259,11 @@ arm_expand_vec_perm_const_1 (struct expand_vec_perm_d *d)
   if (d->perm[0] >= d->nelt)
 {
   unsigned i, nelt = d->nelt;
-  rtx x;
 
   for (i = 0; i < nelt; ++i)
 	d->perm[i] = (d->perm[i] + nelt) & (2 * nelt - 1);
 
-  x = d->op0;
-  d->op0 = d->op1;
-  d->op1 = x;
+  std::swap (d->op0, d->op1);
 }
 
   if (TARGET_NEON)


[PATCH] Fix PR67651

2015-09-21 Thread Richard Biener

The following fixes a very old bug in RTL nonzero_address_p which fails
to consider -fno-delete-null-pointer-checks.  Fixing that makes the
testcase

extern unsigned int _vector_table;

int main(void)
{
  unsigned int *vector_base = &_vector_table;

  if (vector_base == 0) {
  return 1;
  } else {
  return 2;
  }
}

not optimize the check with that flag.  We appearantly never got this
right, so I'll install this on trunk only.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Richard.

2015-09-21  Richard Biener  

PR middle-end/67651
* rtlanal.c (nonzero_address_p): SYMBOL_REFs may have zero
address with -fno-delete-null-pointer-checks.

Index: gcc/rtlanal.c
===
--- gcc/rtlanal.c   (revision 227899)
+++ gcc/rtlanal.c   (working copy)
@@ -723,7 +723,7 @@ nonzero_address_p (const_rtx x)
   switch (code)
 {
 case SYMBOL_REF:
-  return !SYMBOL_REF_WEAK (x);
+  return flag_delete_null_pointer_checks && !SYMBOL_REF_WEAK (x);
 
 case LABEL_REF:
   return true;


Re: [ARM] Fix PR middle-end/65958

2015-09-21 Thread Eric Botcazou
> On targets using thumb1, I can see:
> - the new test failing (you should probably add a dg-skip or an
> effective-target directive)

OK, I have added:

/* { dg-skip-if "" { arm_thumb1 } } */

as in the ivopts-orig_biv-inc.c test.

> - gcc.dg/pr48134.c now fails, saying:
> sorry, unimplemented: -fstack-check=specific for THUMB1
> while the error message is the same as for your new test, I'm
> wondering why/how it actually passed before your patch?

Yeah, it's a bug, -fstack-check=specific should have been rejected before for 
all ARM targets since it was not implemented.  Will fix.

-- 
Eric Botcazou


Re: [PATCH tree-inline] do not say "called from here" with UNKNOWN_LOCATION

2015-09-21 Thread Richard Biener
On Fri, Sep 18, 2015 at 8:47 PM, Manuel López-Ibáñez
 wrote:
> And now with the patch.
>
> On 18 September 2015 at 20:40, Manuel López-Ibáñez
>  wrote:
>> In https://sourceware.org/ml/libc-alpha/2014-12/msg00300.html, we give a
>> "called from here" note without actually having a location, which looks
>> strange. I haven't been able to generate such a testcase. In this patch, we
>> assert this cannot happen when checking and simply skip the extra note in
>> release mode.
>>
>> Boot&tested on x86_64-linux-gnu
>>
>> OK?

Hmm.

input_location is set from the call stmt:

  /* FIXME: instantiate_decl isn't called by inlinable_function_p.  */
  saved_location = input_location;
  input_location = gimple_location (stmt);

it would be nice to get rid of that.

If the call is artificially generated it might have no location so I
think the assert
is bogus.  Also in the case of UNKNOWN_LOCATION it would be nice to at
least note the function we are failing to inline to (thus, use
DECL_SOURCE_LOCATION
of cfun->decl).  So better add a diag_location and compute that upfront to avoid
repeating the check.

Did you investigate the glibc case on whether caller or callee are artificial?

Richard.

>>
>> gcc/testsuite/ChangeLog:
>>
>> 2015-09-18  Manuel López-Ibáñez  
>>
>> * gcc.target/i386/inline_error.c (int bar): Use dg-message for note.
>> * gcc.target/i386/pr57756.c (static __inline int caller): Likewise.
>> * gcc.target/i386/pr59789.c (f1): Likewise.
>> * gcc.target/i386/intrinsics_5.c (__m128i foo): Likewise.
>> * gcc.target/i386/intrinsics_6.c: Likewise.
>> * gcc.dg/winline-5.c (int t): Likewise.
>> * gcc.dg/winline-9.c (t): Likewise.
>> * gcc.dg/always_inline2.c (q): Likewise.
>> * gcc.dg/winline-2.c (inline int t): Likewise.
>> * gcc.dg/winline-6.c: Likewise.
>> * gcc.dg/winline-10.c (void g): Likewise.
>> * gcc.dg/pr49243.c (void parse): Likewise.
>> * gcc.dg/always_inline3.c (q2): Likewise.
>> * gcc.dg/winline-3.c: Likewise.
>> * gcc.dg/winline-7.c (inline void *t): Likewise.
>>
>> gcc/ChangeLog:
>>
>> 2015-09-18  Manuel López-Ibáñez  
>>
>> * tree-inline.c (expand_call_inline): Use inform for extra note.
>> Do not give "called from here" with UNKNOWN_LOCATION.


Re: [PATCH] 2015-09-03 Benedikt Huber Philipp Tomsich

2015-09-21 Thread Marcus Shawcroft
Hi,

Thanks for your work on this.  There are a bunch of predominantly
style nits in line below.  My none nit comments on this patch are:

This should be left turned off for all cores where we have not seen
benchmark numbers to indicate that this optimization is a benefit, we
can take patches for each core in the future once numbers are
available.

Given the absence of numbers for some of the cores and the significant
impact on one core this should remain disabled by default on generic.

We don't necessarily want a proliferation of user orientated tuning
options without good reason, instead the per core default tuning
behaviour should make the right choice per core, hence I think -mrecip
should be dropped.  We already have the -moverride= mechanism to
provide a developer orientated mechanism to override the per core
tuning flag.

Further comments inline

Thanks

/Marcus

On 7 September 2015 at 11:40, Benedikt Huber
 wrote:


--- a/gcc/config/aarch64/aarch64-tuning-flags.def
+++ b/gcc/config/aarch64/aarch64-tuning-flags.def
@@ -29,4 +29,5 @@
  AARCH64_TUNE_ to give an enum name. */

 AARCH64_EXTRA_TUNING_OPTION ("rename_fma_regs", RENAME_FMA_REGS)
+AARCH64_EXTRA_TUNING_OPTION ("mrecip_default_enabled", MRECIP_DEFAULT_ENABLED)

This name is exposed via the -moverride=tune= option, perhaps a
better name would be:
+AARCH64_EXTRA_TUNING_OPTION ("recip_sqrt", RECIP_SQRT)


> +/* Add builtins for reciprocal square root. */
> +void
> +aarch64_add_builtin_rsqrt (void)
> +{
> +  tree fndecl = NULL;
> +  tree ftype = NULL;
> +
> +  tree V2SF_type_node = build_vector_type (float_type_node, 2);
> +  tree V2DF_type_node = build_vector_type (double_type_node, 2);
> +  tree V4SF_type_node = build_vector_type (float_type_node, 4);
> +
> +  ftype = build_function_type_list (double_type_node, double_type_node, 
> NULL_TREE);

Line length exceeds 80 characters.

> +  fndecl = add_builtin_function ("__builtin_aarch64_rsqrt_df",
> +ftype, AARCH64_BUILTIN_RSQRT_DF, BUILT_IN_MD, NULL, NULL_TREE);
> +  aarch64_builtin_decls[AARCH64_BUILTIN_RSQRT_DF] = fndecl;
> +
> +  ftype = build_function_type_list (float_type_node, float_type_node, 
> NULL_TREE);

Line length exceeds 80 characters.

> +/* Function to expand reciprocal square root builtins. */
> +static rtx
> +aarch64_expand_builtin_rsqrt (int fcode, tree exp, rtx target)
> +{
> +  rtx pat;
> +  tree arg0 = CALL_EXPR_ARG (exp, 0);
> +  rtx op0 = expand_normal (arg0);
> +
> +  enum insn_code c;
> +
> +  switch (fcode)
> +{
> +  case AARCH64_BUILTIN_RSQRT_DF:
> +c = CODE_FOR_rsqrt_df2; break;

Leading blocks of 8 spaces should be TAB's, likewise the rest of the patch.

> +/* Return builtin for reciprocal square root. */
> +tree
> +aarch64_builtin_rsqrt (unsigned int fn, bool md_fn)

Blank line between function comments and function, likewise through
the rest of the patch please.

> diff --git a/gcc/config/aarch64/aarch64-opts.h 
> b/gcc/config/aarch64/aarch64-opts.h
> index bf6bb7b..f8e79cb 100644
> --- a/gcc/config/aarch64/aarch64-opts.h
> +++ b/gcc/config/aarch64/aarch64-opts.h
> @@ -73,4 +73,11 @@ enum aarch64_code_model {
>AARCH64_CMODEL_LARGE
>  };
>
> +/* Each core can have -mrecip enabled or disabled by default. */
> +enum aarch64_mrecip {

{ on new line please.

> +  AARCH64_MRECIP_OFF = 0,
> +  AARCH64_MRECIP_ON,
> +  AARCH64_MRECIP_DEFAULT,

Trailing , will give a "comma at end of enumerator list" warning every
time this file is included,  drop the comma please.

> +/* Function to decide when to use
> + * reciprocal square root builtins. */

Drop the leading * on follow on comment lines, here and elsewhere in
the patch please.

> +/* Select reciprocal square root initial estimate
> + * insn depending on machine mode. */
> +rsqrte_type get_rsqrte_type (enum machine_mode mode)

New line between type and function name please, likewise for other
instance in this patch.

> +  for (int i = 0; i < iterations; ++i)
> +{
> +  rtx x1 = gen_reg_rtx (mode);
> +  rtx x2 = gen_reg_rtx (mode);
> +  rtx x3 = gen_reg_rtx (mode);
> +  emit_set_insn (x2, gen_rtx_MULT (mode, x0, x0));
> +
> +  emit_insn ((*get_rsqrts_type (mode)) (x3, xsrc, x2));
> +
> +  emit_set_insn (x1, gen_rtx_MULT (mode, x0, x3));
> +  x0 = x1;
> +}
> +
> +  emit_move_insn (dst, x0);
> +  return;

Superflous return, drop it please.

> +#undef TARGET_BUILTIN_RECIPROCAL
> +#define TARGET_BUILTIN_RECIPROCAL aarch64_builtin_reciprocal
> +

The rest of these TARGET_* defines are in alphabetical order, please
insert the new one in order.

> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> +
> +mlow-precision-recip-sqrt
> +Common Var(flag_mrecip_low_precision_sqrt) Optimization
> +Run fewer approximation steps to reduce latency and precision.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi

> +@item -mlow-precision-recip-sqrt
> +@item -mno-low-precision-recip-sqrt
> +@opindex -mlow-precision-recip-sqrt
> +@opindex -mn

Re: [RFC,PATCH] DWARF support for AIX

2015-09-21 Thread Richard Biener
On Sun, Sep 20, 2015 at 4:25 PM, David Edelsohn  wrote:
> Appended is an initial implementation of DWARF support for AIX.  The
> patch emits the correct sections and works with the existing DWARF
> support for AIX XCOFF in GDB ... somewhat.
>
> First, AIX only supports a subset of DWARF sections.  AIX does not
> support DWARF debug frame sections, although DWARF unwind frame
> sections are produced and consumed internally by GCC.  I initially
> tried bracketing dwarf2out.c:output_call_frame_info() with #ifdef, but
> then changed to setting targetm.debug_unwind_info() to UI_NONE.  This
> change prevents the debug frame info, but the debug location section
> newly appeared, which AIX also does not support.  So I must bracket
> output_location_lists with #ifdef AIX.  Setting and checking for
> something like a NULL section name specifically for AIX seems more
> complicated and cumbersome.
>
> Second, recent releases of XLC support DWARF debug info.  GDB
> understands the DWARF information from XLC.  GDB xcoffread.c performs
> the mostly normal
>
>   if (dwarf2_has_info (objfile, &dwarf2_xcoff_names))
> dwarf2_build_psymtabs (objfile);
>
>   dwarf2_build_frame_info (objfile);
>
> so I do not see anything unusual in what it expects to parse.
> HOWEVER, XLC DWARF sections begin with the DWARF version number, not
> the length, e.g.,
>
>.dwsect 0x0002  # section name .dwline
> .dwline:
> #   0x0400  # section address
> #   .long   0x0049  # section length
> .short  0x0002  # dwarf version
>
> .dwsect 0x0001  # section name .dwinfo
> .dwinfo:
> #   0x0500  # section address
> #   .long   0x013e  # section length
> .short  0x0002  # dwarf version
>
> And that is exactly the way the XLC disassembly appears, comments and
> everything.  The section length is commented out.
>
> If I manually comment out the section length information in the GCC
> DWARF debugging information, trivial tests of GDB on AIX function
> correctly, e.g., it shows the correct file, function and line numbers.
>
> GDB accepts and understands XLC DWARF debug information WITHOUT
> section length but does not understand GCC DWARF debug information
> WITH section length.  DWARF for other ELF targets includes the section
> length and does appear to hinder GDB.
>
> How can I adjust the DWARF produced by GCC or the DWARF consumed by
> GDB to address the section length information?

You probably need to add a target hook to do that.  Note this isn't
"section length"
but the size of the CU header.  At least on all other targets I know of:

.section.debug_info,"",@progbits
.Ldebug_info0:
.long   0x64# Length of Compilation Unit Info
.value  0x4 # DWARF version number

and the format of that header is specified by the DWARF standard.  So I wonder
why XLC chose to do sth non-conforming here (again :/)

Richard.

> Thanks, David
>
>* dwarf2out.c (dwarf2out_finish): Don't output location
> lists on AIX.
> * config/rs6000/rs6000.c (rs6000_xcoff_debug_unwind_info):
> New.
> (rs6000_xcoff_asm_named_section): Emit .dwsect pseudo-op
> for SECTION_DEBUG.
> (rs6000_xcoff_declare_function_name): Emit different
> .function pseudo-op when DWARF2_DEBUG.
> * config/rs6000/xcoff.h (TARGET_DEBUG_UNWIND_INFO):
> Redefine.
> * config/rs6000/aix71.h (DWARF2_DEBUGGING_INFO): Define.
> (PREFERRED_DEBUGGING_TYPE): Define.
> (DEBUG_INFO_SECTION): Define.
> (DEBUG_ABBREV_SECTION): Define.
> (DEBUG_ARANGES_SECTION): Define.
> (DEBUG_LINE_SECTION): Define.
> (DEBUG_PUBNAMES_SECTION): Define.
> (DEBUG_PUBTYPES_SECTION): Define.
> (DEBUG_STR_SECTION): Define.
> (DEBUG_RANGES_SECTION): Define.
>
> Index: dwarf2out.c
> ===
> --- dwarf2out.c (revision 227937)
> +++ dwarf2out.c (working copy)
> @@ -25486,6 +25486,7 @@
>output_abbrev_section ();
>  }
>
> +#ifndef TARGET_AIX_VERSION
>/* Output location list section if necessary.  */
>if (have_location_lists)
>  {
> @@ -25494,6 +25495,7 @@
>ASM_OUTPUT_LABEL (asm_out_file, loc_section_label);
>output_location_lists (comp_unit_die ());
>  }
> +#endif
>
>output_pubtables ();
>
> Index: config/rs6000/rs6000.c
> ===
> --- config/rs6000/rs6000.c  (revision 227937)
> +++ config/rs6000/rs6000.c  (working copy)
> @@ -30684,6 +30684,12 @@
>  #endif
>
>  #if TARGET_XCOFF
> +static enum unwind_info_type
> +rs6000_xcoff_debug_unwind_info (void)
> +{
> +  return UI_NONE;
> +}
> +
>  static void
>  rs6000_xcoff_asm_output_anchor (rtx symbol)
>  {
> @@ -30805,6 +30811,11 @@
>int smclass

Re: [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics.

2015-09-21 Thread Alan Lawrence
[Resending in plain text] This makes sense to me now, although I find
your comment slightly confusing:

[] in that
+;; the meaning of HI and LO is always taken with a little-endian view of
+;; the vector

You mean vec_unpacks_{hi,lo} (which seems to go against the
*architectural* bit after this), or hi/lo in cases other than
vec_unpack (=> not "always"), or something else?

maybe s/always/usually/ or s/always/otherwise/ ?

Cheers, Alan

On 10 September 2015 at 15:02, James Greenhalgh
 wrote:
>
> On Wed, Sep 09, 2015 at 10:28:28AM +0100, Christophe Lyon wrote:
>> On 9 September 2015 at 10:31, James Greenhalgh  
>> wrote:
>> >
>> > Hi,
>> >
>> > This patch clears up some remaining confusion in the vector lane orderings
>> > for the two intrinsics mentioned in the title.
>> >
>> > Bootstrapped on aarch64-none-linux-gnu and regression tested for
>> > aarch64_be-none-elf with no issues.
>> >
>>
>> Does this actually fix an existing testcase?
>
> Yes, of course, sorry - that was a useless introduction to the patch!
>
> First, I've updated the patch with a testcase, which fails for me on
> aarch64_be-none-elf but not aarch64-*-*.
>
> The issue is that the RTL folding routines will happily fold through
> a vec_concat or a vec_select, which we have given the wrong operands
> to when in BYTES_BIG_ENDIAN mode.  The fix is similar to that which we
> have elsewhere in aarch64-simd.md, which is to split out the big
> and little endian forms of the patterns which need vec_concat, and
> to build a vec_par_cnst_*_half mask for the patterns which need
> vec_select. This keeps us in the GCC-view of lane ordering.
>
> There is test coverage that these patterns do the right thing for the
> vectorizer (I know, because I initially typoed s/le/be and saw tests
> gcc.dg/vect fall over), and the new testcase adds coverage for the
> expansion path through intrinsics.
>
> I've rebased on top of Alan's patch, which goes halfway to fixing the
> issue, but which didn't fix the float_truncate patterns, which had an
> incorrect vec_concat. That simplifies the patch considerably.
>
> Rechecked on aarch64_be-none-elf and aarch64-none-linux-gnu with no
> issues.
>
> OK?
>
> Thanks,
> James
>
> ---
> gcc/
>
> 2015-09-09  James Greenhalgh  
>
> * config/aarch64/aarch64-simd.md
>
> (aarch64_float_truncate_hi_v4sf): Rewrite as an expand.
> (aarch64_float_truncate_hi_v4sf_le): New.
> (aarch64_float_truncate_hi_v4sf_be): Likewise.
>
> gcc/testsuite/
>
> 2015-09-09  James Greenhalgh  
>
> * gcc.target/aarch64/advsimd-intrinsics/vcvt_high_1.c: New.
>


Re: [PATCH] Fix Cygwin bootstrap failing to find win32 libraries

2015-09-21 Thread Kai Tietz
2015-09-21 0:35 GMT+02:00 JonY <10wa...@gmail.com>:
> On 9/21/2015 02:05, Kai Tietz wrote:
>> 2015-09-16 13:42 GMT+02:00 JonY <10wa...@gmail.com>:
>>> libgcc is failing to find kerne32 etc during the 2nd stage when
>>> bootstraping, explicitly add w32api directory to search path.
>>>
>>> Patch OK?
>>>
>>> diff --git a/gcc/config/i386/cygwin.h b/gcc/config/i386/cygwin.h
>>> index 2a2a0bf..fd3bc0a 100644
>>> --- a/gcc/config/i386/cygwin.h
>>> +++ b/gcc/config/i386/cygwin.h
>>> @@ -39,6 +39,7 @@ along with GCC; see the file COPYING3.  If not see
>>>
>>>  #undef STARTFILE_SPEC
>>>  #define STARTFILE_SPEC "\
>>> +  -L%R/usr/lib/w32api \
>>>%{!shared: %{!mdll: crt0%O%s \
>>>%{pg:gcrt0%O%s}}}\
>>>%{shared:crtbeginS.o%s;:crtbegin.o%s} \
>>>
>>>
>>
>> Hello JonY,
>>
>> patch is ok with proper ChangeLog.  For sample of format see existing
>> ChangeLog files in gcc-subdirectory.
>>
>> Thanks,
>> Kai
>>
>
> 2015-09-16  Jonathan Yong  <10wa...@gmail.com>
>
> * config/i386/cygwin.h (STARTFILE_SPEC): Explicitly search
>   sysroot/usr/lib/32api for additional win32 libraries,
>   fixes failing Cygwin bootstrapping.
>
> Message OK?
>

Committed for you at rev. 227962.

Kai


Re: [PATCH tree-inline] do not say "called from here" with UNKNOWN_LOCATION

2015-09-21 Thread Manuel López-Ibáñez
On 21 September 2015 at 10:18, Richard Biener
 wrote:
> input_location is set from the call stmt:
>
>   /* FIXME: instantiate_decl isn't called by inlinable_function_p.  */
>   saved_location = input_location;
>   input_location = gimple_location (stmt);
>
> it would be nice to get rid of that.

I could replace all uses of input_location in this function by
gimple_location(stmt) as I noted in the comments. Would that be ok if
it works? I'm not sure I can prove that input_location is not used
behind the scenes for some other purpose (all the more reason to kill
input_location once and for all). Friends, don't let friends use
input_location in new code!

> If the call is artificially generated it might have no location so I
> think the assert
> is bogus.  Also in the case of UNKNOWN_LOCATION it would be nice to at

I was sure of that, but I was also surprised that not a single
testcase was triggering it. I can re-submit without the assert.

> least note the function we are failing to inline to (thus, use
> DECL_SOURCE_LOCATION
> of cfun->decl).  So better add a diag_location and compute that upfront to 
> avoid
> repeating the check.

   error ("inlining failed in call to always_inline %q+F: %s", fn,
  cgraph_inline_failed_string (reason));

The call is using '+F', thus the location is set to some location
related to F, depending on which *_printer function is active at that
moment. cp_printer uses location_of, and default_tree_printer uses
DECL_SOURCE_LOCATION. Is the default_tree_printer always used at this
point? If yes, I completely agree we should use an explicit
DECL_SOURCE_LOCATION. The meaning of '+' is not only opaque but it
breaks #pragma GCC diagnostic.

> Did you investigate the glibc case on whether caller or callee are artificial?

No, I could not get glibc to compile. Unfortunately, the machine of
the compile farm that I use to work on GCC is too old. Installing
locally all up-to-date dependencies would require too much time from
me that is better spent on other things. Nevertheless, printing "cc1:
called from here" is always bogus. Not printing it is better. If we
could print something smarter, that would be perfect, but I don't know
how to do that, so "better" is good enough for me. Would you agree?

Cheers,

Manuel.


Re: [PATCH] [ARM] Replacing variable swaps that use a temporary variable with a call to std::swap in gcc/config/arm/arm.c

2015-09-21 Thread Kyrill Tkachov

Hi Bilyan,

On 21/09/15 08:41, Bilyan Borisov wrote:

Replacing variable swaps that use a temporary variable with a call to 
std::swap. Tested against arm-none-eabi target including a variant with neon 
enabled.

2015-XX-XX  Bilyan Borisov  

* config/arm/arm.c (thumb_output_move_mem_multiple): Replaced
operands[4] operands[5] swap with std::swap, removed tmp variable.
(arm_evpc_neon_vzip): Replaced in0/in1 and
out0/out1 swaps with std::swap, removed x variable.
(arm_evpc_neon_vtrn): Replaced in0/int1 and
out0/out1 swaos with std::swap, removed x variable.
(arm_expand_vec_perm_const_1): Replaced
d->op0/d->op1 swap with std::swap, removed x variable.
(arm_evpc_neon_vuzp): Replaced in0/in1 and
out0/out1 swaps with std::swap, removed x variable.



This is ok for trunk.
I've committed this for you with r227963.

Thanks,
Kyrill



Re: [PATCH tree-inline] do not say "called from here" with UNKNOWN_LOCATION

2015-09-21 Thread Richard Biener
On Mon, Sep 21, 2015 at 11:59 AM, Manuel López-Ibáñez
 wrote:
> On 21 September 2015 at 10:18, Richard Biener
>  wrote:
>> input_location is set from the call stmt:
>>
>>   /* FIXME: instantiate_decl isn't called by inlinable_function_p.  */
>>   saved_location = input_location;
>>   input_location = gimple_location (stmt);
>>
>> it would be nice to get rid of that.
>
> I could replace all uses of input_location in this function by
> gimple_location(stmt) as I noted in the comments. Would that be ok if
> it works? I'm not sure I can prove that input_location is not used
> behind the scenes for some other purpose (all the more reason to kill
> input_location once and for all). Friends, don't let friends use
> input_location in new code!

Yeah...  not sure how to check but to look for any changes in
generated cc1/cc1plus
debug info.  You could also try making it invalid (-1?) and hope
libcpp would eventually
blow up if that is used.

>> If the call is artificially generated it might have no location so I
>> think the assert
>> is bogus.  Also in the case of UNKNOWN_LOCATION it would be nice to at
>
> I was sure of that, but I was also surprised that not a single
> testcase was triggering it. I can re-submit without the assert.

please

>> least note the function we are failing to inline to (thus, use
>> DECL_SOURCE_LOCATION
>> of cfun->decl).  So better add a diag_location and compute that upfront to 
>> avoid
>> repeating the check.
>
>error ("inlining failed in call to always_inline %q+F: %s", fn,
>   cgraph_inline_failed_string (reason));
>
> The call is using '+F', thus the location is set to some location
> related to F, depending on which *_printer function is active at that
> moment. cp_printer uses location_of, and default_tree_printer uses
> DECL_SOURCE_LOCATION. Is the default_tree_printer always used at this
> point? If yes, I completely agree we should use an explicit
> DECL_SOURCE_LOCATION. The meaning of '+' is not only opaque but it
> breaks #pragma GCC diagnostic.

But it prints the location of the function we failed to inline.  I
want to retain
at least an approximation to the location of the call, which is the location
of the function we inline _to_.

>
>> Did you investigate the glibc case on whether caller or callee are 
>> artificial?
>
> No, I could not get glibc to compile. Unfortunately, the machine of
> the compile farm that I use to work on GCC is too old. Installing
> locally all up-to-date dependencies would require too much time from
> me that is better spent on other things. Nevertheless, printing "cc1:
> called from here" is always bogus. Not printing it is better. If we
> could print something smarter, that would be perfect, but I don't know
> how to do that, so "better" is good enough for me. Would you agree?

Well, it's easy to do better, see above.

Richard.

> Cheers,
>
> Manuel.


[PATCH, libgfortran] Fix FIND_FILE decls and use.

2015-09-21 Thread Kirill Yukhin
Hello,
When libgfortran is configured w/ HAVE_WORKING_STAT undefined
*and* current system is not MinGW - FIND_FILE_[DECL|ARGS} still
trying to use Windows's handles (id).

Proposed patch discriminates MinGW case under HAVE_WORKING_STAT
switched off.

Bootstrapped (on Linux).

libgfortran/
* io/unix.c (FIND_FILE0_DECL): Don't use `id' when
HAVE_WORKING_STAT is not defined and system is not
MinGW.
(FIND_FILE0_ARGS): Ditto.

Is it ok for trunk?

Is it ok for gcc-5-branch? (if bootstrap passed)

--
Thanks, K

$ git show c1d01f74051f65e75b0de73b1b43c76b9b2e1b02
commit c1d01f74051f65e75b0de73b1b43c76b9b2e1b02
Author: Kirill Yukhin 
Date:   Fri Sep 18 19:00:52 2015 +0300

Fix libgfortran/io/unix.c to allow not stable STAT for non MinGW.

diff --git a/libgfortran/io/unix.c b/libgfortran/io/unix.c
index b86bd67..772725a 100644
--- a/libgfortran/io/unix.c
+++ b/libgfortran/io/unix.c
@@ -1551,8 +1551,13 @@ compare_file_filename (gfc_unit *u, const char *name, 
int len)
 # define FIND_FILE0_DECL struct stat *st
 # define FIND_FILE0_ARGS st
 #else
-# define FIND_FILE0_DECL uint64_t id, const char *path
-# define FIND_FILE0_ARGS id, path
+# if defined(__MINGW32__)
+#  define FIND_FILE0_DECL uint64_t id, const char *path
+#  define FIND_FILE0_ARGS id, path
+# else
+#  define FIND_FILE0_DECL const char *path
+#  define FIND_FILE0_ARGS path
+# endif
 #endif

 /* find_file0()-- Recursive work function for find_file() */
  



Re: [PATCH tree-inline] do not say "called from here" with UNKNOWN_LOCATION

2015-09-21 Thread Manuel López-Ibáñez
On 21 September 2015 at 12:29, Richard Biener
 wrote:
>>> least note the function we are failing to inline to (thus, use
>>> DECL_SOURCE_LOCATION
>>> of cfun->decl).  So better add a diag_location and compute that upfront to 
>>> avoid
>>> repeating the check.
>>
>>error ("inlining failed in call to always_inline %q+F: %s", fn,
>>   cgraph_inline_failed_string (reason));
>>
>> The call is using '+F', thus the location is set to some location
>> related to F, depending on which *_printer function is active at that
>> moment. cp_printer uses location_of, and default_tree_printer uses
>> DECL_SOURCE_LOCATION. Is the default_tree_printer always used at this
>> point? If yes, I completely agree we should use an explicit
>> DECL_SOURCE_LOCATION. The meaning of '+' is not only opaque but it
>> breaks #pragma GCC diagnostic.
>
> But it prints the location of the function we failed to inline.  I
> want to retain
> at least an approximation to the location of the call, which is the location
> of the function we inline _to_.

I think I misunderstood you. Do you mean something like?

if (gimple_location (stmt) != UNKNOWN_LOCATION)
inform (gimple_location (stmt), "called from here");
else
inform (DECL_SOURCE_LOCATION (cfun->decl), "called from this function");


Cheers,

Manuel.


[PATCH, i386, AVX-512] Fix splitter for `not-xor' logic.

2015-09-21 Thread Kirill Yukhin
Hello Uroš,

This simple patch fixes mode (to iterator) in
splitter for `not-xor' logic.

Bootstrapped & reg-tested on trunk.

Is it ok for trunk?

Is it ok for gcc-5-branch? (if tested accordingly)

(I have no idea how to reference to define_split in
ChangeLog entry)

gcc/
* gcc/config/i386/i386.md (define_split not/xor SWI1248x): Use
iterator instead of fixed modes..

--
Thanks, K

commit fc8c797375b4fc8fc155070ccfcca52497991c48
Author: Kirill Yukhin 
Date:   Fri Sep 18 16:10:06 2015 +0300

AVX-512. Fix kxnor insn splits.

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 7808705..f7230f0 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -8565,11 +8565,11 @@
(clobber (reg:CC FLAGS_REG))]
   "TARGET_AVX512F && reload_completed"
[(parallel [(set (match_dup 0)
-   (xor:HI (match_dup 0)
-   (match_dup 1)))
+   (xor:SWI1248x (match_dup 0)
+ (match_dup 1)))
   (clobber (reg:CC FLAGS_REG))])
 (set (match_dup 0)
-(not:HI (match_dup 0)))])
+(not:SWI1248x (match_dup 0)))])

 ;;There are kortrest[bdq] but no intrinsics for them.
 ;;We probably don't need to implement them.


Re: [PATCH, libgfortran] Fix FIND_FILE decls and use.

2015-09-21 Thread FX
Dear Kirill,

> When libgfortran is configured w/ HAVE_WORKING_STAT undefined
> *and* current system is not MinGW - FIND_FILE_[DECL|ARGS} still
> trying to use Windows's handles (id).

Well, if HAVE_WORKING_STAT is not defined, then it means some other mechanism 
has to be used. If your target is not mingw32 and stat() is not reliable, we’ll 
need to provide an alternate way of handling things. Your patch would default 
to the unusable stat().

What’s your target?

FX

Re: [PATCH tree-inline] do not say "called from here" with UNKNOWN_LOCATION

2015-09-21 Thread Richard Biener
On Mon, Sep 21, 2015 at 12:46 PM, Manuel López-Ibáñez
 wrote:
> On 21 September 2015 at 12:29, Richard Biener
>  wrote:
 least note the function we are failing to inline to (thus, use
 DECL_SOURCE_LOCATION
 of cfun->decl).  So better add a diag_location and compute that upfront to 
 avoid
 repeating the check.
>>>
>>>error ("inlining failed in call to always_inline %q+F: %s", fn,
>>>   cgraph_inline_failed_string (reason));
>>>
>>> The call is using '+F', thus the location is set to some location
>>> related to F, depending on which *_printer function is active at that
>>> moment. cp_printer uses location_of, and default_tree_printer uses
>>> DECL_SOURCE_LOCATION. Is the default_tree_printer always used at this
>>> point? If yes, I completely agree we should use an explicit
>>> DECL_SOURCE_LOCATION. The meaning of '+' is not only opaque but it
>>> breaks #pragma GCC diagnostic.
>>
>> But it prints the location of the function we failed to inline.  I
>> want to retain
>> at least an approximation to the location of the call, which is the location
>> of the function we inline _to_.
>
> I think I misunderstood you. Do you mean something like?
>
> if (gimple_location (stmt) != UNKNOWN_LOCATION)
> inform (gimple_location (stmt), "called from here");
> else
> inform (DECL_SOURCE_LOCATION (cfun->decl), "called from this 
> function");

Yes.  (now that location may also be UNKNOWN in which case I don't
have a good fallback idea and we can drop the note)

Richard.

>
> Cheers,
>
> Manuel.


Re: [AArch64][PATCH 1/5] Use atomic instructions for swap and fetch-update operations.

2015-09-21 Thread Matthew Wahab

On 18/09/15 08:58, James Greenhalgh wrote:

On Thu, Sep 17, 2015 at 05:37:55PM +0100, Matthew Wahab wrote:



diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 65d2cc9..0e71002 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -27,6 +27,7 @@
  UNSPECV_ATOMIC_CMPSW  ; Represent an atomic compare swap.
  UNSPECV_ATOMIC_EXCHG  ; Represent an atomic exchange.
  UNSPECV_ATOMIC_CAS; Represent an atomic CAS.
+UNSPECV_ATOMIC_SWP ; Represent an atomic SWP.
  UNSPECV_ATOMIC_OP ; Represent an atomic operation.
  ])

@@ -122,19 +123,19 @@
  )

  (define_insn_and_split "aarch64_compare_and_swap_lse"
-  [(set (reg:CC CC_REGNUM) ;; bool out
+  [(set (reg:CC CC_REGNUM)
  (unspec_volatile:CC [(const_int 0)] UNSPECV_ATOMIC_CMPSW))
-   (set (match_operand:GPI 0 "register_operand" "=&r") ;; val out
-(match_operand:GPI 1 "aarch64_sync_memory_operand" "+Q"))   ;; memory
+   (set (match_operand:GPI 0 "register_operand" "=&r")
+(match_operand:GPI 1 "aarch64_sync_memory_operand" "+Q"))
 (set (match_dup 1)
  (unspec_volatile:GPI
-  [(match_operand:GPI 2 "aarch64_plus_operand" "rI")   ;; expect
-   (match_operand:GPI 3 "register_operand" "r");; desired
-   (match_operand:SI 4 "const_int_operand")  ;; is_weak
-   (match_operand:SI 5 "const_int_operand")  ;; mod_s
-   (match_operand:SI 6 "const_int_operand")] ;; mod_f
+  [(match_operand:GPI 2 "aarch64_plus_operand" "rI")
+   (match_operand:GPI 3 "register_operand" "r")
+   (match_operand:SI 4 "const_int_operand")
+   (match_operand:SI 5 "const_int_operand")
+   (match_operand:SI 6 "const_int_operand")]


I'm not sure I understand the change here, those comments still look helpful
enough for understanding the pattern, what have a I missed?


That was part of an attempt to clean up some code. It's unnecessary and I've dropped 
the change.


Attached is the updated patch with some other changes:
- Simplified the atomic_exchange expander in line with reviews for
  other patches in the series.
- Removed the CC clobber from aarch64_atomic_exchange_lse, it was
  over-cautious.
- Added a missing entry to the change log (noting a whitespace fix).

Ok for trunk?
Matthew

gcc/
2015-09-21  Matthew Wahab  

* config/aarch64/aarch64-protos.h (aarch64_gen_atomic_ldop):
Declare.
* config/aarch64/aarch64.c (aarch64_emit_atomic_swap): New.
(aarch64_gen_atomic_ldop): New.
(aarch64_split_atomic_op): Fix whitespace and add a comment.
* config/aarch64/atomics.md (UNSPECV_ATOMIC_SWP): New.
(aarch64_compare_and_swap_lse): Fix some whitespace.
(atomic_exchange): Replace with an expander.
(aarch64_atomic_exchange): New.
(aarch64_atomic_exchange_lse): New.
(aarch64_atomic_): Fix some whitespace.
(aarch64_atomic_swp): New.


gcc/testsuite/
2015-09-21  Matthew Wahab  

* gcc.target/aarch64/atomic-inst-ops.inc: (TEST_MODEL): New.
(TEST_ONE): New.
* gcc.target/aarch64/atomic-inst-swap.c: New.


>From 31226dce8d36be98ca95d9165d4147a3bf84d180 Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Fri, 7 Aug 2015 17:18:37 +0100
Subject: [PATCH 1/5] Add atomic SWP instruction

Change-Id: I87bf48526cb11e65edd15691f5eab20446e418c9
---
 gcc/config/aarch64/aarch64-protos.h|  1 +
 gcc/config/aarch64/aarch64.c   | 46 +-
 gcc/config/aarch64/atomics.md  | 71 --
 .../gcc.target/aarch64/atomic-inst-ops.inc | 13 
 gcc/testsuite/gcc.target/aarch64/atomic-inst-swp.c | 44 ++
 5 files changed, 170 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/atomic-inst-swp.c

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index ff19851..eba4c76 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -378,6 +378,7 @@ rtx aarch64_load_tp (rtx);
 void aarch64_expand_compare_and_swap (rtx op[]);
 void aarch64_split_compare_and_swap (rtx op[]);
 void aarch64_gen_atomic_cas (rtx, rtx, rtx, rtx, rtx);
+void aarch64_gen_atomic_ldop (enum rtx_code, rtx, rtx, rtx, rtx);
 void aarch64_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx, rtx);
 
 bool aarch64_gen_adjusted_ldpstp (rtx *, bool, enum machine_mode, RTX_CODE);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 4d2126b..dc05c6e 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11185,11 +11185,54 @@ aarch64_split_compare_and_swap (rtx operands[])
 aarch64_emit_post_barrier (model);
 }
 
+/* Emit an atomic swap.  */
+
+static void
+aarch64_emit_atomic_swap (machine_mod

Re: [PR middle-end/60832] Do not convert widest_int to tree just for printing it.

2015-09-21 Thread Manuel López-Ibáñez
On 21 September 2015 at 09:24, Richard Biener  wrote:
> On Sun, 20 Sep 2015, Manuel López-Ibáñez wrote:
>
>> In do_warn_aggressive_loop_optimizations, we convert to a tree just to print 
>> a
>> widest_int. Apart from overly complicated, this results in printing '3u'
>> instead of just '3'.
>>
>> Unfortunately, adding a printf-like conversion specifier would require making
>> pretty-print.c link with wide-int.cc, which will include a lot of new
>> dependencies into several other programs (gcov-tool for example). It would be
>> possible to add the conversion specifier to every FE pretty-printer, but this
>> still would require updating c-format.c, which is far from trivial. A simpler
>> approach is to convert to a string rather than to a tree.
>
> I'd add the pretty printer overloads to wide-int-print.[ch], but yes
> it would require a change in c-format.c.

That is not enough, some conversion specifier parser needs to handle
the new codes. The one in pretty-print.c is used all over the place
and adding a dependency on wide-int.cc will bring gmp and other stuff
that is currently  not used (nor linked with) various helper programs.
If wide-int is only printed in the middle-end, perhaps it is
sufficient to add it to default_tree_printer. Otherwise, it needs to
be added to every FE. This is not difficult just tedious. The
difficult part is the c-format.c changes (I wish someone would fix
https://gcc.gnu.org/PR47781 so we can use a simpler syntax to define
our custom specifiers; in addition to the positive synergies it will
bring with other GNU projects that use custom specifiers and wish to
use -Wformat).

>> OK?
>
> Ok.  Let's remember this case and if more of these pop up try working
> on a "proper" solution.

Sure, I added my WIP patch to the PR, but I don't plan to keep working on it.

Cheers,

Manuel.


Re: [PATCH, libgfortran] Fix FIND_FILE decls and use.

2015-09-21 Thread Kirill Yukhin
On 21 Sep 12:53, FX wrote:
> Dear Kirill,
> 
> > When libgfortran is configured w/ HAVE_WORKING_STAT undefined
> > *and* current system is not MinGW - FIND_FILE_[DECL|ARGS} still
> > trying to use Windows's handles (id).
> 
> Well, if HAVE_WORKING_STAT is not defined, then it means some other mechanism 
> has to be used. If your target is not mingw32 and stat() is not reliable, 
> we’ll need to provide an alternate way of handling things. Your patch would 
> default to the unusable stat().
That's right, but maybe at least allow bootstrap to pass?

> What’s your target?
I am trying to pass bootstrap w/ couple of insns siwtched on, which are not 
available on host.
While configureing libgfortran test
  checking whether the target stat is reliable...
returns no, since test program fails to execute (due to illegal insn exception).
So, we have non MinGW system w/o stat.

I am not going to use resulting libgfortran, only extra sanity check for my 
changes to i386 target.

--
Thanks, K

> 
> FX



Re: [AArch64][PATCH 2/5] Make BIC, other logical instructions, available. (was: Add BIC instruction.)

2015-09-21 Thread Matthew Wahab

On 18/09/15 09:05, James Greenhalgh wrote:

On Thu, Sep 17, 2015 at 05:40:48PM +0100, Matthew Wahab wrote:

Hello,

ARMv8.1 adds atomic swap and atomic load-operate instructions with
optional memory ordering specifiers. This patch adds an expander to
generate a BIC instruction that can be explicitly called when
implementing the atomic__fetch pattern to calculate the value to
be returned by the operation.



Why not make the "*_one_cmpl_3" pattern
named (remove the leading *) and call that in your atomic__fetch
patterns as:

   and_one_cmpl_3

I'd rather that than to add a pettern that simply expands to the same
thing.


I overlooked that pattern when I was trying to find the bic emitter. I've attached an 
updated patch.


Tested as part of the series for aarch64-none-linux-gnu with native bootstrap 
and
make check. Also tested for aarch64-none-elf with cross-compiled
check-gcc on an ARMv8.1 emulator with +lse enabled by default.

Ok for trunk?
Matthew

2015-09-21  Matthew Wahab  

* config/aarch64/aarch64.md
(_one_cmpl_3): Make a named
pattern.

>From 0e2ae8739d70e4d1c14fa848f67847b1ecf94f71 Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Mon, 17 Aug 2015 17:48:27 +0100
Subject: [PATCH 2/5] Make BIC, other logical instructions, available for use.

Change-Id: Ibef049bfa1bfe5e168feada3dc358f28383e6410
---
 gcc/config/aarch64/aarch64.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 88ba72e..72384ce 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3392,7 +3392,7 @@
   [(set_attr "type" "logics_reg")]
 )
 
-(define_insn "*_one_cmpl_3"
+(define_insn "_one_cmpl_3"
   [(set (match_operand:GPI 0 "register_operand" "=r")
 	(LOGICAL:GPI (not:GPI
 		  (SHIFT:GPI
-- 
2.1.4



Re: [AArch64][PATCH 3/5] Add atomic load-operate instructions.

2015-09-21 Thread Matthew Wahab

On 18/09/15 09:39, James Greenhalgh wrote:

On Thu, Sep 17, 2015 at 05:42:35PM +0100, Matthew Wahab wrote:

---
  gcc/config/aarch64/atomics.md | 41 +
  1 file changed, 41 insertions(+)

diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 0e71002..b7b6fb5 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -29,8 +29,25 @@
  UNSPECV_ATOMIC_CAS; Represent an atomic CAS.
  UNSPECV_ATOMIC_SWP; Represent an atomic SWP.
  UNSPECV_ATOMIC_OP ; Represent an atomic operation.
+UNSPECV_ATOMIC_LDOP; Represent an atomic 
load-operation
+UNSPECV_ATOMIC_LDOP_OR ; Represent an atomic load-or
+UNSPECV_ATOMIC_LDOP_BIC; Represent an atomic load-bic
+UNSPECV_ATOMIC_LDOP_XOR; Represent an atomic load-xor
+UNSPECV_ATOMIC_LDOP_PLUS   ; Represent an atomic load-add
  ])

+;; Iterators for load-operate instructions.
+
+(define_int_iterator ATOMIC_LDOP
+ [UNSPECV_ATOMIC_LDOP_OR UNSPECV_ATOMIC_LDOP_BIC
+  UNSPECV_ATOMIC_LDOP_XOR UNSPECV_ATOMIC_LDOP_PLUS])
+
+(define_int_attr atomic_ldop
+ [(UNSPECV_ATOMIC_LDOP_OR "set") (UNSPECV_ATOMIC_LDOP_BIC "clr")
+  (UNSPECV_ATOMIC_LDOP_XOR "eor") (UNSPECV_ATOMIC_LDOP_PLUS "add")])


There is precedent (atomic_optab, atomic_op_operand, const_atomic, etc.) for
these living in config/aarch64/iterators.md so they should be moved there.
Presumably the difficulty with that is to do with the position of the
"unspecv" define_c_enum? I'd argue that is in the wrong place too...

If you want to leave this to a cleanup patch in stage 3 that is fine.

This patch is OK for trunk.



I'd prefer to keep the clean-up separate from this series. I'll commit the patch as 
it is and the deal with the iterator move later.


Thanks,
Matthew


[PATCH] Fix -Wmisleading-indentation false-positive

2015-09-21 Thread Patrick Palka
This patch fixes the last remaining false-positive
-Wmisleading-indentation warning that is emitted against the sqlite
sources.

The problem is the following kind of code snippet:

for (i = 0;
 i < 10;
 i++
);
foo (i);

which is an "interesting" coding style but it is not misleading, so the
heurisitic shouldn't trip over it.  I adjusted the heuristic (which only
applies when the body is a semicolon) to be more strict in one sense (we
no longer warn when the guard column lines up with the column of the
"next" statement like in the above snippet) and yet more relaxed in
another sense (we now warn when the guard column is greater than the
column of the "next" statement, see test case changes).

Tested by compiling Linux, sqlite and vim code bases with
-Wmisleading-indentation.  OK if bootstrap + regtest succeeds?

gcc/c-family/ChangeLog:

* c-indentation.c (should_warn_for_misleading_indentation):
Float out and consolidate the calls to get_visual_column that is
passed guard_exploc as an argument.  Compare
next_stmt_vis_column with guard_line_first_nws instead of with
body_line_first_nws.

gcc/testsuite/ChangeLog:

* c-c++-common/Wmisleading-indentation.c: Augment test.
---
 gcc/c-family/c-indentation.c   | 23 +++-
 .../c-c++-common/Wmisleading-indentation.c | 25 ++
 2 files changed, 33 insertions(+), 15 deletions(-)

diff --git a/gcc/c-family/c-indentation.c b/gcc/c-family/c-indentation.c
index dd35223..5316316 100644
--- a/gcc/c-family/c-indentation.c
+++ b/gcc/c-family/c-indentation.c
@@ -341,6 +341,8 @@ should_warn_for_misleading_indentation (const 
token_indent_info &guard_tinfo,
   unsigned int next_stmt_vis_column;
   unsigned int body_vis_column;
   unsigned int body_line_first_nws;
+  unsigned int guard_vis_column;
+  unsigned int guard_line_first_nws;
   /* If we can't determine it, don't issue a warning.  This is sometimes
 the case for input files containing #line directives, and these
 are often for autogenerated sources (e.g. from .md files), where
@@ -351,6 +353,11 @@ should_warn_for_misleading_indentation (const 
token_indent_info &guard_tinfo,
  &body_vis_column,
  &body_line_first_nws))
return false;
+  if (!get_visual_column (guard_exploc,
+ &guard_vis_column,
+ &guard_line_first_nws))
+   return false;
+
   if ((body_type != CPP_SEMICOLON
   && next_stmt_vis_column == body_vis_column)
  /* As a special case handle the case where the body is a semicolon
@@ -365,7 +372,7 @@ should_warn_for_misleading_indentation (const 
token_indent_info &guard_tinfo,
  || (body_type == CPP_SEMICOLON
  && body_exploc.line > guard_exploc.line
  && body_line_first_nws != body_vis_column
- && next_stmt_vis_column == body_line_first_nws))
+ && next_stmt_vis_column > guard_line_first_nws))
{
   /* Don't warn if they are aligned on the same column
 as the guard itself (suggesting autogenerated code that doesn't
@@ -395,13 +402,6 @@ should_warn_for_misleading_indentation (const 
token_indent_info &guard_tinfo,
 indentation is misleading).  Using the column of the first
 non-whitespace character on the guard line makes that
 happen.  */
- unsigned int guard_vis_column;
- unsigned int guard_line_first_nws;
- if (!get_visual_column (guard_exploc,
- &guard_vis_column,
- &guard_line_first_nws))
-   return false;
-
  if (guard_line_first_nws == body_vis_column)
return false;
 
@@ -462,13 +462,6 @@ should_warn_for_misleading_indentation (const 
token_indent_info &guard_tinfo,
  {
if (body_exploc.line == guard_exploc.line)
  {
-   unsigned int guard_vis_column;
-   unsigned int guard_line_first_nws;
-   if (!get_visual_column (guard_exploc,
-   &guard_vis_column,
-   &guard_line_first_nws))
- return false;
-
if (next_stmt_vis_column > guard_line_first_nws
|| (next_tok_type == CPP_OPEN_BRACE
&& next_stmt_vis_column == guard_vis_column))
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c 
b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
index 0d6d8d2..f61c182 100644
--- a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
@@ -732,6 +732,13 @@ fn_37 (void)
   foo (0);
 }
 
+  if (flagB) /* { dg-message "3: ...this 'if' clause" } */
+/* blah */;
+   { /* { dg-warning "st

Re: [AArch64][PATCH 1/5] Use atomic instructions for swap and fetch-update operations.

2015-09-21 Thread James Greenhalgh
On Mon, Sep 21, 2015 at 12:06:40PM +0100, Matthew Wahab wrote:
> On 18/09/15 08:58, James Greenhalgh wrote:
> > On Thu, Sep 17, 2015 at 05:37:55PM +0100, Matthew Wahab wrote:
> 
> >> diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
> >> index 65d2cc9..0e71002 100644
> >> --- a/gcc/config/aarch64/atomics.md
> >> +++ b/gcc/config/aarch64/atomics.md
> >> @@ -27,6 +27,7 @@
> >>   UNSPECV_ATOMIC_CMPSW ; Represent an atomic compare swap.
> >>   UNSPECV_ATOMIC_EXCHG ; Represent an atomic exchange.
> >>   UNSPECV_ATOMIC_CAS   ; Represent an atomic CAS.
> >> +UNSPECV_ATOMIC_SWP; Represent an atomic SWP.
> >>   UNSPECV_ATOMIC_OP; Represent an atomic operation.
> >>   ])
> >>
> >> @@ -122,19 +123,19 @@
> >>   )
> >>
> >>   (define_insn_and_split "aarch64_compare_and_swap_lse"
> >> -  [(set (reg:CC CC_REGNUM);; bool 
> >> out
> >> +  [(set (reg:CC CC_REGNUM)
> >>   (unspec_volatile:CC [(const_int 0)] UNSPECV_ATOMIC_CMPSW))
> >> -   (set (match_operand:GPI 0 "register_operand" "=&r");; val 
> >> out
> >> -(match_operand:GPI 1 "aarch64_sync_memory_operand" "+Q"))   ;; memory
> >> +   (set (match_operand:GPI 0 "register_operand" "=&r")
> >> +(match_operand:GPI 1 "aarch64_sync_memory_operand" "+Q"))
> >>  (set (match_dup 1)
> >>   (unspec_volatile:GPI
> >> -  [(match_operand:GPI 2 "aarch64_plus_operand" "rI")  ;; expect
> >> -   (match_operand:GPI 3 "register_operand" "r")   ;; 
> >> desired
> >> -   (match_operand:SI 4 "const_int_operand")   ;; 
> >> is_weak
> >> -   (match_operand:SI 5 "const_int_operand")   ;; mod_s
> >> -   (match_operand:SI 6 "const_int_operand")]  ;; mod_f
> >> +  [(match_operand:GPI 2 "aarch64_plus_operand" "rI")
> >> +   (match_operand:GPI 3 "register_operand" "r")
> >> +   (match_operand:SI 4 "const_int_operand")
> >> +   (match_operand:SI 5 "const_int_operand")
> >> +   (match_operand:SI 6 "const_int_operand")]
> >
> > I'm not sure I understand the change here, those comments still look helpful
> > enough for understanding the pattern, what have a I missed?
> 
> That was part of an attempt to clean up some code. It's unnecessary and I've 
> dropped 
> the change.
> 
> Attached is the updated patch with some other changes:
> - Simplified the atomic_exchange expander in line with reviews for
>other patches in the series.
> - Removed the CC clobber from aarch64_atomic_exchange_lse, it was
>over-cautious.
> - Added a missing entry to the change log (noting a whitespace fix).
> 
> Ok for trunk?
> Matthew

OK.

Thanks,
James

> gcc/
> 2015-09-21  Matthew Wahab  
> 
>   * config/aarch64/aarch64-protos.h (aarch64_gen_atomic_ldop):
>   Declare.
>   * config/aarch64/aarch64.c (aarch64_emit_atomic_swap): New.
>   (aarch64_gen_atomic_ldop): New.
>   (aarch64_split_atomic_op): Fix whitespace and add a comment.
>   * config/aarch64/atomics.md (UNSPECV_ATOMIC_SWP): New.
>   (aarch64_compare_and_swap_lse): Fix some whitespace.
>   (atomic_exchange): Replace with an expander.
>   (aarch64_atomic_exchange): New.
>   (aarch64_atomic_exchange_lse): New.
>   (aarch64_atomic_): Fix some whitespace.
>   (aarch64_atomic_swp): New.
> 
> 
> gcc/testsuite/
> 2015-09-21  Matthew Wahab  
> 
>   * gcc.target/aarch64/atomic-inst-ops.inc: (TEST_MODEL): New.
>   (TEST_ONE): New.
>  * gcc.target/aarch64/atomic-inst-swap.c: New.
> 
> 




Re: [AArch64][PATCH 2/5] Make BIC, other logical instructions, available. (was: Add BIC instruction.)

2015-09-21 Thread James Greenhalgh
On Mon, Sep 21, 2015 at 12:12:21PM +0100, Matthew Wahab wrote:
> On 18/09/15 09:05, James Greenhalgh wrote:
> > On Thu, Sep 17, 2015 at 05:40:48PM +0100, Matthew Wahab wrote:
> >> Hello,
> >>
> >> ARMv8.1 adds atomic swap and atomic load-operate instructions with
> >> optional memory ordering specifiers. This patch adds an expander to
> >> generate a BIC instruction that can be explicitly called when
> >> implementing the atomic__fetch pattern to calculate the value to
> >> be returned by the operation.
> >>
> >
> > Why not make the "*_one_cmpl_3" pattern
> > named (remove the leading *) and call that in your atomic__fetch
> > patterns as:
> >
> >and_one_cmpl_3
> >
> > I'd rather that than to add a pettern that simply expands to the same
> > thing.
> 
> I overlooked that pattern when I was trying to find the bic emitter. I've 
> attached an 
> updated patch.
> 
> Tested as part of the series for aarch64-none-linux-gnu with native bootstrap 
> and
> make check. Also tested for aarch64-none-elf with cross-compiled
> check-gcc on an ARMv8.1 emulator with +lse enabled by default.
> 
> Ok for trunk?
> Matthew

OK.

Thanks,
James

> 
> 2015-09-21  Matthew Wahab  
> 
>   * config/aarch64/aarch64.md
>   (_one_cmpl_3): Make a named
>   pattern.
> 

> From 0e2ae8739d70e4d1c14fa848f67847b1ecf94f71 Mon Sep 17 00:00:00 2001
> From: Matthew Wahab 
> Date: Mon, 17 Aug 2015 17:48:27 +0100
> Subject: [PATCH 2/5] Make BIC, other logical instructions, available for use.
> 
> Change-Id: Ibef049bfa1bfe5e168feada3dc358f28383e6410
> ---
>  gcc/config/aarch64/aarch64.md | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 88ba72e..72384ce 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -3392,7 +3392,7 @@
>[(set_attr "type" "logics_reg")]
>  )
>  
> -(define_insn "*_one_cmpl_3"
> +(define_insn "_one_cmpl_3"
>[(set (match_operand:GPI 0 "register_operand" "=r")
>   (LOGICAL:GPI (not:GPI
> (SHIFT:GPI
> -- 
> 2.1.4
> 



Re: [AArch64][PATCH 4/5] Use atomic load-operate instructions for fetch-update patterns.

2015-09-21 Thread Matthew Wahab

On 18/09/15 09:55, James Greenhalgh wrote:

On Thu, Sep 17, 2015 at 05:47:43PM +0100, Matthew Wahab wrote:

Hello,

ARMv8.1 adds atomic swap and atomic load-operate instructions with
optional memory ordering specifiers. This patch uses the ARMv8.1 atomic
load-operate instructions to implement the atomic_fetch_
patterns. This patch also updates the implementation of the atomic_
patterns, which are treated as versions of the atomic_fetch_ which
discard the loaded data.


[..]


Ok for trunk?


Some comments in line below.



diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index eba4c76..76ebd6f 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -378,6 +378,8 @@ rtx aarch64_load_tp (rtx);
  void aarch64_expand_compare_and_swap (rtx op[]);
  void aarch64_split_compare_and_swap (rtx op[]);
  void aarch64_gen_atomic_cas (rtx, rtx, rtx, rtx, rtx);
+
+bool aarch64_atomic_ldop_supported_p (enum rtx_code);
  void aarch64_gen_atomic_ldop (enum rtx_code, rtx, rtx, rtx, rtx);
  void aarch64_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx, rtx);

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index dc05c6e..33f9ef3 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11064,6 +11064,33 @@ aarch64_expand_compare_and_swap (rtx operands[])
emit_insn (gen_rtx_SET (bval, x));
  }

+/* Test whether the target supports using a atomic load-operate instruction.
+   CODE is the operation and AFTER is TRUE if the data in memory after the
+   operation should be returned and FALSE if the data before the operation
+   should be returned.  Returns FALSE if the operation isn't supported by the
+   architecture.
+  */


Stray newline, leave the */ on the line before.


Fixed this.


+
+bool
+aarch64_atomic_ldop_supported_p (enum rtx_code code)
+{
+  if (!TARGET_LSE)
+return false;
+
+  switch (code)
+{
+case SET:
+case AND:
+case IOR:
+case XOR:
+case MINUS:
+case PLUS:
+  return true;
+default:
+  return false;
+}
+}
+
  /* Emit a barrier, that is appropriate for memory model MODEL, at the end of a
 sequence implementing an atomic operation.  */

@@ -11206,26 +11233,169 @@ aarch64_emit_atomic_swap (machine_mode mode, rtx 
dst, rtx value,
emit_insn (gen (dst, mem, value, model));
  }

-/* Emit an atomic operation where the architecture supports it.  */
+/* Operations supported by aarch64_emit_atomic_load_op.  */
+
+enum aarch64_atomic_load_op_code
+{
+  AARCH64_LDOP_PLUS,   /* A + B  */
+  AARCH64_LDOP_XOR,/* A ^ B  */
+  AARCH64_LDOP_OR, /* A | B  */
+  AARCH64_LDOP_BIC /* A & ~B  */
+};


I have a small preference to calling these the same name as the
instructions they will generate, so AARCH64_LDOP_ADD, AARCH64_LDOP_EOR,
AARCH64_LDOP_SET and AARCH64_LDOP_CLR, but I'm happy fo you to leave it
this way if you prefer.



I prefer to keep them related to the operation being implemented rather than how it 
is implemented so I've left them as they are.




-(define_insn_and_split "atomic_"
+(define_expand "atomic_"
+ [(set (match_operand:ALLI 0 "aarch64_sync_memory_operand" "")
+   (unspec_volatile:ALLI
+[(atomic_op:ALLI (match_dup 0)
+  (match_operand:ALLI 1 "" ""))
+ (match_operand:SI 2 "const_int_operand")]
+UNSPECV_ATOMIC_OP))
+  (clobber (reg:CC CC_REGNUM))]


This is not needed for the LSE forms of these instructions and may result
in less optimal code genmeration. On the other hand, that will only be in
a corner case and this is only a define_expand. Because of that, it would
be clearer to a reader if you dropped the detailed description of this
in RTL (which is never used) and rewrote it using just the uses of the
operands, as so:


+(define_expand "atomic_"
+ [(match_operand:ALLI 0 "aarch64_sync_memory_operand" "")
+  (match_operand:ALLI 1 "" "")
+  (match_operand:SI 2 "const_int_operand")]




Switched the new expanders in this and the other patches to the simpler form.




+(define_insn_and_split "aarch64_atomic_"
+ [(set (match_operand:ALLI 0 "aarch64_sync_memory_operand" "+Q")
+   (unspec_volatile:ALLI
+[(atomic_op:ALLI (match_dup 0)
+  (match_operand:ALLI 1 "" "r"))
+ (match_operand:SI 2 "const_int_operand")]
+UNSPECV_ATOMIC_OP))
+  (clobber (reg:CC CC_REGNUM))
+  (clobber (match_scratch:ALLI 3 "=&r"))
+  (clobber (match_scratch:SI 4 "=&r"))]
+  ""


TARGET_LSE ?


It's not needed here because this pattern is always available.


+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+  {
+aarch64_split_atomic_op (, NULL, operands[3], operands[0],
+operands[1], operands[2], operands[4]);
+DONE;
+  }
+)
+
+(define_insn_and_split "aarch64_atomic__lse"
[(set (match_operand:ALLI 0 "aarch64_sync_memory_operand" "+Q")
  (unspec_volatile:ALLI
[(atomic_op:ALLI (match_dup 0)
(match_operand:ALLI 1 "" "r"))
-   (match_operand:SI 2 "c

[PATCH] Add new hooks ASM_OUTPUT_START_FUNCTION_HEADER ...

2015-09-21 Thread Dominik Vogt
This patch adds to new backend hooks
ASM_OUTPUT_START_FUNCTION_HEADER and
ASM_OUTPUT_END_FUNCTION_FOOTER that may be defined to emit
assembly code at the very start or end of a function.  This
functionality is needed by the patch that ports the "target"
attribute and pragma to S/390.  (I'll post that patch shortly, and
in a different thread.)

Note that a function may need to switch to a different section in
the .s file (hot/cold section or whatever).  The additional code
is (arguably) emitted *before* switching the section.  That seems
to be more correct to me that emitting it after the switch, but
it's not a big deal to change that.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* defaults.h (ASM_OUTPUT_START_FUNCTION_HEADER): New hook function.
(ASM_OUTPUT_START_FUNCTION_FOOTER): Ditto.
* varasm.c (assemble_start_function): Call new hook at start of
function.
(assemble_end_function): Call new hook at end of function.
* doc/tm.texi.in: Document new hook functions.
>From 61c2546ba0aa0e9b2743f8d64427143aac5af2b0 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Wed, 29 Jul 2015 16:14:23 +0100
Subject: [PATCH] Add new hooks ASM_OUTPUT_START_FUNCTION_HEADER ...

and ASM_OUTPUT_END_FUNCTION_FOOTER.  These are used by the implementation of
__attribute__ ((target(...))) on S390.
---
 gcc/defaults.h | 10 ++
 gcc/doc/tm.texi| 14 ++
 gcc/doc/tm.texi.in | 14 ++
 gcc/varasm.c   |  2 ++
 4 files changed, 40 insertions(+)

diff --git a/gcc/defaults.h b/gcc/defaults.h
index d4d3a56..aa9c849 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -135,6 +135,16 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define TARGET_DEFERRED_OUTPUT_DEFS(DECL,TARGET) false
 #endif
 
+/* Output additional assemler code at the very start of the function.  */
+#ifndef ASM_OUTPUT_START_FUNCTION_HEADER
+#define ASM_OUTPUT_START_FUNCTION_HEADER(FILE,DECL) do { } while (0)
+#endif
+
+/* Output additional assemler code at the very end of the function.  */
+#ifndef ASM_OUTPUT_END_FUNCTION_FOOTER
+#define ASM_OUTPUT_START_FUNCTION_FOOTER(FILE,DECL) do { } while (0)
+#endif
+
 /* This is how to output the definition of a user-level label named
NAME, such as the label on variable NAME.  */
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index d548d96..7ba00c3 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -7917,6 +7917,20 @@ If you define @code{TYPE_ASM_OP} and @code{TYPE_OPERAND_FMT}, a default
 definition of this macro is provided.
 @end defmac
 
+@defmac ASM_OUTPUT_START_FUNCTION_HEADER (@var{stream}, @var{decl})
+A C statement (sans semicolon) to output to the stdio stream
+@var{stream} additional assembly language code to be placed at the
+very start of the current function.  The default definition of this
+macro does nothing.
+@end defmac
+
+@defmac ASM_OUTPUT_END_FUNCTION_FOOTER (@var{stream}, @var{decl})
+A C statement (sans semicolon) to output to the stdio stream
+@var{stream} additional assembly language code at the very end of
+the current function.  The default definition of this macro does
+nothing.
+@end defmac
+
 @defmac ASM_DECLARE_FUNCTION_NAME (@var{stream}, @var{name}, @var{decl})
 A C statement (sans semicolon) to output to the stdio stream
 @var{stream} any text necessary for declaring the name @var{name} of a
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 9bef4a5..2b2b7e4 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -5546,6 +5546,20 @@ If you define @code{TYPE_ASM_OP} and @code{TYPE_OPERAND_FMT}, a default
 definition of this macro is provided.
 @end defmac
 
+@defmac ASM_OUTPUT_START_FUNCTION_HEADER (@var{stream}, @var{decl})
+A C statement (sans semicolon) to output to the stdio stream
+@var{stream} additional assembly language code to be placed at the
+very start of the current function.  The default definition of this
+macro does nothing.
+@end defmac
+
+@defmac ASM_OUTPUT_END_FUNCTION_FOOTER (@var{stream}, @var{decl})
+A C statement (sans semicolon) to output to the stdio stream
+@var{stream} additional assembly language code at the very end of
+the current function.  The default definition of this macro does
+nothing.
+@end defmac
+
 @defmac ASM_DECLARE_FUNCTION_NAME (@var{stream}, @var{name}, @var{decl})
 A C statement (sans semicolon) to output to the stdio stream
 @var{stream} any text necessary for declaring the name @var{name} of a
diff --git a/gcc/varasm.c b/gcc/varasm.c
index 706e652..dcee8f0 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -1701,6 +1701,7 @@ assemble_start_function (tree decl, const char *fnname)
   char tmp_label[100];
   bool hot_label_written = false;
 
+  ASM_OUTPUT_START_FUNCTION_HEADER (asm_out_file, current_function_decl);
   if (flag_reorder_blocks_and_partition)
 {
   ASM_GENERATE_INTERNAL_LABEL (tmp_label, "LHOTB", const_labelno);
@@ -1864,6 +1865,7 @@ assemble_end_function (tree decl, const char *f

Re: [Patch/ccmp] Cost instruction sequences to choose better expand order

2015-09-21 Thread Bernd Schmidt

On 09/18/2015 05:21 PM, Jiong Wang wrote:


Current conditional compare (CCMP) support in GCC aim to optimize
short circuit for cascade comparision, given a simple conditional
compare candidate:

   if (a == 17 || a == 32)

[...]

The problem is current implementation always expand t0 first, then
t1. While the expand order need to consider the rtx costs, because "cmp"
and "ccmp" may have different restrictions that the expand order will
result in performance differences.

For example on AArch64, "ccmp" only accept immediate within -31 ~ 31
while "cmp" accept wider range, so if we expand "a == 32" in the second
step, then it will use "ccmp", and thus an extra "mov reg, 32"
instruction is generated because "32" is out of the range. While if we
expand "a == 32" first, then it's fine as "cmp" can encoding it.


I've played with this patch a bit with an aarch64 cross compiler. First 
of all - it doesn't seem to work, I get identical costs and the swapping 
doesn't happen. Did you forget to include a rtx_cost patch?


I was a little worried about whether this would be expensive for longer 
sequences of conditions, but it seems like it looks only at leafs where 
we have two comparisons, so that cost should be minimal. However, it's 
possible there's room for improvement in code generation. I was curious 
and looked at a slightly more complex testcase


int
foo (int a)
{
  if (a == 17 || a == 32 || a == 47 || a == 53 || a == 66 || a == 72)
return 1;
  else
return 2;
}

and this doesn't generate a sequence of ccmps as might have been 
expected; we only get pairs of comparisons merged with a bit_ior:


  D.2699 = a == 17;
  D.2700 = a == 32;
  D.2701 = D.2699 | D.2700;
  if (D.2701 != 0) goto ; else goto ;
  :
  D.2703 = a == 47;
  D.2704 = a == 53;
  D.2705 = D.2703 | D.2704;
  if (D.2705 != 0) goto ; else goto ;

and the ccmp expander doesn't see the entire thing. I found that a 
little surprising TBH.



Bernd


Re: [AArch64][PATCH 5/5] Use atomic load-operate instructions for update-fetch patterns.

2015-09-21 Thread Matthew Wahab

On 17/09/15 17:54, Matthew Wahab wrote:

ARMv8.1 adds atomic swap and atomic load-operate instructions with
optional memory ordering specifiers. This patch uses the ARMv8.1
load-operate instructions to implement the atomic__fetch patterns.

The approach is to use the atomic load-operate instruction to atomically
load the data and update memory and then to use the loaded data to
calculate the value that the instruction would have stored. The
calculation attempts to mirror the operation of the atomic instruction.
For example, atomic_and_fetch is implemented with an atomic
load-bic so the result is also calculated using a BIC instruction.


[...]


2015-09-17  Matthew Wahab  

 * config/aarch64/aarch64-protos.h (aarch64_gen_atomic_ldop):
 Adjust declaration.
 * config/aarch64/aarch64.c (aarch64_emit_bic): New.
 (aarch64_gen_atomic_load_op): Adjust comment.  Add parameter
 out_result.  Update to support update-fetch operations.
 * config/aarch64/atomics.md (aarch64_atomic_exchange_lse):
 Adjust for change to aarch64_gen_atomic_ldop.
 (aarch64_atomic__lse): Likewise.
 (aarch64_atomic_fetch__lse): Likewise.
 (atomic__fetch): Change to an expander.
 (aarch64_atomic__fetch): New.
 (aarch64_atomic__fetch_lse): New.

gcc/testsuite
2015-09-17  Matthew Wahab  

 * gcc.target/aarch64/atomic-inst-ldadd.c: Add tests for
 update-fetch operations.
 * gcc.target/aarch64/atomic-inst-ldlogic.c: Likewise.



Attached an updated patch that takes into account the review comments and changes for 
the rest of the series.


The changes in this patch:
- Updated emit_bic for changes in the earlier patch.
- Simplified the patterns used in the new expanders.
- Dropped CC clobber from the _lse patterns.

Tested the series for aarch64-none-linux-gnu with native bootstrap and
make check. Also tested for aarch64-none-elf with cross-compiled
check-gcc on an ARMv8.1 emulator with +lse enabled by default.

Ok for trunk?
Matthew

2015-09-21  Matthew Wahab  

* config/aarch64/aarch64-protos.h (aarch64_gen_atomic_ldop):
Adjust declaration.
* config/aarch64/aarch64.c (aarch64_emit_bic): New.
(aarch64_gen_atomic_ldop): Adjust comment.  Add parameter
out_result.  Update to support update-fetch operations.
* config/aarch64/atomics.md (aarch64_atomic_exchange_lse):
Adjust for change to aarch64_gen_atomic_ldop.
(aarch64_atomic__lse): Likewise.
(aarch64_atomic_fetch__lse): Likewise.
(atomic__fetch): Change to an expander.
(aarch64_atomic__fetch): New.
(aarch64_atomic__fetch_lse): New.

gcc/testsuite
2015-09-21  Matthew Wahab  

* gcc.target/aarch64/atomic-inst-ldadd.c: Add tests for
update-fetch operations.
* gcc.target/aarch64/atomic-inst-ldlogic.c: Likewise.


>From abd313723964e90b6e7d7785b646c657f6b072f9 Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Mon, 17 Aug 2015 11:27:18 +0100
Subject: [PATCH 5/5] Use atomic instructions for update-fetch patterns.

Change-Id: I5eef48586fe904f0d2df8c581fb3c12a4a2d9c78
---
 gcc/config/aarch64/aarch64-protos.h|   2 +-
 gcc/config/aarch64/aarch64.c   |  72 +++--
 gcc/config/aarch64/atomics.md  |  55 +-
 .../gcc.target/aarch64/atomic-inst-ldadd.c |  53 ++---
 .../gcc.target/aarch64/atomic-inst-ldlogic.c   | 118 ++---
 5 files changed, 241 insertions(+), 59 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 76ebd6f..dd8ebcc 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -380,7 +380,7 @@ void aarch64_split_compare_and_swap (rtx op[]);
 void aarch64_gen_atomic_cas (rtx, rtx, rtx, rtx, rtx);
 
 bool aarch64_atomic_ldop_supported_p (enum rtx_code);
-void aarch64_gen_atomic_ldop (enum rtx_code, rtx, rtx, rtx, rtx);
+void aarch64_gen_atomic_ldop (enum rtx_code, rtx, rtx, rtx, rtx, rtx);
 void aarch64_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx, rtx);
 
 bool aarch64_gen_adjusted_ldpstp (rtx *, bool, enum machine_mode, RTX_CODE);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 3a1b434..b6cdf7c 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11211,6 +11211,25 @@ aarch64_split_compare_and_swap (rtx operands[])
 aarch64_emit_post_barrier (model);
 }
 
+/* Emit a BIC instruction.  */
+
+static void
+aarch64_emit_bic (machine_mode mode, rtx dst, rtx s1, rtx s2, int shift)
+{
+  rtx shift_rtx = GEN_INT (shift);
+  rtx (*gen) (rtx, rtx, rtx, rtx);
+
+  switch (mode)
+{
+case SImode: gen = gen_and_one_cmpl_lshrsi3; break;
+case DImode: gen = gen_and_one_cmpl_lshrdi3; break;
+default:
+  gcc_unreachable ();
+}
+
+  emit_insn (gen (dst, s2, shift_rtx, s1));
+}
+
 /* Emit an atomic swap.  */
 
 static void
@@ -11305,13 +11324,14 @@ aarch64_emit_a

Re: [patch] Only do shrink_to_fit() when exceptions enabled

2015-09-21 Thread Jonathan Wakely

Oops.

Bootstrapped with --disable-libstdcxx-dual-abi on x86_64-linux,
committed to trunk.

commit 9b9d9e3a5921f9a5225b466ce9d07b42b72f54dd
Author: Jonathan Wakely 
Date:   Mon Sep 21 11:58:41 2015 +0100

Fix bootstrap error introduced in r227870

	PR libstdc++/67647
	* include/bits/basic_string.h [!_GLIBCXX_USE_CXX11_ABI]
	(basic_string::shrink_to_fit): Fix #endif placement.

diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h
index b5e7e36..c1689af 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -3291,8 +3291,8 @@ _GLIBCXX_END_NAMESPACE_CXX11
 	  { reserve(0); }
 	catch(...)
 	  { }
-#endif
 	  }
+#endif
   }
 #endif
 


Re: [PATCH] shrink-wrap: Handle multiple predecessors of prologue

2015-09-21 Thread Bernd Schmidt

On 09/18/2015 05:36 PM, Segher Boessenkool wrote:

  {
+  while (pro != entry && !can_get_prologue (pro, prologue_clobbered))
+   {
+ gcc_assert (pro != entry);


Lose the assert, it's redundant with the loop condition.


+  basic_block new_bb = create_empty_bb (EXIT_BLOCK_PTR_FOR_FN (cfun)->prev_bb);
+  BB_COPY_PARTITION (new_bb, pro);

[...]

+  *entry_edge = make_single_succ_edge (new_bb, pro, EDGE_FALLTHRU);
+  force_nonfallthru (*entry_edge);
+
free_dominance_info (CDI_DOMINATORS);
  }


Not sure I like creating a fallthru edge and then calling 
force_nonfallthru on it. Have a look at emit_to_new_bb_before in 
except.c which does something similar - it creates the new block before 
the one into which it wants to fall thru, and modifies a potential 
existing fallthru edge. I'd like that better; please change the patch if 
you also agree. Otherwise ok.



Bernd


Re: [RFC] Try vector as a new representation for vector masks

2015-09-21 Thread Ilya Enkovich
2015-09-18 19:50 GMT+03:00 Richard Henderson :
> On 09/18/2015 06:21 AM, Ilya Enkovich wrote:
 +machine_mode
 +default_get_mask_mode (unsigned nunits, unsigned vector_size)
 +{
 +  unsigned elem_size = vector_size / nunits;
 +  machine_mode elem_mode
 += smallest_mode_for_size (elem_size * BITS_PER_UNIT, MODE_INT);
>>>
>>> Why these arguments as opposed to passing elem_size?  It seems that every 
>>> hook
>>> is going to have to do this division...
>>
>> Every target would have nunits = vector_size / elem_size because
>> nunits is used to create a vector mode. Thus no difference.
>
> I meant passing nunits and elem_size, but not vector_size.  Thus no division
> required.  If the target does require the vector size, it could be obtained by
> multiplication, which is cheaper.  But in cases like this we'd not require
> either mult or div.

OK

>
 @@ -1885,7 +1885,9 @@ expand_MASK_LOAD (gcall *stmt)
create_output_operand (&ops[0], target, TYPE_MODE (type));
create_fixed_operand (&ops[1], mem);
create_input_operand (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)));
 -  expand_insn (optab_handler (maskload_optab, TYPE_MODE (type)), 3, ops);
 +  expand_insn (convert_optab_handler (maskload_optab, TYPE_MODE (type),
 +   TYPE_MODE (TREE_TYPE (maskt))),
 +3, ops);
>>>
>>> Why do we now need a conversion here?
>>
>> Mask mode was implicit for masked loads and stores. Now it becomes
>> explicit because we may load the same value using different masks.
>> E.g. for i386 we may load 256bit vector using both vector and scalar
>> masks.
>
> Ok, sure, the mask mode is needed, I get that.  But that doesn't answer the
> question regarding conversion.  Why would convert_optab_handler be needed to
> *change* the mode of the mask.  I assume that's not actually possible, with 
> the
> target hook already having chosen the proper mode for the mask.

There is no any conversion here, maskload_optab is a convert_optab
because it uses two modes, one for value and the other one for mask.

Ilya

>
>
> r~


Re: [PATCH] Fix -Wmisleading-indentation false-positive

2015-09-21 Thread Bernd Schmidt

On 09/21/2015 01:23 PM, Patrick Palka wrote:


* c-indentation.c (should_warn_for_misleading_indentation):
Float out and consolidate the calls to get_visual_column that is
passed guard_exploc as an argument.  Compare
next_stmt_vis_column with guard_line_first_nws instead of with
body_line_first_nws.


Ok.


Bernd


Re: [PATCH c/c++] use explicit locations for some warnings in c-pragma.c

2015-09-21 Thread Manuel López-Ibáñez
On 21 September 2015 at 06:52, Christophe Lyon
 wrote:
> It looks like I mis-applied your patch.
> I cleaned up and re-applied it, and I can confirm it fixes the problem.

Committed as obvious as r227967.

Thanks,

Manuel.


Re: Fwd: [PATCH] Enable libstdc++ numeric conversions on Cygwin

2015-09-21 Thread Jonathan Wakely

On 19/09/15 15:55 -0400, Jennifer Yao wrote:

Do you already have a copyright assignment for GCC?

If not, would you be willing to complete one?


No, but I would be willing to complete one, yes. Are you willing to
send me the appropriate forms, or should I contact someone else?


Great, I'll send you the forms, thanks.



Re: [PATCH, i386, AVX-512] Fix splitter for `not-xor' logic.

2015-09-21 Thread Uros Bizjak
On Mon, Sep 21, 2015 at 12:50 PM, Kirill Yukhin  wrote:
> Hello Uroš,
>
> This simple patch fixes mode (to iterator) in
> splitter for `not-xor' logic.
>
> Bootstrapped & reg-tested on trunk.
>
> Is it ok for trunk?
>
> Is it ok for gcc-5-branch? (if tested accordingly)

OK everywhere. Looks obvious, though.

> (I have no idea how to reference to define_split in
> ChangeLog entry)

Yes, you need a bit of creativity to describe define_splits, but looks OK to me.

>
> gcc/
> * gcc/config/i386/i386.md (define_split not/xor SWI1248x): Use
> iterator instead of fixed modes..

Thanks,
Uros.


Re: [libcpp/C PATCH] Handle lines encoded into several maps in linemap_position_for_loc_and_offset

2015-09-21 Thread Marek Polacek
On Sun, Aug 23, 2015 at 06:07:13PM +0200, Manuel López-Ibáñez wrote:
> linemap_position_for_loc_and_offset() tries to generate a location_t
> encoding a column offset from the current location, for example, point
> to a certain character inside a string. This is trivial to do when the
> new location "fits within" the map of the original location. However,
> it may happen that the (long) line corresponding to the original
> location is encoded in several maps, thus the new location should
> actually be encoded in a subsequent map from the original location.
> This patch detects this case and adjusts the map correspondingly.
> 
> (This shows that the line-map representation is quite wasteful in this
> case, because line-maps always start at column 0. That is, map[0]
> highest location may encode up to line 8 column 80, then
> map[1]->start_location starts encoding at line 8 column 0. Thus, there
> are two location_t values that point to the same source location.)
> 
> No new testcase since Marek found out this when tackling PR c/66415
> (https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00411.html). This
> change just fixes the column number.
> 
> Bootstrapped & regtested on x86-64-linux-gnu.
> 
> OK?

Ok, thanks.

Marek


[SH][committed] FIx PR 67657

2015-09-21 Thread Oleg Endo
Hi,

This fixes PR 67657 on trunk.  It seems that something after the
peephole2 pass is not happy to see things like:
 mov.l   @r2+,r2

which some of the SH peephole2 patterns create out of e.g.
mov.l   @r2+,r0
mov r0,r2

The post-inc is a bit pointless, so this patch catches such cases and
replaces the address in the mem with a non-post-inc address.

Tested on sh-elf with
make -k check RUNTESTFLAGS="--target_board=sh-sim
\{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}"

Committed as r227969.
A backport to GCC 5 will follow.

Cheers,
Oleg

gcc/ChangeLog:
PR target/67657
* config/sh/sh.c (sh_remove_overlapping_post_inc,
sh_peephole_emit_move_insn): Add new functions.
* config/sh/sh-protos.h (sh_remove_overlapping_post_inc,
sh_peephole_emit_move_insn): Declere them.
* config/sh/sh.md: Use them in various peephole2 patterns.
Index: gcc/config/sh/sh-protos.h
===
--- gcc/config/sh/sh-protos.h	(revision 227958)
+++ gcc/config/sh/sh-protos.h	(working copy)
@@ -306,6 +306,8 @@
 extern bool sh_reg_dead_or_unused_after_insn (const rtx_insn* i, int regno);
 extern void sh_remove_reg_dead_or_unused_notes (rtx_insn* i, int regno);
 extern rtx_insn* sh_check_add_incdec_notes (rtx_insn* i);
+extern rtx sh_remove_overlapping_post_inc (rtx dst, rtx src);
+extern rtx_insn* sh_peephole_emit_move_insn (rtx dst, rtx src);
 
 extern bool sh_in_recog_treg_set_expr (void);
 extern bool sh_recog_treg_set_expr (rtx op, machine_mode mode);
Index: gcc/config/sh/sh.c
===
--- gcc/config/sh/sh.c	(revision 227958)
+++ gcc/config/sh/sh.c	(working copy)
@@ -13810,6 +13810,34 @@
   return i;
 }
 
+/* Given a move insn destiation and a source, make sure that the move source
+   operand is not a post-inc mem load with the same address reg as the
+   destination.  Returns the modified source operand with the post-inc removed
+   if necessary.  */
+rtx
+sh_remove_overlapping_post_inc (rtx dst, rtx src)
+{
+  if (!MEM_P (src))
+return src;
+
+  rtx addr = XEXP (src, 0);
+
+  if (GET_CODE (addr) == POST_INC
+  && reg_overlap_mentioned_p (XEXP (addr, 0), dst))
+return replace_equiv_address (src, XEXP (addr, 0));
+
+  gcc_assert (GET_CODE (addr) != POST_MODIFY);
+  return src;
+}
+
+/* Emit a move insn that is safe to be used in peephole patterns.  */
+rtx_insn*
+sh_peephole_emit_move_insn (rtx dst, rtx src)
+{
+  return sh_check_add_incdec_notes (
+	emit_move_insn (dst, sh_remove_overlapping_post_inc (dst, src)));
+}
+
 /* Given an op rtx and an insn, try to find out whether the result of the
specified op consists only of logical operations on T bit stores.  */
 bool
Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md	(revision 227958)
+++ gcc/config/sh/sh.md	(working copy)
@@ -14681,7 +14681,7 @@
   [(const_int 0)]
 {
   emit_insn (gen_addsi3 (operands[1], operands[1], operands[2]));
-  sh_check_add_incdec_notes (emit_move_insn (operands[3], operands[1]));
+  sh_peephole_emit_move_insn (operands[3], operands[1]);
 })
 
 ;;	mov.l	@(r0,r9),r1
@@ -14694,7 +14694,7 @@
   "TARGET_SH1 && peep2_reg_dead_p (2, operands[0])"
   [(const_int 0)]
 {
-  sh_check_add_incdec_notes (emit_move_insn (operands[2], operands[1]));
+  sh_peephole_emit_move_insn (operands[2], operands[1]);
 })
 
 (define_peephole2
@@ -14705,7 +14705,7 @@
   "TARGET_SH1 && peep2_reg_dead_p (2, operands[0])"
   [(const_int 0)]
 {
-  sh_check_add_incdec_notes (emit_move_insn (operands[2], operands[1]));
+  sh_peephole_emit_move_insn (operands[2], operands[1]);
 })
 
 (define_peephole2
@@ -14717,7 +14717,7 @@
   [(const_int 0)]
 {
   sh_check_add_incdec_notes (emit_insn (gen_extendsi2 (operands[2],
-			 operands[1])));
+		   sh_remove_overlapping_post_inc (operands[2], operands[1];
 })
 
 ;;	mov.w	@(18,r1),r0 (r0 = HImode)
@@ -14747,8 +14747,9 @@
 
   // We don't know what the new set insn will be in detail.  Just make sure
   // that it still can be recognized and the constraints are satisfied.
-  rtx_insn* i = emit_insn (gen_rtx_SET (operands[2], operands[3]));
-		 
+  rtx_insn* i = emit_insn (gen_rtx_SET (operands[2],
+		sh_remove_overlapping_post_inc (operands[2], operands[3])));
+
   recog_data_d prev_recog_data = recog_data;
   bool i_invalid = insn_invalid_p (i, false); 
   recog_data = prev_recog_data;
@@ -14786,7 +14787,8 @@
 {
   // We don't know what the new set insn will be in detail.  Just make sure
   // that it still can be recognized and the constraints are satisfied.
-  rtx_insn* i = emit_insn (gen_rtx_SET (operands[2], operands[3]));
+  rtx_insn* i = emit_insn (gen_rtx_SET (operands[2],
+		sh_remove_overlapping_post_inc (operands[2], operands[3])));
 
   recog_data_d prev_recog_data = recog_data;
   bool i_invalid = insn_invalid_p (i, false); 


Re: [C++ PATCH] Complete the implementation of N4230, Nested namespace definition.

2015-09-21 Thread Jason Merrill

OK, thanks!

Jason


Re: [SH][committed] FIx PR 67657

2015-09-21 Thread Oleg Endo
On Mon, 2015-09-21 at 22:04 +0900, Oleg Endo wrote:
> Hi,
> 
> This fixes PR 67657 on trunk.  It seems that something after the
> peephole2 pass is not happy to see things like:
>  mov.l   @r2+,r2
> 
> which some of the SH peephole2 patterns create out of e.g.
> mov.l   @r2+,r0
> mov r0,r2
> 
> The post-inc is a bit pointless, so this patch catches such cases and
> replaces the address in the mem with a non-post-inc address.
> 
> Tested on sh-elf with
> make -k check RUNTESTFLAGS="--target_board=sh-sim
> \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}"
> 
> Committed as r227969.
> A backport to GCC 5 will follow.

This is the patch for GCC 5.  Basically same, just a small difference in
the gen_rtx_SET invocation.  Committed as r227970.

Cheers,
Oleg
Index: gcc/config/sh/sh-protos.h
===
--- gcc/config/sh/sh-protos.h	(revision 227969)
+++ gcc/config/sh/sh-protos.h	(working copy)
@@ -305,6 +305,8 @@
 extern bool sh_reg_dead_or_unused_after_insn (const rtx_insn* i, int regno);
 extern void sh_remove_reg_dead_or_unused_notes (rtx_insn* i, int regno);
 extern rtx_insn* sh_check_add_incdec_notes (rtx_insn* i);
+extern rtx sh_remove_overlapping_post_inc (rtx dst, rtx src);
+extern rtx_insn* sh_peephole_emit_move_insn (rtx dst, rtx src);
 
 extern bool sh_in_recog_treg_set_expr (void);
 extern bool sh_recog_treg_set_expr (rtx op, machine_mode mode);
Index: gcc/config/sh/sh.c
===
--- gcc/config/sh/sh.c	(revision 227969)
+++ gcc/config/sh/sh.c	(working copy)
@@ -13787,6 +13787,34 @@
   return i;
 }
 
+/* Given a move insn destiation and a source, make sure that the move source
+   operand is not a post-inc mem load with the same address reg as the
+   destination.  Returns the modified source operand with the post-inc removed
+   if necessary.  */
+rtx
+sh_remove_overlapping_post_inc (rtx dst, rtx src)
+{
+  if (!MEM_P (src))
+return src;
+
+  rtx addr = XEXP (src, 0);
+
+  if (GET_CODE (addr) == POST_INC
+  && reg_overlap_mentioned_p (XEXP (addr, 0), dst))
+return replace_equiv_address (src, XEXP (addr, 0));
+
+  gcc_assert (GET_CODE (addr) != POST_MODIFY);
+  return src;
+}
+
+/* Emit a move insn that is safe to be used in peephole patterns.  */
+rtx_insn*
+sh_peephole_emit_move_insn (rtx dst, rtx src)
+{
+  return sh_check_add_incdec_notes (
+	emit_move_insn (dst, sh_remove_overlapping_post_inc (dst, src)));
+}
+
 /* Given an op rtx and an insn, try to find out whether the result of the
specified op consists only of logical operations on T bit stores.  */
 bool
Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md	(revision 227969)
+++ gcc/config/sh/sh.md	(working copy)
@@ -14644,7 +14644,7 @@
   [(const_int 0)]
 {
   emit_insn (gen_addsi3 (operands[1], operands[1], operands[2]));
-  sh_check_add_incdec_notes (emit_move_insn (operands[3], operands[1]));
+  sh_peephole_emit_move_insn (operands[3], operands[1]);
 })
 
 ;;	mov.l	@(r0,r9),r1
@@ -14657,7 +14657,7 @@
   "TARGET_SH1 && peep2_reg_dead_p (2, operands[0])"
   [(const_int 0)]
 {
-  sh_check_add_incdec_notes (emit_move_insn (operands[2], operands[1]));
+  sh_peephole_emit_move_insn (operands[2], operands[1]);
 })
 
 (define_peephole2
@@ -14668,7 +14668,7 @@
   "TARGET_SH1 && peep2_reg_dead_p (2, operands[0])"
   [(const_int 0)]
 {
-  sh_check_add_incdec_notes (emit_move_insn (operands[2], operands[1]));
+  sh_peephole_emit_move_insn (operands[2], operands[1]);
 })
 
 (define_peephole2
@@ -14680,7 +14680,7 @@
   [(const_int 0)]
 {
   sh_check_add_incdec_notes (emit_insn (gen_extendsi2 (operands[2],
-			 operands[1])));
+		   sh_remove_overlapping_post_inc (operands[2], operands[1];
 })
 
 ;;	mov.w	@(18,r1),r0 (r0 = HImode)
@@ -14710,8 +14710,9 @@
 
   // We don't know what the new set insn will be in detail.  Just make sure
   // that it still can be recognized and the constraints are satisfied.
-  rtx_insn* i = emit_insn (gen_rtx_SET (VOIDmode, operands[2], operands[3]));
-		 
+  rtx_insn* i = emit_insn (gen_rtx_SET (VOIDmode, operands[2],
+		sh_remove_overlapping_post_inc (operands[2], operands[3])));
+
   recog_data_d prev_recog_data = recog_data;
   bool i_invalid = insn_invalid_p (i, false); 
   recog_data = prev_recog_data;
@@ -14749,7 +14750,8 @@
 {
   // We don't know what the new set insn will be in detail.  Just make sure
   // that it still can be recognized and the constraints are satisfied.
-  rtx_insn* i = emit_insn (gen_rtx_SET (VOIDmode, operands[2], operands[3]));
+  rtx_insn* i = emit_insn (gen_rtx_SET (VOIDmode, operands[2],
+		sh_remove_overlapping_post_inc (operands[2], operands[3])));
 
   recog_data_d prev_recog_data = recog_data;
   bool i_invalid = insn_invalid_p (i, false); 


Re: [PATCH][AArch64] Use preferred aliases for CSNEG, CSINC, CSINV

2015-09-21 Thread Kyrill Tkachov

Hi Andrew,

On 12/09/15 02:15, Andrew Pinski wrote:

On Tue, Sep 1, 2015 at 6:08 PM, Kyrill Tkachov  wrote:

Hi all,

The ARMv8-A reference manual says:
"CNEG , , 
is equivalent to
CSNEG , , , invert()
and is the preferred disassembly when Rn == Rm && cond != '111x'."

That is, when the two input registers are the same we can use the shorter
CNEG mnemonic
with the inverse condition instead of the longer CSNEG instruction.
Similarly for the
CSINV and CSINC instructions, they have shorter CINV and CINC forms.
This patch adjusts the output templates to emit the preferred shorter
sequences when possible.

The new mnemonics are just aliases, they map down to the same instruction in
the end, so there
are no performance or behaviour implications. But it does make the assembly
a bit more readable
IMO, since:
"cnegw27, w9, le"
can be simply read as "if the condition is less or equal negate w9" instead
of the previous:
"csnegw27, w9, w9, gt" where you have to remember which of the input
registers is negated.


Bootstrapped and tested on aarch64-linux-gnu.
Ok for trunk?

I really think this kind of special casing is not correct and does not
belong in the compiler.  The main reason it complicates the back-end
more than the benefit of easier of reading the assembly code.


.
The complication is an extra if-else statement with
explicit strings on each arm i.e. there's no snprintf trickery.
I tend to read a lot of the generated assembly when performing assembly
comparisons while working on performance patches and I find that having
the cneg from with two register operands and the negate condition is quicker
to parse than the full csneg form where I need to remember that extra bit
of info that the condition there must be inverted to get the negation condition.

If you feel very strongly against this I can withdraw this patch, but I'd 
rather have it in.
FWIW, clang also emits the CNEG when it can AFAICS, though I admit that's not 
necessarily a strong
argument for this change.

Kyrill



Thanks,
Andrew Pinski


Thanks,
Kyrill

2015-09-01  Kyrylo Tkachov  

 * config/aarch64/aarch64.md (csinc3_insn): Use CINC
 mnemonic when possible.
 (*csinv3_insn): Use CINV mnemonic when possible.
 (csneg3_insn): USE CNEG mnemonic when possible.

2015-09-01  Kyrylo Tkachov  

 * gcc.target/aarch64/abs_1.c: Update scan-assembler checks
 to allow cneg.
 * gcc.target/aarch64/cond_op_imm_1.c: Likewise.  Likewise for cinv.
 * gcc.target/aarch64/mod_2.c: Likewise.




Re: [PATCH] New attribute to create target clones

2015-09-21 Thread Bernd Schmidt

On 08/27/2015 01:18 PM, Evgeny Stupachenko wrote:

Based on RFC:
https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01322.html

The patch implement an extension to Function Multiversioning that
allows to clone a function for multiple targets.
__attribute__((target_clones("avx","arch=slm","default")))
int foo ()
...

Will create 3 clones of foo(). One optimized with -mavx, one optimized
with -march=slm, and one with default optimizations.
And will create ifunc resolver that calls appropriate clone (same as
in Function Multiversioning).


The general question is - do we want this, given that it seems to 
introduce no functionality that can't be had with the existing 
multiversioning? You could always compile the same source file to 
multiple objects with different defines for the target optimization, or 
include a file containing the multiversioned function multiple times 
with changing #defines.


Some comments on the patch itself:

> diff --git a/gcc/testsuite/gcc.dg/mvc1.c b/gcc/testsuite/gcc.dg/mvc1.c
> new file mode 100644
> index 000..ab1e0f5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/mvc1.c
> @@ -0,0 +1,27 @@
> +/* { dg-do run { target i?86-*-* x86_64-*-* } } */
> +/* { dg-options "-O2" } */

Target-specific tests should go into gcc.target/i386.


+  else if (lookup_attribute ("target_clones", DECL_ATTRIBUTES (*node)))
+   {
+ warning (OPT_Wattributes,
+  "%qE attribute ignored as it conflict with target_clones",
+  name);
+ *no_add_attrs = true;
+   }


Bad grammar ("conflicts"), and apparently you're supposed to use %qs 
even for known attribute names. Look at the code immediately before this 
and copy it. Similar issues throughout.



+  for (e = node->callers; e ;e = (e == NULL) ? e_next : e->next_caller)


It looks like this could be simplified if you immediately assign
  e_next = e->next_caller
and then always use e_next.

Trying to test it, I get (compiling with -O2 on x86_64-linux):

mvc1.c: In function ‘main’:
mvc1.c:18:1: error: virtual definition of statement not up-to-date
 main ()
 ^
_2 = foo.ifunc ();
mvc1.c:18:1: internal compiler error: verify_ssa failed


Bernd



Re: [C++] Coding rule enforcement

2015-09-21 Thread Daniel Gutson
On Tue, Sep 15, 2015 at 3:31 PM, Jason Merrill  wrote:
> On 09/15/2015 01:20 PM, Manuel López-Ibáñez wrote:
>>
>> On 15/09/15 15:26, Richard Biener wrote:
>>>
>>> On Tue, Sep 15, 2015 at 3:02 PM, Nathan Sidwell  wrote:

 Jason,
 somme of our customers have 'interesting' C++ coding rules, they'd
 like to
 have the compiler enforced.  They want to disable:

 1) namespace definitions
 2) template declarations
 3) multiple inheritance
 4) virtual inheritance

 But they want to use the STL.  This patch implements 4 new flags,
 intended
 to be use in the -fno-FOO form.  They're only active outside of system
 header files.
>>
>>
>> If these are quite specific coding rules, wouldn't this be something
>> ideal for a plugin rather than implemented in the compiler proper?
>>
>> One can implement warnings with plugins (in python if desired!)
>>
>>
>> http://gcc-python-plugin.readthedocs.org/en/latest/basics.html#generating-custom-errors-and-warnings
>>
>>
>> Compiler plugins that implement specific coding rules are quite common,
>> alas, using clang not GCC:
>> https://wiki.documentfoundation.org/Development/Clang_plugins
>>
>> It would be more generally useful to extend GCC to fully support this
>> type of plugins.
>
>
> Good point.

FWIW, we could make this plugin in 2 weeks (w already have static
checkers as plugins for our customers). I understand Nathan that you
may have some deadlines, but if we could have the opportunity to
implement it, we could accomplish a clean isolation of a particular
business needs (despite I acknowledge that a warning about virtual
inheritance may be useful for a broader audience). OTOH, a plugin can
receive arguments, such as a configuration file which could point to
specific sources or hints about where to apply the warning, or a
suppression file, both things useful for large legacy code. IMVHO I
think this is a superior solution.
Please let me know if we could collaborate to get both a better gcc
and a better static checker.


>
> Jason
>
>



-- 

Daniel F. Gutson
Chief Engineering Officer, SPD

San Lorenzo 47, 3rd Floor, Office 5
Córdoba, Argentina

Phone:   +54 351 4217888 / +54 351 4218211
Skype:dgutson
LinkedIn: http://ar.linkedin.com/in/danielgutson


[SH][committed] Adjust pr64345-1.c testcase for SH2A

2015-09-21 Thread Oleg Endo
Hi,

There have been some changes to the generated bitfiddling code for SH2A
and the test case caught them.  Those changes are good, and the test
case can be adjusted accordingly.

Tested on sh-elf with
make -k check-gcc RUNTESTFLAGS="sh.exp --target_board=sh-sim\{-m2a/-mb}"

Committed as r227971.

Cheers,
Oleg

gcc/testsuite/ChangeLog:
PR target/64345
* gcc.target/sh/pr64345-1.c: Adjust expected insn counts for SH2A.
Index: gcc/testsuite/gcc.target/sh/pr64345-1.c
===
--- gcc/testsuite/gcc.target/sh/pr64345-1.c	(revision 227970)
+++ gcc/testsuite/gcc.target/sh/pr64345-1.c	(working copy)
@@ -10,10 +10,10 @@
 /* { dg-final { scan-assembler-times "shlr\t" 8 { target { ! sh2a } } } }  */
 
 /* { dg-final { scan-assembler-times "tst" 442 { target { sh2a } } } }  */
-/* { dg-final { scan-assembler-times "bld" 276 { target { sh2a } } } }  */
+/* { dg-final { scan-assembler-times "bld" 285 { target { sh2a } } } }  */
 /* { dg-final { scan-assembler-times "cmp/pz" 6 { target { sh2a } } } }  */
-/* { dg-final { scan-assembler-times "shll\t" 4 { target { sh2a } } } }  */
-/* { dg-final { scan-assembler-times "shlr\t" 8 { target { sh2a } } } }  */
+/* { dg-final { scan-assembler-times "shll\t" 3 { target { sh2a } } } }  */
+/* { dg-final { scan-assembler-times "shlr\t" 0 { target { sh2a } } } }  */
 
 /* { dg-final { scan-assembler-times "and\t#1" 32 } }  */
 


Re: [Patch/ccmp] Cost instruction sequences to choose better expand order

2015-09-21 Thread Jiong Wang

Bernd Schmidt writes:

> On 09/18/2015 05:21 PM, Jiong Wang wrote:
>>
>> Current conditional compare (CCMP) support in GCC aim to optimize
>> short circuit for cascade comparision, given a simple conditional
>> compare candidate:
>>
>>if (a == 17 || a == 32)
> [...]
>> The problem is current implementation always expand t0 first, then
>> t1. While the expand order need to consider the rtx costs, because "cmp"
>> and "ccmp" may have different restrictions that the expand order will
>> result in performance differences.
>>
>> For example on AArch64, "ccmp" only accept immediate within -31 ~ 31
>> while "cmp" accept wider range, so if we expand "a == 32" in the second
>> step, then it will use "ccmp", and thus an extra "mov reg, 32"
>> instruction is generated because "32" is out of the range. While if we
>> expand "a == 32" first, then it's fine as "cmp" can encoding it.
>
> I've played with this patch a bit with an aarch64 cross compiler. First 
> of all - it doesn't seem to work, I get identical costs and the swapping 
> doesn't happen. Did you forget to include a rtx_cost patch?

No. Please see NOTE part of the description. AArch64 doesn't cost ccmp
currently. It will be fixed by a seperate patch later. The testcase is
thus marked as XFAIL.

>
> I was a little worried about whether this would be expensive for longer 
> sequences of conditions, but it seems like it looks only at leafs where 
> we have two comparisons, so that cost should be minimal. However, it's 
> possible there's room for improvement in code generation. I was curious 
> and looked at a slightly more complex testcase
>
> int
> foo (int a)
> {
>if (a == 17 || a == 32 || a == 47 || a == 53 || a == 66 || a == 72)
>  return 1;
>else
>  return 2;
> }

Yes, this patch only cost the inner most call of the recursive call of
expand_ccmp_expr_1 which is simple and common situation.
>
> and this doesn't generate a sequence of ccmps as might have been 
> expected; we only get pairs of comparisons merged with a bit_ior:
>
>D.2699 = a == 17;
>D.2700 = a == 32;
>D.2701 = D.2699 | D.2700;
>if (D.2701 != 0) goto ; else goto ;
>:
>D.2703 = a == 47;
>D.2704 = a == 53;
>D.2705 = D.2703 | D.2704;
>if (D.2705 != 0) goto ; else goto ;
>
> and the ccmp expander doesn't see the entire thing. I found that a 
> little surprising TBH.

It's because fold_truth_andor has re-written the tree to avoid the use
short circuit given LOGICAL_OP_NON_SHORT_CIRCUIT returns true.

-- 
Regards,
Jiong



Re: [PATCH, ARM]: Fix static interworking call

2015-09-21 Thread Christophe Lyon
On 21 September 2015 at 12:55, Christian Bruel  wrote:
> Hi Christophe,
>
>> It seems you committed the 1st version of your patch.
>
>
> Not sure what version you are talking about. I committed what was posted
> (https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01260.html) at revision
> #227880.

I thought you had modified it, after Richard's comments.

>
>> However, it fails if one forces a armv5t target, because, as Richard
>> said -mfloat-abi=hard is not supported on Thumb1.
>
>
> sure, but the testcase was for thumb2. It's just that the thumb1 check in
> the dg-skip-if is indeed useless. I'll fix this.
>
>>
>> It tried to remove -mfloat-abi=hard, and I noticed that only 1 'blx'
>> is generated when using soft or softfp.
>>
>> I expect this to be fixed by the linker anyway, but it may mean there
>> is still something wrong.
>
>
> This is expected. For thumb1, such static calls interwork are still done by
> the linker.
> For consistency IMHO we could also resolve static interwork thumb1 calls in
> the compiler, but that would be another issue.
> This failure was only for thumb2 (this having -march=armv7-a in the
> dg-options). (See https://sourceware.org/bugzilla/show_bug.cgi?id=17505)
>
> Hope that clarifies,
>
Yes, thanks.

> thanks
>
> Christian


Re: [Patch/ccmp] Cost instruction sequences to choose better expand order

2015-09-21 Thread pinskia


> On Sep 21, 2015, at 4:39 AM, Bernd Schmidt  wrote:
> 
>> On 09/18/2015 05:21 PM, Jiong Wang wrote:
>> 
>> Current conditional compare (CCMP) support in GCC aim to optimize
>> short circuit for cascade comparision, given a simple conditional
>> compare candidate:
>> 
>>   if (a == 17 || a == 32)
> [...]
>> The problem is current implementation always expand t0 first, then
>> t1. While the expand order need to consider the rtx costs, because "cmp"
>> and "ccmp" may have different restrictions that the expand order will
>> result in performance differences.
>> 
>> For example on AArch64, "ccmp" only accept immediate within -31 ~ 31
>> while "cmp" accept wider range, so if we expand "a == 32" in the second
>> step, then it will use "ccmp", and thus an extra "mov reg, 32"
>> instruction is generated because "32" is out of the range. While if we
>> expand "a == 32" first, then it's fine as "cmp" can encoding it.
> 
> I've played with this patch a bit with an aarch64 cross compiler. First of 
> all - it doesn't seem to work, I get identical costs and the swapping doesn't 
> happen. Did you forget to include a rtx_cost patch?
> 
> I was a little worried about whether this would be expensive for longer 
> sequences of conditions, but it seems like it looks only at leafs where we 
> have two comparisons, so that cost should be minimal. However, it's possible 
> there's room for improvement in code generation. I was curious and looked at 
> a slightly more complex testcase
> 
> int
> foo (int a)
> {
>  if (a == 17 || a == 32 || a == 47 || a == 53 || a == 66 || a == 72)
>return 1;
>  else
>return 2;
> }
> 
> and this doesn't generate a sequence of ccmps as might have been expected; we 
> only get pairs of comparisons merged with a bit_ior:
> 
>  D.2699 = a == 17;
>  D.2700 = a == 32;
>  D.2701 = D.2699 | D.2700;
>  if (D.2701 != 0) goto ; else goto ;
>  :
>  D.2703 = a == 47;
>  D.2704 = a == 53;
>  D.2705 = D.2703 | D.2704;
>  if (D.2705 != 0) goto ; else goto ;
> 
> and the ccmp expander doesn't see the entire thing. I found that a little 
> surprising TBH.

This is a known issue with fold-const folding too early. Replace || with | and 
add some parentheses and you should get a string of ccmp's. 

Thanks,
Andrew


> 
> 
> Bernd


[PATCH] Fix PR67664

2015-09-21 Thread Richard Biener

The following avoids duplicate DW_AT_const_value attributesin the way
probably intended by the early-debug merge which added the get_AT
check in the first place.

Bootstrapped & tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-09-21  Richard Biener  

PR debug/67664
* dwarf2out.c (add_location_or_const_value_attribute): Remove
attribute parameter.  Early exit if either DW_AT_const_value
or DW_AT_location are present already.
(gen_variable_die): Adjust caller.
(dwarf2out_late_global_decl): Likewise.

* g++.dg/debug/dwarf2/pr67764.C: New testcase.

Index: gcc/dwarf2out.c
===
*** gcc/dwarf2out.c (revision 227899)
--- gcc/dwarf2out.c (working copy)
*** static void insert_int (HOST_WIDE_INT, u
*** 3254,3261 
  static void insert_wide_int (const wide_int &, unsigned char *, int);
  static void insert_float (const_rtx, unsigned char *);
  static rtx rtl_for_decl_location (tree);
! static bool add_location_or_const_value_attribute (dw_die_ref, tree, bool,
!  enum dwarf_attribute);
  static bool tree_add_const_value_attribute (dw_die_ref, tree);
  static bool tree_add_const_value_attribute_for_decl (dw_die_ref, tree);
  static void add_name_attribute (dw_die_ref, const char *);
--- 3254,3260 
  static void insert_wide_int (const wide_int &, unsigned char *, int);
  static void insert_float (const_rtx, unsigned char *);
  static rtx rtl_for_decl_location (tree);
! static bool add_location_or_const_value_attribute (dw_die_ref, tree, bool);
  static bool tree_add_const_value_attribute (dw_die_ref, tree);
  static bool tree_add_const_value_attribute_for_decl (dw_die_ref, tree);
  static void add_name_attribute (dw_die_ref, const char *);
*** fortran_common (tree decl, HOST_WIDE_INT
*** 16137,16144 
 since we will need to refer to them each time the function is inlined.  */
  
  static bool
! add_location_or_const_value_attribute (dw_die_ref die, tree decl, bool 
cache_p,
!  enum dwarf_attribute attr)
  {
rtx rtl;
dw_loc_list_ref list;
--- 16136,16142 
 since we will need to refer to them each time the function is inlined.  */
  
  static bool
! add_location_or_const_value_attribute (dw_die_ref die, tree decl, bool 
cache_p)
  {
rtx rtl;
dw_loc_list_ref list;
*** add_location_or_const_value_attribute (d
*** 16151,16157 
if (TREE_CODE (decl) == ERROR_MARK)
  return false;
  
!   if (get_AT (die, attr))
  return true;
  
gcc_assert (TREE_CODE (decl) == VAR_DECL || TREE_CODE (decl) == PARM_DECL
--- 16149,16156 
if (TREE_CODE (decl) == ERROR_MARK)
  return false;
  
!   if (get_AT (die, DW_AT_location)
!   || get_AT (die, DW_AT_const_value))
  return true;
  
gcc_assert (TREE_CODE (decl) == VAR_DECL || TREE_CODE (decl) == PARM_DECL
*** add_location_or_const_value_attribute (d
*** 16217,16223 
  }
if (list)
  {
!   add_AT_location_description (die, attr, list);
return true;
  }
/* None of that worked, so it must not really have a location;
--- 16216,16222 
  }
if (list)
  {
!   add_AT_location_description (die, DW_AT_location, list);
return true;
  }
/* None of that worked, so it must not really have a location;
*** gen_formal_parameter_die (tree node, tre
*** 18117,18123 
  equate_decl_number_to_die (node, parm_die);
if (! DECL_ABSTRACT_P (node_or_origin))
add_location_or_const_value_attribute (parm_die, node_or_origin,
!  node == NULL, DW_AT_location);
  
break;
  
--- 18116,18122 
  equate_decl_number_to_die (node, parm_die);
if (! DECL_ABSTRACT_P (node_or_origin))
add_location_or_const_value_attribute (parm_die, node_or_origin,
!  node == NULL);
  
break;
  
*** gen_variable_die (tree decl, tree origin
*** 19663,19669 
add_pubname (decl_or_origin, var_die);
else
add_location_or_const_value_attribute (var_die, decl_or_origin,
!  decl == NULL, DW_AT_location);
  }
else
  tree_add_const_value_attribute_for_decl (var_die, decl_or_origin);
--- 19662,19668 
add_pubname (decl_or_origin, var_die);
else
add_location_or_const_value_attribute (var_die, decl_or_origin,
!  decl == NULL);
  }
else
  tree_add_const_value_attribute_for_decl (var_die, decl_or_origin);
*** dwarf2out_late_global_decl (tree decl)
*** 21640,21647 
  {
dw_die_ref die = lookup_decl_die (decl);
if (die)
!   add_location_or_const_value_attribute (die, decl,

Re: [C++] Coding rule enforcement

2015-09-21 Thread Manuel López-Ibáñez
On 21 September 2015 at 15:46, Daniel Gutson
 wrote:
>
> FWIW, we could make this plugin in 2 weeks (w already have static
> checkers as plugins for our customers). I understand Nathan that you
> may have some deadlines, but if we could have the opportunity to
> implement it, we could accomplish a clean isolation of a particular
> business needs (despite I acknowledge that a warning about virtual
> inheritance may be useful for a broader audience). OTOH, a plugin can
> receive arguments, such as a configuration file which could point to
> specific sources or hints about where to apply the warning, or a
> suppression file, both things useful for large legacy code. IMVHO I
> think this is a superior solution.
> Please let me know if we could collaborate to get both a better gcc
> and a better static checker.

My opinion is that if people want to "donate" their plugins to the
FSF, they are willing to maintain them, and they are not overtly
complex or require any third-party software, they should be added to
the GCC repository and build/tested. They could serve as examples and
extra testing for the plugin framework, with the expectation that they
may get removed if they become unmaintained.

Cheers,

Manuel.


Re: [PATCH] shrink-wrap: Handle multiple predecessors of prologue

2015-09-21 Thread Segher Boessenkool
On Mon, Sep 21, 2015 at 01:56:28PM +0200, Bernd Schmidt wrote:
> >+  basic_block new_bb = create_empty_bb (EXIT_BLOCK_PTR_FOR_FN 
> >(cfun)->prev_bb);
> >+  BB_COPY_PARTITION (new_bb, pro);
> [...]
> >+  *entry_edge = make_single_succ_edge (new_bb, pro, EDGE_FALLTHRU);
> >+  force_nonfallthru (*entry_edge);
> >+
> >free_dominance_info (CDI_DOMINATORS);
> >  }
> 
> Not sure I like creating a fallthru edge and then calling 
> force_nonfallthru on it.

It's just to create the edge and a branch, and to set all flags and the
edge probability etc. correctly for it.  (You have to first make it
EDGE_FALLTHRU or force_nonfallthru will complain).  Is there a nicer
idiom, or a helper function?  Should there be?

> Have a look at emit_to_new_bb_before in 
> except.c which does something similar - it creates the new block before 
> the one into which it wants to fall thru, and modifies a potential 
> existing fallthru edge. I'd like that better; please change the patch if 
> you also agree. Otherwise ok.

We might not be able to redirect the jump from the block formerly the
predecessor of PRO, to jump over the new block (it was not necessarily a
fallthrough edge, could be EDGE_COMPLEX).  In most cases we could do it
of course.  I would prefer not to add special case code (that is not as
well tested).


Segher


[PATCH] Fix dwarf2out ICE

2015-09-21 Thread Richard Biener

When testing with -g g++.dg/ext/attr-alias-3.C ICEs because we mess
up the context of class-scope vars with aliases.  The following fixes
this.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-09-21  Richard Biener  

* passes.c (rest_of_decl_compilation): Do not call
dwarf2out_early_global_decl for aliases.

* g++.dg/ext/attr-alias-3.C: Add -g.

Index: gcc/passes.c
===
*** gcc/passes.c(revision 227899)
--- gcc/passes.c(working copy)
*** rest_of_decl_compilation (tree decl,
*** 327,333 
 and thus have incomplete early debug and late debug
 called from varpool node removal fails to handle it
 properly.  */
! || (TREE_CODE (decl) == VAR_DECL
  && TREE_STATIC (decl) && !DECL_EXTERNAL (decl)))
/* Avoid confusing the debug information machinery when there are
 errors.  */
--- 327,334 
 and thus have incomplete early debug and late debug
 called from varpool node removal fails to handle it
 properly.  */
! || (finalize
! && TREE_CODE (decl) == VAR_DECL
  && TREE_STATIC (decl) && !DECL_EXTERNAL (decl)))
/* Avoid confusing the debug information machinery when there are
 errors.  */
Index: gcc/testsuite/g++.dg/ext/attr-alias-3.C
===
*** gcc/testsuite/g++.dg/ext/attr-alias-3.C (revision 227899)
--- gcc/testsuite/g++.dg/ext/attr-alias-3.C (working copy)
***
*** 1,5 
--- 1,6 
  // PR c++/56134
  // { dg-require-alias "" }
+ // { dg-options "-g" }
  
  char a;
  class Q


Re: [AArch64][PATCH 4/5] Use atomic load-operate instructions for fetch-update patterns.

2015-09-21 Thread James Greenhalgh
On Mon, Sep 21, 2015 at 12:31:21PM +0100, Matthew Wahab wrote:
> I've also removed the CC clobber from the _lse patterns, it was overly 
> cautious.
> 
> Update patch attached, Ok for trunk?
> Matthew

OK.

Thanks,
James

> 
> gcc/
> 2015-09-21  Matthew Wahab  
> 
>   * config/aarch64/aarch64-protos.h
>   (aarch64_atomic_ldop_supported_p): Declare.
>   * config/aarch64/aarch64.c (aarch64_atomic_ldop_supported_p): New.
>   (enum aarch64_atomic_load_op_code): New.
>   (aarch64_emit_atomic_load_op): New.
>   (aarch64_gen_atomic_ldop): Update to support load-operate
>   patterns.
>   * config/aarch64/atomics.md (atomic_): Change
>   to an expander.
>   (aarch64_atomic_): New.
>   (aarch64_atomic__lse): New.
>   (atomic_fetch_): Change to an expander.
>   (aarch64_atomic_fetch_): New.
>   (aarch64_atomic_fetch__lse): New.
> 
> gcc/testsuite/
> 2015-09-21  Matthew Wahab  
> 
>   * gcc.target/aarch64/atomic-inst-ldadd.c: New.
>   * gcc.target/aarch64/atomic-inst-ldlogic.c: New.
> 

> From 5d1bc64d7e9509374076e4c4ff5a285d4b073f24 Mon Sep 17 00:00:00 2001
> From: Matthew Wahab 
> Date: Fri, 7 Aug 2015 17:10:42 +0100
> Subject: [PATCH 4/5] Use atomic instructions for fetch-update patterns.
> 
> Change-Id: I39759f02e61039067ccaabfd52039e4804eddf2f
> ---
>  gcc/config/aarch64/aarch64-protos.h|   2 +
>  gcc/config/aarch64/aarch64.c   | 175 
> -
>  gcc/config/aarch64/atomics.md  | 101 ++--
>  .../gcc.target/aarch64/atomic-inst-ldadd.c |  58 +++
>  .../gcc.target/aarch64/atomic-inst-ldlogic.c   | 109 +
>  5 files changed, 433 insertions(+), 12 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/atomic-inst-ldadd.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/atomic-inst-ldlogic.c
> 
> diff --git a/gcc/config/aarch64/aarch64-protos.h 
> b/gcc/config/aarch64/aarch64-protos.h
> index eba4c76..76ebd6f 100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -378,6 +378,8 @@ rtx aarch64_load_tp (rtx);
>  void aarch64_expand_compare_and_swap (rtx op[]);
>  void aarch64_split_compare_and_swap (rtx op[]);
>  void aarch64_gen_atomic_cas (rtx, rtx, rtx, rtx, rtx);
> +
> +bool aarch64_atomic_ldop_supported_p (enum rtx_code);
>  void aarch64_gen_atomic_ldop (enum rtx_code, rtx, rtx, rtx, rtx);
>  void aarch64_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx, rtx);
>  
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index dc05c6e..3a1b434 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -11064,6 +11064,32 @@ aarch64_expand_compare_and_swap (rtx operands[])
>emit_insn (gen_rtx_SET (bval, x));
>  }
>  
> +/* Test whether the target supports using a atomic load-operate instruction.
> +   CODE is the operation and AFTER is TRUE if the data in memory after the
> +   operation should be returned and FALSE if the data before the operation
> +   should be returned.  Returns FALSE if the operation isn't supported by the
> +   architecture.  */
> +
> +bool
> +aarch64_atomic_ldop_supported_p (enum rtx_code code)
> +{
> +  if (!TARGET_LSE)
> +return false;
> +
> +  switch (code)
> +{
> +case SET:
> +case AND:
> +case IOR:
> +case XOR:
> +case MINUS:
> +case PLUS:
> +  return true;
> +default:
> +  return false;
> +}
> +}
> +
>  /* Emit a barrier, that is appropriate for memory model MODEL, at the end of 
> a
> sequence implementing an atomic operation.  */
>  
> @@ -11206,26 +11232,169 @@ aarch64_emit_atomic_swap (machine_mode mode, rtx 
> dst, rtx value,
>emit_insn (gen (dst, mem, value, model));
>  }
>  
> -/* Emit an atomic operation where the architecture supports it.  */
> +/* Operations supported by aarch64_emit_atomic_load_op.  */
> +
> +enum aarch64_atomic_load_op_code
> +{
> +  AARCH64_LDOP_PLUS, /* A + B  */
> +  AARCH64_LDOP_XOR,  /* A ^ B  */
> +  AARCH64_LDOP_OR,   /* A | B  */
> +  AARCH64_LDOP_BIC   /* A & ~B  */
> +};
> +
> +/* Emit an atomic load-operate.  */
> +
> +static void
> +aarch64_emit_atomic_load_op (enum aarch64_atomic_load_op_code code,
> +  machine_mode mode, rtx dst, rtx src,
> +  rtx mem, rtx model)
> +{
> +  typedef rtx (*aarch64_atomic_load_op_fn) (rtx, rtx, rtx, rtx);
> +  const aarch64_atomic_load_op_fn plus[] =
> +  {
> +gen_aarch64_atomic_loadaddqi,
> +gen_aarch64_atomic_loadaddhi,
> +gen_aarch64_atomic_loadaddsi,
> +gen_aarch64_atomic_loadadddi
> +  };
> +  const aarch64_atomic_load_op_fn eor[] =
> +  {
> +gen_aarch64_atomic_loadeorqi,
> +gen_aarch64_atomic_loadeorhi,
> +gen_aarch64_atomic_loadeorsi,
> +gen_aarch64_atomic_loadeordi
> +  };
> +  const aarch64_atomic_load_op_fn ior[] =
> +  {
> +gen_aarch64_atomic_loadsetqi,
> +gen_aarch64_atomic_loadse

Re: [AArch64][PATCH 5/5] Use atomic load-operate instructions for update-fetch patterns.

2015-09-21 Thread James Greenhalgh
On Mon, Sep 21, 2015 at 12:40:32PM +0100, Matthew Wahab wrote:
> On 17/09/15 17:54, Matthew Wahab wrote:
> > ARMv8.1 adds atomic swap and atomic load-operate instructions with
> > optional memory ordering specifiers. This patch uses the ARMv8.1
> > load-operate instructions to implement the atomic__fetch patterns.
> >
> > The approach is to use the atomic load-operate instruction to atomically
> > load the data and update memory and then to use the loaded data to
> > calculate the value that the instruction would have stored. The
> > calculation attempts to mirror the operation of the atomic instruction.
> > For example, atomic_and_fetch is implemented with an atomic
> > load-bic so the result is also calculated using a BIC instruction.
> >
> [...]
> >
> > 2015-09-17  Matthew Wahab  
> >
> >  * config/aarch64/aarch64-protos.h (aarch64_gen_atomic_ldop):
> >  Adjust declaration.
> >  * config/aarch64/aarch64.c (aarch64_emit_bic): New.
> >  (aarch64_gen_atomic_load_op): Adjust comment.  Add parameter
> >  out_result.  Update to support update-fetch operations.
> >  * config/aarch64/atomics.md (aarch64_atomic_exchange_lse):
> >  Adjust for change to aarch64_gen_atomic_ldop.
> >  (aarch64_atomic__lse): Likewise.
> >  (aarch64_atomic_fetch__lse): Likewise.
> >  (atomic__fetch): Change to an expander.
> >  (aarch64_atomic__fetch): New.
> >  (aarch64_atomic__fetch_lse): New.
> >
> > gcc/testsuite
> > 2015-09-17  Matthew Wahab  
> >
> >  * gcc.target/aarch64/atomic-inst-ldadd.c: Add tests for
> >  update-fetch operations.
> >  * gcc.target/aarch64/atomic-inst-ldlogic.c: Likewise.
> >
> 
> Attached an updated patch that takes into account the review comments and 
> changes for 
> the rest of the series.
> 
> The changes in this patch:
> - Updated emit_bic for changes in the earlier patch.
> - Simplified the patterns used in the new expanders.
> - Dropped CC clobber from the _lse patterns.
> 
> Tested the series for aarch64-none-linux-gnu with native bootstrap and
> make check. Also tested for aarch64-none-elf with cross-compiled
> check-gcc on an ARMv8.1 emulator with +lse enabled by default.
> 
> Ok for trunk?

OK.

Thanks,
James

> Matthew
> 
> 2015-09-21  Matthew Wahab  
> 
>   * config/aarch64/aarch64-protos.h (aarch64_gen_atomic_ldop):
>   Adjust declaration.
>   * config/aarch64/aarch64.c (aarch64_emit_bic): New.
>   (aarch64_gen_atomic_ldop): Adjust comment.  Add parameter
>   out_result.  Update to support update-fetch operations.
>   * config/aarch64/atomics.md (aarch64_atomic_exchange_lse):
>   Adjust for change to aarch64_gen_atomic_ldop.
>   (aarch64_atomic__lse): Likewise.
>   (aarch64_atomic_fetch__lse): Likewise.
>   (atomic__fetch): Change to an expander.
>   (aarch64_atomic__fetch): New.
>   (aarch64_atomic__fetch_lse): New.
> 
> gcc/testsuite
> 2015-09-21  Matthew Wahab  
> 
>   * gcc.target/aarch64/atomic-inst-ldadd.c: Add tests for
>   update-fetch operations.
>   * gcc.target/aarch64/atomic-inst-ldlogic.c: Likewise.
> 
> 



[PATCH, i386] Introduce switch for Skylake Server CPU.

2015-09-21 Thread Kirill Yukhin
Hello,
This patch introduces switches necessary for new Intel Server CPU
(code-named Skylake).

Bootstrapped & regtested.

Is it ok for trunk?

gcc/
* config.gcc: Support "skx".
* config/i386/i386-c.c (ix86_target_macros_internal): Handle
PROCESSOR_SKX.
* config/i386/i386.c (m_SKX): Define.
(processor_target_table): Add "skx".
(PTA_SKX): Define.
(ix86_option_override_internal): Add "skx".
(fold_builtin_cpu): Handle "skx".
* config/i386/i386.h (TARGET_SKX): Define.
(processor_type): Add PROCESSOR_SKX.
* config/i386/i386.md (attr "cpu"): Add knl.
* config/i386/x86-tune.def: Add m_KNL.

gcc/testsuite/
* gcc.target/i386/funcspec-5.c: Test avx512vl, avx512bw,
avx512dq, avx512cd, avx512er, avx512pf and skx.

--
Thanks, K

commit ffaaf5ad0460cd5bd85f6fd4b360fb1171eb273e
Author: Kirill Yukhin 
Date:   Fri Sep 18 14:03:54 2015 +0300

AVX-512. Introduce SKX CPU.

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 75807f5..18da001 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -595,7 +595,7 @@ x86_64_archs="amdfam10 athlon64 athlon64-sse3 barcelona 
bdver1 bdver2 \
 bdver3 bdver4 btver1 btver2 k8 k8-sse3 opteron opteron-sse3 nocona \
 core2 corei7 corei7-avx core-avx-i core-avx2 atom slm nehalem westmere \
 sandybridge ivybridge haswell broadwell bonnell silvermont knl x86-64 \
-native"
+native skx"
 
 # Additional x86 processors supported by --with-cpu=.  Each processor
 # MUST be separated by exactly one space.
diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index 35cecd0..09670bb 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -177,6 +177,10 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
   def_or_undef (parse_in, "__knl");
   def_or_undef (parse_in, "__knl__");
   break;
+case PROCESSOR_SKX:
+  def_or_undef (parse_in, "__skx");
+  def_or_undef (parse_in, "__skx__");
+  break;
 /* use PROCESSOR_max to not set/unset the arch macro.  */
 case PROCESSOR_max:
   break;
@@ -286,6 +290,9 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
 case PROCESSOR_KNL:
   def_or_undef (parse_in, "__tune_knl__");
   break;
+case PROCESSOR_SKX:
+  def_or_undef (parse_in, "__tune_skx__");
+  break;
 case PROCESSOR_IAMCU:
   def_or_undef (parse_in, "__tune_iamcu__");
   break;
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 00e7006..d34bb62 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2098,6 +2098,7 @@ const struct processor_costs *ix86_cost = &pentium_cost;
 #define m_BONNELL (1<

[PATCH, i386, AVX-512] Fix operands in mask unpack[si|di] patterns.

2015-09-21 Thread Kirill Yukhin
Hello,
This patch fixes operands in kunpck[hi|si|di] insn patterns.
Bootstrapped (regtesting in progress).

gcc/
* gcc/config/i386/i386.md (define_insn "kunpckhi"): Fix
operand in pattern.
(define_insn "kunpcksi"): Ditto. 
(define_insn "kunpckdi"): Ditto. 

Is it ok for trunk and gcc-5-branch if testing pass?

--
Thanks, K

commit 433bd69b29c62d0c65e5a0772d564b0d930ee8c5
Author: Kirill Yukhin 
Date:   Fri Sep 18 13:51:26 2015 +0300

AVX-512. Fix patterns for kunpck insns.

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index f7230f0..2f8cdb2 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -8603,7 +8603,7 @@
   [(set (match_operand:HI 0 "register_operand" "=k")
(ior:HI
  (ashift:HI
-   (match_operand:HI 1 "register_operand" "k")
+   (zero_extend:HI (match_operand:QI 1 "register_operand" "k"))
(const_int 8))
  (zero_extend:HI (match_operand:QI 2 "register_operand" "k"]
   "TARGET_AVX512F"
@@ -8616,9 +8616,9 @@
   [(set (match_operand:SI 0 "register_operand" "=k")
(ior:SI
  (ashift:SI
-   (match_operand:SI 1 "register_operand" "k")
+   (zero_extend:SI (match_operand:HI 1 "register_operand" "k"))
(const_int 16))
- (zero_extend:SI (subreg:HI (match_operand:SI 2 "register_operand" 
"k") 0]
+ (zero_extend:SI (match_operand:HI 2 "register_operand" "k"]
   "TARGET_AVX512BW"
   "kunpckwd\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "mode" "SI")])
@@ -8627,9 +8627,9 @@
   [(set (match_operand:DI 0 "register_operand" "=k")
(ior:DI
  (ashift:DI
-   (match_operand:DI 1 "register_operand" "k")
+   (zero_extend:DI (match_operand:SI 1 "register_operand" "k"))
(const_int 32))
- (zero_extend:DI (subreg:SI (match_operand:DI 2 "register_operand" 
"k") 0]
+ (zero_extend:DI (match_operand:SI 2 "register_operand" "k"]
   "TARGET_AVX512BW"
   "kunpckdq\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "mode" "DI")])


Re: [PATCH, i386] Introduce switch for Skylake Server CPU.

2015-09-21 Thread Jakub Jelinek
On Mon, Sep 21, 2015 at 05:14:45PM +0300, Kirill Yukhin wrote:
> Hello,
> This patch introduces switches necessary for new Intel Server CPU
> (code-named Skylake).
> 
> Bootstrapped & regtested.
> 
> Is it ok for trunk?
> 
> gcc/
>   * config.gcc: Support "skx".
> * config/i386/i386-c.c (ix86_target_macros_internal): Handle
> PROCESSOR_SKX.
> * config/i386/i386.c (m_SKX): Define.
> (processor_target_table): Add "skx".
> (PTA_SKX): Define.
>   (ix86_option_override_internal): Add "skx".
> (fold_builtin_cpu): Handle "skx".
> * config/i386/i386.h (TARGET_SKX): Define.
> (processor_type): Add PROCESSOR_SKX.
> * config/i386/i386.md (attr "cpu"): Add knl.
> * config/i386/x86-tune.def: Add m_KNL.
> 
> gcc/testsuite/
> * gcc.target/i386/funcspec-5.c: Test avx512vl, avx512bw,
>   avx512dq, avx512cd, avx512er, avx512pf and skx.

Is it a good idea to introduce further abbrevs like this?
I thought we went away from e.g. slm to silvermont, we didn't
introduce hsw but haswell, etc.
So, wouldn't it be better to add skylake-xeon instead of skx?

Jakub


Re: [PATCH] shrink-wrap: Handle multiple predecessors of prologue

2015-09-21 Thread Bernd Schmidt

We might not be able to redirect the jump from the block formerly the
predecessor of PRO, to jump over the new block (it was not necessarily a
fallthrough edge, could be EDGE_COMPLEX).  In most cases we could do it
of course.  I would prefer not to add special case code (that is not as
well tested).


If there was a jump already, there's no need to redirect it? Only a 
fallthru edge would be affected by a new block before its target, or am 
I missing something?



Bernd



Re: [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics.

2015-09-21 Thread James Greenhalgh

On Mon, Sep 21, 2015 at 10:44:32AM +0100, Alan Lawrence wrote:
> [Resending in plain text] This makes sense to me now, although I find
> your comment slightly confusing:
>
> [] in that
> +;; the meaning of HI and LO is always taken with a little-endian view of
> +;; the vector
>
> You mean vec_unpacks_{hi,lo} (which seems to go against the
> *architectural* bit after this), or hi/lo in cases other than
> vec_unpack (=> not "always"), or something else?
>
> maybe s/always/usually/ or s/always/otherwise/ ?
>

What I was aiming for is a description that our implementation of these
standard pattern names looks wrong, because "hi" always extracts the
architectural high lanes, in other big-endian patterns we make the
adjustment that higher numbered lanes map to the low architectural lanes.

I've tried to reword the comment to make it clearer, but I'm assuming some
familiarity with our overall big-endian vector model.

I've also updated the testcase to skip it if we are targetting AArch32,
which does not provide these intrinsics.

OK?

Thanks,
James

---
gcc/

2015-09-21  James Greenhalgh  

* config/aarch64/aarch64-simd.md

(aarch64_float_truncate_hi_v4sf): Rewrite as an expand.
(aarch64_float_truncate_hi_v4sf_le): New.
(aarch64_float_truncate_hi_v4sf_be): Likewise.

gcc/testsuite/

2015-09-21  James Greenhalgh  

* gcc.target/aarch64/advsimd-intrinsics/vcvt_high_1.c: New.

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index dbe5259..5ab2f2b 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1703,6 +1703,15 @@
   [(set_attr "type" "neon_fp_cvt_widen_s")]
 )
 
+;; ??? Note that the vectorizer usage of the vec_unpacks_[lo/hi] patterns
+;; is inconsistent with vector ordering elsewhere in the compiler, in that
+;; the meaning of HI and LO changes depending on the target endianness.
+;; While elsewhere we map the higher numbered elements of a vector to
+;; the lower architectural lanes of the vector, for these patterns we want
+;; to always treat "hi" as referring to the higher architectural lanes.
+;; Consequently, while the patterns below look inconsistent with our
+;; other big-endian patterns their behaviour is as required.
+
 (define_expand "vec_unpacks_lo_"
   [(match_operand: 0 "register_operand" "")
(match_operand:VQ_HSF 1 "register_operand" "")]
@@ -1757,17 +1766,42 @@
   [(set_attr "type" "neon_fp_cvt_narrow_d_q")]
 )
 
-(define_insn "aarch64_float_truncate_hi_"
+(define_insn "aarch64_float_truncate_hi__le"
   [(set (match_operand: 0 "register_operand" "=w")
 (vec_concat:
   (match_operand:VDF 1 "register_operand" "0")
   (float_truncate:VDF
 	(match_operand: 2 "register_operand" "w"]
-  "TARGET_SIMD"
+  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
   "fcvtn2\\t%0., %2"
   [(set_attr "type" "neon_fp_cvt_narrow_d_q")]
 )
 
+(define_insn "aarch64_float_truncate_hi__be"
+  [(set (match_operand: 0 "register_operand" "=w")
+(vec_concat:
+  (float_truncate:VDF
+	(match_operand: 2 "register_operand" "w"))
+  (match_operand:VDF 1 "register_operand" "0")))]
+  "TARGET_SIMD && BYTES_BIG_ENDIAN"
+  "fcvtn2\\t%0., %2"
+  [(set_attr "type" "neon_fp_cvt_narrow_d_q")]
+)
+
+(define_expand "aarch64_float_truncate_hi_"
+  [(match_operand: 0 "register_operand" "=w")
+   (match_operand:VDF 1 "register_operand" "0")
+   (match_operand: 2 "register_operand" "w")]
+  "TARGET_SIMD"
+{
+  rtx (*gen) (rtx, rtx, rtx) = BYTES_BIG_ENDIAN
+			 ? gen_aarch64_float_truncate_hi__be
+			 : gen_aarch64_float_truncate_hi__le;
+  emit_insn (gen (operands[0], operands[1], operands[2]));
+  DONE;
+}
+)
+
 (define_expand "vec_pack_trunc_v2df"
   [(set (match_operand:V4SF 0 "register_operand")
   (vec_concat:V4SF
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_high_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_high_1.c
new file mode 100644
index 000..4691da3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_high_1.c
@@ -0,0 +1,99 @@
+/* { dg-skip-if "" { arm*-*-* } } */
+
+#include "arm_neon.h"
+#include 
+
+void abort (void);
+
+void
+foo (void)
+{
+  /* Test vcvt_high_f32_f64.  */
+  float32x2_t arg1;
+  float64x2_t arg2;
+  float32x4_t result;
+  arg1 = vcreate_f32 (UINT64_C (0x3f0db5793f6e1892));
+  arg2 = vcombine_f64 (vcreate_f64 (UINT64_C (0x3fe8e49d23fb575d)),
+		   vcreate_f64 (UINT64_C (0x3fd921291b3df73e)));
+  //  Expect: "result" = 3ec909483f4724e93f0db5793f6e1892
+  result = vcvt_high_f32_f64 (arg1, arg2);
+  float32_t got;
+  float32_t exp;
+
+  /* Lane 0.  */
+  got = vgetq_lane_f32 (result, 0);
+  exp = ((float32_t) 0.9300624132156372);
+  if (exp / got) < ((float32_t) 0.999))
+	 || ((exp / got) > ((float32_t) 1.001)))
+ && (((exp - got) < ((float32_t) -1.0e-4))
+	 || ((exp - got) > ((float32_t) 1.0e-4)
+abort ();
+
+  /* Lane 1.  */
+  got = vgetq_lane_f32 (result, 1);
+  exp = ((fl

Re: [PATCH, i386] Introduce switch for Skylake Server CPU.

2015-09-21 Thread H.J. Lu
On Mon, Sep 21, 2015 at 7:14 AM, Kirill Yukhin  wrote:
> Hello,
> This patch introduces switches necessary for new Intel Server CPU
> (code-named Skylake).
>
> Bootstrapped & regtested.
>
> Is it ok for trunk?
>
> gcc/
> * config.gcc: Support "skx".
> * config/i386/i386-c.c (ix86_target_macros_internal): Handle
> PROCESSOR_SKX.
> * config/i386/i386.c (m_SKX): Define.
> (processor_target_table): Add "skx".
> (PTA_SKX): Define.
> (ix86_option_override_internal): Add "skx".
> (fold_builtin_cpu): Handle "skx".
> * config/i386/i386.h (TARGET_SKX): Define.
> (processor_type): Add PROCESSOR_SKX.
> * config/i386/i386.md (attr "cpu"): Add knl.
> * config/i386/x86-tune.def: Add m_KNL.
>
> gcc/testsuite/
> * gcc.target/i386/funcspec-5.c: Test avx512vl, avx512bw,
> avx512dq, avx512cd, avx512er, avx512pf and skx.
>

You also need to update libgcc/config/i386/cpuinfo.c and
gcc.target/i386/builtin_target.c.

H.J.


[commit, spu] Re: [BUILDROBOT] spu: left shift of negative value

2015-09-21 Thread Ulrich Weigand
Jan-Benedict Glaw wrote:

> I just noticed that (for config_list.mk builds), current GCC errors
> out at spu.c, see eg. build
> http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=3D469639 :
> 
> g++ -fno-PIE -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE   -fno-excep=
> tions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrit=
> e-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -peda=
> ntic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -f=
> no-common  -DHAVE_CONFIG_H -I. -I. -I../../../gcc/gcc -I../../../gcc/gcc/. =
> -I../../../gcc/gcc/../include -I../../../gcc/gcc/../libcpp/include -I/opt/c=
> farm/mpc/include  -I../../../gcc/gcc/../libdecnumber -I../../../gcc/gcc/../=
> libdecnumber/dpd -I../libdecnumber -I../../../gcc/gcc/../libbacktrace   -o =
> spu.o -MT spu.o -MMD -MP -MF ./.deps/spu.TPo ../../../gcc/gcc/config/spu/sp=
> u.c
> ../../../gcc/gcc/config/spu/spu.c: In function =E2=80=98void spu_expand_ins=
> v(rtx_def**)=E2=80=99:
> ../../../gcc/gcc/config/spu/spu.c:530:46: error: left shift of negative val=
> ue [-Werror=3Dshift-negative-value]
>maskbits =3D (-1ll << (32 - width - start));
>   ^
> ../../../gcc/gcc/config/spu/spu.c:536:46: error: left shift of negative val=
> ue [-Werror=3Dshift-negative-value]
>maskbits =3D (-1ll << (64 - width - start));
>   ^
> cc1plus: all warnings being treated as errors
> Makefile:2092: recipe for target 'spu.o' failed
> make[2]: *** [spu.o] Error 1
> make[2]: Leaving directory '/home/jbglaw/build-configlist_mk/spu-elf/build-=
> gcc/mk/spu-elf/gcc'

I've now checked in the following fix.

Thanks,
Ulrich


ChangeLog:

* config/spu/spu.c (spu_expand_insv): Avoid undefined behavior.

Index: gcc/config/spu/spu.c
===
*** gcc/config/spu/spu.c(revision 227968)
--- gcc/config/spu/spu.c(working copy)
*** spu_expand_insv (rtx ops[])
*** 472,478 
  {
HOST_WIDE_INT width = INTVAL (ops[1]);
HOST_WIDE_INT start = INTVAL (ops[2]);
!   HOST_WIDE_INT maskbits;
machine_mode dst_mode;
rtx dst = ops[0], src = ops[3];
int dst_size;
--- 472,478 
  {
HOST_WIDE_INT width = INTVAL (ops[1]);
HOST_WIDE_INT start = INTVAL (ops[2]);
!   unsigned HOST_WIDE_INT maskbits;
machine_mode dst_mode;
rtx dst = ops[0], src = ops[3];
int dst_size;
*** spu_expand_insv (rtx ops[])
*** 527,541 
switch (dst_size)
  {
  case 32:
!   maskbits = (-1ll << (32 - width - start));
if (start)
!   maskbits += (1ll << (32 - start));
emit_move_insn (mask, GEN_INT (maskbits));
break;
  case 64:
!   maskbits = (-1ll << (64 - width - start));
if (start)
!   maskbits += (1ll << (64 - start));
emit_move_insn (mask, GEN_INT (maskbits));
break;
  case 128:
--- 527,541 
switch (dst_size)
  {
  case 32:
!   maskbits = (~(unsigned HOST_WIDE_INT)0 << (32 - width - start));
if (start)
!   maskbits += ((unsigned HOST_WIDE_INT)1 << (32 - start));
emit_move_insn (mask, GEN_INT (maskbits));
break;
  case 64:
!   maskbits = (~(unsigned HOST_WIDE_INT)0 << (64 - width - start));
if (start)
!   maskbits += ((unsigned HOST_WIDE_INT)1 << (64 - start));
emit_move_insn (mask, GEN_INT (maskbits));
break;
  case 128:


-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics.

2015-09-21 Thread Alan Lawrence

On 21/09/15 15:38, James Greenhalgh wrote:


On Mon, Sep 21, 2015 at 10:44:32AM +0100, Alan Lawrence wrote:

[Resending in plain text] This makes sense to me now, although I find
your comment slightly confusing:

[] in that
+;; the meaning of HI and LO is always taken with a little-endian view of
+;; the vector

You mean vec_unpacks_{hi,lo} (which seems to go against the
*architectural* bit after this), or hi/lo in cases other than
vec_unpack (=> not "always"), or something else?

maybe s/always/usually/ or s/always/otherwise/ ?



What I was aiming for is a description that our implementation of these
standard pattern names looks wrong, because "hi" always extracts the
architectural high lanes, in other big-endian patterns we make the
adjustment that higher numbered lanes map to the low architectural lanes.

I've tried to reword the comment to make it clearer, but I'm assuming some
familiarity with our overall big-endian vector model.


Looks good to me, thanks for the clarification.

Cheers, Alan



Re: [PATCH, i386, AVX-512] Fix operands in mask unpack[si|di] patterns.

2015-09-21 Thread Uros Bizjak
On Mon, Sep 21, 2015 at 4:27 PM, Kirill Yukhin  wrote:
> Hello,
> This patch fixes operands in kunpck[hi|si|di] insn patterns.
> Bootstrapped (regtesting in progress).
>
> gcc/
> * gcc/config/i386/i386.md (define_insn "kunpckhi"): Fix
> operand in pattern.
> (define_insn "kunpcksi"): Ditto.
> (define_insn "kunpckdi"): Ditto.
>
> Is it ok for trunk and gcc-5-branch if testing pass?

You know the insn better than I, but the patterns look OK to me.

Please note that this change will remove (unneeded due to inherent
zero-extending?) zero-extends from the instruction stream, as combine
pass will merge zext into the pattern itself.

Thanks,
Uros.

> --
> Thanks, K
>
> commit 433bd69b29c62d0c65e5a0772d564b0d930ee8c5
> Author: Kirill Yukhin 
> Date:   Fri Sep 18 13:51:26 2015 +0300
>
> AVX-512. Fix patterns for kunpck insns.
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index f7230f0..2f8cdb2 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -8603,7 +8603,7 @@
>[(set (match_operand:HI 0 "register_operand" "=k")
> (ior:HI
>   (ashift:HI
> -   (match_operand:HI 1 "register_operand" "k")
> +   (zero_extend:HI (match_operand:QI 1 "register_operand" "k"))
> (const_int 8))
>   (zero_extend:HI (match_operand:QI 2 "register_operand" "k"]
>"TARGET_AVX512F"
> @@ -8616,9 +8616,9 @@
>[(set (match_operand:SI 0 "register_operand" "=k")
> (ior:SI
>   (ashift:SI
> -   (match_operand:SI 1 "register_operand" "k")
> +   (zero_extend:SI (match_operand:HI 1 "register_operand" "k"))
> (const_int 16))
> - (zero_extend:SI (subreg:HI (match_operand:SI 2 "register_operand" 
> "k") 0]
> + (zero_extend:SI (match_operand:HI 2 "register_operand" "k"]
>"TARGET_AVX512BW"
>"kunpckwd\t{%2, %1, %0|%0, %1, %2}"
>[(set_attr "mode" "SI")])
> @@ -8627,9 +8627,9 @@
>[(set (match_operand:DI 0 "register_operand" "=k")
> (ior:DI
>   (ashift:DI
> -   (match_operand:DI 1 "register_operand" "k")
> +   (zero_extend:DI (match_operand:SI 1 "register_operand" "k"))
> (const_int 32))
> - (zero_extend:DI (subreg:SI (match_operand:DI 2 "register_operand" 
> "k") 0]
> + (zero_extend:DI (match_operand:SI 2 "register_operand" "k"]
>"TARGET_AVX512BW"
>"kunpckdq\t{%2, %1, %0|%0, %1, %2}"
>[(set_attr "mode" "DI")])


Re: [PATCH, i386] Introduce switch for Skylake Server CPU.

2015-09-21 Thread Uros Bizjak
On Mon, Sep 21, 2015 at 4:27 PM, Jakub Jelinek  wrote:
> On Mon, Sep 21, 2015 at 05:14:45PM +0300, Kirill Yukhin wrote:
>> Hello,
>> This patch introduces switches necessary for new Intel Server CPU
>> (code-named Skylake).
>>
>> Bootstrapped & regtested.
>>
>> Is it ok for trunk?
>>
>> gcc/
>>   * config.gcc: Support "skx".
>> * config/i386/i386-c.c (ix86_target_macros_internal): Handle
>> PROCESSOR_SKX.
>> * config/i386/i386.c (m_SKX): Define.
>> (processor_target_table): Add "skx".
>> (PTA_SKX): Define.
>>   (ix86_option_override_internal): Add "skx".
>> (fold_builtin_cpu): Handle "skx".
>> * config/i386/i386.h (TARGET_SKX): Define.
>> (processor_type): Add PROCESSOR_SKX.
>> * config/i386/i386.md (attr "cpu"): Add knl.
>> * config/i386/x86-tune.def: Add m_KNL.
>>
>> gcc/testsuite/
>> * gcc.target/i386/funcspec-5.c: Test avx512vl, avx512bw,
>>   avx512dq, avx512cd, avx512er, avx512pf and skx.
>
> Is it a good idea to introduce further abbrevs like this?
> I thought we went away from e.g. slm to silvermont, we didn't
> introduce hsw but haswell, etc.
> So, wouldn't it be better to add skylake-xeon instead of skx?

Those "short names" were invented due to some
restrictions/recommendations on the usage of internal project names.
These restrictions were lifted years ago, so we are free to add more
descriptive names. As Jakub suggested, "skylake-xeon" is IMO much
better.

Uros.


Re: [PATCH 6/n] OpenMP 4.0 offloading infrastructure: option handling

2015-09-21 Thread Thomas Schwinge
Hi!

On Sat, 11 Oct 2014 18:49:00 +0400, Ilya Verbin  wrote:
> This is the last common infrastructure patch in the series.

> 1. -foffload==
>By default, GCC will build offload images for all offload targets specified
> in configure, with non-target-specific options passed to host compiler.
> This option is used to control offload targets and options for them.
> 
> It can be used in a few ways:
> * -foffload=disable
>   Tells GCC to disable offload support.
>   OpenMP target regions will be run in host fallback mode.
> * -foffload=
>   Tells GCC to build offload images for .
>   They will be built with non-target-specific options passed to host compiler.
> * -foffload=
>   Tells GCC to build offload images for all targets specified in configure. 
>   They will be built with non-target-specific options passed to host compiler
>   plus .
> * -foffload==
>   Tells GCC to build offload images for .
>   They will be built with non-target-specific options passed to host compiler
>   plus .
> 
> Options specified by -foffload are appended to the end of option set, so in 
> case
> of option conflicts they have more priority.

(, "--foffload* undocumented", has recently
been filed.)

(In the following, "intelmic" is short for
"x86_64-intelmicemul-linux-gnu", and "nvptx" is short for "nvptx-none".)

What is the syntax to use for building both intelmic and nvptx offloading
code?  I understand we allow for separate -foffload=intelmic
-foffload=nvptx options.  Do we also intend to allow
-foffload=intelmic,nvptx or -foffload=intelmic:nvptx?

And then, we allow for specifying offloading compiler options with
-foffload=intelmic=[...] and -foffload=nvptx=[...]; do we also intend to
allow -foffload=intelmic,nvptx=[...] (do the options apply to nvptx only,
or to both intelmic and nvptx?), and/or
-foffload=intelmic=[...],nvptx=[...], and/or
-foffload=intelmic:nvptx=[...] (which "looks a bit like" the options
ought to apply to nvptx only -- or to both intelmic and nvptx?), and/or
-foffload=intelmic=[...]:nvptx=[...]?

Given the code in gcc/configure.ac:

   939  for tgt in `echo $enable_offload_targets | sed 's/,/ /g'`; do
   940tgt=`echo $tgt | sed 's/=.*//'`
   941if test x"$offload_targets" = x; then
   942  offload_targets=$tgt
   943else
   944  offload_targets="$offload_targets:$tgt"
   945fi
   946  done
   947  AC_DEFINE_UNQUOTED(OFFLOAD_TARGETS, "$offload_targets",
   948[Define to hold the list of target names suitable for 
offloading.])

..., we constuct OFFLOAD_TARGETS with the offload targets delimited by
colons, but when parsing -foffload options:

> --- a/gcc/gcc.c
> +++ b/gcc/gcc.c

> +/* Parse -foffload option argument.  */
> +
> +static void
> +handle_foffload_option (const char *arg)
> +{
> +  [...]

..., we look for comma delimiters in the configure-time OFFLOAD_TARGETS:

  3612/* Check that GCC is configured to support the offload 
target.  */
  3613c = OFFLOAD_TARGETS;
  3614while (c)
  3615  {
  3616n = strchr (c, ',');
  3617if (n == NULL)
  3618  n = strchr (c, '\0');
  3619  
  3620if (next - cur == n - c && strncmp (target, c, n - c) == 
0)
  3621  break;
  3622  
  3623c = *n ? n + 1 : NULL;
  3624  }
  3625  
  3626if (!c)
  3627  fatal_error (input_location,
  3628   "GCC is not configured to support %s as 
offload target",
  3629   target);

So, this code will not do the right thing when configured with
--enable-offload-targets=intelmic,nvptx (thus,
OFFLOAD_TARGETS=intelmic:nvptx): using -foffload=nvptx will then result
in "xgcc: fatal error: GCC is not configured to support nvptx as offload
target".

If I'm understanding the following code correctly, this supports the idea
that the intention has been for -foffload=[targets]=[options] to separate
the targets by commas, and separate the options by spaces -- is that
correct?

> --- a/gcc/lto-wrapper.c
> +++ b/gcc/lto-wrapper.c

> +/* Extract options for TARGET offload compiler from OPTIONS and append
> +   them to ARGV_OBSTACK.  */
> +
> +static void
> +append_offload_options (obstack *argv_obstack, const char *target,
> + struct cl_decoded_option *options,
> + unsigned int options_count)
> +{
> +  [...]

   571  /* Extract options for TARGET offload compiler from OPTIONS and 
append
   572 them to ARGV_OBSTACK.  */
   573  
   574  static void
   575  append_offload_options (obstack *argv_obstack, const char *target,
   576  struct cl_decoded_option *options,
   577  unsigned int options_count)
   578  {
   579for (unsigned i = 0; i < options_count; i++)
   580  {
   581 

[PATCH] DWARF support for AIX v3

2015-09-21 Thread David Edelsohn
Appended is a patch to support DWARF on AIX and works with GDB and AIX
assembler.  The AIX assembler prepends the CU length to the section
and does not want the compiler to provide it, so I need to disable
those parts of dwarf2out.c.  I know it's ugly, but, on the upside,
this eventually allows GCC to deprecate Stabs, which I think is worth
the pain.

I discovered that -g3 emits the debug_macros section, which AIX also
does not like, so I disable that section as well.  As I mentioned in
v2, I moved the calculation of aranges_length to its one use in the
output function instead of calculating it early to avoid an unused
function argument.

I also switched to XCOFF_DEBUGGING_INFO as the macro to test in
dwarf2out.c.  I think a target hook or testing that DWARF section
names have not been overridden to NULL is overkill. Suggestions
welcome, but any solution that disables arbitrary portions of
dwarf2out.c is going to be ugly. I doubt that any other system,
especially ones based on ELF file format, would ever do this.

With a few tweaks to the testsuite, this passes GCC dwarf2 tests.

Thanks, David

* dwarf2out.c (size_of_pubnames): Don't define on AIX.
(size_of_aranges): Don't define on AIX.
(output_compilation_unit_header): Don't output length on AIX.
(output_pubnames): Don't output length on AIX.
(output_aranges): Delete argument. Compute length locally. Don't
output length on AIX.
(output_line_info): Don't output length on AIX.
(dwarf2out_finish): Don't output location lists on AIX.
Don't compute aranges_length. Don't output macros section.
* config/rs6000/rs6000.c (rs6000_xcoff_debug_unwind_info):
New.
(rs6000_xcoff_asm_named_section): Emit .dwsect pseudo-op
for SECTION_DEBUG.
(rs6000_xcoff_declare_function_name): Emit different
.function pseudo-op when DWARF2_DEBUG.
* config/rs6000/xcoff.h (TARGET_DEBUG_UNWIND_INFO):
Redefine.
* config/rs6000/aix71.h (DWARF2_DEBUGGING_INFO): Define.
(PREFERRED_DEBUGGING_TYPE): Define.
(DEBUG_INFO_SECTION): Define.
(DEBUG_ABBREV_SECTION): Define.
(DEBUG_ARANGES_SECTION): Define.
(DEBUG_LINE_SECTION): Define.
(DEBUG_PUBNAMES_SECTION): Define.
(DEBUG_PUBTYPES_SECTION): Define.
(DEBUG_STR_SECTION): Define.
(DEBUG_RANGES_SECTION): Define.

Index: dwarf2out.c
===
--- dwarf2out.c (revision 227959)
+++ dwarf2out.c (working copy)
@@ -3185,8 +3185,10 @@ static void calc_base_type_die_sizes (void);
 static void mark_dies (dw_die_ref);
 static void unmark_dies (dw_die_ref);
 static void unmark_all_dies (dw_die_ref);
+#ifndef XCOFF_DEBUGGING_INFO
 static unsigned long size_of_pubnames (vec *);
 static unsigned long size_of_aranges (void);
+#endif
 static enum dwarf_form value_format (dw_attr_node *);
 static void output_value_format (dw_attr_node *);
 static void output_abbrev_section (void);
@@ -3202,7 +3204,7 @@ static void add_enumerator_pubname (const char *,
 static void add_pubname_string (const char *, dw_die_ref);
 static void add_pubtype (tree, dw_die_ref);
 static void output_pubnames (vec *);
-static void output_aranges (unsigned long);
+static void output_aranges (void);
 static unsigned int add_ranges_num (int);
 static unsigned int add_ranges (const_tree);
 static void add_ranges_by_labels (dw_die_ref, const char *, const char *,
@@ -8387,6 +8389,7 @@ include_pubname_in_output (vec *names)
 {
   unsigned i;
-  unsigned long pubnames_length = size_of_pubnames (names);
   pubname_entry *pub;

+#ifndef XCOFF_DEBUGGING_INFO
+  unsigned long pubnames_length = size_of_pubnames (names);
+
   if (DWARF_INITIAL_LENGTH_SIZE - DWARF_OFFSET_SIZE == 4)
 dw2_asm_output_data (4, 0x,
   "Initial length escape value indicating 64-bit DWARF extension");
   dw2_asm_output_data (DWARF_OFFSET_SIZE, pubnames_length, "Pub Info Length");
+#endif

   /* Version number for pubnames/pubtypes is independent of dwarf version.  */
   dw2_asm_output_data (2, 2, "DWARF Version");
@@ -9706,15 +9715,19 @@ output_pubtables (void)
text section generated for this compilation unit.  */

 static void
-output_aranges (unsigned long aranges_length)
+output_aranges (void)
 {
   unsigned i;

-  if (DWARF_INITIAL_LENGTH_SIZE - DWARF_OFFSET_SIZE == 4)
+#ifndef XCOFF_DEBUGGING_INFO
+  unsigned long aranges_length = size_of_aranges ();
+
+if (DWARF_INITIAL_LENGTH_SIZE - DWARF_OFFSET_SIZE == 4)
 dw2_asm_output_data (4, 0x,
   "Initial length escape value indicating 64-bit DWARF extension");
   dw2_asm_output_data (DWARF_OFFSET_SIZE, aranges_length,
   "Length of Address Ranges Info");
+#endif
   /* Version number for aranges is still 2, even up to DWARF5.  */
   dw2_asm_output_data (2, 2, "DWARF Version");
   if (dwarf_split_debug_info)
@@ -10398,11 +10411,13 @@ output_line_info (bool pro

Re: Openacc launch API

2015-09-21 Thread Nathan Sidwell

Jakub?

https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01287.html


On 09/17/15 10:40, Nathan Sidwell wrote:

Updated patch addressing your points.  Some  further comments though ...


+  while (GOMP_LAUNCH_PACK (GOMP_LAUNCH_END, 0, 0)
+ != (tag = va_arg (ap, unsigned)))


That's a somewhat non-idiomatic way to write this, with the constant first and
not obviously a constant. I'd initialize a variable with the constant before the
loop.


I went with
   while ((tag = va_arg (...)) != 0) ...

and killed GOMP_LAUNCH_END throughout, using explicit '0'.


+  assert (!GOMP_LAUNCH_DEVICE (tag));


Uh, that seems unfriendly, and not exactly forwards compatible. Can that fail a
bit more gracefully? (Alternatively, implement the device_type stuff now so that
we don't have TODOs in the code and don't have to worry about compatibility
issues.)


Added call to gomp_fatal, indicating libgomp is out of date. Also added a
default to the switch following with the same effect.  The trouble  with
implementing handling of device_type here now, is difficulty in testing its
correctness.  If it were  buggy we'd be in a worse position than not having it.


+GOACC_2.0,1 {
+  global:
+GOACC_parallel_keyed;
+} GOACC_2.0;


Did you mean to use a comma?


I misunderstood your comment as 'did you mean to use a comma where you used
something else', not 'is that comma a typo?'  well spotted!


+else if (!tagging)


Oh... so tagging controls two different methods for constructing argument lists,
one for GOACC_parallel and the other for whatever OMP uses? That's a bit
unfortunate, I'll need to think about it for a bit or defer to Jakub.


My earlier description was lacking.  The memory arguments have already been
pushed before that switch.  This is just dealing with async & wait args.  I
found it easier to modify the existing code path and have a tagging flag, rather
than duplicate it.

nathan




Re: [C++] Coding rule enforcement

2015-09-21 Thread Jason Merrill

On 09/21/2015 10:01 AM, Manuel López-Ibáñez wrote:

On 21 September 2015 at 15:46, Daniel Gutson
 wrote:


FWIW, we could make this plugin in 2 weeks (w already have static
checkers as plugins for our customers). I understand Nathan that you
may have some deadlines, but if we could have the opportunity to
implement it, we could accomplish a clean isolation of a particular
business needs (despite I acknowledge that a warning about virtual
inheritance may be useful for a broader audience). OTOH, a plugin can
receive arguments, such as a configuration file which could point to
specific sources or hints about where to apply the warning, or a
suppression file, both things useful for large legacy code. IMVHO I
think this is a superior solution.
Please let me know if we could collaborate to get both a better gcc
and a better static checker.


My opinion is that if people want to "donate" their plugins to the
FSF, they are willing to maintain them, and they are not overtly
complex or require any third-party software, they should be added to
the GCC repository and build/tested. They could serve as examples and
extra testing for the plugin framework, with the expectation that they
may get removed if they become unmaintained.


Absolutely.

Jason




New post-LTO OpenACC pass

2015-09-21 Thread Nathan Sidwell

Jakub,
this patch adds a new transforming pass, which executes after the LTO readback 
pass, and hence knows whether it is targeting the host or (a) device.


The contents of the pass will be built out -- it does much more on the gomp4 
pass.  This instance simply scans and replaces the acc_on_device builtin. 
Expanding early will allow such things as constant propagation and dead code 
removal earlier.


We still need the traditional expansion at RTL time too, because this function 
is used when building the library, in case the user does something crazy, like 
calling via a pointer.


The scanning code is written such that the replaced code is also scanned.  This 
will occur for the later transforms that might expand to internal builtins which 
themselves could be optimized.


The 'get_oacc_fn_attrib' was also present in the launch API patch.  Although 
used just internally to omp-low.c in this patch, it ends up being more widely used.


ok for trunk?

nathan
2015-09-21  Nathan Sidwell  
	Cesar Philippidis  

	* omp-low.h (get_oacc_fn_attrib): Declare.
	* omp-low.c (get_oacc_fn_attrib): New.
	(oacc_xform_on_device): New.
	(execute_oacc_transform): New pass.
	(pass_data_oacc_transform): New.
	(pass_oacc_transform): New.
	(make_pass_oacc_transform): New.
	* tree-pass.h (make_pass_oacc_transform): Declare.
	* passes.def: Add pass_oacc_transform.

Index: omp-low.c
===
--- omp-low.c	(revision 227968)
+++ omp-low.c	(working copy)
@@ -8860,6 +8860,16 @@ expand_omp_atomic (struct omp_region *re
   expand_omp_atomic_mutex (load_bb, store_bb, addr, loaded_val, stored_val);
 }
 
+#define OACC_FN_ATTRIB "oacc function"
+
+/* Retrieve the oacc function attrib and return it.  Non-oacc
+   functions will return NULL.  */
+
+tree
+get_oacc_fn_attrib (tree fn)
+{
+  return lookup_attribute (OACC_FN_ATTRIB, DECL_ATTRIBUTES (fn));
+}
 
 /* Expand the GIMPLE_OMP_TARGET starting at REGION.  */
 
@@ -13909,4 +13919,131 @@ omp_finish_file (void)
 }
 }
 
+/* Transform an acc_on_device call.  OpenACC 2.0a requires this folded at
+   compile time for constant operands.  We always fold it.  In an
+   offloaded function we're never 'none'.  */
+
+static void
+oacc_xform_on_device (gimple *call)
+{
+  tree arg = gimple_call_arg (call, 0);
+  unsigned val = GOMP_DEVICE_HOST;
+	  
+#ifdef ACCEL_COMPILER
+  val = GOMP_DEVICE_NOT_HOST;
+#endif
+  tree result = build2 (EQ_EXPR, boolean_type_node, arg,
+			build_int_cst (integer_type_node, val));
+#ifdef ACCEL_COMPILER
+  {
+tree dev  = build2 (EQ_EXPR, boolean_type_node, arg,
+			build_int_cst (integer_type_node,
+   ACCEL_COMPILER_acc_device));
+result = build2 (TRUTH_OR_EXPR, boolean_type_node, result, dev);
+  }
+#endif
+  result = fold_convert (integer_type_node, result);
+  tree lhs = gimple_call_lhs (call);
+  gimple_seq seq = NULL;
+
+  push_gimplify_context (true);
+  gimplify_assign (lhs, result, &seq);
+  pop_gimplify_context (NULL);
+
+  gimple_stmt_iterator gsi = gsi_for_stmt (call);
+  gsi_replace_with_seq (&gsi, seq, false);
+}
+
+/* Main entry point for oacc transformations which run on the device
+   compiler after LTO, so we know what the target device is at this
+   point (including the host fallback).  */
+
+static unsigned int
+execute_oacc_transform ()
+{
+  tree attrs = get_oacc_fn_attrib (current_function_decl);
+  
+  if (!attrs)
+/* Not an offloaded function.  */
+return 0;
+
+  basic_block bb;
+  FOR_ALL_BB_FN (bb, cfun)
+for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+	 !gsi_end_p (gsi); gsi_next (&gsi))
+  {
+	gimple *stmt = gsi_stmt (gsi);
+	bool rescan = false;
+
+	if (!is_gimple_call (stmt))
+	  continue;
+
+	/* Rewind to allow rescan.  */
+	gsi_prev (&gsi);
+
+	gcall *call = as_a  (stmt);
+	
+	if (gimple_call_builtin_p (call, BUILT_IN_ACC_ON_DEVICE))
+	  /* acc_on_device must be evaluated at compile time for
+	 constant arguments.  */
+	  {
+	oacc_xform_on_device (call);
+	rescan = true;
+	  }
+
+	if (gsi_end_p (gsi))
+	  /* We rewound past the beginning of the BB.  */
+	  gsi = gsi_start_bb (bb);
+
+	if (!rescan)
+	  /* Undo the rewind, so we don't get stuck infinitely.  */
+	  gsi_next (&gsi);
+  }
+
+  return 0;
+}
+
+namespace {
+
+const pass_data pass_data_oacc_transform =
+{
+  GIMPLE_PASS, /* type */
+  "fold_oacc_transform", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_NONE, /* tv_id */
+  PROP_cfg, /* properties_required */
+  0 /* Possibly PROP_gimple_eomp.  */, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  TODO_update_ssa | TODO_cleanup_cfg, /* todo_flags_finish */
+};
+
+class pass_oacc_transform : public gimple_opt_pass
+{
+public:
+  pass_oacc_transform (gcc::context *ctxt)
+: gimple_opt_pass (pass_data_oacc_transform, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  virtual unsigned int execute (function *)
+{
+  bool gate = (flag_openacc != 0 && !seen_e

Re: [C++] Coding rule enforcement

2015-09-21 Thread Daniel Gutson
On Mon, Sep 21, 2015 at 1:23 PM, Jason Merrill  wrote:
> On 09/21/2015 10:01 AM, Manuel López-Ibáñez wrote:
>>
>> On 21 September 2015 at 15:46, Daniel Gutson
>>  wrote:
>>>
>>>
>>> FWIW, we could make this plugin in 2 weeks (w already have static
>>> checkers as plugins for our customers). I understand Nathan that you
>>> may have some deadlines, but if we could have the opportunity to
>>> implement it, we could accomplish a clean isolation of a particular
>>> business needs (despite I acknowledge that a warning about virtual
>>> inheritance may be useful for a broader audience). OTOH, a plugin can
>>> receive arguments, such as a configuration file which could point to
>>> specific sources or hints about where to apply the warning, or a
>>> suppression file, both things useful for large legacy code. IMVHO I
>>> think this is a superior solution.
>>> Please let me know if we could collaborate to get both a better gcc
>>> and a better static checker.
>>
>>
>> My opinion is that if people want to "donate" their plugins to the
>> FSF, they are willing to maintain them, and they are not overtly
>> complex or require any third-party software, they should be added to
>> the GCC repository and build/tested. They could serve as examples and
>> extra testing for the plugin framework, with the expectation that they
>> may get removed if they become unmaintained.
>
>
> Absolutely.

Three of those plugins are already available in bitbucket. Maybe we
should move them to the official plugins page.
The fourth plugin is under development but I asked for volunteers to
test it in ths C++' std-proposals forum,
since it enhances the semantic of constexpr in order to allocate
memory during constexpr construction and
therefore having associative containers in ROM, so this is a special
case where the main intent is a proof of
concept for a C++ proposal.

>
> Jason
>
>



-- 

Daniel F. Gutson
Chief Engineering Officer, SPD

San Lorenzo 47, 3rd Floor, Office 5
Córdoba, Argentina

Phone:   +54 351 4217888 / +54 351 4218211
Skype:dgutson
LinkedIn: http://ar.linkedin.com/in/danielgutson


Re: [C++] Coding rule enforcement

2015-09-21 Thread Nathan Sidwell

On 09/21/15 12:23, Jason Merrill wrote:

On 09/21/2015 10:01 AM, Manuel López-Ibáñez wrote:

On 21 September 2015 at 15:46, Daniel Gutson
 wrote:


FWIW, we could make this plugin in 2 weeks (w already have static
checkers as plugins for our customers). I understand Nathan that you
may have some deadlines, but if we could have the opportunity to
implement it, we could accomplish a clean isolation of a particular
business needs (despite I acknowledge that a warning about virtual
inheritance may be useful for a broader audience). OTOH, a plugin can
receive arguments, such as a configuration file which could point to
specific sources or hints about where to apply the warning, or a
suppression file, both things useful for large legacy code. IMVHO I
think this is a superior solution.
Please let me know if we could collaborate to get both a better gcc
and a better static checker.


My opinion is that if people want to "donate" their plugins to the
FSF, they are willing to maintain them, and they are not overtly
complex or require any third-party software, they should be added to
the GCC repository and build/tested. They could serve as examples and
extra testing for the plugin framework, with the expectation that they
may get removed if they become unmaintained.


Absolutely.


I've no objection to going the plugin way, but I'm extremely unlikely to be able 
to devote time to doing that in the foreseeable future.


nathan


Re: [PATCH 6/n] OpenMP 4.0 offloading infrastructure: option handling

2015-09-21 Thread Ilya Verbin
2015-09-21 18:15 GMT+03:00 Thomas Schwinge :
> (, "--foffload* undocumented", has recently
> been filed.)
>
> (In the following, "intelmic" is short for
> "x86_64-intelmicemul-linux-gnu", and "nvptx" is short for "nvptx-none".)
>
> What is the syntax to use for building both intelmic and nvptx offloading
> code?  I understand we allow for separate -foffload=intelmic
> -foffload=nvptx options.  Do we also intend to allow
> -foffload=intelmic,nvptx or -foffload=intelmic:nvptx?
>
> And then, we allow for specifying offloading compiler options with
> -foffload=intelmic=[...] and -foffload=nvptx=[...]; do we also intend to
> allow -foffload=intelmic,nvptx=[...] (do the options apply to nvptx only,
> or to both intelmic and nvptx?), and/or
> -foffload=intelmic=[...],nvptx=[...], and/or
> -foffload=intelmic:nvptx=[...] (which "looks a bit like" the options
> ought to apply to nvptx only -- or to both intelmic and nvptx?), and/or
> -foffload=intelmic=[...]:nvptx=[...]?

The plan was:

1. -foffload=intelmic,nvptx=[...]  <- apply options to both intelmic,nvptx.
   Just like -foffload=[...] applies to both targets (if configured so).
2. -foffload=intelmic=[...],nvptx=[...]  <- is not allowed.
3. To apply different options to different targets, one should pass:
   -foffload=intelmic=[...] -foffload=nvptx=[...].

>   3612/* Check that GCC is configured to support the offload 
> target.  */
>   3613c = OFFLOAD_TARGETS;
>   3614while (c)
>   3615  {
>   3616n = strchr (c, ',');
>   3617if (n == NULL)
>   3618  n = strchr (c, '\0');
>   3619
>   3620if (next - cur == n - c && strncmp (target, c, n - c) 
> == 0)
>   3621  break;
>   3622
>   3623c = *n ? n + 1 : NULL;
>   3624  }
>   3625
>   3626if (!c)
>   3627  fatal_error (input_location,
>   3628   "GCC is not configured to support %s as 
> offload target",
>   3629   target);
>
> So, this code will not do the right thing when configured with
> --enable-offload-targets=intelmic,nvptx (thus,
> OFFLOAD_TARGETS=intelmic:nvptx): using -foffload=nvptx will then result
> in "xgcc: fatal error: GCC is not configured to support nvptx as offload
> target".
>
> If I'm understanding the following code correctly, this supports the idea
> that the intention has been for -foffload=[targets]=[options] to separate
> the targets by commas, and separate the options by spaces -- is that
> correct?

Yes, targets are separated by commas, options are the whole string after the
equal sign, spaces inside are allowed.

  -- Ilya


Re: [PATCH, i386] Introduce switch for Skylake Server CPU.

2015-09-21 Thread Kirill Yukhin
Hi Uroš, Jakub,
eOn 21 Sep 16:27, Jakub Jelinek wrote:
> On Mon, Sep 21, 2015 at 05:14:45PM +0300, Kirill Yukhin wrote:
> > Hello,
> > This patch introduces switches necessary for new Intel Server CPU
> > (code-named Skylake).
> > 
> > Bootstrapped & regtested.
> Is it a good idea to introduce further abbrevs like this?
> I thought we went away from e.g. slm to silvermont, we didn't
> introduce hsw but haswell, etc.
> So, wouldn't it be better to add skylake-xeon instead of skx?
Intel Xeon E3 does not support AVX-512*.
So, I'd suggest skylake-avx512.

gcc/
* config.gcc: Support "skylake-avx512".
* config/i386/i386-c.c (ix86_target_macros_internal): Handle
PROCESSOR_SKYLAKE_AVX512.
* config/i386/i386.c (m_SKYLAKE_AVX512): Define.
(processor_target_table): Add "skylake-avx512".
(PTA_SKYLAKE_AVX512): Define.
(ix86_option_override_internal): Add "skylake_avx512".
(fold_builtin_cpu): Handle "skylake_avx512", add F_AVX512VL
F_AVX512BW, F_AVX512DQ, F_AVX512ER, F_AVX512PF, F_AVX512CD.
* config/i386/i386.h (TARGET_SKYLAKE_AVX512): Define.
(processor_type): Add PROCESSOR_SKYLAKE_AVX512.

libgcc/
* libgcc/config/i386/cpuinfo.c (enum processor_features): Add
FEATURE_AVX512VL, FEATURE_AVX512BW, FEATURE_AVX512DQ,
FEATURE_AVX512CD, FEATURE_AVX512ER, FEATURE_AVX512PF.
(get_available_features): Habdle new features.

gcc/testsuite/
* gcc.target/i386/funcspec-5.c: Test avx512vl, avx512bw,
avx512dq, avx512cd, avx512er, avx512pf and skylake-avx512.
* gcc.target/i386/builtin_target.c: Test  avx512vl, avx512bw,
avx512dq, avx512cd, avx512er and avx512pf.

Patch in the bottom. Is it ok?
> 
>   Jakub

--
Thanks, K

commit 69d2e45dbc6b944845e3ec91127e62f85ab51490
Author: Kirill Yukhin 
Date:   Fri Sep 18 14:03:54 2015 +0300

AVX-512. Introduce SKX CPU.

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 75807f5..efe4806 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -595,7 +595,7 @@ x86_64_archs="amdfam10 athlon64 athlon64-sse3 barcelona 
bdver1 bdver2 \
 bdver3 bdver4 btver1 btver2 k8 k8-sse3 opteron opteron-sse3 nocona \
 core2 corei7 corei7-avx core-avx-i core-avx2 atom slm nehalem westmere \
 sandybridge ivybridge haswell broadwell bonnell silvermont knl x86-64 \
-native"
+native skylake-avx512"
 
 # Additional x86 processors supported by --with-cpu=.  Each processor
 # MUST be separated by exactly one space.
diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index 35cecd0..5e583ae 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -177,6 +177,10 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
   def_or_undef (parse_in, "__knl");
   def_or_undef (parse_in, "__knl__");
   break;
+case PROCESSOR_SKYLAKE_AVX512:
+  def_or_undef (parse_in, "__skylake_avx512");
+  def_or_undef (parse_in, "__skylake_avx512__");
+  break;
 /* use PROCESSOR_max to not set/unset the arch macro.  */
 case PROCESSOR_max:
   break;
@@ -286,6 +290,9 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
 case PROCESSOR_KNL:
   def_or_undef (parse_in, "__tune_knl__");
   break;
+case PROCESSOR_SKYLAKE_AVX512:
+  def_or_undef (parse_in, "__tune_skylake_avx512__");
+  break;
 case PROCESSOR_IAMCU:
   def_or_undef (parse_in, "__tune_iamcu__");
   break;
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 00e7006..41065be 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2098,6 +2098,7 @@ const struct processor_costs *ix86_cost = &pentium_cost;
 #define m_BONNELL (1<

Re: [PATCH, i386] Introduce switch for Skylake Server CPU.

2015-09-21 Thread Kirill Yukhin
Hello HJ,
On 21 Sep 07:40, H.J. Lu wrote:
> On Mon, Sep 21, 2015 at 7:14 AM, Kirill Yukhin  
> wrote:
> > Hello,
> > This patch introduces switches necessary for new Intel Server CPU
> > (code-named Skylake).
> >
> > Bootstrapped & regtested.
> >
> > Is it ok for trunk?
> > gcc/testsuite/
> > * gcc.target/i386/funcspec-5.c: Test avx512vl, avx512bw,
> > avx512dq, avx512cd, avx512er, avx512pf and skx.
> >
> 
> You also need to update libgcc/config/i386/cpuinfo.c and
> gcc.target/i386/builtin_target.c.
As far as I kbow, Intel is not yet disclosed family/model for Skylake Server.
I'll extend cpuinfo.c when it'll happen.
Thanks, I'll fix the test.

--
Thanks, K
> 
> H.J.


Re: [PATCH] PR28901 -Wunused-variable ignores unused const initialised variables

2015-09-21 Thread Steve Ellcey
On Fri, 2015-09-18 at 20:29 -0600, Martin Sebor wrote:

> On 09/15/2015 11:20 AM, Steve Ellcey wrote:

> > I guess it is not the 'const' I think should be handled special but the
> > 'static'.  Having unused static variables (const or not) declared in a
> > header file but unused seems reasonable since the header file may be
> > included in multiple .c files each of which uses a subset of the static
> > variables.
> 
> I tend to agree. I suppose diagnosing unused non-const static
> definitions might be helpful but I can't think of a good reason
> to diagnose unused initialized static consts in C. Especially
> since they're not diagnosed in C++.
> 
> Would diagnosing them in source files while avoiding the warning
> for static const definitions in headers be an acceptable compromise?
> 
> Martin

That seems like a reasonable compromise to me.

Steve Ellcey
sell...@imgtec.com



[PATCH] Fix undefined behavior in h8300 backend

2015-09-21 Thread Jeff Law


This was found when building all configurations using the trunk 
compiler.  A run-of-the-mill problem with left shifting -1.


To make sure I didn't muck anything up badly, I went back to Kazu's 
introduction of this pattern (2003), took his pseudo-code and turned it 
into a trivial h8300 specific test.


I also verified newlib built with the new compiler.  newlib never 
triggers this pattern though.


I briefly pondered adding tests for the many other patterns Kazu added 
in 2003 as well as generalizing andsi3_ashift_n_lower, but I just 
couldn't justify that much time on the H8 port :-)


Installed on the trunk.

Jeff


Fwd: [PATCH] Fix undefined behavior in h8300 backend

2015-09-21 Thread Jeff Law


With the patch this time...

 Forwarded Message 
Subject: [PATCH] Fix undefined behavior in h8300 backend
Date: Mon, 21 Sep 2015 11:10:04 -0600
From: Jeff Law 
To: gcc-patches@gcc.gnu.org


This was found when building all configurations using the trunk
compiler.  A run-of-the-mill problem with left shifting -1.

To make sure I didn't muck anything up badly, I went back to Kazu's
introduction of this pattern (2003), took his pseudo-code and turned it
into a trivial h8300 specific test.

I also verified newlib built with the new compiler.  newlib never
triggers this pattern though.

I briefly pondered adding tests for the many other patterns Kazu added
in 2003 as well as generalizing andsi3_ashift_n_lower, but I just
couldn't justify that much time on the H8 port :-)

Installed on the trunk.

Jeff


diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index c25888a..b6527de 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2015-09-21  Jeff Law  
+
+   * config/h8300/h8300.md (andsi3_ashift_n_lower): Avoid undefined
+   behavior. 
+
 2015-09-21  Ulrich Weigand  
 
* config/spu/spu.c (spu_expand_insv): Avoid undefined behavior.
diff --git a/gcc/config/h8300/h8300.md b/gcc/config/h8300/h8300.md
index 4079b30..52213ac 100644
--- a/gcc/config/h8300/h8300.md
+++ b/gcc/config/h8300/h8300.md
@@ -3914,7 +3914,8 @@
(clobber (match_scratch:QI 4 "=X,&r"))]
   "(TARGET_H8300H || TARGET_H8300S)
 && INTVAL (operands[2]) <= 15
-&& INTVAL (operands[3]) == ((-1 << INTVAL (operands[2])) & 0x)"
+&& UINTVAL (operands[3]) == ((HOST_WIDE_INT_M1U << INTVAL (operands[2]))
+& 0x)"
   "#"
   "&& reload_completed"
   [(parallel [(set (match_dup 5)
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index c65469f..7afd523 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2015-09-21  Jeff Law  
+
+   * gcc.target/h8300/andsi3_ashift_n_lower.c: New test.
+
 2015-09-21  Ville Voutilainen  
 
Complete the implementation of N4230, Nested namespace definition.
diff --git a/gcc/testsuite/gcc.target/h8300/andsi3_ashift_n_lower.c 
b/gcc/testsuite/gcc.target/h8300/andsi3_ashift_n_lower.c
new file mode 100644
index 000..8cacc52
--- /dev/null
+++ b/gcc/testsuite/gcc.target/h8300/andsi3_ashift_n_lower.c
@@ -0,0 +1,7 @@
+/* { dg-do compile }  */
+/* { dg-options "-mh -O2 -fomit-frame-pointer" }  */
+/* { dg-final { scan-assembler-times "extu" 1 } }  */
+
+unsigned long foo(unsigned long a)
+  { return (a << 4) & 0x; }
+


Re: [C/C++ PATCH] RFC: Implement -Wduplicated-cond (PR c/64249) (take

2015-09-21 Thread Marek Polacek
On Fri, Sep 18, 2015 at 08:16:52PM +0200, Manuel López-Ibáñez wrote:
> On 18/09/15 18:45, Martin Sebor wrote:
> >but it makes me wonder how common this pattern is in portable
> >code and whether adding workarounds for it is the right solution
> >(or if it might prompt people to disable the warning, which would
> >be a shame).
> 
> Perhaps if we are going to warn, we could look for sizeof() and virtual
> locations in the operands, and skip the warning. It would be nice to find a
> heuristic that allows warning in most cases but skip those that appear often
> in common code.

Yeah, but as I mentioned in the previous mail, we don't have the ability to
do that.  For sizeof perhaps we could use SIZEOF_EXPR, but since integer
constants don't have a location we can't use from_macro_expansion_at for them.

I think the warning works right in most cases already, just in some weird
cases it fires.  Perhaps that's acceptable, but I know people aren't keen on
adding kludges to their code to suppress warnings...

So like I said, maybe -Wextra for now.

> Another alternative is to have a more heuristic version of operand_equal(),
> if two operands are equal because of "compilation-dependent" code (macro
> expansion, sizeof, etc), then they are not considered equal. that is
> operand_equal_2(sizeof(long), sizeof(long)) returns true, but
> operand_equal_2(8, sizeof(long)) returns false.
> 
> I have no idea whether it is possible to implement operand_equal_2 in a sane 
> way.

I'm not sanguine about that at all :/.

Marek


Re: [C/C++ PATCH] RFC: Implement -Wduplicated-cond (PR c/64249) (take

2015-09-21 Thread Marek Polacek
On Fri, Sep 18, 2015 at 10:45:33AM -0600, Martin Sebor wrote:
> >Done in the below.  This version actually bootstraps, because I've added
> >-Wno-duplicated-cond for insn-dfatab.o and insn-latencytab.o (don't know
> >how to fix these) + I've tweaked a condition in genemit.c.  The problem
> >here is that we have
> >
> >   if (INTVAL (x) == 0)
> > printf ("const0_rtx");
> >   else if (INTVAL (x) == 1)
> > printf ("const1_rtx");
> >   else if (INTVAL (x) == -1)
> > printf ("constm1_rtx");
> >   // ...
> >   else if (INTVAL (x) == STORE_FLAG_VALUE)
> > printf ("const_true_rtx");
> >
> >and STORE_FLAG_VALUE happens to be 1, so we have two same conditions.
> >STORE_FLAG_VALUE is 1 or -1, but according to the documentation it can
> >also be some other number so we should keep this if statement.  I've
> >avoided the warning by adding STORE_FLAG_VALUE > 1 check.
> 
> Binutils and GLIBC also fail to build due to similar problems (in
> addition to several errors triggered by the new -Wunused-const-variable
> warning).
 
Thanks for doing this.  I've been meaning to compile a kernel with this
new warning on but I've never gotten to do it.

> The one in GLIBC is trivial to fix by guarding the code with
> #if N != 1:
> 
> In file included from ../sysdeps/x86_64/ldbl2mpn.c:1:0:
> ../sysdeps/x86_64/../i386/ldbl2mpn.c: In function
> ‘__mpn_extract_long_double’:
> ../sysdeps/x86_64/../i386/ldbl2mpn.c:89:24: error: duplicated ‘if’ condition
> [-Werror=duplicated-cond]
> else if (res_ptr[0] != 0)
> ^
> ../sysdeps/x86_64/../i386/ldbl2mpn.c:74:23: note: previously used here
> if (res_ptr[N - 1] != 0)
>^
> 
> The one in Binutils is pretty easy to fix too:
> 
> In function ‘stab_int_type’:
> /home/msebor/scm/fsf/binutils-gdb/binutils/wrstabs.c:665:18: error:
> duplicated if’ condition [-Werror=duplicated-cond]
> else if (size == 8)
>   ^
> /home/msebor/scm/fsf/binutils-gdb/binutils/wrstabs.c:663:18: note:
> previously used here
> else if (size == sizeof (long))
>   ^
> 
> but it makes me wonder how common this pattern is in portable
> code and whether adding workarounds for it is the right solution
> (or if it might prompt people to disable the warning, which would
> be a shame).

Yeah :(.  I hate how such obscure stuff (we have *one* occurence in the
whole GCC codebase...) can paralyze the warning as such.

I'm sorry to have to say that I don't quite know what to do, because
these "false positives" aren't easily fixable.  Because for
"m == sizeof (int)" the FE simply sees "m == 4"; ditto for e.g.
#define M 4 and "m == M".

Perhaps move the warning into -Wextra until we have the capacity to
differentiate between those?

> As an aside, I would have expected the change you implemented
> in GCC to get around this to trigger some other warning (such
> as -Wtautological-compare) e.g., if (a > 1 && a == -1), but it
> doesn't seem to, either in GCC or Clang.

This sort of stuff isn't really trivial to do in the front end,
I'm afraid.

> >How does this look like now?
> 
> If no one else is concerned about the above it looks good to
> me. I was hoping to see the warning emitted for conditional
> expressions as well but that can be considered an enhancement.

I think this can surely be done later on.

I realized that current patch has a minor deficiency: it will start
a chain even in case the first condition has a side-effect thus the
chain should be invalid.  I'll fix this problem soon.
 
> FWIW, while testing the patch I noticed the following bug: 67629.
> It seems like the same logic was we discussed in this context is
> needed there as well.

That sounds more like a bug in tree-cfg than in the FE ;-).

Thanks,

Marek


Re: [C++] Coding rule enforcement

2015-09-21 Thread Jason Merrill

On 09/21/2015 12:39 PM, Nathan Sidwell wrote:

On 09/21/15 12:23, Jason Merrill wrote:

On 09/21/2015 10:01 AM, Manuel López-Ibáñez wrote:

On 21 September 2015 at 15:46, Daniel Gutson
 wrote:


FWIW, we could make this plugin in 2 weeks (w already have static
checkers as plugins for our customers). I understand Nathan that you
may have some deadlines, but if we could have the opportunity to
implement it, we could accomplish a clean isolation of a particular
business needs (despite I acknowledge that a warning about virtual
inheritance may be useful for a broader audience). OTOH, a plugin can
receive arguments, such as a configuration file which could point to
specific sources or hints about where to apply the warning, or a
suppression file, both things useful for large legacy code. IMVHO I
think this is a superior solution.
Please let me know if we could collaborate to get both a better gcc
and a better static checker.


My opinion is that if people want to "donate" their plugins to the
FSF, they are willing to maintain them, and they are not overtly
complex or require any third-party software, they should be added to
the GCC repository and build/tested. They could serve as examples and
extra testing for the plugin framework, with the expectation that they
may get removed if they become unmaintained.


Absolutely.


I've no objection to going the plugin way, but I'm extremely unlikely to
be able to devote time to doing that in the foreseeable future.


I like the idea of the plugin, but your patch is very clean, so I think 
let's go ahead and put it in.


Jason



Re: [RFC] Try vector as a new representation for vector masks

2015-09-21 Thread Richard Henderson
On 09/21/2015 05:08 AM, Ilya Enkovich wrote:
> There is no any conversion here, maskload_optab is a convert_optab
> because it uses two modes, one for value and the other one for mask.

Ah, I see.  In which case I think we ought to come up with a different name.
C.f. get_vcond_icode.


r~


Re: [PATCH, i386] Introduce switch for Skylake Server CPU.

2015-09-21 Thread Uros Bizjak
On Mon, Sep 21, 2015 at 6:57 PM, Kirill Yukhin  wrote:
> Hi Uroš, Jakub,
> eOn 21 Sep 16:27, Jakub Jelinek wrote:
>> On Mon, Sep 21, 2015 at 05:14:45PM +0300, Kirill Yukhin wrote:
>> > Hello,
>> > This patch introduces switches necessary for new Intel Server CPU
>> > (code-named Skylake).
>> >
>> > Bootstrapped & regtested.
>> Is it a good idea to introduce further abbrevs like this?
>> I thought we went away from e.g. slm to silvermont, we didn't
>> introduce hsw but haswell, etc.
>> So, wouldn't it be better to add skylake-xeon instead of skx?
> Intel Xeon E3 does not support AVX-512*.
> So, I'd suggest skylake-avx512.

Even better, and fits existing practice, e.g. corei7-avx.

> gcc/
> * config.gcc: Support "skylake-avx512".
> * config/i386/i386-c.c (ix86_target_macros_internal): Handle
> PROCESSOR_SKYLAKE_AVX512.
> * config/i386/i386.c (m_SKYLAKE_AVX512): Define.
> (processor_target_table): Add "skylake-avx512".
> (PTA_SKYLAKE_AVX512): Define.
> (ix86_option_override_internal): Add "skylake_avx512".
> (fold_builtin_cpu): Handle "skylake_avx512", add F_AVX512VL
> F_AVX512BW, F_AVX512DQ, F_AVX512ER, F_AVX512PF, F_AVX512CD.
> * config/i386/i386.h (TARGET_SKYLAKE_AVX512): Define.
> (processor_type): Add PROCESSOR_SKYLAKE_AVX512.
>
> libgcc/
> * libgcc/config/i386/cpuinfo.c (enum processor_features): Add
> FEATURE_AVX512VL, FEATURE_AVX512BW, FEATURE_AVX512DQ,
> FEATURE_AVX512CD, FEATURE_AVX512ER, FEATURE_AVX512PF.
> (get_available_features): Habdle new features.
>
> gcc/testsuite/
> * gcc.target/i386/funcspec-5.c: Test avx512vl, avx512bw,
> avx512dq, avx512cd, avx512er, avx512pf and skylake-avx512.
> * gcc.target/i386/builtin_target.c: Test  avx512vl, avx512bw,
> avx512dq, avx512cd, avx512er and avx512pf.
>
> Patch in the bottom. Is it ok?

Comments inline.

> commit 69d2e45dbc6b944845e3ec91127e62f85ab51490
> Author: Kirill Yukhin 
> Date:   Fri Sep 18 14:03:54 2015 +0300
>
> AVX-512. Introduce SKX CPU.
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 75807f5..efe4806 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -595,7 +595,7 @@ x86_64_archs="amdfam10 athlon64 athlon64-sse3 barcelona 
> bdver1 bdver2 \
>  bdver3 bdver4 btver1 btver2 k8 k8-sse3 opteron opteron-sse3 nocona \
>  core2 corei7 corei7-avx core-avx-i core-avx2 atom slm nehalem westmere \
>  sandybridge ivybridge haswell broadwell bonnell silvermont knl x86-64 \
> -native"
> +native skylake-avx512"

Please leave x86-64 and native at the last two places. They are generic entries.

>  # Additional x86 processors supported by --with-cpu=.  Each processor
>  # MUST be separated by exactly one space.
> diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
> index 35cecd0..5e583ae 100644
> --- a/gcc/config/i386/i386-c.c
> +++ b/gcc/config/i386/i386-c.c
> @@ -177,6 +177,10 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
>def_or_undef (parse_in, "__knl");
>def_or_undef (parse_in, "__knl__");
>break;
> +case PROCESSOR_SKYLAKE_AVX512:
> +  def_or_undef (parse_in, "__skylake_avx512");
> +  def_or_undef (parse_in, "__skylake_avx512__");
> +  break;
>  /* use PROCESSOR_max to not set/unset the arch macro.  */
>  case PROCESSOR_max:
>break;
> @@ -286,6 +290,9 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
>  case PROCESSOR_KNL:
>def_or_undef (parse_in, "__tune_knl__");
>break;
> +case PROCESSOR_SKYLAKE_AVX512:
> +  def_or_undef (parse_in, "__tune_skylake_avx512__");
> +  break;
>  case PROCESSOR_IAMCU:
>def_or_undef (parse_in, "__tune_iamcu__");
>break;
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 00e7006..41065be 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -2098,6 +2098,7 @@ const struct processor_costs *ix86_cost = &pentium_cost;
>  #define m_BONNELL (1<  #define m_SILVERMONT (1<  #define m_KNL (1< +#define m_SKYLAKE_AVX512 (1<  #define m_INTEL (1<
>  #define m_GEODE (1< @@ -2567,6 +2568,7 @@ static const struct ptt 
> processor_target_table[PROCESSOR_max] =
>{"bonnell", &atom_cost, 16, 15, 16, 7, 16},
>{"silvermont", &slm_cost, 16, 15, 16, 7, 16},
>{"knl", &slm_cost, 16, 15, 16, 7, 16},
> +  {"skylake-avx512", &core_cost, 16, 10, 16, 10, 16},
>{"intel", &intel_cost, 16, 15, 16, 7, 16},
>{"geode", &geode_cost, 0, 0, 0, 0, 0},
>{"k6", &k6_cost, 32, 7, 32, 7, 32},
> @@ -3286,6 +3288,9 @@ ix86_option_override_internal (bool main_args_p,
>(PTA_HASWELL | PTA_ADX | PTA_PRFCHW | PTA_RDSEED)
>  #define PTA_SKYLAKE \
>(PTA_BROADWELL | PTA_CLFLUSHOPT | PTA_XSAVEC | PTA_XSAVES)
> +#define PTA_SKYLAKE_AVX512 \
> +  (PTA_SKYLAKE | PTA_AVX512F | PTA_AVX512CD | PTA_AVX512VL \
> +   | PTA_AVX512BW | PTA_AVX512DQ)
>  #define PTA_KNL \
>(PTA_BROADWELL | PTA_AVX512PF | PTA_AVX512ER

[PATCH] PR fortran/67615 -- check for scalar expression

2015-09-21 Thread Steve Kargl
Patch is self-explanatory.  Built and regression tested on
x86_64-*-freebsd.  OK to commit?

2015-09-21  Steven G. Kargl  

PR fortran/67615
* resolve.c (gfc_resolve_code): Check for scalar expression in 
arithmetic-if.


2015-09-21  Steven G. Kargl  

PR fortran/67615
* gfortran.dg/pr67615.f90: new test.

-- 
Steve
Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c	(revision 227976)
+++ gcc/fortran/resolve.c	(working copy)
@@ -10377,15 +10381,18 @@ gfc_resolve_code (gfc_code *code, gfc_na
 	  }
 
 	case EXEC_ARITHMETIC_IF:
-	  if (t
-	  && code->expr1->ts.type != BT_INTEGER
-	  && code->expr1->ts.type != BT_REAL)
-	gfc_error ("Arithmetic IF statement at %L requires a numeric "
-		   "expression", &code->expr1->where);
+	  {
+	gfc_expr *e = code->expr1;
 
-	  resolve_branch (code->label1, code);
-	  resolve_branch (code->label2, code);
-	  resolve_branch (code->label3, code);
+	if (t && (e->rank > 0
+		  || !(e->ts.type == BT_REAL || e->ts.type == BT_INTEGER)))
+	  gfc_error ("Arithmetic IF statement at %L requires a scalar "
+			 "REAL or INTEGER expression", &code->expr1->where);
+
+	resolve_branch (code->label1, code);
+	resolve_branch (code->label2, code);
+	resolve_branch (code->label3, code);
+	  }
 	  break;
 
 	case EXEC_IF:
Index: gcc/testsuite/gfortran.dg/pr67615.f90
===
--- gcc/testsuite/gfortran.dg/pr67615.f90	(revision 0)
+++ gcc/testsuite/gfortran.dg/pr67615.f90	(working copy)
@@ -0,0 +1,33 @@
+! { dg-do compile }
+! { dg-options "-std=legacy" }
+! PR fortran/67615
+!
+program foo
+
+   implicit none
+
+   integer i(2), j
+   real x
+   complex z
+ 
+   j = 2
+   if (j) 10, 20, 30
+
+   x = -1
+   if (x) 10, 20, 30
+
+   z = (1,2)
+   if (z) 10, 20, 30   ! { dg-error "Arithmetic IF statement" }
+
+   i = [1, 2]
+   if (i) 10, 20, 30   ! { dg-error "Arithmetic IF statement" }
+
+   if ( [1] ) 10, 20, 30   ! { dg-error "Arithmetic IF statement" }
+   if ( [1, -1] ) 10, 20, 30   ! { dg-error "Arithmetic IF statement" }
+   if ( [real :: 1, -1] ) 10, 20, 30   ! { dg-error "Arithmetic IF statement" }
+
+10 stop
+20 stop
+30 stop
+
+end program foo


Re: [PATCH] PR fortran/67615 -- check for scalar expression

2015-09-21 Thread FX
> 2015-09-21  Steven G. Kargl  
> 
>   PR fortran/67615
>   * resolve.c (gfc_resolve_code): Check for scalar expression in 
>   arithmetic-if.
> 
> 
> 2015-09-21  Steven G. Kargl  
> 
>   PR fortran/67615
>   * gfortran.dg/pr67615.f90: new test.

OK


Re: [PATCH] PR fortran/67615 -- check for scalar expression

2015-09-21 Thread Steve Kargl
On Mon, Sep 21, 2015 at 07:52:48PM +0200, FX wrote:
> > 2015-09-21  Steven G. Kargl  
> > 
> > PR fortran/67615
> > * resolve.c (gfc_resolve_code): Check for scalar expression in 
> > arithmetic-if.
> > 
> > 
> > 2015-09-21  Steven G. Kargl  
> > 
> > PR fortran/67615
> > * gfortran.dg/pr67615.f90: new test.
> 
> OK

Thanks.

-- 
Steve


Re: (patch,rfc) s/gimple/gimple */

2015-09-21 Thread Richard Sandiford
Richard Biener  writes:
> On Fri, Sep 18, 2015 at 3:32 PM, Trevor Saunders  
> wrote:
>> On Wed, Sep 16, 2015 at 03:11:14PM -0400, David Malcolm wrote:
>>> On Wed, 2015-09-16 at 09:16 -0400, Trevor Saunders wrote:
>>> > Hi,
>>> >
>>> > I gave changing from gimple to gimple * a shot last week.  It turned out
>>> > to be not too hard.  As you might expect the patch is huge so its
>>> > attached compressed.
>>> >
>>> > patch was bootstrapped + regtested on x86_64-linux-gnu, and run through
>>> > config-list.mk.  However I needed to update it some for changes made
>>> > while testing.  Do people want to make this change now?  If so I'll try
>>> > and commit the patch over the weekend when less is changing.
>>>
>>>
>>> FWIW there are some big changes in gcc/tree-vect-slp.c:vectorizable_load
>>> that looks like unrelated whitespace changes, e.g. the following (and
>>> there are some followup hunks).  Did something change underneath, or was
>>> there a stray whitespace cleanup here?  (I skimmed through the patch,
>>> and this was the only file I spotted where something looked wrong)
>>
>> yeah, it was a stray whitespace cleanup, but I reverted it.
>>
>> Given the few but only positive comments I've seen I'm planning to
>> commit this over the weekend.
>
> Thanks a lot!
>
> If you are still in a refactoring mood then I have sth else here.  When
> streamlining the gimple accessors I noticed the glaring const-correctness
> issue in
>
> /* Return a pointer to the LHS of assignment statement GS.  */
>
> static inline tree *
> gimple_assign_lhs_ptr (const gassign *gs)
> {
>   return const_cast (&gs->op[0]);
> }
>
> and was thinking to either "fix" it by removing the 'const' or by
> merging gimple_assign_lhs and gimple_assign_lhs_ptr into
>
> static inline const tree&
> gimple_assign_lhs (const gassign *);
>
> static inline tree&
> gimple_assign_lhs (gassign *);

AIUI const_tree (like const_rtx) only protects the top-level tree.
This is something I always hoped to change for rtl one day, but fixing
all the fallout would be an incredibly dull task...

I suppose protecting the top level is still better than nothing though.

Thanks,
Richard



Re: New post-LTO OpenACC pass

2015-09-21 Thread Cesar Philippidis
On 09/21/2015 09:30 AM, Nathan Sidwell wrote:

> +const pass_data pass_data_oacc_transform =
> +{
> +  GIMPLE_PASS, /* type */
> +  "fold_oacc_transform", /* name */

Want to rename the tree dump file to oacc_xforms like I'm did in the
attached patch? Regardless, I think we need to document this flag in
invoke.texi.

> +  OPTGROUP_NONE, /* optinfo_flags */
> +  TV_NONE, /* tv_id */
> +  PROP_cfg, /* properties_required */
> +  0 /* Possibly PROP_gimple_eomp.  */, /* properties_provided */
> +  0, /* properties_destroyed */
> +  0, /* todo_flags_start */
> +  TODO_update_ssa | TODO_cleanup_cfg, /* todo_flags_finish */
> +};

Cesar
2015-09-21  Cesar Philippidis  

	gcc/
	* doc/invoke.texi: Document -fdump-tree-oacc_xforms.
	* omp-low.c (pass_data_oacc_transform): Rename the tree dump for
	oacc_transform as oacc_xforms.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 92f82d7..7406941 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -7158,6 +7158,11 @@ is made by appending @file{.slp} to the source file name.
 Dump each function after Value Range Propagation (VRP).  The file name
 is made by appending @file{.vrp} to the source file name.
 
+@item oacc_xforms
+@opindex fdump-tree-oacc_xforms
+Dump each function after applying target-specific OpenACC transformations.
+The file name is made by appending @file{.oacc_xforms} to the source file name.
+
 @item all
 @opindex fdump-tree-all
 Enable all the available tree dumps with the flags provided in this option.
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index e3dc160..f31e6cd 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -15086,7 +15086,7 @@ namespace {
 const pass_data pass_data_oacc_transform =
 {
   GIMPLE_PASS, /* type */
-  "fold_oacc_transform", /* name */
+  "oacc_xforms", /* name */
   OPTGROUP_NONE, /* optinfo_flags */
   TV_NONE, /* tv_id */
   PROP_cfg, /* properties_required */


Re: New post-LTO OpenACC pass

2015-09-21 Thread Nathan Sidwell

On 09/21/15 16:30, Cesar Philippidis wrote:

On 09/21/2015 09:30 AM, Nathan Sidwell wrote:


+const pass_data pass_data_oacc_transform =
+{
+  GIMPLE_PASS, /* type */
+  "fold_oacc_transform", /* name */


Want to rename the tree dump file to oacc_xforms like I'm did in the
attached patch? Regardless, I think we need to document this flag in
invoke.texi.


Thanks for noticing the missing doc.  I'm not attached to any particular name. 
'fold_oacc_transform' is rather generic, and a bit of  a mouthful.  Perhaps 
'oacclower', 'oaccdevlower' or something (I  see there's 'lateomplower' for 
guidance)


nathan


[PATCH tree-inline v2] do not say "called from here" with UNKNOWN_LOCATION

2015-09-21 Thread Manuel López-Ibáñez
In https://sourceware.org/ml/libc-alpha/2014-12/msg00300.html, we give a
"called from here" note without actually having a location, which looks
strange. I haven't been able to generate such a testcase. If this happens,
try to at least point to the current function being called. If that location is
also unknown, skip the extra note.

gcc/ChangeLog:

2015-09-21  Manuel López-Ibáñez  

* tree-inline.c (expand_call_inline): Use inform for extra note.
Do not give a note with UNKNOWN_LOCATION.
Replace input_location with gimple_location (stmt).
Use true instead of TRUE.

gcc/testsuite/ChangeLog:

2015-09-21  Manuel López-Ibáñez  

* gcc.target/i386/inline_error.c (int bar): Use dg-message for note.
* gcc.target/i386/pr57756.c (static __inline int caller): Likewise.
* gcc.target/i386/pr59789.c (f1): Likewise.
* gcc.target/i386/intrinsics_5.c (__m128i foo): Likewise.
* gcc.target/i386/intrinsics_6.c: Likewise.
* gcc.dg/winline-5.c (int t): Likewise.
* gcc.dg/winline-9.c (t): Likewise.
* gcc.dg/always_inline2.c (q): Likewise.
* gcc.dg/winline-2.c (inline int t): Likewise.
* gcc.dg/winline-6.c: Likewise.
* gcc.dg/winline-10.c (void g): Likewise.
* gcc.dg/pr49243.c (void parse): Likewise.
* gcc.dg/always_inline3.c (q2): Likewise.
* gcc.dg/winline-3.c: Likewise.
* gcc.dg/winline-7.c (inline void *t): Likewise.
Index: gcc/testsuite/gcc.target/i386/inline_error.c
===
--- gcc/testsuite/gcc.target/i386/inline_error.c(revision 227965)
+++ gcc/testsuite/gcc.target/i386/inline_error.c(working copy)
@@ -7,7 +7,7 @@ foo () /* { dg-error "inlining failed in
   return 0;
 }
 
 int bar()
 {
-  return foo (); /* { dg-error "called from here" } */
+  return foo (); /* { dg-message "called from here" } */
 }
Index: gcc/testsuite/gcc.target/i386/pr57756.c
===
--- gcc/testsuite/gcc.target/i386/pr57756.c (revision 227965)
+++ gcc/testsuite/gcc.target/i386/pr57756.c (working copy)
@@ -9,11 +9,11 @@ __inline int callee () /* { dg-error "in
 }
 
 __attribute__((target("sse")))
 static __inline int caller ()
 {
-  return callee(); /* { dg-error "called from here" }  */
+  return callee(); /* { dg-message "called from here" }  */
 }
 
 int main ()
 {
   return caller();
Index: gcc/testsuite/gcc.target/i386/pr59789.c
===
--- gcc/testsuite/gcc.target/i386/pr59789.c (revision 227965)
+++ gcc/testsuite/gcc.target/i386/pr59789.c (working copy)
@@ -16,7 +16,7 @@ _mm_set_epi32 (int __q3, int __q2, int _
 
 
 __m128i
 f1(void)
 { /* { dg-message "warning: SSE vector return without SSE enabled changes the 
ABI" } */
-  return _mm_set_epi32 (0, 0, 0, 0); /* { dg-error "called from here" } */
+  return _mm_set_epi32 (0, 0, 0, 0); /* { dg-message "called from here" } */
 }
Index: gcc/testsuite/gcc.target/i386/intrinsics_5.c
===
--- gcc/testsuite/gcc.target/i386/intrinsics_5.c(revision 227965)
+++ gcc/testsuite/gcc.target/i386/intrinsics_5.c(working copy)
@@ -8,9 +8,9 @@
 
 #include 
 
 __m128i foo(__m128i *V)
 {
-return _mm_stream_load_si128(V); /* { dg-error "called from here" } */
+return _mm_stream_load_si128(V); /* { dg-message "called from here" } */
 }
 
 /* { dg-prune-output ".*inlining failed.*" }  */
Index: gcc/testsuite/gcc.target/i386/intrinsics_6.c
===
--- gcc/testsuite/gcc.target/i386/intrinsics_6.c(revision 227965)
+++ gcc/testsuite/gcc.target/i386/intrinsics_6.c(working copy)
@@ -8,9 +8,9 @@
 
 #include 
 
 __m128i foo(__m128i *V)
 {
-return _mm_stream_load_si128(V); /* { dg-error "called from here" } */
+return _mm_stream_load_si128(V); /* { dg-message "called from here" } */
 }
 
 /* { dg-prune-output ".*inlining failed.*" }  */
Index: gcc/testsuite/gcc.dg/winline-5.c
===
--- gcc/testsuite/gcc.dg/winline-5.c(revision 227965)
+++ gcc/testsuite/gcc.dg/winline-5.c(working copy)
@@ -15,7 +15,7 @@ inline int q(void) /* { dg-warning "inli
big();
big();
 }
 int t (void)
 {
-   return q (); /* { dg-warning "called from here" } */
+   return q (); /* { dg-message "called from here" } */
 }
Index: gcc/testsuite/gcc.dg/winline-9.c
===
--- gcc/testsuite/gcc.dg/winline-9.c(revision 227965)
+++ gcc/testsuite/gcc.dg/winline-9.c(working copy)
@@ -20,7 +20,7 @@ int
 t()
 {
   if (a)
 aa();
   if (b)
-bb();  /* { dg-warning "called from here" "" } */
+bb();  /* { dg-message "called from here" } */
 }
Index: gcc/testsuite/gcc.dg/always_inline2.c
=

[patch] Pack global state of forwprop to one structure

2015-09-21 Thread Martin Jambor
Hi,

the following patch almost removes the global state of pass_forwprop.
Only almost, because the pass calls fold_const with its own valueize
function which uses lattices, and changing the prototype of valueize
(by adding a void *data parameter) is something I am not going to
undertake without knowing it would be accepted, especially since there
might be people who would prefer to pass the data via some C++ism,
rather than a void pointer.  It is also probably more work than I have
time for this week.

The patch nevertheless puts global state in one structure and all but
this one function (and the execute method, into which the variable
should be eventually moved) access it through a pointer parameter.
Along the way I streamlined how cfg_changed is communicated from three
functions, now they set the flag in the state structure instead of
returning magic number 2 which the caller then tests for only to set
the flag in that case.

My motivation is a bug in HSA run-time and/or finalizer, which does
not allow me to run the following pattern of code:

  pointer = &local_decl; /* address of first private decl is 0 */
  pointer->field = tmp;  /* this oopses on my RC kernel */

Because I'm in the process of changing the HSA omp lowering/expansion
in a way that routinely produces such statements, I need to run some
sort of simple forward propagation even at -O0 (on functions that are
to be expanded only to HSAIL and not on host).  To do that, making
forward_propagate_addr_expr public is very useful, but it has to know
when it is called from outside the pass so that it does not try to
update the inner state.

The forward propagation is a hack and I actually I hope I will be able
to remove it before proposing merging to trunk.  But this patch seems
useful on its own, so I would like to commit it.

Bootstrapped and tested on x86_64-linux.  OK for trunk?

Thanks,

Martin


2015-09-21  Martin Jambor  

* tree-ssa-forwprop.c (struct fwp_state): New type.
(pass_state): New variable.
(cfg_changed): Moved to fwp_state.
(to_purge): Likewise.
(lattice): Likewise.
(fwprop_set_lattice_val): Use state received in a new parameter.
(fwprop_invalidate_lattice): Likewise.
(remove_prop_source_from_use): Likewise.
(forward_propagate_into_comparison): Likewise. Set state->cfg_changed
instead of returning two.
(forward_propagate_into_gimple_cond): Likewise.
(tidy_after_forward_propagate_addr): Use state received in a new
parameter.
(forward_propagate_addr_expr_1): Likewise.
(forward_propagate_addr_expr): Likewise.
(simplify_gimple_switch_label_vec): Likewise.
(simplify_gimple_switch): Likewise.
(simplify_builtin_call): Likewise.
(simplify_permutation): Likewise.  Set state->cfg_changed instead of
returning two.
(fwprop_ssa_val): Use pass_state.
(namespace): Likewise.  Removed checks for return value of two.
---
 gcc/tree-ssa-forwprop.c | 282 
 1 file changed, 142 insertions(+), 140 deletions(-)

diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
index ccfde5f..343c1de 100644
--- a/gcc/tree-ssa-forwprop.c
+++ b/gcc/tree-ssa-forwprop.c
@@ -188,44 +188,55 @@ along with GCC; see the file COPYING3.  If not see
 
This will (of course) be extended as other needs arise.  */
 
-static bool forward_propagate_addr_expr (tree, tree, bool);
+/* Structure holding state of the forward propagation pass.  */
 
-/* Set to true if we delete dead edges during the optimization.  */
-static bool cfg_changed;
+struct fwp_state
+{
+  /* Set to true if we delete dead edges during the optimization.  */
+  bool cfg_changed;
 
-static tree rhs_to_tree (tree type, gimple stmt);
+  /* Bitmap of basic blocks containing dead outgoing EH edges.  */
+  bitmap to_purge;
 
-static bitmap to_purge;
+  /* Const-and-copy lattice.  */
+  vec lattice;
+};
 
-/* Const-and-copy lattice.  */
-static vec lattice;
+/* Global state of the  */
+static struct fwp_state pass_state;
+
+static bool forward_propagate_addr_expr (fwp_state *, tree, tree, bool);
+static tree rhs_to_tree (tree type, gimple stmt);
+
+/* Set the lattice entry for NAME to VAL.  STATE points to global state
+   information of the pass.  */
 
-/* Set the lattice entry for NAME to VAL.  */
 static void
-fwprop_set_lattice_val (tree name, tree val)
+fwprop_set_lattice_val (fwp_state *state, tree name, tree val)
 {
   if (TREE_CODE (name) == SSA_NAME)
 {
-  if (SSA_NAME_VERSION (name) >= lattice.length ())
+  if (SSA_NAME_VERSION (name) >= state->lattice.length ())
{
- lattice.reserve (num_ssa_names - lattice.length ());
- lattice.quick_grow_cleared (num_ssa_names);
+ state->lattice.reserve (num_ssa_names - state->lattice.length ());
+ state->lattice.quick_grow_cleared (num_ssa_names);
}
-  lattice[SSA_NAME_VERSION (n

[PATCH,committed] Fix typos in comment.

2015-09-21 Thread Steve Kargl
I committed the following patch, which spells "defined" correctly.


2015-09-21  Steven G. Kargl  

* resolve.c (nonscalar_typebound_assign): Fix typos in comment.
 
Index: resolve.c
===
--- resolve.c   (revision 227984)
+++ resolve.c   (revision 227985)
@@ -9844,8 +9844,8 @@ nonscalar_typebound_assign (gfc_symbol *
"An intrinsic assignment where the variable is of derived type is
performed as if each component of the variable were assigned from the
corresponding component of expr using pointer assignment (7.2.2) for
-   each pointer component, defined assignment for each nonpointer
-   nonallocatable component of a type that has a type-bound defined
+   each pointer component, defined assignment for each nonpointer
+   nonallocatable component of a type that has a type-bound defined
assignment consistent with the component, intrinsic assignment for
each other nonpointer nonallocatable component, ..."
 
-- 
Steve


[patch committed FT32] Fixes for hardware startup

2015-09-21 Thread James Bowman
The attached patch contains several fixes for the default hardware
startup.

[libgcc]

2015-09-21  James Bowman  

* config/ft32/crti-hw.S: Use __PMSIZE to allow configurable
memory layout. Deal correctly with BSS region larger than 32K.
Handle a watchdog reset like a power-on reset. Clean up unused
code.

Index: libgcc/config/ft32/crti-hw.S
===
--- libgcc/config/ft32/crti-hw.S(revision 227841)
+++ libgcc/config/ft32/crti-hw.S(working copy)
@@ -1,11 +1,8 @@
-.equ SYS_REGMSC0CFG_B3  , 0x1001b
-.equ SYS_REGIRQCTL_B3   , 0x100e3
-
 .global _start
 _start:
 # START Interrupt Vector Table [[
-jmp 0x3fffc # RESET Vector
-jmp _watchdog_isr   # WATCHDOG Vector # TODO: Change me to reset 
the chip proper
+jmp __PMSIZE-4
+jmp watchdog_init
 jmp interrupt_0
 jmp interrupt_1
 jmp interrupt_2
@@ -38,64 +35,66 @@ _start:
 jmp interrupt_29
 jmp interrupt_30
 jmp interrupt_31
-jmp 0x3fff8
+jmp __PMSIZE-8
 # ]] END Interrupt Vector Table
 
 codestart:
 jmpinit
-
 
 .global _exithook
 _exithook:   # Debugger uses '_exithook' at 0x90 to catch program 
exit
 return
 
+watchdog_init:
+ldk$r0,1
 init:
+ldk $sp,__RAMSIZE
 # Disable all interrupts
-ldk $r0,0x80
-sta.b   0x100e3,$r0
-
-# Reset all peripherals
-# lda.l   $r0, 0x10018
-# bins.l  $r0, $r0, 0x23F  # Set bit 31
-# sta.l   0x10018, $r0
+ldk $r4,0x80
+sta.b   0x100e3,$r4
 
 # Initialize DATA by copying from program memory
-ldk.l   $r0,__data_load_start
+ldk.l   $r4,__data_load_start
 ldk.l   $r1,__data_load_end
 ldk.l   $r2,0   # Will use __data after binutils patch
 
 jmp .dscopy
 .dsloop:
-# Copy PM[$r0] to RAM $r2
-lpmi.l  $r3,$r0,0
+# Copy PM[$r4] to RAM $r2
+lpmi.l  $r3,$r4,0
 sti.l   $r2,0,$r3
-add.l   $r0,$r0,4
+add.l   $r4,$r4,4
 add.l   $r2,$r2,4
 .dscopy:
-cmp.l   $r0,$r1
+cmp.l   $r4,$r1
 jmpclt,.dsloop
 
 # Zero BSS
-ldk.l   $r0,_bss_start
+ldk.l   $r4,_bss_start
 ldk.l   $r2,_end
-sub.l   $r2,$r2,$r0
+sub.l   $r2,$r2,$r4
 ldk.l   $r1,0
-memset.l  $r0,$r1,$r2
+ldk$r3,32764
+1:
+cmp$r2,$r3
+jmpc   lt,2f
+memset $r4,$r1,$r3
+add$r4,$r4,$r3
+sub$r2,$r2,$r3
+jmp1b
+2:
+memset $r4,$r1,$r2
 
 sub.l   $sp,$sp,24  # Space for the caller argument frame
 callmain
 
-.equ EXITEXIT, 0x1fffc
+.equ EXITEXIT,0x1fffc
 
 .global _exit
 _exit:
 sta.l   EXITEXIT,$r0# simulator end of test
 jmp _exithook
 
-_watchdog_isr:
-ldk.l   $sp, 0x80   # Reset the stack pointer so it doesn't grow 
to a huge size
-jmp 0
-
 # Macro to construct the interrupt stub code.
 # it just saves r0, loads r0 with the int vector
 # and branches to interrupt_common.


Re: [PATCH], PowerPC IEEE 128-bit patch #7 (revised)

2015-09-21 Thread Michael Meissner
A heads up. I just found some places in the IEEE 128-bit floating point code
where it doesn't handle conversions to/from __ibm128.  Nor does it generate the
same names for -mabi=ieeelongdouble.  I will submit a revised patch when it is
ready.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH] Fix PR66952

2015-09-21 Thread Christophe Lyon
On 23 July 2015 at 13:17, Richard Biener  wrote:
> On Thu, 23 Jul 2015, Kyrill Tkachov wrote:
>
>>
>> On 23/07/15 10:02, Andreas Schwab wrote:
>> > Richard Biener  writes:
>> >
>> > > Index: gcc/testsuite/gcc.dg/torture/pr66952.c
>> > > ===
>> > > --- gcc/testsuite/gcc.dg/torture/pr66952.c(revision 0)
>> > > +++ gcc/testsuite/gcc.dg/torture/pr66952.c(working copy)
>> > > @@ -0,0 +1,28 @@
>> > > +/* { dg-do run } */
>> > > +
>> > > +int a = 128, b;
>> > > +
>> > > +static int
>> > > +fn1 (char p1, int p2)
>> > > +{
>> > > +  return p1 < 0 || p1 > 1 >> p2 ? 0 : p1 << 1;
>> > This is broken, p1 can never be < 0.
>>
>> Just mark the chars as signed chars?
>> The testcase passes for me with that change.
>
> Ah - I always forget to double-check such testcases with both
> -fsinged-char and -funsigned-char...
>
> Will fix.
>

Hi Richard,

I looks like you forgot to backport r226104 too (which forces use of
signed chars).

Christophe.

> Richard.


Re: [patch committed SH] Fix PR target/67573

2015-09-21 Thread Kaz Kojima
> I've committed the attached patch to fix PR67573.  LRA reveals
> that scratch registers in *call_*pcrel should be defined as
> early clobbered.  The patch is tested on sh4-unkown-linux-gnu.

I backported the fix for PR target/67573 to 5-branch.  The trunk
patch applies without problem.  Tested on sh4-unknown-linux-gnu.
Committed as revision 227953.

Regards,
kaz
--
2015-09-21  Kaz Kojima  

Backport from mainline
2015-09-16  Kaz Kojima  

PR target/67573
* config/sh/sh.md (call_pcrel): Add early clobber to scratch
operand.
(call_value_pcrel, sibcall_pcrel, sibcall_value_pcrel): Likewise.


Re: [PATCH tree-inline] do not say "called from here" with UNKNOWN_LOCATION

2015-09-21 Thread Trevor Saunders
On Mon, Sep 21, 2015 at 12:29:36PM +0200, Richard Biener wrote:
> On Mon, Sep 21, 2015 at 11:59 AM, Manuel López-Ibáñez
>  wrote:
> > On 21 September 2015 at 10:18, Richard Biener
> >  wrote:
> >> input_location is set from the call stmt:
> >>
> >>   /* FIXME: instantiate_decl isn't called by inlinable_function_p.  */
> >>   saved_location = input_location;
> >>   input_location = gimple_location (stmt);
> >>
> >> it would be nice to get rid of that.
> >
> > I could replace all uses of input_location in this function by
> > gimple_location(stmt) as I noted in the comments. Would that be ok if
> > it works? I'm not sure I can prove that input_location is not used
> > behind the scenes for some other purpose (all the more reason to kill
> > input_location once and for all). Friends, don't let friends use
> > input_location in new code!
> 
> Yeah...  not sure how to check but to look for any changes in
> generated cc1/cc1plus
> debug info.  You could also try making it invalid (-1?) and hope
> libcpp would eventually
> blow up if that is used.

I've been kind of meaning to add some raii classes like
auto_assert_no_input_location that assert if you use input_location
while its on the stack.  It shouldn't be that hard to do, so it'd be
cool if someone beat me to getting it done :)

Trev


Re: [PATCH tree-inline] do not say "called from here" with UNKNOWN_LOCATION

2015-09-21 Thread Manuel López-Ibáñez
On 21 September 2015 at 12:29, Richard Biener
 wrote:
> On Mon, Sep 21, 2015 at 11:59 AM, Manuel López-Ibáñez
>  wrote:
>> On 21 September 2015 at 10:18, Richard Biener
>>  wrote:
>>> input_location is set from the call stmt:
>>>
>>>   /* FIXME: instantiate_decl isn't called by inlinable_function_p.  */
>>>   saved_location = input_location;
>>>   input_location = gimple_location (stmt);
>>>
>>> it would be nice to get rid of that.
>>
>> I could replace all uses of input_location in this function by
>> gimple_location(stmt) as I noted in the comments. Would that be ok if
>> it works? I'm not sure I can prove that input_location is not used
>> behind the scenes for some other purpose (all the more reason to kill
>> input_location once and for all). Friends, don't let friends use
>> input_location in new code!
>
> Yeah...  not sure how to check but to look for any changes in
> generated cc1/cc1plus
> debug info.  You could also try making it invalid (-1?) and hope
> libcpp would eventually
> blow up if that is used.
>

It does blow up with:

/home/manuel/test2/227965M/build/gcc/gnat1 -gnatwa -quiet -nostdinc
-dumpbase s-mudido.adb -auxbase-strip s-mudido.o -O2 -Wextra -Wall
-fpic -g -gnatpg -mtune=generic -march=x86-64 -gnatO s-mudido.o
s-mudido.adb -o /tmp/ccNQpzNF.s

at:

B =>SET_EXPR_LOCATION (mod, EXPR_LOC_OR_LOC (val, input_location));

#0  internal_get_tmp_var (val=0x75dfdc40, pre_p=0x7fffdaf0,
post_p=, is_formal=) at
/home/manuel/test2/src/gcc/gimplify.c:540
#1  0x00c00efd in gimplify_expr
(expr_p=expr_p@entry=0x7fffdaf8, pre_p=pre_p@entry=0x7fffdaf0,
post_p=0x7fffd9a0, post_p@entry=0x0, gimple_test_f=, fallback=fallback@entry=1) at
/home/manuel/test2/src/gcc/gimplify.c:9040
#2  0x00c19b67 in gimple_regimplify_operands
(stmt=0x76074be0, gsi_p=gsi_p@entry=0x7fffdbb0) at
/home/manuel/test2/src/gcc/gimplify-me.c:252
#3  0x00e8cbe3 in copy_bb (id=id@entry=0x7fffde40,
bb=bb@entry=0x760eb548,
frequency_scale=frequency_scale@entry=1,
count_scale=count_scale@entry=1) at
/home/manuel/test2/src/gcc/tree-inline.c:1798
#4  0x00e8e039 in copy_cfg_body (new_entry=0x0,
exit_block_map=0x75dff340, entry_block_map=0x760d4c30,
frequency_scale=1, count=, id=0x7fffde40) at
/home/manuel/test2/src/gcc/tree-inline.c:2716
#5  copy_body (id=0x7fffde40, count=,
frequency_scale=1, entry_block_map=0x760d4c30,
exit_block_map=0x75dff340, new_entry=0x0) at
/home/manuel/test2/src/gcc/tree-inline.c:2955
#6  0x00e94f71 in expand_call_inline (id=0x7fffde40,
stmt=, bb=) at
/home/manuel/test2/src/gcc/tree-inline.c:4693
#7  gimple_expand_calls_inline (id=0x7fffde40, bb=)
at /home/manuel/test2/src/gcc/tree-inline.c:4833
#8  optimize_inline_calls (fn=) at
/home/manuel/test2/src/gcc/tree-inline.c:4973
#9  0x014c503c in inline_transform (node=0x7644ccf0) at
/home/manuel/test2/src/gcc/ipa-inline-transform.c:545
#10 0x00d54bac in execute_one_ipa_transform_pass
(ipa_pass=0x2656340, node=0x7644ccf0) at
/home/manuel/test2/src/gcc/passes.c:2197
#11 execute_all_ipa_transforms () at /home/manuel/test2/src/gcc/passes.c:2238
#12 0x00a99bc8 in cgraph_node::expand
(this=this@entry=0x7644ccf0) at
/home/manuel/test2/src/gcc/cgraphunit.c:1976
#13 0x00a9b44e in expand_all_functions () at
/home/manuel/test2/src/gcc/cgraphunit.c:2119
#14 symbol_table::compile (this=this@entry=0x7642b0a8) at
/home/manuel/test2/src/gcc/cgraphunit.c:2472
#15 0x00a9da63 in symbol_table::compile (this=0x7642b0a8)
at /home/manuel/test2/src/gcc/cgraphunit.c:2536
#16 symbol_table::finalize_compilation_unit (this=0x7642b0a8) at
/home/manuel/test2/src/gcc/cgraphunit.c:2562
#17 0x00e17d90 in compile_file () at
/home/manuel/test2/src/gcc/toplev.c:508
#18 0x0069e8a4 in do_compile () at
/home/manuel/test2/src/gcc/toplev.c:1973
#19 toplev::main (this=this@entry=0x7fffe0a0, argc=argc@entry=21,
argv=argv@entry=0x7fffe198) at
/home/manuel/test2/src/gcc/toplev.c:2080
#20 0x006a0bd7 in main (argc=21, argv=0x7fffe198) at
/home/manuel/test2/src/gcc/main.c:39

For some extra reason val does not have a location:

(gdb) p debug_tree(val)
 
sizes-gimplified asm_written nonaliased-component BLK size
 unit size 
align 8 symtab -166627696 alias set 32 canonical type
0x75fca2a0 domain  context

pointer_to_this 
reference_to_this  chain >
asm_written public unsigned DI
size 
unit size 
align 64 symtab -166626736 alias set 41 canonical type 0x75fca540>

arg 0 
nothrow
arg 0 
visited var def_stmt
R.94_159 = .builtin_alloca_with_align (iftmp.93_10, 8);

version 159
ptr-info 0x75d7d5a0>
arg 1 
s-mudido.adb:156:24>>

Arguably, gimple_location (stmt) is probably better than anything that
may be initially at input_location.

Then, we cannot remov

Re: [PATCH] Add new hooks ASM_OUTPUT_START_FUNCTION_HEADER ...

2015-09-21 Thread Trevor Saunders
On Mon, Sep 21, 2015 at 12:31:58PM +0100, Dominik Vogt wrote:
> This patch adds to new backend hooks
> ASM_OUTPUT_START_FUNCTION_HEADER and
> ASM_OUTPUT_END_FUNCTION_FOOTER that may be defined to emit
> assembly code at the very start or end of a function.  This
> functionality is needed by the patch that ports the "target"
> attribute and pragma to S/390.  (I'll post that patch shortly, and
> in a different thread.)

obligatory question, why are we adding new target macros instead of
using hooks?  Especially in a place like this where the caller is very
unlikely to be at all hot.

thanks!

Trev



Re: (patch,rfc) s/gimple/gimple */

2015-09-21 Thread Trevor Saunders
On Mon, Sep 21, 2015 at 08:32:36PM +0100, Richard Sandiford wrote:
> Richard Biener  writes:
> > On Fri, Sep 18, 2015 at 3:32 PM, Trevor Saunders  
> > wrote:
> >> On Wed, Sep 16, 2015 at 03:11:14PM -0400, David Malcolm wrote:
> >>> On Wed, 2015-09-16 at 09:16 -0400, Trevor Saunders wrote:
> >>> > Hi,
> >>> >
> >>> > I gave changing from gimple to gimple * a shot last week.  It turned out
> >>> > to be not too hard.  As you might expect the patch is huge so its
> >>> > attached compressed.
> >>> >
> >>> > patch was bootstrapped + regtested on x86_64-linux-gnu, and run through
> >>> > config-list.mk.  However I needed to update it some for changes made
> >>> > while testing.  Do people want to make this change now?  If so I'll try
> >>> > and commit the patch over the weekend when less is changing.
> >>>
> >>>
> >>> FWIW there are some big changes in gcc/tree-vect-slp.c:vectorizable_load
> >>> that looks like unrelated whitespace changes, e.g. the following (and
> >>> there are some followup hunks).  Did something change underneath, or was
> >>> there a stray whitespace cleanup here?  (I skimmed through the patch,
> >>> and this was the only file I spotted where something looked wrong)
> >>
> >> yeah, it was a stray whitespace cleanup, but I reverted it.
> >>
> >> Given the few but only positive comments I've seen I'm planning to
> >> commit this over the weekend.
> >
> > Thanks a lot!
> >
> > If you are still in a refactoring mood then I have sth else here.  When
> > streamlining the gimple accessors I noticed the glaring const-correctness
> > issue in
> >
> > /* Return a pointer to the LHS of assignment statement GS.  */
> >
> > static inline tree *
> > gimple_assign_lhs_ptr (const gassign *gs)
> > {
> >   return const_cast (&gs->op[0]);
> > }
> >
> > and was thinking to either "fix" it by removing the 'const' or by
> > merging gimple_assign_lhs and gimple_assign_lhs_ptr into
> >
> > static inline const tree&
> > gimple_assign_lhs (const gassign *);
> >
> > static inline tree&
> > gimple_assign_lhs (gassign *);
> 
> AIUI const_tree (like const_rtx) only protects the top-level tree.
> This is something I always hoped to change for rtl one day, but fixing
> all the fallout would be an incredibly dull task...
> 
> I suppose protecting the top level is still better than nothing though.

yeah, it seems like having things at least be standard const correct
here would be nice.  That said some of this looks interesting, it looks
like the use def data structures keep pointers to pointers to trees in
the middle of gimple objects.  That seems kind of odd, I guess the idea
is that that way sometimes the use def data structure will automatically
stay up to date?

Trev

> 
> Thanks,
> Richard
> 


  1   2   >