RE: [PATCH][2/3] Fix PR54733 Optimize endian independent load/store

2014-04-02 Thread Thomas Preud'homme
> From: Marc Glisse [mailto:marc.gli...@inria.fr]
> 
> Uh? It does fold a+1-a for me. What it doesn't do is look through the
> definition of b in b-a. Richard+GSoC will supposedly soon provide a
> function that does that.

Oh right, it's a bit more complex here since the array index is converted
to an offset first. So the operation is more like:  ((a+1)*cst) - (a*cst).
Any chances this might be handled at some point? Note that this might
not be very frequent so it's not very important for this patch.

Thanks for the comment.

Best regards,

Thomas




RFA: RL78: Fix handling of (SUBREG (SYMBOL_REF))

2014-04-02 Thread Nick Clifton
Hi DJ,

  The patch below is to fix a snafu I made whilst fixing some problems
  with the RL78 port a while ago.  GCC was generating
  (SUBREG (SYMBOL_REF) ) which made no sense to me, so I had the
  movqi expander just fail when it encountered them.  Now that I have
  more idea about why they are created - installing symbolic values into
  bitfields or packed structure fields - I have found that it is
  necessary to support them.  Failure is not an option as GCC will just
  silently omit generating any code at all.

  Tested with an rl78-elf toolchain without any regressions.  OK to
  apply ?

Cheers
  Nick

gcc/ChangeLog
2014-04-01  Nick Clifton  

* config/rl78/rl78-expand.md (movqi): Handle (SUBREG (SYMBOL_REF))
properly.

Index: gcc/config/rl78/rl78-expand.md
===
--- gcc/config/rl78/rl78-expand.md  (revision 209009)
+++ gcc/config/rl78/rl78-expand.md  (working copy)
@@ -30,18 +30,23 @@
 if (rl78_far_p (operands[0]) && rl78_far_p (operands[1]))
   operands[1] = copy_to_mode_reg (QImode, operands[1]);
 
-/* FIXME: Not sure how GCC can generate (SUBREG (SYMBOL_REF)),
-   but it does.  Since this makes no sense, reject it here.  */
+/* GCC can generate (SUBREG (SYMBOL_REF)) when it has to store a symbol
+   into a bitfield, or a packed ordinary field.  We can handle this
+   provided that the destination is a register.  If not, then load the
+   source into a register first.  */
 if (GET_CODE (operands[1]) == SUBREG
-&& GET_CODE (XEXP (operands[1], 0)) == SYMBOL_REF)
-  FAIL;
+&& GET_CODE (XEXP (operands[1], 0)) == SYMBOL_REF
+   && ! REG_P (operands[0]))
+   operands[1] = copy_to_mode_reg (QImode, operands[1]);
+
 /* Similarly for (SUBREG (CONST (PLUS (SYMBOL_REF.
cf. g++.dg/abi/packed.C.  */
 if (GET_CODE (operands[1]) == SUBREG
&& GET_CODE (XEXP (operands[1], 0)) == CONST
 && GET_CODE (XEXP (XEXP (operands[1], 0), 0)) == PLUS
-&& GET_CODE (XEXP (XEXP (XEXP (operands[1], 0), 0), 0)) == SYMBOL_REF)
-  FAIL;
+&& GET_CODE (XEXP (XEXP (XEXP (operands[1], 0), 0), 0)) == SYMBOL_REF
+   && ! REG_P (operands[0]))
+   operands[1] = copy_to_mode_reg (QImode, operands[1]);
 
 if (CONST_INT_P (operands[1]) && ! IN_RANGE (INTVAL (operands[1]), (-1 << 
8) + 1, (1 << 8) - 1))
   FAIL;


Re: [gomp4] Add tables generation

2014-04-02 Thread Thomas Schwinge
Hi!

On Thu, 20 Mar 2014 17:50:13 +0100, Bernd Schmidt  
wrote:
> This is based on Michael Zolotukhin's patch 2/3 from a while ago. It 
> adds functionality to build function/variable tables that will allow 
> libgomp to look up offload target code based on the address of the 
> corresponding host function. There are two alternatives, one based on 
> named sections, and one based on a target hook when named sections are 
> unavailable (as on ptx).
> 
> Committed on gomp-4_0-branch.

I see regressions in the libgomp testsuite for configurations where
offloading is not enabled:

spawn [...]/build/gcc/xgcc -B[...]/build/gcc/ 
[...]/source/libgomp/testsuite/libgomp.c/for-3.c 
-B[...]/build/x86_64-unknown-linux-gnu/./libgomp/ 
-B[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs 
-I[...]/build/x86_64-unknown-linux-gnu/./libgomp 
-I[...]/source/libgomp/testsuite/.. -fmessage-length=0 
-fno-diagnostics-show-caret -fdiagnostics-color=never -fopenmp -std=gnu99 
-fopenmp -L[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -lm -o 
./for-3.exe
/tmp/ccGnT0ei.o: In function `main':
for-3.c:(.text+0x21032): undefined reference to `__OPENMP_TARGET__'
collect2: error: ld returned 1 exit status

I suppose that's because even if...

> --- gcc/configure.ac  (revision 208715)
> +++ gcc/configure.ac  (working copy)
> @@ -887,6 +887,10 @@ AC_SUBST(enable_accelerator)
>  offload_targets=`echo $offload_targets | sed -e 's#,#:#'`
>  AC_DEFINE_UNQUOTED(OFFLOAD_TARGETS, "$offload_targets",
>   [Define to hold the list of target names suitable for offloading.])
> +if test x$offload_targets != x; then
> +  AC_DEFINE(ENABLE_OFFLOADING, 1,
> +[Define this to enable support for offloading.])
> +fi

... offloading is not enabled, this...

> --- gcc/omp-low.c (revision 208706)
> +++ gcc/omp-low.c (working copy)
> @@ -8671,19 +8672,22 @@ expand_omp_target (struct omp_region *re
>  }
>  
>gimple g;
> -  /* FIXME: This will be address of
> - extern char __OPENMP_TARGET__[] __attribute__((visibility ("hidden")))
> - symbol, as soon as the linker plugin is able to create it for us.  */
> -  tree openmp_target = build_zero_cst (ptr_type_node);
> +  tree openmp_target
> += build_decl (UNKNOWN_LOCATION, VAR_DECL,
> +   get_identifier ("__OPENMP_TARGET__"), ptr_type_node);
> +  TREE_PUBLIC (openmp_target) = 1;
> +  DECL_EXTERNAL (openmp_target) = 1;
>if (kind == GF_OMP_TARGET_KIND_REGION)
>  {
>tree fnaddr = build_fold_addr_expr (child_fn);
> -  g = gimple_build_call (builtin_decl_explicit (start_ix), 7,
> -  device, fnaddr, openmp_target, t1, t2, t3, t4);
> +  g = gimple_build_call (builtin_decl_explicit (start_ix), 7, device,
> +  fnaddr, build_fold_addr_expr (openmp_target),
> +  t1, t2, t3, t4);
>  }
>else
> -g = gimple_build_call (builtin_decl_explicit (start_ix), 6,
> -device, openmp_target, t1, t2, t3, t4);
> +g = gimple_build_call (builtin_decl_explicit (start_ix), 6, device,
> +build_fold_addr_expr (openmp_target),
> +t1, t2, t3, t4);

... will now cause a reference to __OPENMP_TARGET__, but...

> --- libgcc/crtstuff.c (revision 208706)
> +++ libgcc/crtstuff.c (working copy)
> @@ -311,6 +311,15 @@ register_tm_clones (void)
>  }
>  #endif /* USE_TM_CLONE_REGISTRY */
>  
> +#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
> +void *_omp_func_table[0]
> +  __attribute__ ((__used__, visibility ("protected"),
> +   section (".offload_func_table_section"))) = { };
> +void *_omp_var_table[0]
> +  __attribute__ ((__used__, visibility ("protected"),
> +   section (".offload_var_table_section"))) = { };
> +#endif
> +
>  #if defined(INIT_SECTION_ASM_OP) || defined(INIT_ARRAY_SECTION_ASM_OP)
>  
>  #ifdef OBJECT_FORMAT_ELF
> @@ -752,6 +761,23 @@ __do_global_ctors (void)
>  #error "What are you doing with crtstuff.c, then?"
>  #endif
>  
> +#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
> +void *_omp_funcs_end[0]
> +  __attribute__ ((__used__, visibility ("protected"),
> +   section (".offload_func_table_section"))) = { };
> +void *_omp_vars_end[0]
> +  __attribute__ ((__used__, visibility ("protected"),
> +   section (".offload_var_table_section"))) = { };
> +extern void *_omp_func_table[];
> +extern void *_omp_var_table[];
> +void *__OPENMP_TARGET__[] __attribute__ ((__visibility__ ("protected"))) =
> +{
> +  &_omp_func_table, &_omp_funcs_end,
> +  &_omp_var_table, &_omp_vars_end
> +};
> +#endif

... __OPENMP_TARGET__ is not being defined here for the
!ENABLE_OFFLOADING case.  In
,
Jakub had suggested this to be a weak symbol, so we'd get NULL in this
case, which would be what's needed here, I think?


Also, I'd suggest to rename __OPENMP_TARGE

Re: [PATCH] Guard special installs in install-driver

2014-04-02 Thread Richard Biener
On Tue, 1 Apr 2014, Mike Stump wrote:

> On Mar 31, 2014, at 4:50 AM, Richard Biener  wrote:
> > -$(INSTALL_PROGRAM) xgcc$(exeext) 
> > $(DESTDIR)$(bindir)/$(GCC_INSTALL_NAME)$(exeext)
> > !   -rm -f 
> > $(DESTDIR)$(bindir)/$(target_noncanonical)-gcc-$(version)$(exeext)
> > !   -( cd $(DESTDIR)$(bindir) && \
> > !  $(LN) $(GCC_INSTALL_NAME)$(exeext) 
> > $(target_noncanonical)-gcc-$(version)$(exeext) )
> > !   -if [ ! -f gcc-cross$(exeext) ] ; then \
> >   rm -f $(DESTDIR)$(bindir)/$(target_noncanonical)-gcc-tmp$(exeext); \
> >   ( cd $(DESTDIR)$(bindir) && \
> > $(LN) $(GCC_INSTALL_NAME)$(exeext) 
> > $(target_noncanonical)-gcc-tmp$(exeext) && \
> > --- 3205,3217 
> >  install-driver: installdirs xgcc$(exeext)
> > -rm -f $(DESTDIR)$(bindir)/$(GCC_INSTALL_NAME)$(exeext)
> > -$(INSTALL_PROGRAM) xgcc$(exeext) 
> > $(DESTDIR)$(bindir)/$(GCC_INSTALL_NAME)$(exeext)
> > !   -if [ "$(GCC_INSTALL_NAME)" != "$(target_noncanonical)-gcc-$(version)" 
> > ]; then \
> > ! -rm -f 
> > $(DESTDIR)$(bindir)/$(target_noncanonical)-gcc-$(version)$(exeext) \
> > ! -( cd $(DESTDIR)$(bindir) && \
> > !$(LN) $(GCC_INSTALL_NAME)$(exeext) 
> > $(target_noncanonical)-gcc-$(version)$(exeext) ) \
> > !   fi
> 
> Certainly safer for release like this, but, gotta wonder if we can avoid 
> the ignoring of errors with the added check…

No idea ;)  For my case I ended up without an installed driver as
the rm of course succeeded but the rest not ...

> I’d have to work out why 
> they did that in the first place and run a build and play a bit to be as 
> sure as I’d like to be… but, a cross and a native build I think should 
> test it adequately.

Work out why we install _two_ additional variants!  (or rather why we
install any additional variants to GCC_INSTALL_NAME at all ...).

Anyway, I now committed the patch.  We can always followup with
cleanups to this area later, possibly in stage1.

Richard.

Re: Fix various x86 tests for --with-arch=bdver3 --with-cpu=bdver3

2014-04-02 Thread Uros Bizjak
On Wed, Apr 2, 2014 at 12:27 AM, Joseph S. Myers
 wrote:

> There are other failures this patch does not resolve in a
> --with-arch=bdver3 --with-cpu=bdver3 configuration.  Some of these are
> AVX tests whose failures are not resolved by adding -mno-prefer-avx128
> (and so this patch does not add -mno-prefer-avx128 to those tests);
> others may be cases where -mtune=generic is appropriate but I haven't
> identified the specific tuning parameter that shows code generation
> differences depending on tuning are correct and so a -mtune= option
> should be used.
>
> FAIL: gcc.target/i386/avx2-vpand-1.c scan-assembler vpand[ 
> \\t]+[^\n]*%ymm[0-9]
> FAIL: gcc.target/i386/avx2-vpand-3.c scan-assembler-times vpand[ 
> \\t]+[^\n]*%ymm[0-9] 1
> FAIL: gcc.target/i386/avx2-vpandn-1.c scan-assembler vpandn[ 
> \\t]+[^\n]*%ymm[0-9]
> FAIL: gcc.target/i386/avx2-vpor-1.c scan-assembler vpor[ \\t]+[^\n]*%ymm[0-9]
> FAIL: gcc.target/i386/avx2-vpxor-1.c scan-assembler vpxor[ 
> \\t]+[^\n]*%ymm[0-9]
> FAIL: gcc.target/i386/avx256-unaligned-load-2.c scan-assembler 
> (sse2_loaddqu|vmovdqu[^\n\r]*movv16qi_internal)
> FAIL: gcc.target/i386/avx256-unaligned-load-2.c scan-assembler vinsert.128
> FAIL: gcc.target/i386/avx512f-vec-init.c scan-assembler-times vmovdqa64[ 
> \\t]+%zmm 2
> FAIL: gcc.target/i386/avx512f-vmovdqu32-1.c scan-assembler-times 
> vmovdqu[36][24][ \\t]+[^\n]*\\)[^\n]*%zmm[0-9][^{] 1
> FAIL: gcc.target/i386/avx512f-vmovupd-1.c scan-assembler-times vmovupd[ 
> \\t]+[^\n]*\\)[^\n]*%zmm[0-9][^{] 1
> FAIL: gcc.target/i386/avx512f-vpandd-1.c scan-assembler-times vpandd[ 
> \\t]+[^\n]*%zmm[0-9][^{] 4
> FAIL: gcc.target/i386/avx512f-vpandd-1.c scan-assembler-times vpandd[ 
> \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
> FAIL: gcc.target/i386/avx512f-vpandd-1.c scan-assembler-times vpandd[ 
> \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
> FAIL: gcc.target/i386/avx512f-vpandnd-1.c scan-assembler-times vpandnd[ 
> \\t]+[^\n]*%zmm[0-9][^{] 4
> FAIL: gcc.target/i386/avx512f-vpandnd-1.c scan-assembler-times vpandnd[ 
> \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
> FAIL: gcc.target/i386/avx512f-vpandnd-1.c scan-assembler-times vpandnd[ 
> \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
> FAIL: gcc.target/i386/avx512f-vpandnq-1.c scan-assembler-times vpandnq[ 
> \\t]+[^\n]*%zmm[0-9][^{] 3
> FAIL: gcc.target/i386/avx512f-vpandnq-1.c scan-assembler-times vpandnq[ 
> \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
> FAIL: gcc.target/i386/avx512f-vpandnq-1.c scan-assembler-times vpandnq[ 
> \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
> FAIL: gcc.target/i386/avx512f-vpandq-1.c scan-assembler-times vpandq[ 
> \\t]+[^\n]*%zmm[0-9][^{] 3
> FAIL: gcc.target/i386/avx512f-vpandq-1.c scan-assembler-times vpandq[ 
> \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
> FAIL: gcc.target/i386/avx512f-vpandq-1.c scan-assembler-times vpandq[ 
> \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
> FAIL: gcc.target/i386/avx512f-vpord-1.c scan-assembler-times vpord[ 
> \\t]+[^\n]*%zmm[0-9][^{] 4
> FAIL: gcc.target/i386/avx512f-vpord-1.c scan-assembler-times vpord[ 
> \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
> FAIL: gcc.target/i386/avx512f-vpord-1.c scan-assembler-times vpord[ 
> \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
> FAIL: gcc.target/i386/avx512f-vporq-1.c scan-assembler-times vporq[ 
> \\t]+[^\n]*%zmm[0-9][^{] 3
> FAIL: gcc.target/i386/avx512f-vporq-1.c scan-assembler-times vporq[ 
> \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
> FAIL: gcc.target/i386/avx512f-vporq-1.c scan-assembler-times vporq[ 
> \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
> FAIL: gcc.target/i386/avx512f-vpxord-1.c scan-assembler-times vpxord[ 
> \\t]+[^\n]*%zmm[0-9][^{] 4
> FAIL: gcc.target/i386/avx512f-vpxord-1.c scan-assembler-times vpxord[ 
> \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
> FAIL: gcc.target/i386/avx512f-vpxord-1.c scan-assembler-times vpxord[ 
> \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
> FAIL: gcc.target/i386/avx512f-vpxorq-1.c scan-assembler-times vpxorq[ 
> \\t]+[^\n]*%zmm[0-9][^{] 3
> FAIL: gcc.target/i386/avx512f-vpxorq-1.c scan-assembler-times vpxorq[ 
> \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
> FAIL: gcc.target/i386/avx512f-vpxorq-1.c scan-assembler-times vpxorq[ 
> \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
> FAIL: gcc.target/i386/pr49002-1.c scan-assembler vmovapd[\t ]*[^,]*,[\t ]*%xmm
> FAIL: gcc.target/i386/pr53712.c scan-assembler-times movdqu 1
> FAIL: gcc.target/i386/pr53907.c scan-assembler movdqa
> FAIL: gcc.target/i386/pr59539-1.c scan-assembler-times vmovdqu 1
> FAIL: gcc.target/i386/pr59539-2.c scan-assembler-times vmovdqu 1

These are due to TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL tuning flag.
Currently, this flag applies to all vector sizes (128, 256 and 512
bits), but I guess it is effective only for 128 bit sizes. Can you
please review usage of this flag in i386/sse.md?

Thanks,
Uros.


Re: [4.8, PATCH 9/26] Backport Power8 and LE support: ABI call support

2014-04-02 Thread Richard Biener
On Wed, 19 Mar 2014, Bill Schmidt wrote:

> Hi,
> 
> This patch (diff-abi-calls) backports fixes to common code to support
> the new ELFv2 ABI.  Copying Richard and Jakub for these bits.

Ok.

Thanks,
Richard.

> Thanks,
> Bill
> 
> 
> 2014-03-29  Bill Schmidt  
> 
>   Backport from mainline r204798:
> 
>   2013-11-14  Ulrich Weigand  
>   Alan Modra  
> 
>   * function.c (assign_parms): Use all.reg_parm_stack_space instead
>   of re-evaluating REG_PARM_STACK_SPACE target macro.
>   (locate_and_pad_parm): New parameter REG_PARM_STACK_SPACE.  Use it
>   instead of evaluating target macro REG_PARM_STACK_SPACE every time.
>   (assign_parm_find_entry_rtl): Update call.
>   * calls.c (initialize_argument_information): Update call.
>   (emit_library_call_value_1): Likewise.
>   * expr.h (locate_and_pad_parm): Update prototype.
> 
>   Backport from mainline r204797:
> 
>   2013-11-14  Ulrich Weigand  
> 
>   * calls.c (store_unaligned_arguments_into_pseudos): Skip PARALLEL
>   arguments.
> 
>   Backport from mainline r197003:
> 
>   2013-03-23  Eric Botcazou  
> 
>   * calls.c (expand_call): Add missing guard to code handling return
>   of non-BLKmode structures in MSB.
>   * function.c (expand_function_end): Likewise.
> 
> 
> Index: gcc-4_8-branch/gcc/calls.c
> ===
> --- gcc-4_8-branch.orig/gcc/calls.c   2013-12-28 17:41:32.056627059 +0100
> +++ gcc-4_8-branch/gcc/calls.c2013-12-28 17:50:43.356356135 +0100
> @@ -983,6 +983,7 @@ store_unaligned_arguments_into_pseudos (
>  
>for (i = 0; i < num_actuals; i++)
>  if (args[i].reg != 0 && ! args[i].pass_on_stack
> + && GET_CODE (args[i].reg) != PARALLEL
>   && args[i].mode == BLKmode
>   && MEM_P (args[i].value)
>   && (MEM_ALIGN (args[i].value)
> @@ -1327,6 +1328,7 @@ initialize_argument_information (int num
>  #else
>args[i].reg != 0,
>  #endif
> +  reg_parm_stack_space,
>args[i].pass_on_stack ? 0 : args[i].partial,
>fndecl, args_size, &args[i].locate);
>  #ifdef BLOCK_REG_PADDING
> @@ -3171,7 +3173,9 @@ expand_call (tree exp, rtx target, int i
>group load/store machinery below.  */
>if (!structure_value_addr
> && !pcc_struct_value
> +   && TYPE_MODE (rettype) != VOIDmode
> && TYPE_MODE (rettype) != BLKmode
> +   && REG_P (valreg)
> && targetm.calls.return_in_msb (rettype))
>   {
> if (shift_return_value (TYPE_MODE (rettype), false, valreg))
> @@ -3734,7 +3738,8 @@ emit_library_call_value_1 (int retval, r
>  #else
>  argvec[count].reg != 0,
>  #endif
> -0, NULL_TREE, &args_size, &argvec[count].locate);
> +reg_parm_stack_space, 0,
> +NULL_TREE, &args_size, &argvec[count].locate);
>  
>if (argvec[count].reg == 0 || argvec[count].partial != 0
> || reg_parm_stack_space > 0)
> @@ -3821,7 +3826,7 @@ emit_library_call_value_1 (int retval, r
>  #else
>  argvec[count].reg != 0,
>  #endif
> -argvec[count].partial,
> +reg_parm_stack_space, argvec[count].partial,
>  NULL_TREE, &args_size, &argvec[count].locate);
> args_size.constant += argvec[count].locate.size.constant;
> gcc_assert (!argvec[count].locate.size.var);
> Index: gcc-4_8-branch/gcc/function.c
> ===
> --- gcc-4_8-branch.orig/gcc/function.c2013-12-28 17:41:32.056627059 
> +0100
> +++ gcc-4_8-branch/gcc/function.c 2013-12-28 17:50:43.362356165 +0100
> @@ -2507,6 +2507,7 @@ assign_parm_find_entry_rtl (struct assig
>  }
>  
>locate_and_pad_parm (data->promoted_mode, data->passed_type, in_regs,
> +all->reg_parm_stack_space,
>  entry_parm ? data->partial : 0, current_function_decl,
>  &all->stack_args_size, &data->locate);
>  
> @@ -3485,11 +3486,7 @@ assign_parms (tree fndecl)
>/* Adjust function incoming argument size for alignment and
>   minimum length.  */
>  
> -#ifdef REG_PARM_STACK_SPACE
> -  crtl->args.size = MAX (crtl->args.size,
> - REG_PARM_STACK_SPACE (fndecl));
> -#endif
> -
> +  crtl->args.size = MAX (crtl->args.size, all.reg_parm_stack_space);
>crtl->args.size = CEIL_ROUND (crtl->args.size,
>  PARM_BOUNDARY / BITS_PER_UNIT);
>  
> @@ -3693,6 +3690,9 @@ gimplify_parameters (void)
> IN_REGS is nonzero if the argument will be passed in registers.  It will
> never be set if REG_PARM_STACK_SPACE is not defined.
>  
> +   REG_PARM_STACK_SPACE is the number of bytes of stack space reserved
> +   f

Re: [4.8, PATCH 15/26] Backport Power8 and LE support: PR54537

2014-04-02 Thread Richard Biener
On Wed, 19 Mar 2014, Bill Schmidt wrote:

> Hi,
> 
> This patch (diff-pr54537) backports a fix for PR54537 which is unrelated
> but necessary.  Copying Richard and Jakub for the common code.

Ok.

Thanks,
Richard.

> Thanks,
> Bill
> 
> 
> [libstdc++-v3]
> 
> 2014-03-29  Bill Schmidt  
> 
> Backport from mainline
>   2013-08-01  Fabien Chêne  
> 
>   PR c++/54537
>   * include/tr1/cmath: Remove pow(double,double) overload, remove a
>   duplicated comment about DR 550. Add a comment to explain the issue.
>   * testsuite/tr1/8_c_compatibility/cmath/pow_cmath.cc: New.
> 
> [gcc/cp]
> 
> 2014-03-29  Bill Schmidt  
> 
>   Back port from mainline
>   2013-08-01  Fabien Chêne  
> 
>   PR c++/54537
>   * cp-tree.h: Check OVL_USED with OVERLOAD_CHECK.
>   * name-lookup.c (do_nonmember_using_decl): Make sure we have an
>   OVERLOAD before calling OVL_USED. Call diagnose_name_conflict
>   instead of issuing an error without mentioning the conflicting
>   declaration.
> 
> [gcc/testsuite]
> 
> 2014-03-29  Bill Schmidt  
> 
>   Back port from mainline
>   2013-08-01  Fabien Chêne  
>   Peter Bergner  
> 
>   PR c++/54537
>   * g++.dg/overload/using3.C: New.
>   * g++.dg/overload/using2.C: Adjust.
>   * g++.dg/lookup/using9.C: Likewise.
> 
> 
> Index: gcc-4_8-test/gcc/cp/cp-tree.h
> ===
> --- gcc-4_8-test.orig/gcc/cp/cp-tree.h
> +++ gcc-4_8-test/gcc/cp/cp-tree.h
> @@ -331,7 +331,7 @@ typedef struct ptrmem_cst * ptrmem_cst_t
>  /* If set, this was imported in a using declaration.
> This is not to confuse with being used somewhere, which
> is not important for this node.  */
> -#define OVL_USED(NODE)   TREE_USED (NODE)
> +#define OVL_USED(NODE)   TREE_USED (OVERLOAD_CHECK (NODE))
>  /* If set, this OVERLOAD was created for argument-dependent lookup
> and can be freed afterward.  */
>  #define OVL_ARG_DEPENDENT(NODE) TREE_LANG_FLAG_0 (OVERLOAD_CHECK
> (NODE))
> Index: gcc-4_8-test/gcc/cp/name-lookup.c
> ===
> --- gcc-4_8-test.orig/gcc/cp/name-lookup.c
> +++ gcc-4_8-test/gcc/cp/name-lookup.c
> @@ -2286,8 +2286,7 @@ push_overloaded_decl_1 (tree decl, int f
> && compparms (TYPE_ARG_TYPES (TREE_TYPE (fn)),
>   TYPE_ARG_TYPES (TREE_TYPE (decl)))
> && ! decls_match (fn, decl))
> - error ("%q#D conflicts with previous using declaration %q#D",
> -decl, fn);
> + diagnose_name_conflict (decl, fn);
>  
> dup = duplicate_decls (decl, fn, is_friend);
> /* If DECL was a redeclaration of FN -- even an invalid
> @@ -2519,7 +2518,7 @@ do_nonmember_using_decl (tree scope, tre
> if (new_fn == old_fn)
>   /* The function already exists in the current namespace.  */
>   break;
> -   else if (OVL_USED (tmp1))
> +   else if (TREE_CODE (tmp1) == OVERLOAD && OVL_USED (tmp1))
>   continue; /* this is a using decl */
> else if (compparms (TYPE_ARG_TYPES (TREE_TYPE (new_fn)),
> TYPE_ARG_TYPES (TREE_TYPE (old_fn
> @@ -2534,7 +2533,7 @@ do_nonmember_using_decl (tree scope, tre
>   break;
> else
>   {
> -   error ("%qD is already declared in this scope", name);
> +   diagnose_name_conflict (new_fn, old_fn);
> break;
>   }
>   }
> Index: gcc-4_8-test/gcc/testsuite/g++.dg/lookup/using9.C
> ===
> --- gcc-4_8-test.orig/gcc/testsuite/g++.dg/lookup/using9.C
> +++ gcc-4_8-test/gcc/testsuite/g++.dg/lookup/using9.C
> @@ -21,11 +21,11 @@ void h()
>f('h');
>f(1); // { dg-error "ambiguous" }
>// { dg-message "candidate" "candidate note" { target *-*-* } 22 }
> -  void f(int);  // { dg-error "previous using declaration" }
> +  void f(int);  // { dg-error "previous declaration" }
>  }
>  
>  void m()
>  {
>void f(int);
> -  using B::f;   // { dg-error "already declared" }
> +  using B::f;   // { dg-error "previous declaration" }
>  }
> Index: gcc-4_8-test/gcc/testsuite/g++.dg/overload/using2.C
> ===
> --- gcc-4_8-test.orig/gcc/testsuite/g++.dg/overload/using2.C
> +++ gcc-4_8-test/gcc/testsuite/g++.dg/overload/using2.C
> @@ -45,7 +45,7 @@ using std::C1;
>extern "C" void exit (int) throw ();
>extern "C" void *malloc (__SIZE_TYPE__) throw ()
> __attribute__((malloc));
>  
> -  void abort (void) throw ();
> +  void abort (void) throw (); // { dg-message "previous" }
>void _exit (int) throw (); // { dg-error "conflicts" "conflic

Re: [PATCH][1/3] Fix PR54733 Optimize endian independent load/store

2014-04-02 Thread Richard Biener
On Wed, Apr 2, 2014 at 2:54 AM, Thomas Preud'homme
 wrote:
> I took the lack of answer for this patch as an indication that the patch is 
> too
> big. This is the first patch in a series of three. Its purpose is to create 
> some new
> effective target for architecture having byte swap instructions and make use
> of them in the existing byte swap tests. One effective target is created for
> each size (16, 32 and 64) as not all architectures support byte swap of all
> sizes.

Sorry, I simply queued it in my review queue for stage1 ... it's definitely
something that was high on my wish-list (including of also using
general vector shuffles if available to support even more patterns).

Still on the queue, stay tuned ;)

Richard.

> Here is the gcc/testsuite/ChangeLog entry:
>
> 2014-04-01  Thomas Preud'homme  
>
> * lib/target-supports.exp: New effective targets for architectures
> capable of performing byte swap.
> * gcc.dg/optimize-bswapdi-1.c: Convert to new bswap target.
> * gcc.dg/optimize-bswapdi-2.c: Likewise.
> * gcc.dg/optimize-bswapsi-1.c: Likewise.
>
> The patch is attached to this email. Is this ok for stage1?
>
> Best regards,
>
> Thomas


Re: [PATCH][2/3] Fix PR54733 Optimize endian independent load/store

2014-04-02 Thread Richard Biener
On Wed, Apr 2, 2014 at 9:04 AM, Thomas Preud'homme
 wrote:
>> From: Marc Glisse [mailto:marc.gli...@inria.fr]
>>
>> Uh? It does fold a+1-a for me. What it doesn't do is look through the
>> definition of b in b-a. Richard+GSoC will supposedly soon provide a
>> function that does that.
>
> Oh right, it's a bit more complex here since the array index is converted
> to an offset first. So the operation is more like:  ((a+1)*cst) - (a*cst).
> Any chances this might be handled at some point? Note that this might
> not be very frequent so it's not very important for this patch.

"More like" isn't enough to answer this - do you have a testcase?  (usually
these end up in undefined-overflow and/or conversion-to-sizetype issues)

Richard.

> Thanks for the comment.
>
> Best regards,
>
> Thomas
>
>


RE: [PATCH][1/3] Fix PR54733 Optimize endian independent load/store

2014-04-02 Thread Thomas Preud'homme
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> 
> Sorry, I simply queued it in my review queue for stage1 ... it's definitely
> something that was high on my wish-list (including of also using
> general vector shuffles if available to support even more patterns).

Oh great. Anyway, having it split in 3 parts will ease the review for you.

Thanks.

Thomas





Re: [gomp4] Add tables generation

2014-04-02 Thread Thomas Schwinge
Hi!

On Thu, 20 Mar 2014 17:50:13 +0100, Bernd Schmidt  
wrote:
> This is based on Michael Zolotukhin's patch 2/3 from a while ago. It 
> adds functionality to build function/variable tables that will allow 
> libgomp to look up offload target code based on the address of the 
> corresponding host function. There are two alternatives, one based on 
> named sections, and one based on a target hook when named sections are 
> unavailable (as on ptx).
> 
> Committed on gomp-4_0-branch.

> --- gcc/omp-low.c (revision 208706)
> +++ gcc/omp-low.c (working copy)
> @@ -8671,19 +8672,22 @@ expand_omp_target (struct omp_region *re
>  }
>  
>gimple g;
> -  /* FIXME: This will be address of
> - extern char __OPENMP_TARGET__[] __attribute__((visibility ("hidden")))
> - symbol, as soon as the linker plugin is able to create it for us.  */
> -  tree openmp_target = build_zero_cst (ptr_type_node);
> +  tree openmp_target
> += build_decl (UNKNOWN_LOCATION, VAR_DECL,
> +   get_identifier ("__OPENMP_TARGET__"), ptr_type_node);
> +  TREE_PUBLIC (openmp_target) = 1;
> +  DECL_EXTERNAL (openmp_target) = 1;
>if (kind == GF_OMP_TARGET_KIND_REGION)
>  {
>tree fnaddr = build_fold_addr_expr (child_fn);
> -  g = gimple_build_call (builtin_decl_explicit (start_ix), 7,
> -  device, fnaddr, openmp_target, t1, t2, t3, t4);
> +  g = gimple_build_call (builtin_decl_explicit (start_ix), 7, device,
> +  fnaddr, build_fold_addr_expr (openmp_target),
> +  t1, t2, t3, t4);
>  }
>else
> -g = gimple_build_call (builtin_decl_explicit (start_ix), 6,
> -device, openmp_target, t1, t2, t3, t4);
> +g = gimple_build_call (builtin_decl_explicit (start_ix), 6, device,
> +build_fold_addr_expr (openmp_target),
> +t1, t2, t3, t4);

Committed in r209013:

commit 1f54e08135bd8be59438977b4edbc102e7cef2d7
Author: tschwinge 
Date:   Wed Apr 2 08:28:54 2014 +

Handle __OPENMP_TARGET__ symbol for OpenACC offloading functions, too.

gcc/
* omp-low.c (expand_oacc_offload): Handle __OPENMP_TARGET__
symbol.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@209013 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp |  5 +
 gcc/omp-low.c  | 14 --
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 1d35b58..8983632 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,8 @@
+2014-04-02  Thomas Schwinge  
+
+   * omp-low.c (expand_oacc_offload): Handle __OPENMP_TARGET__
+   symbol.
+
 2014-03-20  Thomas Schwinge  
 
* gimple.h (enum gf_mask): Add GF_OMP_FOR_KIND_OACC_LOOP.
diff --git gcc/omp-low.c gcc/omp-low.c
index a7b93bc..01eda9d 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -5138,13 +5138,15 @@ expand_oacc_offload (struct omp_region *region)
 }
 
   gimple g;
-  /* FIXME: This will be address of
- extern char __OPENMP_TARGET__[] __attribute__((visibility ("hidden")))
- symbol, as soon as the linker plugin is able to create it for us.  */
-  tree openmp_target = build_zero_cst (ptr_type_node);
+  tree openmp_target
+= build_decl (UNKNOWN_LOCATION, VAR_DECL,
+ get_identifier ("__OPENMP_TARGET__"), ptr_type_node);
+  TREE_PUBLIC (openmp_target) = 1;
+  DECL_EXTERNAL (openmp_target) = 1;
   tree fnaddr = build_fold_addr_expr (child_fn);
-  g = gimple_build_call (builtin_decl_explicit (start_ix),
-10, device, fnaddr, openmp_target, t1, t2, t3, t4,
+  g = gimple_build_call (builtin_decl_explicit (start_ix), 10, device,
+fnaddr, build_fold_addr_expr (openmp_target),
+t1, t2, t3, t4,
 t_num_gangs, t_num_workers, t_vector_length);
   gimple_set_location (g, gimple_location (entry_stmt));
   gsi_insert_before (&gsi, g, GSI_SAME_STMT);


> +/* Create new symbol containing (address, size) pairs for omp-marked
> +   functions and global variables.  */
> +void
> +omp_finish_file (void)
> +{
> +  struct cgraph_node *node;
> +  struct varpool_node *vnode;
> +  const char *funcs_section_name = ".offload_func_table_section";
> +  const char *vars_section_name = ".offload_var_table_section";
> +  vec *v_funcs, *v_vars;
> +
> +  vec_alloc (v_vars, 0);
> +  vec_alloc (v_funcs, 0);
> +
> +  [...]
> +  unsigned num_vars = vec_safe_length (v_vars);
> +  unsigned num_funcs = vec_safe_length (v_funcs);
> +  [...]
> +  if (targetm_common.have_named_sections)
> +{
> +  [...]
> +   }
> +  else
> +{
> +  for (unsigned i = 0; i < num_funcs; i++)
> + {
> +   tree it = (*v_funcs)[i];
> +   targetm.record_offload_symbol (it);
> + }  
> +  for (unsigned i = 0; i < num_funcs; i++)
> + {
> +   tree it = (*v_vars)[i];
> +   targetm.record_offloa

Re: [gomp4] Add tables generation

2014-04-02 Thread Thomas Schwinge
Hi!

On Wed, 02 Apr 2014 09:34:29 +0200, I wrote:
> On Thu, 20 Mar 2014 17:50:13 +0100, Bernd Schmidt  
> wrote:
> > This is based on Michael Zolotukhin's patch 2/3 from a while ago. It 
> > adds functionality to build function/variable tables that will allow 
> > libgomp to look up offload target code based on the address of the 
> > corresponding host function. There are two alternatives, one based on 
> > named sections, and one based on a target hook when named sections are 
> > unavailable (as on ptx).
> > 
> > Committed on gomp-4_0-branch.
> 
> I see regressions in the libgomp testsuite for configurations where
> offloading is not enabled:
> 
> spawn [...]/build/gcc/xgcc -B[...]/build/gcc/ 
> [...]/source/libgomp/testsuite/libgomp.c/for-3.c 
> -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/ 
> -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs 
> -I[...]/build/x86_64-unknown-linux-gnu/./libgomp 
> -I[...]/source/libgomp/testsuite/.. -fmessage-length=0 
> -fno-diagnostics-show-caret -fdiagnostics-color=never -fopenmp -std=gnu99 
> -fopenmp -L[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -lm -o 
> ./for-3.exe
> /tmp/ccGnT0ei.o: In function `main':
> for-3.c:(.text+0x21032): undefined reference to `__OPENMP_TARGET__'
> collect2: error: ld returned 1 exit status
> 
> I suppose that's because [...]

Workaround committed in r209015:

commit 6a015f81a5fafe32cf45656e3de121f4088dbf41
Author: tschwinge 
Date:   Wed Apr 2 08:29:17 2014 +

Work around __OPENMP_TARGET__ not being defined for !ENABLE_OFFLOADING.

libgcc/
* crtstuff.c [!ENABLE_OFFLOADING] (__OPENMP_TARGET__): Define to
NULL.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@209015 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgcc/ChangeLog.gomp | 10 ++
 libgcc/crtstuff.c |  2 ++
 2 files changed, 12 insertions(+)

diff --git libgcc/ChangeLog.gomp libgcc/ChangeLog.gomp
new file mode 100644
index 000..7d08efa
--- /dev/null
+++ libgcc/ChangeLog.gomp
@@ -0,0 +1,10 @@
+2014-04-02  Thomas Schwinge  
+
+   * crtstuff.c [!ENABLE_OFFLOADING] (__OPENMP_TARGET__): Define to
+   NULL.
+
+Copyright (C) 2014 Free Software Foundation, Inc.
+
+Copying and distribution of this file, with or without modification,
+are permitted in any medium without royalty provided the copyright
+notice and this notice are preserved.
diff --git libgcc/crtstuff.c libgcc/crtstuff.c
index cda0bae..79af7f0 100644
--- libgcc/crtstuff.c
+++ libgcc/crtstuff.c
@@ -775,6 +775,8 @@ void *__OPENMP_TARGET__[] __attribute__ ((__visibility__ 
("protected"))) =
   &_omp_func_table, &_omp_funcs_end,
   &_omp_var_table, &_omp_vars_end
 };
+#else
+void **__OPENMP_TARGET__ __attribute__ ((__visibility__ ("protected"))) = NULL;
 #endif
 


> Also, I'd suggest to rename __OPENMP_TARGET__ (and similar ones) to
> __GNU_OFFLOAD__ (or similar).  As we're using this offloading stuff for
> both OpenACC and OpenMP target, it makes sense to me to use a generic
> name; we still have the chance to do so now while this stuff is not yet
> in trunk.


Grüße,
 Thomas


pgpMH12KYLnx1.pgp
Description: PGP signature


Re: [PATCH][LTO] Rework -flto-partition=, add =one case

2014-04-02 Thread Richard Biener
On Tue, 1 Apr 2014, Jan Hubicka wrote:

> > 
> > This reworks the option to use the Enum support we have now and
> > adds a =one case (to eventually get rid of one LTO operation mode,
> > =none ...).  I was tempted to support -flto-partition=
> > and get rid of --param lto-partitions (thereby also supporting =1),
> 
> Yep, I preffer to have one switch to chose algorithm and other to set
> its parameter as you do now. At the moment partitioning is quite a non-issue
> since only important IPA passes works on whole thing, but that may change and
> we may want to play with different partitionings.
> (I have plans for that for incremental compilation and other things)

Well, partitioning is important to get a parallel build.

> > but that param specifies the maximum number of partitions and
> > still uses the balanced algorithm, thus the result would be
> > confusing (and of little use I suppose, as opposed to =1 which should
> > give you the same answer as =none).
> 
> =none still seems somewhat useful - for setups where you do multiple parallel
> compilations it will be faster than WHOPR and it helps developing IPA passes
> since you do not need to worry about WHOPR complexities at start.

True, but as it ends up eating more memory your multiple parallel
compilations may in the end be slower if they run into swap ;)

And you can do simple IPA passes just where IPA-PTA sits now - at LTRANS
level.

> But with the code to bring function bodies at demand, this is less important.
> I believe with passmanager being bit more flexible, the code paths can be
> almost completely shared. Have few patches on this and pass queue reorg for
> next stage1, so will try to push them out.

Yeah, it would be nice to make the flow of compilation somewhat more
obvious that it is now ...

Richard.


[PATCH] Remove stale declaration

2014-04-02 Thread Marek Polacek
I noticed that we declare this function, but its definition was
removed in 2009 by P. Bonzini, thus the decl serves no purpose.

Regtested/bootstrapped on x86_64-linux, ok for trunk?

2014-04-02  Marek Polacek  

* c-common.h (c_expand_expr): Remove declaration.

diff --git gcc/c-family/c-common.h gcc/c-family/c-common.h
index 1099b10..24959d8 100644
--- gcc/c-family/c-common.h
+++ gcc/c-family/c-common.h
@@ -928,8 +928,6 @@ extern bool vector_targets_convertible_p (const_tree t1, 
const_tree t2);
 extern bool vector_types_convertible_p (const_tree t1, const_tree t2, bool 
emit_lax_note);
 extern tree c_build_vec_perm_expr (location_t, tree, tree, tree, bool = true);
 
-extern rtx c_expand_expr (tree, rtx, enum machine_mode, int, rtx *);
-
 extern void init_c_lex (void);
 
 extern void c_cpp_builtins (cpp_reader *);

Marek


Re: [PATCH] Remove stale declaration

2014-04-02 Thread Richard Biener
On Wed, Apr 2, 2014 at 12:36 PM, Marek Polacek  wrote:
> I noticed that we declare this function, but its definition was
> removed in 2009 by P. Bonzini, thus the decl serves no purpose.
>
> Regtested/bootstrapped on x86_64-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2014-04-02  Marek Polacek  
>
> * c-common.h (c_expand_expr): Remove declaration.
>
> diff --git gcc/c-family/c-common.h gcc/c-family/c-common.h
> index 1099b10..24959d8 100644
> --- gcc/c-family/c-common.h
> +++ gcc/c-family/c-common.h
> @@ -928,8 +928,6 @@ extern bool vector_targets_convertible_p (const_tree t1, 
> const_tree t2);
>  extern bool vector_types_convertible_p (const_tree t1, const_tree t2, bool 
> emit_lax_note);
>  extern tree c_build_vec_perm_expr (location_t, tree, tree, tree, bool = 
> true);
>
> -extern rtx c_expand_expr (tree, rtx, enum machine_mode, int, rtx *);
> -
>  extern void init_c_lex (void);
>
>  extern void c_cpp_builtins (cpp_reader *);
>
> Marek


Re: [committed, libjava] XFAIL sourcelocation (PR libgcj/55637) backported to 4.8.3

2014-04-02 Thread Rainer Orth
domi...@lps.ens.fr (Dominique Dhumieres) writes:

> r...@cebitec.uni-bielefeld.de (Rainer Orth) wrote:
>> Sure, patch preapproved.
>
> Commited as r208983:
>
> 2014-04-01  Dominique d'Humieres 
> Rainer Orth  
>
> PR libgcj/55637
> * testsuite/libjava.lang/sourcelocation.xfail: New file.

Btw, the customary format for such a ChangeLog entry is

2014-04-01  Dominique d'Humieres 

Backport from mainline
2014-02-20  Rainer Orth  

PR libgcj/55637
* testsuite/libjava.lang/sourcelocation.xfail: New file.

This way, you can easily see when the original went in.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH, ARM] Enable tail call optimization for long call

2014-04-02 Thread Jiong Wang


On 25/03/14 15:44, Richard Earnshaw wrote:

On 24/03/14 11:26, Jiong Wang wrote:

This patch enables tail call optimization for long call on arm.

Previously we have too strict check on arm_function_ok_for_sibcall and
be lack of the support on sibcall/sibcall_value expand that long call tail 
oppportunities are lost.

OK for next next stage 1?


I think this is OK for EABI targets (since we can rely on the linker
generating the right form of interworking veneer), but I'm less certain
about other systems (do we still support COFF).

I think I'd prefer the patch to factor in TARGET_AAPCS_BASED and to
assume that if that is true then arbitrary tail-calls are safe.


Hi Richard,

 IMHO, this is actually a tail call optimization, we just need to make 
sure the register which hold the address be caller saved then it will be OK.


 Updated the change log to fix that "aarch64" typo.  No modification on 
the patch, but enclose it in this reply to keep wholeness.


 So, is it ok for next stage-1?

 Thanks.

--
Jiong


gcc/
   * config/arm/predicates.md (call_insn_operand): Add long_call check.
   * config/arm/arm.md (sibcall, sibcall_value): Force the address to reg for 
long_call.
   * config/arm/arm.c (arm_function_ok_for_sibcall): Remove long_call 
restriction.

gcc/testsuite
   gcc.target/arm/tail-long-call.c: New test.
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index d5f9ff3..8dcdfa8 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -6087,11 +6087,6 @@ arm_function_ok_for_sibcall (tree decl, tree exp)
   if (TARGET_VXWORKS_RTP && flag_pic && !targetm.binds_local_p (decl))
 return false;
 
-  /* Cannot tail-call to long calls, since these are out of range of
- a branch instruction.  */
-  if (decl && arm_is_long_call_p (decl))
-return false;
-
   /* If we are interworking and the function is not declared static
  then we can't tail-call it unless we know that it exists in this
  compilation unit (since it might be a Thumb routine).  */
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 2ddda02..fe285f0 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -9444,8 +9444,10 @@
   "TARGET_32BIT"
   "
   {
-if (!REG_P (XEXP (operands[0], 0))
-   && (GET_CODE (XEXP (operands[0], 0)) != SYMBOL_REF))
+if ((!REG_P (XEXP (operands[0], 0))
+	 && GET_CODE (XEXP (operands[0], 0)) != SYMBOL_REF)
+	|| (GET_CODE (XEXP (operands[0], 0)) == SYMBOL_REF
+	&& arm_is_long_call_p (SYMBOL_REF_DECL (XEXP (operands[0], 0)
  XEXP (operands[0], 0) = force_reg (SImode, XEXP (operands[0], 0));
 
 if (operands[2] == NULL_RTX)
@@ -9462,8 +9464,10 @@
   "TARGET_32BIT"
   "
   {
-if (!REG_P (XEXP (operands[1], 0)) &&
-   (GET_CODE (XEXP (operands[1],0)) != SYMBOL_REF))
+if ((!REG_P (XEXP (operands[1], 0))
+	 && GET_CODE (XEXP (operands[1], 0)) != SYMBOL_REF)
+	|| (GET_CODE (XEXP (operands[1], 0)) == SYMBOL_REF
+	&& arm_is_long_call_p (SYMBOL_REF_DECL (XEXP (operands[1], 0)
  XEXP (operands[1], 0) = force_reg (SImode, XEXP (operands[1], 0));
 
 if (operands[3] == NULL_RTX)
diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
index ce5c9a8..3673343 100644
--- a/gcc/config/arm/predicates.md
+++ b/gcc/config/arm/predicates.md
@@ -673,5 +673,6 @@
(match_code "reg" "0")))
 
 (define_predicate "call_insn_operand"
-  (ior (match_code "symbol_ref")
+  (ior (and (match_code "symbol_ref")
+	(match_test "!arm_is_long_call_p (SYMBOL_REF_DECL (op))"))
(match_operand 0 "s_register_operand")))
diff --git a/gcc/testsuite/gcc.target/arm/tail-long-call.c b/gcc/testsuite/gcc.target/arm/tail-long-call.c
new file mode 100644
index 000..9b27468
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/tail-long-call.c
@@ -0,0 +1,12 @@
+/* { dg-skip-if "need at least armv5te" { *-*-* } { "-march=armv[234]*" "-mthumb" } { "" } } */
+/* { dg-options "-O2 -march=armv5te -marm" } */
+/* { dg-final { scan-assembler "bx" } } */
+/* { dg-final { scan-assembler-not "blx" } } */
+
+int lcal (int) __attribute__ ((long_call));
+
+int
+dec (int a)
+{
+  return lcal (a);
+}

Re: [PATCH][AARCH64] Support tail indirect function call

2014-04-02 Thread Jiong Wang

^Ping...

Regards,
Jiong

On 18/03/14 14:13, Jiong Wang wrote:

Current, indirect function call prevents tail-call optimization on AArch64.

This patch adapt the fix for PR arm/19599 to AArch64.

Is it ok for next stage 1?

Thanks.

-- Jiong

gcc/

  * config/aarch64/predicates.md (aarch64_call_insn_operand): New
predicate.
  * config/aarch64/constraints.md ("Ucs", "Usf"):  New constraints.
  * config/aarch64/aarch64.md (*sibcall_insn, *sibcall_value_insn):
Adjust for
  tailcalling through registers.
  * config/aarch64/aarch64.h (enum reg_class): New caller save
register class.
  (REG_CLASS_NAMES): Likewise.
  (REG_CLASS_CONTENTS): Likewise.
  * config/aarch64/aarch64.c (aarch64_function_ok_for_sibcall): Allow
tailcalling
  without decls.

gcc/testsuite

  *gcc.target/aarch64/tail-indirect-call.c: New test.


--
Jiong




[PATCH][LTO/PGO] Warn when both -flto and -fprofile-generate are enabled

2014-04-02 Thread Markus Trippelsdorf
It is a common mistake to enable both -flto and -fprofile-generate when
building projects. This is not a good idea, because memory use will
skyrocket due to instrumentation. So just warn the user.

OK for next stage1?

2014-04-02  Markus Trippelsdorf  

* common.opt (fprofile-generate): Add flag.
* opts.c (finish_options): Add new warning.
(common_handle_option): Set flag.

diff --git a/gcc/common.opt b/gcc/common.opt
index 62c72f0d2fbf..61e9adfa0df5 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1689,7 +1689,7 @@ Common Report Var(flag_profile_correction)
 Enable correction of flow inconsistent profile data input
 
 fprofile-generate
-Common
+Common Var(flag_profile_generate)
 Enable common options for generating profile info for profile feedback 
directed optimizations
 
 fprofile-generate=
diff --git a/gcc/opts.c b/gcc/opts.c
index fdc903f9271a..b62a0d626d94 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -833,6 +833,9 @@ finish_options (struct gcc_options *opts, struct 
gcc_options *opts_set,
error_at (loc, "only one -flto-partition value can be specified");
 }
 
+  if (opts->x_flag_generate_lto && opts->x_flag_profile_generate)
+warning_at (loc, 0, "Enabling both -fprofile-generate and -flto is a bad 
idea.");
+
   /* We initialize opts->x_flag_split_stack to -1 so that targets can set a
  default value if they choose based on other options.  */
   if (opts->x_flag_split_stack == -1)
@@ -1728,6 +1731,7 @@ common_handle_option (struct gcc_options *opts,
 
 case OPT_fprofile_generate_:
   opts->x_profile_data_prefix = xstrdup (arg);
+  opts->x_flag_profile_generate = true;
   value = true;
   /* No break here - do -fprofile-generate processing. */
 case OPT_fprofile_generate:
-- 
Markus


Re: [PATCH][LTO/PGO] Warn when both -flto and -fprofile-generate are enabled

2014-04-02 Thread Marek Polacek
On Wed, Apr 02, 2014 at 01:50:31PM +0200, Markus Trippelsdorf wrote:
> +  if (opts->x_flag_generate_lto && opts->x_flag_profile_generate)
> +warning_at (loc, 0, "Enabling both -fprofile-generate and -flto is a bad 
> idea.");

s/Enabling/enabling/ + no dot at the end.

Marek


[PATCHv2][LTO/PGO] Warn when both -flto and -fprofile-generate are enabled

2014-04-02 Thread Markus Trippelsdorf
It is a common mistake to enable both -flto and -fprofile-generate when
building projects. This is not a good idea, because memory use will
skyrocket due to instrumentation. So just warn the user.

OK for next stage1?

2014-04-02  Markus Trippelsdorf  

* common.opt (fprofile-generate): Add flag.
* opts.c (finish_options): Add new warning.
(common_handle_option): Set flag.

diff --git a/gcc/common.opt b/gcc/common.opt
index 62c72f0d2fbf..61e9adfa0df5 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1689,7 +1689,7 @@ Common Report Var(flag_profile_correction)
 Enable correction of flow inconsistent profile data input
 
 fprofile-generate
-Common
+Common Var(flag_profile_generate)
 Enable common options for generating profile info for profile feedback 
directed optimizations
 
 fprofile-generate=
diff --git a/gcc/opts.c b/gcc/opts.c
index fdc903f9271a..581d2e948483 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -833,6 +833,9 @@ finish_options (struct gcc_options *opts, struct 
gcc_options *opts_set,
error_at (loc, "only one -flto-partition value can be specified");
 }
 
+  if (opts->x_flag_generate_lto && opts->x_flag_profile_generate)
+warning_at (loc, 0, "enabling both -fprofile-generate and -flto is a bad 
idea");
+
   /* We initialize opts->x_flag_split_stack to -1 so that targets can set a
  default value if they choose based on other options.  */
   if (opts->x_flag_split_stack == -1)
@@ -1728,6 +1731,7 @@ common_handle_option (struct gcc_options *opts,
 
 case OPT_fprofile_generate_:
   opts->x_profile_data_prefix = xstrdup (arg);
+  opts->x_flag_profile_generate = true;
   value = true;
   /* No break here - do -fprofile-generate processing. */
 case OPT_fprofile_generate:
-- 
Markus


Re: [PATCH][LTO/PGO] Warn when both -flto and -fprofile-generate are enabled

2014-04-02 Thread Richard Biener
On Wed, Apr 2, 2014 at 1:50 PM, Markus Trippelsdorf
 wrote:
> It is a common mistake to enable both -flto and -fprofile-generate when
> building projects. This is not a good idea, because memory use will
> skyrocket due to instrumentation. So just warn the user.
>
> OK for next stage1?

I'd rather see if we can fix the underlying issue.  For example as we
are now instrumenting as IPA pass we can allocate a single
counter array (if the number of global vars is the issue).  Basically
split analysis and instrumentation into two phases for that.

Or even better, do profile instrumentation as "real" IPA pass.

Richard.

> 2014-04-02  Markus Trippelsdorf  
>
> * common.opt (fprofile-generate): Add flag.
> * opts.c (finish_options): Add new warning.
> (common_handle_option): Set flag.
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 62c72f0d2fbf..61e9adfa0df5 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1689,7 +1689,7 @@ Common Report Var(flag_profile_correction)
>  Enable correction of flow inconsistent profile data input
>
>  fprofile-generate
> -Common
> +Common Var(flag_profile_generate)
>  Enable common options for generating profile info for profile feedback 
> directed optimizations
>
>  fprofile-generate=
> diff --git a/gcc/opts.c b/gcc/opts.c
> index fdc903f9271a..b62a0d626d94 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -833,6 +833,9 @@ finish_options (struct gcc_options *opts, struct 
> gcc_options *opts_set,
> error_at (loc, "only one -flto-partition value can be specified");
>  }
>
> +  if (opts->x_flag_generate_lto && opts->x_flag_profile_generate)
> +warning_at (loc, 0, "Enabling both -fprofile-generate and -flto is a bad 
> idea.");
> +
>/* We initialize opts->x_flag_split_stack to -1 so that targets can set a
>   default value if they choose based on other options.  */
>if (opts->x_flag_split_stack == -1)
> @@ -1728,6 +1731,7 @@ common_handle_option (struct gcc_options *opts,
>
>  case OPT_fprofile_generate_:
>opts->x_profile_data_prefix = xstrdup (arg);
> +  opts->x_flag_profile_generate = true;
>value = true;
>/* No break here - do -fprofile-generate processing. */
>  case OPT_fprofile_generate:
> --
> Markus


Re: [PATCH][LTO/PGO] Warn when both -flto and -fprofile-generate are enabled

2014-04-02 Thread Richard Biener
On Wed, Apr 2, 2014 at 2:07 PM, Richard Biener
 wrote:
> On Wed, Apr 2, 2014 at 1:50 PM, Markus Trippelsdorf
>  wrote:
>> It is a common mistake to enable both -flto and -fprofile-generate when
>> building projects. This is not a good idea, because memory use will
>> skyrocket due to instrumentation. So just warn the user.
>>
>> OK for next stage1?
>
> I'd rather see if we can fix the underlying issue.  For example as we
> are now instrumenting as IPA pass we can allocate a single
> counter array (if the number of global vars is the issue).  Basically
> split analysis and instrumentation into two phases for that.
>
> Or even better, do profile instrumentation as "real" IPA pass.

Thus, isn't -coverage also facing the same issue?  Thus, is it
really -fprofile-arcs already or only one of the value profiling pieces?

Richard.

> Richard.
>
>> 2014-04-02  Markus Trippelsdorf  
>>
>> * common.opt (fprofile-generate): Add flag.
>> * opts.c (finish_options): Add new warning.
>> (common_handle_option): Set flag.
>>
>> diff --git a/gcc/common.opt b/gcc/common.opt
>> index 62c72f0d2fbf..61e9adfa0df5 100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -1689,7 +1689,7 @@ Common Report Var(flag_profile_correction)
>>  Enable correction of flow inconsistent profile data input
>>
>>  fprofile-generate
>> -Common
>> +Common Var(flag_profile_generate)
>>  Enable common options for generating profile info for profile feedback 
>> directed optimizations
>>
>>  fprofile-generate=
>> diff --git a/gcc/opts.c b/gcc/opts.c
>> index fdc903f9271a..b62a0d626d94 100644
>> --- a/gcc/opts.c
>> +++ b/gcc/opts.c
>> @@ -833,6 +833,9 @@ finish_options (struct gcc_options *opts, struct 
>> gcc_options *opts_set,
>> error_at (loc, "only one -flto-partition value can be specified");
>>  }
>>
>> +  if (opts->x_flag_generate_lto && opts->x_flag_profile_generate)
>> +warning_at (loc, 0, "Enabling both -fprofile-generate and -flto is a 
>> bad idea.");
>> +
>>/* We initialize opts->x_flag_split_stack to -1 so that targets can set a
>>   default value if they choose based on other options.  */
>>if (opts->x_flag_split_stack == -1)
>> @@ -1728,6 +1731,7 @@ common_handle_option (struct gcc_options *opts,
>>
>>  case OPT_fprofile_generate_:
>>opts->x_profile_data_prefix = xstrdup (arg);
>> +  opts->x_flag_profile_generate = true;
>>value = true;
>>/* No break here - do -fprofile-generate processing. */
>>  case OPT_fprofile_generate:
>> --
>> Markus


Re: [PATCHv2][LTO/PGO] Warn when both -flto and -fprofile-generate are enabled

2014-04-02 Thread Rainer Orth
Markus Trippelsdorf  writes:

> diff --git a/gcc/opts.c b/gcc/opts.c
> index fdc903f9271a..581d2e948483 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -833,6 +833,9 @@ finish_options (struct gcc_options *opts, struct 
> gcc_options *opts_set,
>   error_at (loc, "only one -flto-partition value can be specified");
>  }
>  
> +  if (opts->x_flag_generate_lto && opts->x_flag_profile_generate)
> +warning_at (loc, 0, "enabling both -fprofile-generate and -flto is a bad 
> idea");

This warning is not very helpful in this form.  Rather say something
like `causes excessive memory consumption' if this is the problem.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[PATCH] Simple enhancements to dumping in ipa.c and ipa-cp.c

2014-04-02 Thread Martin Jambor
Hi,

recently I've been looking into a number of bugs involving
symtab_remove_unreachable_nodes in one way or another and I have
always started by applying the hunk below.  I did this because
distinguishing different symbol nodes only according to their names is
just so inconvenient, especially when compiling C++.  The risk is
minimal and therefore I'd like to propose it to trunk even at this
late stage, although I can of course wait until the next stage1.

The other hunk is something that I think is also useful when looking
into all failures of ipcp_verify_propagated_values like e.g. PR 60727.

I included the patch in a recent bootstrap and testing and it of
course passes.  OK for trunk now?  Or later?

Thanks,

Martin


2014-04-01  Martin Jambor  

* ipa-cp.c (ipcp_verify_propagated_values): Also dump symtab and
mention gcc_unreachable before failing.
* ipa.c (symtab_remove_unreachable_nodes): Also print order of
removed symbols.

Index: src/gcc/ipa-cp.c
===
--- src.orig/gcc/ipa-cp.c
+++ src/gcc/ipa-cp.c
@@ -884,8 +884,9 @@ ipcp_verify_propagated_values (void)
{
  if (dump_file)
{
+ dump_symtab (dump_file);
  fprintf (dump_file, "\nIPA lattices after constant "
-  "propagation:\n");
+  "propagation, before gcc_unreachable:\n");
  print_all_lattices (dump_file, true, false);
}
 
Index: src/gcc/ipa.c
===
--- src.orig/gcc/ipa.c
+++ src/gcc/ipa.c
@@ -469,7 +469,7 @@ symtab_remove_unreachable_nodes (bool be
   if (!node->aux)
{
  if (file)
-   fprintf (file, " %s", node->name ());
+   fprintf (file, " %s/%i", node->name (), node->order);
  cgraph_remove_node (node);
  changed = true;
}
@@ -483,7 +483,7 @@ symtab_remove_unreachable_nodes (bool be
  if (node->definition)
{
  if (file)
-   fprintf (file, " %s", node->name ());
+   fprintf (file, " %s/%i", node->name (), node->order);
  node->body_removed = true;
  node->analyzed = false;
  node->definition = false;
@@ -531,7 +531,7 @@ symtab_remove_unreachable_nodes (bool be
  && (!flag_ltrans || !DECL_EXTERNAL (vnode->decl)))
{
  if (file)
-   fprintf (file, " %s", vnode->name ());
+   fprintf (file, " %s/%i", vnode->name (), vnode->order);
  varpool_remove_node (vnode);
  changed = true;
}


[PATCH] Disable IPA-SRA for always_inline functions

2014-04-02 Thread Martin Jambor
Hi,

when dealing with a PR yesterday I have noticed that IPA-SRA was
modifying an always_inline function which is useless work since the
function must then be inlined anyway.  Thus I'd like to propose the
following simple change disabling it in such cases.

Included in a bootstrap and testing on x86_64-linux.  OK for trunk now
or in the next stsge1?

Thanks,

Martin


2014-04-01  Martin Jambor  

* tree-sra.c (ipa_sra_preliminary_function_checks): Skip
always_inline functions.

Index: src/gcc/tree-sra.c
===
--- src.orig/gcc/tree-sra.c
+++ src/gcc/tree-sra.c
@@ -4960,6 +4960,15 @@ ipa_sra_preliminary_function_checks (str
   if (TYPE_ATTRIBUTES (TREE_TYPE (node->decl)))
 return false;
 
+  if (lookup_attribute ("always_inline",
+   DECL_ATTRIBUTES (node->decl)) != NULL)
+{
+  if (dump_file)
+   fprintf (dump_file, "Allways inline function will be inlined "
+"anyway. \n");
+  return false;
+}
+
   return true;
 }
 


Re: [PATCH] Simple enhancements to dumping in ipa.c and ipa-cp.c

2014-04-02 Thread Richard Biener
On Wed, 2 Apr 2014, Martin Jambor wrote:

> Hi,
> 
> recently I've been looking into a number of bugs involving
> symtab_remove_unreachable_nodes in one way or another and I have
> always started by applying the hunk below.  I did this because
> distinguishing different symbol nodes only according to their names is
> just so inconvenient, especially when compiling C++.  The risk is
> minimal and therefore I'd like to propose it to trunk even at this
> late stage, although I can of course wait until the next stage1.
> 
> The other hunk is something that I think is also useful when looking
> into all failures of ipcp_verify_propagated_values like e.g. PR 60727.
> 
> I included the patch in a recent bootstrap and testing and it of
> course passes.  OK for trunk now?  Or later?

I'll leave the actual changes for review by Honza, it's fine at this
stage if he things the changes make sense and are consistent.

Thanks,
Richard.

> Thanks,
> 
> Martin
> 
> 
> 2014-04-01  Martin Jambor  
> 
>   * ipa-cp.c (ipcp_verify_propagated_values): Also dump symtab and
>   mention gcc_unreachable before failing.
>   * ipa.c (symtab_remove_unreachable_nodes): Also print order of
>   removed symbols.
> 
> Index: src/gcc/ipa-cp.c
> ===
> --- src.orig/gcc/ipa-cp.c
> +++ src/gcc/ipa-cp.c
> @@ -884,8 +884,9 @@ ipcp_verify_propagated_values (void)
>   {
> if (dump_file)
>   {
> +   dump_symtab (dump_file);
> fprintf (dump_file, "\nIPA lattices after constant "
> -"propagation:\n");
> +"propagation, before gcc_unreachable:\n");
> print_all_lattices (dump_file, true, false);
>   }
>  
> Index: src/gcc/ipa.c
> ===
> --- src.orig/gcc/ipa.c
> +++ src/gcc/ipa.c
> @@ -469,7 +469,7 @@ symtab_remove_unreachable_nodes (bool be
>if (!node->aux)
>   {
> if (file)
> - fprintf (file, " %s", node->name ());
> + fprintf (file, " %s/%i", node->name (), node->order);
> cgraph_remove_node (node);
> changed = true;
>   }
> @@ -483,7 +483,7 @@ symtab_remove_unreachable_nodes (bool be
> if (node->definition)
>   {
> if (file)
> - fprintf (file, " %s", node->name ());
> + fprintf (file, " %s/%i", node->name (), node->order);
> node->body_removed = true;
> node->analyzed = false;
> node->definition = false;
> @@ -531,7 +531,7 @@ symtab_remove_unreachable_nodes (bool be
> && (!flag_ltrans || !DECL_EXTERNAL (vnode->decl)))
>   {
> if (file)
> - fprintf (file, " %s", vnode->name ());
> + fprintf (file, " %s/%i", vnode->name (), vnode->order);
> varpool_remove_node (vnode);
> changed = true;
>   }
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer


Re: [PATCH] Disable IPA-SRA for always_inline functions

2014-04-02 Thread Richard Biener
On Wed, 2 Apr 2014, Martin Jambor wrote:

> Hi,
> 
> when dealing with a PR yesterday I have noticed that IPA-SRA was
> modifying an always_inline function which is useless work since the
> function must then be inlined anyway.  Thus I'd like to propose the
> following simple change disabling it in such cases.
> 
> Included in a bootstrap and testing on x86_64-linux.  OK for trunk now
> or in the next stsge1?

Ok for next stage1, but please short-cut the lookup_attribute
with a DECL_DISREGARD_INLINE_LIMITS () check.  Maybe even
abstract this away into a predicate on the cgraph node.

Thanks,
Richard.

> Thanks,
> 
> Martin
> 
> 
> 2014-04-01  Martin Jambor  
> 
>   * tree-sra.c (ipa_sra_preliminary_function_checks): Skip
>   always_inline functions.
> 
> Index: src/gcc/tree-sra.c
> ===
> --- src.orig/gcc/tree-sra.c
> +++ src/gcc/tree-sra.c
> @@ -4960,6 +4960,15 @@ ipa_sra_preliminary_function_checks (str
>if (TYPE_ATTRIBUTES (TREE_TYPE (node->decl)))
>  return false;
>  
> +  if (lookup_attribute ("always_inline",
> + DECL_ATTRIBUTES (node->decl)) != NULL)
> +{
> +  if (dump_file)
> + fprintf (dump_file, "Allways inline function will be inlined "
> +  "anyway. \n");
> +  return false;
> +}
> +
>return true;
>  }
>  
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer


Re: [PATCH][ARM] Handle simple SImode PLUS and MINUS operations in rtx costs

2014-04-02 Thread Kyrill Tkachov
Pinging this for stage1, otherwise I'll forget about it and it'll fall through 
the cracks...


http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01276.html

Thanks,
Kyrill

On 24/03/14 17:21, Kyrill Tkachov wrote:

Hi all,

I noticed that we don't handle simple reg-to-reg arithmetic operations in the
arm rtx cost functions. We should be adding the cost of alu.arith to the costs
of the operands. This patch does that. Since we don't have any cost tables yet
that have a non-zero value for that field it shouldn't affect code-gen for any
current cores.

Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for next stage1?

Thanks,
Kyrill

2014-03-24  Kyrylo Tkachov  

  * config/arm/arm.c (arm_new_rtx_costs): Handle reg-to-reg PLUS
  and MINUS RTXs.





[PATCH] [ARM] [RFC] Fix longstanding push_minipool_fix ICE (PR49423, lp1296601)

2014-04-02 Thread Charles Baylis
Hi

This patch fixes the push_minipool_fix ICE, which occurs when the ARM
backend encounters a zero/sign extending load from a constant pool.

I don't have a current test case for trunk, lp1296601 has a test case
which affects the linaro-4.8 branch. As far as I know, there has been
no fix for this on trunk.

The approach taken in this patch is to extend each pattern where this
can occur,  so that it triggers a define_split to synthesise a
constant move instead. Some but not all extend patterns have
previously added pool_range attributes to work-around this problem,
this patch removes those, and also fixes the remaining patterns. Some
patterns have slightly more complex workarounds, which I have not yet
analysed, but it seems worth posting the patch at this stage to get
feedback on the general approach.

Tested on arm-unknown-linux-gnueabihf (qemu), bootstrap in progress.

If this looks good, I'll clean it up for a more detailed review.

Thanks
Charles


0001-initial-attempt-at-fixing-push_minipool_fix-ICE.patch
Description: application/download


Re: [Patch, AArch64] Fix shuffle for big-endian.

2014-04-02 Thread Tejas Belagod

Richard Henderson wrote:

On 02/21/2014 08:30 AM, Tejas Belagod wrote:

+  /* If two vectors, we end up with a wierd mixed-endian mode on NEON.  */
+  if (BYTES_BIG_ENDIAN)
+   {
+ if (!d->one_vector_p && d->perm[i] & nunits)
+   {
+ /* Extract the offset.  */
+ elt = d->perm[i] & (nunits - 1);
+ /* Reverse the top half.  */
+ elt = nunits - 1 - elt;
+ /* Offset it by the bottom half.  */
+ elt += nunits;
+   }
+ else
+   elt = nunits - 1 - d->perm[i];
+   }


Isn't this just

  elt = d->perm[i] ^ (nunits - 1);

all the time?  I.e. invert the index within the word,
but leave the word index (nunits) unchanged.



Here is a revised patch. OK for stage-1?

Thanks
Tejas.

2014-04-02  Tejas Belagod  

gcc/

* config/aarch64/aarch64.c (aarch64_evpc_tbl): Reverse order of elements
for big-endian.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index e839539..d30b79c 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -8129,7 +8129,15 @@ aarch64_evpc_tbl (struct expand_vec_perm_d *d)
 return false;
 
   for (i = 0; i < nelt; ++i)
-rperm[i] = GEN_INT (d->perm[i]);
+{
+  int nunits = GET_MODE_NUNITS (vmode);
+
+  /* If big-endian and two vectors we end up with a wierd mixed-endian
+mode on NEON.  Reverse the index within each word but not the word
+itself.  */
+  rperm[i] = GEN_INT (BYTES_BIG_ENDIAN ? d->perm[i] ^ (nunits - 1)
+  : d->perm[i]);
+}
   sel = gen_rtx_CONST_VECTOR (vmode, gen_rtvec_v (nelt, rperm));
   sel = force_reg (vmode, sel);
 

Re: [PATCH] aarch64 suuport for libitm

2014-04-02 Thread Richard Henderson
On 04/01/2014 03:41 PM, Andrew Pinski wrote:
> On Tue, Apr 1, 2014 at 3:24 PM, Richard Henderson  wrote:
>> Comments?  If approved, should this go in for 4.9, or wait for stage1?
>> Certainly it's self-contained...
> 
> On Cavium's thunder processor the cache line size is going to be
> bigger than 64 bytes, what is your solution to improve performance on
> target's like Thunder?

We can expand the number reasonably.  The only thing it controls is layout of
some of the internal data structures to attempt to put different locks on
different lines.

Is 128 big enough for Thunder?  Honestly, I may well not even have it right for
the processor we have in house.  I didn't bother trying to track down docs to
find out.

> Also I think the default page size for most Linux distros is going to
> be 64k on aarch64 including Redhat Linux so it makes sense not to
> define FIXED_PAGE_SIZE.

Heh.  It turns out these page size defines aren't used any more at all.  During
one of the rewrites we must have delete the bits that used it.  I'll get rid of
all of them so as to be less confusing.

> I will implement the ILP32 version of this patch once it goes in,
> there needs a few changes in gtm_jmpbuf due to long and pointers being
> 32bit but the assembly storing 64bits always.

I can minimize those changes now by using unsigned long long...


r~



Re: [PATCH][1/3] Fix PR54733 Optimize endian independent load/store

2014-04-02 Thread Joseph S. Myers
On Wed, 2 Apr 2014, Thomas Preud'homme wrote:

> +   if { [is-effective-target bswap]
> +&& ![istarget x86_64-*-*] } {

That x86_64-*-* test is wrong.  x86_64-*-* and i?86-*-* should always be 
handled the same (if you then want to distinguish 32-bit and 64-bit 
multilibs, you check the appropriate effective-target there, depending on 
whether the condition is one on the ABI or which register size is being 
used, which affects how x32 should be counted).


-- 
Joseph S. Myers
jos...@codesourcery.com


[4.8, PATCH 27/26] Backport Power8 and LE support: Fixes for AIX test failures

2014-04-02 Thread Bill Schmidt
Hi,

This patch (diff-aix) adds to the 4.8 PowerPC backport patch series with
a few backported fixes from trunk that repair test failures on AIX.

Thanks,
Bill


[gcc]

2014-04-02  Bill Schmidt  

Backport from mainline r205308
2013-11-23  David Edelsohn  

* config/rs6000/rs6000.c (IN_NAMED_SECTION): New macro.
(rs6000_xcoff_select_section): Place decls with stricter alignment
into named sections.
(rs6000_xcoff_unique_section): Allow unique sections for
uninitialized data with strict alignment.

[gcc/testsuite]

2014-04-02  Bill Schmidt  

Backport from mainline
2013-04-05  David Edelsohn  

* gcc.target/powerpc/sd-vsx.c: Skip on AIX.
* gcc.target/powerpc/sd-pwr6.c: Same.


Index: gcc-4_8-test2/gcc/config/rs6000/rs6000.c
===
--- gcc-4_8-test2.orig/gcc/config/rs6000/rs6000.c
+++ gcc-4_8-test2/gcc/config/rs6000/rs6000.c
@@ -29165,10 +29165,23 @@ rs6000_xcoff_asm_named_section (const ch
   name, suffix[smclass], flags & SECTION_ENTSIZE);
 }
 
+#define IN_NAMED_SECTION(DECL) \
+  ((TREE_CODE (DECL) == FUNCTION_DECL || TREE_CODE (DECL) == VAR_DECL) \
+   && DECL_SECTION_NAME (DECL) != NULL_TREE)
+
 static section *
 rs6000_xcoff_select_section (tree decl, int reloc,
-unsigned HOST_WIDE_INT align ATTRIBUTE_UNUSED)
+unsigned HOST_WIDE_INT align)
 {
+  /* Place variables with alignment stricter than BIGGEST_ALIGNMENT into
+ named section.  */
+  if (align > BIGGEST_ALIGNMENT)
+{
+  resolve_unique_section (decl, reloc, true);
+  if (IN_NAMED_SECTION (decl))
+   return get_named_section (decl, NULL, reloc);
+}
+
   if (decl_readonly_section (decl, reloc))
 {
   if (TREE_PUBLIC (decl))
@@ -29206,10 +29219,12 @@ rs6000_xcoff_unique_section (tree decl,
 {
   const char *name;
 
-  /* Use select_section for private and uninitialized data.  */
+  /* Use select_section for private data and uninitialized data with
+ alignment <= BIGGEST_ALIGNMENT.  */
   if (!TREE_PUBLIC (decl)
   || DECL_COMMON (decl)
-  || DECL_INITIAL (decl) == NULL_TREE
+  || (DECL_INITIAL (decl) == NULL_TREE
+ && DECL_ALIGN (decl) <= BIGGEST_ALIGNMENT)
   || DECL_INITIAL (decl) == error_mark_node
   || (flag_zero_initialized_in_bss
  && initializer_zerop (DECL_INITIAL (decl
Index: gcc-4_8-test2/gcc/testsuite/gcc.target/powerpc/sd-pwr6.c
===
--- gcc-4_8-test2.orig/gcc/testsuite/gcc.target/powerpc/sd-pwr6.c
+++ gcc-4_8-test2/gcc/testsuite/gcc.target/powerpc/sd-pwr6.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
-/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* powerpc-ibm-aix* } { "*" } { "" } } */
 /* { dg-require-effective-target powerpc_vsx_ok } */
 /* { dg-options "-O2 -mcpu=power6 -mhard-dfp" } */
 /* { dg-final { scan-assembler-not   "lfiwzx"   } } */
Index: gcc-4_8-test2/gcc/testsuite/gcc.target/powerpc/sd-vsx.c
===
--- gcc-4_8-test2.orig/gcc/testsuite/gcc.target/powerpc/sd-vsx.c
+++ gcc-4_8-test2/gcc/testsuite/gcc.target/powerpc/sd-vsx.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
-/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* powerpc-ibm-aix* } { "*" } { "" } } */
 /* { dg-require-effective-target powerpc_vsx_ok } */
 /* { dg-options "-O2 -mcpu=power7 -mhard-dfp" } */
 /* { dg-final { scan-assembler-times "lfiwzx" 2 } } */




[Patch C++] PR57958 RFC

2014-04-02 Thread Dinar Temirbulatov
Hi,
Following change fixes gimple production for lambda function, in the
patch I assumed that constructing COMPOUND_EXPR for the return value
of "auto" type function resoluted to CLASS_TYPE_P is wrong. Tested
x86_64-pc-linux-gnu by applying to trunk with no new regressions.
Thanks, Dinar.


fix1.patch
Description: Binary data


Re: [PATCH] aarch64 suuport for libitm

2014-04-02 Thread pinskia


> On Apr 2, 2014, at 7:37 AM, Richard Henderson  wrote:
> 
>> On 04/01/2014 03:41 PM, Andrew Pinski wrote:
>>> On Tue, Apr 1, 2014 at 3:24 PM, Richard Henderson  wrote:
>>> Comments?  If approved, should this go in for 4.9, or wait for stage1?
>>> Certainly it's self-contained...
>> 
>> On Cavium's thunder processor the cache line size is going to be
>> bigger than 64 bytes, what is your solution to improve performance on
>> target's like Thunder?
> 
> We can expand the number reasonably.  The only thing it controls is layout of
> some of the internal data structures to attempt to put different locks on
> different lines.
> 
> Is 128 big enough for Thunder?  Honestly, I may well not even have it right 
> for
> the processor we have in house.  I didn't bother trying to track down docs to
> find out.

Yes 128 should be enough. 

Thanks,
Andrew

> 
>> Also I think the default page size for most Linux distros is going to
>> be 64k on aarch64 including Redhat Linux so it makes sense not to
>> define FIXED_PAGE_SIZE.
> 
> Heh.  It turns out these page size defines aren't used any more at all.  
> During
> one of the rewrites we must have delete the bits that used it.  I'll get rid 
> of
> all of them so as to be less confusing.
> 
>> I will implement the ILP32 version of this patch once it goes in,
>> there needs a few changes in gtm_jmpbuf due to long and pointers being
>> 32bit but the assembly storing 64bits always.
> 
> I can minimize those changes now by using unsigned long long...
> 
> 
> r~
> 


Re: RFA: Fix PR rtl-optimization/60651

2014-04-02 Thread Joern Rennecke
On 28 March 2014 10:20, Eric Botcazou  wrote:
>> However, the first call is for blocks with incoming abnormal edges.
>> If these are empty, the change as I wrote it yesterday is fine, but not
>> when they are non-empty; in that case, we should indeed insert before the
>> first instruction in that block.
>
> OK, so the issue is specific to empty basic blocks and boils down to inserting
> instructions in a FIFO manner into them.

Actually, the issue also applies to abnormal edges where lcm did leave a set -
but these are rare, and my last patch should handle these properly in any event,
by no longer using the NOTE_INSN_BASIC_BLOCK itself unless the block is
empty.

>> This can be archived by finding an insert-before position using NEXT_INSN
>> on the basic block head; this amounts to the very same insertion place
>> as inserting after the basic block head.  Also, we will continue to set no
>> location, and use the same bb, because both add_insn_before and
>> add_insn_after (in contradiction to its block comment) will infer the basic
>> block from the insn given (in the case for add_insn_before, I assume
>> that the basic block doesn't start with a BARRIER - that would be invalid -
>> and that the insn it starts with has a valid BLOCK_FOR_INSN setting the
>> same way the basic block head has.
>
> This looks reasonable, but I think that we need more commentary because it's
> not straightforward to understand, so I would:
>
>   1. explicitly state that we enforce an order on the entities in addition to
> the order on priority, both in the code (for example create a 4th paragraph in
> the comment at the top of the file, before "More details ...") and in the doc
> as you already did, but "ordering" the two orders for the sake of clarity:
> first the order on priority then, for the same priority, the order to the
> entities.

Actually, all the patch provides is a partial order, just as I stated.
Providing the strict order you describe would require adding another
loop nesting to the entity/basic block/seginfo loop, and it wouldn't
really be useful for targets.
To order by entity first, then by priority, could be useful for some targets,
so that they can express a dependency chain of mode switching events
to be computed in a single lcm pass without inflating the mode count
(which determines how often we have to invoke the lcm machinery).
However, that would require having separate buckets for each entity for
each  insert_insn_on_edge point.

For epiphany,  EPIPHANY_MSW_ENTITY_FPU_OMNIBUS (for -O0) and
EPIPHANY_MSW_ENTITY_ROUND_KNOWN (used when optimizing)
depend on EPIPHANY_MSW_ENTITY_AND,  EPIPHANY_MSW_ENTITY_OR and
EPIPHANY_MSW_ENTITY_CONFIG.
The latter three only have two modes, an the former two use the
enum attr_fp_mode values, the first of which is FP_MODE_ROUND_UNKNOWN.
That value does not actually appear as a needed mode for these entities, hence
the partial order is sufficient.

EPIPHANY_MSW_ENTITY_FPU_OMNIBUS also depends on EPIPHANY_MSW_ENTITY_OR.

>   2. add a line in the head comment of new_seginfo saying that INSN may not be
> a NOTE_BASIC_BLOCK, unless BB is empty.
>
>   3. add a comment above the trick in optimize_mode_switching saying that it
> is both required to implement the FIFO insertion and valid because we know
> that the basic block was initially empty.

Done.

> It's not clear to me whether this is a regression or not, so you'll also need
> to run it by the RMs.

I don't think it's a regression.
2014-04-02  Joern Rennecke  

gcc:
PR rtl-optimization/60651
* mode-switching.c (optimize_mode_switching): Make sure to emit
sets of a lower numbered entity before sets of a higher numbered
entity to a mode of the same or lower priority.
(new_seginfo): Document and enforce requirement that
NOTE_INSN_BASIC_BLOCK only appears for empty blocks.
* doc/tm.texi.in: Document ordering constraint for emitted mode sets.
* doc/tm.texi: Regenerate.
gcc/testsuite:
PR rtl-optimization/60651
* gcc.target/epiphany/mode-switch.c: New test.

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index f7024a7..b8ca17e 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -9778,6 +9778,8 @@ for @var{entity}.  For any fixed @var{entity}, 
@code{mode_priority_to_mode}
 Generate one or more insns to set @var{entity} to @var{mode}.
 @var{hard_reg_live} is the set of hard registers live at the point where
 the insn(s) are to be inserted.
+Sets of a lower numbered entity will be emitted before sets of a higher
+numbered entity to a mode of the same or lower priority.
 @end defmac
 
 @node Target Attributes
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 6dcbde4..d793d26 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -7447,6 +7447,8 @@ for @var{entity}.  For any fixed @var{entity}, 
@code{mode_priority_to_mode}
 Generate one or more insns to set @var{entity} to @var{mode}.
 @var{hard_reg_live} is the set of hard registers live at

Re: [PATCH] Simple enhancements to dumping in ipa.c and ipa-cp.c

2014-04-02 Thread Jan Hubicka
> On Wed, 2 Apr 2014, Martin Jambor wrote:
> 
> > Hi,
> > 
> > recently I've been looking into a number of bugs involving
> > symtab_remove_unreachable_nodes in one way or another and I have
> > always started by applying the hunk below.  I did this because
> > distinguishing different symbol nodes only according to their names is
> > just so inconvenient, especially when compiling C++.  The risk is
> > minimal and therefore I'd like to propose it to trunk even at this
> > late stage, although I can of course wait until the next stage1.
> > 
> > The other hunk is something that I think is also useful when looking
> > into all failures of ipcp_verify_propagated_values like e.g. PR 60727.
> > 
> > I included the patch in a recent bootstrap and testing and it of
> > course passes.  OK for trunk now?  Or later?
> 
> I'll leave the actual changes for review by Honza, it's fine at this
> stage if he things the changes make sense and are consistent.

It seems fine to me...
> 
> Thanks,
> Richard.
> 
> > Thanks,
> > 
> > Martin
> > 
> > 
> > 2014-04-01  Martin Jambor  
> > 
> > * ipa-cp.c (ipcp_verify_propagated_values): Also dump symtab and
> > mention gcc_unreachable before failing.
> > * ipa.c (symtab_remove_unreachable_nodes): Also print order of
> > removed symbols.
> > 
> > Index: src/gcc/ipa-cp.c
> > ===
> > --- src.orig/gcc/ipa-cp.c
> > +++ src/gcc/ipa-cp.c
> > @@ -884,8 +884,9 @@ ipcp_verify_propagated_values (void)
> > {
> >   if (dump_file)
> > {
> > + dump_symtab (dump_file);
> >   fprintf (dump_file, "\nIPA lattices after constant "
> > -  "propagation:\n");
> > +  "propagation, before gcc_unreachable:\n");

This means before symtab_remove_unreachable_nodes?

Honza
> >   print_all_lattices (dump_file, true, false);
> > }
> >  
> > Index: src/gcc/ipa.c
> > ===
> > --- src.orig/gcc/ipa.c
> > +++ src/gcc/ipa.c
> > @@ -469,7 +469,7 @@ symtab_remove_unreachable_nodes (bool be
> >if (!node->aux)
> > {
> >   if (file)
> > -   fprintf (file, " %s", node->name ());
> > +   fprintf (file, " %s/%i", node->name (), node->order);
> >   cgraph_remove_node (node);
> >   changed = true;
> > }
> > @@ -483,7 +483,7 @@ symtab_remove_unreachable_nodes (bool be
> >   if (node->definition)
> > {
> >   if (file)
> > -   fprintf (file, " %s", node->name ());
> > +   fprintf (file, " %s/%i", node->name (), node->order);
> >   node->body_removed = true;
> >   node->analyzed = false;
> >   node->definition = false;
> > @@ -531,7 +531,7 @@ symtab_remove_unreachable_nodes (bool be
> >   && (!flag_ltrans || !DECL_EXTERNAL (vnode->decl)))
> > {
> >   if (file)
> > -   fprintf (file, " %s", vnode->name ());
> > +   fprintf (file, " %s/%i", vnode->name (), vnode->order);
> >   varpool_remove_node (vnode);
> >   changed = true;
> > }
> > 
> > 
> 
> -- 
> Richard Biener 
> SUSE / SUSE Labs
> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer


Re: [PATCH] Simple enhancements to dumping in ipa.c and ipa-cp.c

2014-04-02 Thread Martin Jambor
Hi,

On Wed, Apr 02, 2014 at 06:08:27PM +0200, Jan Hubicka wrote:
> > On Wed, 2 Apr 2014, Martin Jambor wrote:
> > 
> > > Hi,
> > > 
> > > recently I've been looking into a number of bugs involving
> > > symtab_remove_unreachable_nodes in one way or another and I have
> > > always started by applying the hunk below.  I did this because
> > > distinguishing different symbol nodes only according to their names is
> > > just so inconvenient, especially when compiling C++.  The risk is
> > > minimal and therefore I'd like to propose it to trunk even at this
> > > late stage, although I can of course wait until the next stage1.
> > > 
> > > The other hunk is something that I think is also useful when looking
> > > into all failures of ipcp_verify_propagated_values like e.g. PR 60727.
> > > 
> > > I included the patch in a recent bootstrap and testing and it of
> > > course passes.  OK for trunk now?  Or later?
> > 
> > I'll leave the actual changes for review by Honza, it's fine at this
> > stage if he things the changes make sense and are consistent.
> 
> It seems fine to me...

Thanks, I will commit it shortly then.

> > 
> > Thanks,
> > Richard.
> > 
> > > Thanks,
> > > 
> > > Martin
> > > 
> > > 
> > > 2014-04-01  Martin Jambor  
> > > 
> > >   * ipa-cp.c (ipcp_verify_propagated_values): Also dump symtab and
> > >   mention gcc_unreachable before failing.
> > >   * ipa.c (symtab_remove_unreachable_nodes): Also print order of
> > >   removed symbols.
> > > 
> > > Index: src/gcc/ipa-cp.c
> > > ===
> > > --- src.orig/gcc/ipa-cp.c
> > > +++ src/gcc/ipa-cp.c
> > > @@ -884,8 +884,9 @@ ipcp_verify_propagated_values (void)
> > >   {
> > > if (dump_file)
> > >   {
> > > +   dump_symtab (dump_file);
> > > fprintf (dump_file, "\nIPA lattices after constant "
> > > -"propagation:\n");
> > > +"propagation, before gcc_unreachable:\n");
> 
> This means before symtab_remove_unreachable_nodes?

No, there is litrally a call to gcc_unreachable just below this
dumping.  I added this to grep for it easily when I have a number of
dumps lying around because there is the same string in normal dumps
too.

Thanks,

Martin


Re: RFA: Fix PR rtl-optimization/60651

2014-04-02 Thread Joern Rennecke
Hmm, the sanity check in new_seginfo caused a boostrap failure
building libjava on x86.
There was a block with CODE_LABEL as basic block head, otherwise empty.


Skip some gcc.target/i386 tests for conflicting -march= options

2014-04-02 Thread Joseph S. Myers
If you test an x86_64 toolchain with -march=bdver3 in the multilib
options, as noted in
 various test
failures arise from tests whose own -march= in dg-options is
overridden.  This patch adds dg-skip-if to those tests to skip them
for conflicting -march= options, as has been done before for other
tests (obviously, if the option ordering is changed in future in
DejaGnu, such skips may become obsolete or could be conditioned on
DejaGnu version).  (No doubt other -march= options would show up
further tests needing such changes.)

Tested x86_64-linux-gnu.  OK to commit?

2014-04-02  Joseph Myers  

* gcc.target/i386/funcspec-2.c, gcc.target/i386/funcspec-3.c,
gcc.target/i386/funcspec-9.c, gcc.target/i386/isa-1.c,
gcc.target/i386/memcpy-strategy-1.c,
gcc.target/i386/memcpy-strategy-2.c,
gcc.target/i386/memcpy-vector_loop-1.c,
gcc.target/i386/memcpy-vector_loop-2.c,
gcc.target/i386/memset-vector_loop-1.c,
gcc.target/i386/memset-vector_loop-2.c,
gcc.target/i386/sse2-init-v2di-2.c, gcc.target/i386/ssetype-1.c,
gcc.target/i386/ssetype-2.c, gcc.target/i386/ssetype-5.c: Skip for
-march= options different from those in dg-options.

Index: gcc/testsuite/gcc.target/i386/memcpy-vector_loop-2.c
===
--- gcc/testsuite/gcc.target/i386/memcpy-vector_loop-2.c(revision 
209023)
+++ gcc/testsuite/gcc.target/i386/memcpy-vector_loop-2.c(working copy)
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-skip-if "" { i?86-*-* x86_64-*-* } { "-march=*" } { "-march=atom" } } 
*/
 /* { dg-options "-O2 -march=atom -minline-all-stringops 
-mstringop-strategy=vector_loop" } */
 /* { dg-final { scan-assembler-times "movdqa" 4} } */
 
Index: gcc/testsuite/gcc.target/i386/ssetype-1.c
===
--- gcc/testsuite/gcc.target/i386/ssetype-1.c   (revision 209023)
+++ gcc/testsuite/gcc.target/i386/ssetype-1.c   (working copy)
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* This test checks for absolute memory operands.  */
 /* { dg-require-effective-target nonpic } */
+/* { dg-skip-if "" { i?86-*-* x86_64-*-* } { "-march=*" } { "-march=k8" } } */
 /* { dg-options "-O2 -msse2 -march=k8" } */
 /* { dg-final { scan-assembler "andpd\[^\\n\]*magic" } } */
 /* { dg-final { scan-assembler "andnpd\[^\\n\]*magic" } } */
Index: gcc/testsuite/gcc.target/i386/ssetype-5.c
===
--- gcc/testsuite/gcc.target/i386/ssetype-5.c   (revision 209023)
+++ gcc/testsuite/gcc.target/i386/ssetype-5.c   (working copy)
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* This test checks for absolute memory operands.  */
 /* { dg-require-effective-target nonpic } */
+/* { dg-skip-if "" { i?86-*-* x86_64-*-* } { "-march=*" } { "-march=k8" } } */
 /* { dg-options "-O2 -msse2 -march=k8" } */
 /* { dg-final { scan-assembler "pand\[^\\n\]*magic" } } */
 /* { dg-final { scan-assembler "pandn\[^\\n\]*magic" } } */
Index: gcc/testsuite/gcc.target/i386/memset-vector_loop-2.c
===
--- gcc/testsuite/gcc.target/i386/memset-vector_loop-2.c(revision 
209023)
+++ gcc/testsuite/gcc.target/i386/memset-vector_loop-2.c(working copy)
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-skip-if "" { i?86-*-* x86_64-*-* } { "-march=*" } { "-march=atom" } } 
*/
 /* { dg-options "-O2 -march=atom -minline-all-stringops 
-mstringop-strategy=vector_loop" } */
 /* { dg-final { scan-assembler-times "movdqa" 4} } */
 
Index: gcc/testsuite/gcc.target/i386/ssetype-2.c
===
--- gcc/testsuite/gcc.target/i386/ssetype-2.c   (revision 209023)
+++ gcc/testsuite/gcc.target/i386/ssetype-2.c   (working copy)
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-skip-if "" { i?86-*-* x86_64-*-* } { "-march=*" } { "-march=k8" } } */
 /* { dg-options "-O2 -msse2 -march=k8" } */
 /* { dg-final { scan-assembler "andpd" } } */
 /* { dg-final { scan-assembler "andnpd" } } */
Index: gcc/testsuite/gcc.target/i386/funcspec-9.c
===
--- gcc/testsuite/gcc.target/i386/funcspec-9.c  (revision 209023)
+++ gcc/testsuite/gcc.target/i386/funcspec-9.c  (working copy)
@@ -1,5 +1,6 @@
 /* Test whether using target specific options, we can generate FMA4 code.  */
 /* { dg-do compile } */
+/* { dg-skip-if "" { i?86-*-* x86_64-*-* } { "-march=*" } { "-march=k8" } } */
 /* { dg-options "-O2 -march=k8 -mfpmath=sse -msse2" } */
 
 extern void exit (int);
Index: gcc/testsuite/gcc.target/i386/funcspec-2.c
===
--- gcc/testsuite/gcc.target/i386/funcspec-2.c  (revision 209023)
+++ gcc/testsuite/gcc.target/i386/funcspec-2.c  (working copy)
@@ -1,5 +1,6 @@
 /* Test w

Re: [PATCH] [ARM] [RFC] Fix longstanding push_minipool_fix ICE (PR49423, lp1296601)

2014-04-02 Thread Charles Baylis
On 2 April 2014 14:29, Charles Baylis  wrote:
> Tested on arm-unknown-linux-gnueabihf (qemu), bootstrap in progress.

bootstrapped successfully on a Chromebook arm-unknown-linux-gnueabihf.


Re: RFA: Fix PR rtl-optimization/60651

2014-04-02 Thread Joern Rennecke
On 2 April 2014 17:34, Joern Rennecke  wrote:
> Hmm, the sanity check in new_seginfo caused a boostrap failure
> building libjava on x86.
> There was a block with CODE_LABEL as basic block head, otherwise empty.

I've added the testcase - and a bit more detail on this issue - in the PR.

I've attached an updated patch, which skips past the CODE_LABEL.
And this one bootstraps on i686-pc-linuc-gnu.
2014-04-02  Joern Rennecke  

gcc:
PR rtl-optimization/60651
* mode-switching.c (optimize_mode_switching): Make sure to emit
sets of a lower numbered entity before sets of a higher numbered
entity to a mode of the same or lower priority.
When creating a seginfo for a basic block that starts with a code
label, move the insertion point past the code label.
(new_seginfo): Document and enforce requirement that
NOTE_INSN_BASIC_BLOCK only appears for empty blocks.
* doc/tm.texi.in: Document ordering constraint for emitted mode sets.
* doc/tm.texi: Regenerate.
gcc/testsuite:
PR rtl-optimization/60651
* gcc.target/epiphany/mode-switch.c: New test.

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index f7024a7..b8ca17e 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -9778,6 +9778,8 @@ for @var{entity}.  For any fixed @var{entity}, 
@code{mode_priority_to_mode}
 Generate one or more insns to set @var{entity} to @var{mode}.
 @var{hard_reg_live} is the set of hard registers live at the point where
 the insn(s) are to be inserted.
+Sets of a lower numbered entity will be emitted before sets of a higher
+numbered entity to a mode of the same or lower priority.
 @end defmac
 
 @node Target Attributes
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 6dcbde4..d793d26 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -7447,6 +7447,8 @@ for @var{entity}.  For any fixed @var{entity}, 
@code{mode_priority_to_mode}
 Generate one or more insns to set @var{entity} to @var{mode}.
 @var{hard_reg_live} is the set of hard registers live at the point where
 the insn(s) are to be inserted.
+Sets of a lower numbered entity will be emitted before sets of a higher
+numbered entity to a mode of the same or lower priority.
 @end defmac
 
 @node Target Attributes
diff --git a/gcc/mode-switching.c b/gcc/mode-switching.c
index 88543b2..088156c 100644
--- a/gcc/mode-switching.c
+++ b/gcc/mode-switching.c
@@ -96,12 +96,18 @@ static void make_preds_opaque (basic_block, int);
 
 
 /* This function will allocate a new BBINFO structure, initialized
-   with the MODE, INSN, and basic block BB parameters.  */
+   with the MODE, INSN, and basic block BB parameters.
+   INSN may not be a NOTE_INSN_BASIC_BLOCK, unless it is en empty
+   basic block; that allows us later to insert instructions in a FIFO-like
+   manner.  */
 
 static struct seginfo *
 new_seginfo (int mode, rtx insn, int bb, HARD_REG_SET regs_live)
 {
   struct seginfo *ptr;
+
+  gcc_assert (!NOTE_INSN_BASIC_BLOCK_P (insn)
+ || insn == BB_END (NOTE_BASIC_BLOCK (insn)));
   ptr = XNEW (struct seginfo);
   ptr->mode = mode;
   ptr->insn_ptr = insn;
@@ -534,7 +540,13 @@ optimize_mode_switching (void)
break;
if (e)
  {
-   ptr = new_seginfo (no_mode, BB_HEAD (bb), bb->index, live_now);
+   rtx ins_pos = BB_HEAD (bb);
+   if (LABEL_P (ins_pos))
+ ins_pos = NEXT_INSN (ins_pos);
+   gcc_assert (NOTE_INSN_BASIC_BLOCK_P (ins_pos));
+   if (ins_pos != BB_END (bb))
+ ins_pos = NEXT_INSN (ins_pos);
+   ptr = new_seginfo (no_mode, ins_pos, bb->index, live_now);
add_seginfo (info + bb->index, ptr);
bitmap_clear_bit (transp[bb->index], j);
  }
@@ -733,7 +745,15 @@ optimize_mode_switching (void)
{
  emitted = true;
  if (NOTE_INSN_BASIC_BLOCK_P (ptr->insn_ptr))
-   emit_insn_after (mode_set, ptr->insn_ptr);
+   /* We need to emit the insns in a FIFO-like manner,
+  i.e. the first to be emitted at our insertion
+  point ends up first in the instruction steam.
+  Because we made sure that NOTE_INSN_BASIC_BLOCK is
+  only used for initially empty basic blocks, we
+  can archive this by appending at the end of
+  the block.  */
+   emit_insn_after
+ (mode_set, BB_END (NOTE_BASIC_BLOCK (ptr->insn_ptr)));
  else
emit_insn_before (mode_set, ptr->insn_ptr);
}
--- /dev/null   2014-03-19 18:18:19.244212660 +
+++ b/gcc/testsuite/gcc.target/epiphany/mode-switch.c   2014-03-25 
13:31:41.186140611 +
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options

[PATCH, libitm] Remove unused PAGE_SIZE macros

2014-04-02 Thread Richard Henderson
As recently pointed out in a thread porting libitm to aarch64, the PAGE_SIZE
and FIXED_PAGE_SIZE macros are unused.  Indeed, not all of the ports actually
defined them at all.

Removed, lest they cause further confusion.


r~
* config/alpha/target.h (PAGE_SIZE, FIXED_PAGE_SIZE): Remove.
* config/arm/target.h, config/sh/target.h: Likewise.
* config/sparc/target.h, config/x86/target.h: Likewise.


diff --git a/libitm/config/alpha/target.h b/libitm/config/alpha/target.h
index 5e23c53..e33f1e1 100644
--- a/libitm/config/alpha/target.h
+++ b/libitm/config/alpha/target.h
@@ -32,10 +32,6 @@ typedef struct gtm_jmpbuf
   unsigned long f[8];
 } gtm_jmpbuf;
 
-/* Alpha generally uses a fixed page size of 8K.  */
-#define PAGE_SIZE  8192
-#define FIXED_PAGE_SIZE1
-
 /* The size of one line in hardware caches (in bytes). */
 #define HW_CACHELINE_SIZE 64
 
diff --git a/libitm/config/arm/target.h b/libitm/config/arm/target.h
index 6a1458e..a909e14 100644
--- a/libitm/config/arm/target.h
+++ b/libitm/config/arm/target.h
@@ -33,10 +33,6 @@ typedef struct gtm_jmpbuf
   unsigned long pc;
 } gtm_jmpbuf;
 
-/* ARM generally uses a fixed page size of 4K.  */
-#define PAGE_SIZE  4096
-#define FIXED_PAGE_SIZE1
-
 /* ??? The size of one line in hardware caches (in bytes). */
 #define HW_CACHELINE_SIZE 64
 
diff --git a/libitm/config/sh/target.h b/libitm/config/sh/target.h
index 6f6ae5f..fbc804c 100644
--- a/libitm/config/sh/target.h
+++ b/libitm/config/sh/target.h
@@ -35,10 +35,6 @@ typedef struct gtm_jmpbuf
 #endif
 } gtm_jmpbuf;
 
-/* SH generally uses a fixed page size of 4K.  */
-#define PAGE_SIZE  4096
-#define FIXED_PAGE_SIZE1
-
 /* ??? The size of one line in hardware caches (in bytes). */
 #define HW_CACHELINE_SIZE 32
 
diff --git a/libitm/config/sparc/target.h b/libitm/config/sparc/target.h
index b127fa4..309dac1 100644
--- a/libitm/config/sparc/target.h
+++ b/libitm/config/sparc/target.h
@@ -29,10 +29,6 @@ typedef struct gtm_jmpbuf
   unsigned long pc;
 } gtm_jmpbuf;
 
-/* UltraSPARC processors generally use a fixed page size of 8K.  */
-#define PAGE_SIZE  8192
-#define FIXED_PAGE_SIZE1
-
 /* The size of one line in hardware caches (in bytes).  We use the primary
cache line size documented for the UltraSPARC T1/T2.  */
 #define HW_CACHELINE_SIZE 16
diff --git a/libitm/config/x86/target.h b/libitm/config/x86/target.h
index 392db48..78a58e7 100644
--- a/libitm/config/x86/target.h
+++ b/libitm/config/x86/target.h
@@ -52,10 +52,6 @@ typedef struct gtm_jmpbuf
 /* x86 doesn't require strict alignment for the basic types.  */
 #define STRICT_ALIGNMENT 0
 
-/* x86 uses a fixed page size of 4K.  */
-#define PAGE_SIZE   4096
-#define FIXED_PAGE_SIZE 1
-
 /* The size of one line in hardware caches (in bytes). */
 #define HW_CACHELINE_SIZE 64
 


[commit, spu] Fix regression (ICE) in g++.dg/torture/pr57499.C

2014-04-02 Thread Ulrich Weigand
Hello,

this fixes the following testsuite regression on spu-elf:
FAIL: g++.dg/torture/pr57499.C  -O1  (internal compiler error)

which was caused by a code path in pad_bb that would simply crash
if the very last active insn in a function happened to be a
"blockage".

Tested on spu-elf, committed to mainline.

Bye,
Ulrich


ChangeLog:

* config/spu/spu.c (pad_bb): Do not crash when the last
insn is CODE_FOR_blockage.

Index: gcc/config/spu/spu.c
===
*** gcc/config/spu/spu.c(revision 208964)
--- gcc/config/spu/spu.c(working copy)
*** pad_bb(void)
*** 2064,2070 
}
  hbr_insn = insn;
}
!   if (INSN_CODE (insn) == CODE_FOR_blockage)
{
  if (GET_MODE (insn) == TImode)
PUT_MODE (next_insn, TImode);
--- 2064,2070 
}
  hbr_insn = insn;
}
!   if (INSN_CODE (insn) == CODE_FOR_blockage && next_insn)
{
  if (GET_MODE (insn) == TImode)
PUT_MODE (next_insn, TImode);
-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



[commit, spu] Fix regression (ICE) in gcc.dg/pr48335-2.c

2014-04-02 Thread Ulrich Weigand
Hello,

this fixes the following regressions on spu-elf:
FAIL: gcc.dg/pr48335-2.c (internal compiler error)
FAIL: gcc.dg/pr48335-3.c (internal compiler error)

which are caused by common code calling the insv pattern with a
combination of bitoffset/bitsize that lies partially outside the
underlying target mode, causing an assertion failure in
spu_expand_insv.

The original reason for the bad offset is that the test case
actually has undefined behavior due to storing partically outside
a struct via a misaligned pointer.

Still, the compiler should not ICE, so I've fixed this similar
to what was done on s390 by just rejecting this in the insv
expander and falling back to common code.

Tested on spu-elf, committed to mainline.

Bye,
Ulrich

ChangeLog:

* config/spu/spu.md ("insv"): Fail if bitoffset+bitsize
lies outside the target mode.

Index: gcc/config/spu/spu.md
===
*** gcc/config/spu/spu.md   (revision 208964)
--- gcc/config/spu/spu.md   (working copy)
***
*** 2851,2857 
  (match_operand:SI 2 "const_int_operand" ""))
(match_operand 3 "nonmemory_operand" ""))]
""
!   { spu_expand_insv(operands); DONE; })
  
  ;; Simplify a number of patterns that get generated by extv, extzv,
  ;; insv, and loads.
--- 2851,2863 
  (match_operand:SI 2 "const_int_operand" ""))
(match_operand 3 "nonmemory_operand" ""))]
""
!   {
! if (INTVAL (operands[1]) + INTVAL (operands[2])
! > GET_MODE_BITSIZE (GET_MODE (operands[0])))
!   FAIL;
! spu_expand_insv(operands);
! DONE;
!   })
  
  ;; Simplify a number of patterns that get generated by extv, extzv,
  ;; insv, and loads.
-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [patch] Fix texinfo warnings for doc/gcc.texi [was: Re: doc bugs]

2014-04-02 Thread Tobias Burnus

*PING*

Tobias Burnus wrote:

H.J. Lu wrote:
On Fri, Mar 28, 2014 at 12:41 PM, Mike Stump  
wrote:

Since we are nearing release, I thought I'd mention I see:
../../gcc/gcc/doc/invoke.texi:1114: warning: node next `Overall 
Options' in menu `C Dialect Options' and in sectioning `Invoking 
G++' differ

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59055


I think one reason that there are (and were) that many warnings is 
that only recently texinfo gained support for diagnosing these issues. 
(Or maybe not that recent but distributions were slow in adapting 
newer texinfo versions.)


Attached is a warning-removal patch.
OK for the trunk?

Regarding invoke.texi: It had (nearly) the same @menu twice, once 
under @chapter where it belongs to and once under a @section where it 
doesn't.


Tobias




RFA: PATCH to add -fno-gnu-unique for c++/60731

2014-04-02 Thread Jason Merrill
Use of STB_GNU_UNIQUE to avoid problems with variable symbols shared 
between two RTLD_LOCAL plugins and a common library dependency causes 
problems with libraries that depend on dlclose/dlopen to reinitialize 
state.  This patch adds a -fno-gnu-unique flag that such libraries can use.


Tested x86_64-pc-linux-gnu.  OK for trunk?
commit e9f123743831274cff1c135cf65bb222507bab32
Author: Jason Merrill 
Date:   Wed Apr 2 15:10:32 2014 -0400

	PR c++/60731
	* common.opt (-fno-gnu-unique): Add.
	* config/elfos.h (USE_GNU_UNIQUE_OBJECT): Check it.

diff --git a/gcc/common.opt b/gcc/common.opt
index 62c72f0..2259f29 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1267,6 +1267,10 @@ fgnu-tm
 Common Report Var(flag_tm)
 Enable support for GNU transactional memory
 
+fgnu-unique
+Common Report Var(flag_gnu_unique) Init(1)
+Use STB_GNU_UNIQUE if supported by the assembler
+
 floop-flatten
 Common Ignore
 Does nothing. Preserved for backward compatibility.
diff --git a/gcc/config/elfos.h b/gcc/config/elfos.h
index 1fce701..c1d5553 100644
--- a/gcc/config/elfos.h
+++ b/gcc/config/elfos.h
@@ -287,7 +287,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 /* Write the extra assembler code needed to declare an object properly.  */
 
 #ifdef HAVE_GAS_GNU_UNIQUE_OBJECT
-#define USE_GNU_UNIQUE_OBJECT 1
+#define USE_GNU_UNIQUE_OBJECT flag_gnu_unique
 #else
 #define USE_GNU_UNIQUE_OBJECT 0
 #endif
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index eca4e8f..2e78b8b 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1070,6 +1070,7 @@ See S/390 and zSeries Options.
 -ffixed-@var{reg}  -fexceptions @gol
 -fnon-call-exceptions  -fdelete-dead-exceptions  -funwind-tables @gol
 -fasynchronous-unwind-tables @gol
+-fno-gnu-unique @gol
 -finhibit-size-directive  -finstrument-functions @gol
 -finstrument-functions-exclude-function-list=@var{sym},@var{sym},@dots{} @gol
 -finstrument-functions-exclude-file-list=@var{file},@var{file},@dots{} @gol
@@ -22015,6 +22016,20 @@ Generate unwind table in DWARF 2 format, if supported by target machine.  The
 table is exact at each instruction boundary, so it can be used for stack
 unwinding from asynchronous events (such as debugger or garbage collector).
 
+@item -fno-gnu-unique
+@opindex fno-gnu-unique
+On systems with recent GNU assembler and C library, the C++ compiler
+uses the @code{STB_GNU_UNIQUE} binding to make sure that definitions
+of template static data members and static local variables in inline
+functions are unique even in the presence of @code{RTLD_LOCAL}; this
+is necessary to avoid problems with a library used by two different
+@code{RTLD_LOCAL} plugins depending on a definition in one of them and
+therefore disagreeing with the other one about the binding of the
+symbol.  But this causes @code{dlclose} to be ignored for affected
+DSOs; if your program relies on reinitialization of a DSO via
+@code{dlclose} and @code{dlopen}, you can use
+@option{-fno-gnu-unique}.
+
 @item -fpcc-struct-return
 @opindex fpcc-struct-return
 Return ``short'' @code{struct} and @code{union} values in memory like


Re: Skip some gcc.target/i386 tests for conflicting -march= options

2014-04-02 Thread Uros Bizjak
On Wed, Apr 2, 2014 at 6:36 PM, Joseph S. Myers  wrote:

> If you test an x86_64 toolchain with -march=bdver3 in the multilib
> options, as noted in
>  various test
> failures arise from tests whose own -march= in dg-options is
> overridden.  This patch adds dg-skip-if to those tests to skip them
> for conflicting -march= options, as has been done before for other
> tests (obviously, if the option ordering is changed in future in
> DejaGnu, such skips may become obsolete or could be conditioned on
> DejaGnu version).  (No doubt other -march= options would show up
> further tests needing such changes.)
>
> Tested x86_64-linux-gnu.  OK to commit?
>
> 2014-04-02  Joseph Myers  
>
> * gcc.target/i386/funcspec-2.c, gcc.target/i386/funcspec-3.c,
> gcc.target/i386/funcspec-9.c, gcc.target/i386/isa-1.c,
> gcc.target/i386/memcpy-strategy-1.c,
> gcc.target/i386/memcpy-strategy-2.c,
> gcc.target/i386/memcpy-vector_loop-1.c,
> gcc.target/i386/memcpy-vector_loop-2.c,
> gcc.target/i386/memset-vector_loop-1.c,
> gcc.target/i386/memset-vector_loop-2.c,
> gcc.target/i386/sse2-init-v2di-2.c, gcc.target/i386/ssetype-1.c,
> gcc.target/i386/ssetype-2.c, gcc.target/i386/ssetype-5.c: Skip for
> -march= options different from those in dg-options.

OK.

Thanks,
Uros.


Use -mno-prefer-avx128 in two more tests

2014-04-02 Thread Joseph S. Myers
Two of the tests I noted in
 did not get
fixed for --with-arch=bdver3 --with-cpu=bdver3 by adding
-mno-prefer-avx128 in fact also show failures for --with-arch=btver2
--with-tune=btver2, and in that case *are* fixed by adding
-mno-prefer-avx128.  Thus, while in those cases there may still be
other tuning issues as noted in
 (btver2
doesn't enable the flag in question) I think it *is* correct to use
-mno-prefer-avx128 for these two tests, and this patch adds it.

Tested x86_64-linux-gnu.  OK to commit?

2014-04-02  Joseph Myers  

* gcc.target/i386/avx2-vpand-3.c,
gcc.target/i386/avx256-unaligned-load-2.c: Use -mno-prefer-avx128.

Index: gcc/testsuite/gcc.target/i386/avx2-vpand-3.c
===
--- gcc/testsuite/gcc.target/i386/avx2-vpand-3.c(revision 209023)
+++ gcc/testsuite/gcc.target/i386/avx2-vpand-3.c(working copy)
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-mavx2 -O2 -ftree-vectorize -save-temps" } */
+/* { dg-options "-mavx2 -mno-prefer-avx128 -O2 -ftree-vectorize -save-temps" } 
*/
 /* { dg-require-effective-target avx2 } */
 
 
Index: gcc/testsuite/gcc.target/i386/avx256-unaligned-load-2.c
===
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-load-2.c (revision 
209023)
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-load-2.c (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { ! ia32 } } } */
-/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-load" } */
+/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-load 
-mno-prefer-avx128" } */
 
 void
 avx_test (char **cp, char **ep)

-- 
Joseph S. Myers
jos...@codesourcery.com


[Committed] S/390: Fix obvious bug in s390_expand_insv

2014-04-02 Thread Andreas Krebbel
Committed to mainline and 4.8.

Bye,

-Andreas-

2014-04-02  Andreas Krebbel  

* config/s390/s390.c (s390_expand_insv): Use GET_MODE_BITSIZE.

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index bdb577c..aac8de8 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -4613,7 +4613,7 @@ s390_expand_insv (rtx dest, rtx op1, rtx op2, rtx src)
   int smode_bsize, mode_bsize;
   rtx op, clobber;
 
-  if (bitsize + bitpos > GET_MODE_SIZE (mode))
+  if (bitsize + bitpos > GET_MODE_BITSIZE (mode))
 return false;
 
   /* Generate INSERT IMMEDIATE (IILL et al).  */



Re: Use -mno-prefer-avx128 in two more tests

2014-04-02 Thread Uros Bizjak
On Wed, Apr 2, 2014 at 10:09 PM, Joseph S. Myers
 wrote:

> Two of the tests I noted in
>  did not get
> fixed for --with-arch=bdver3 --with-cpu=bdver3 by adding
> -mno-prefer-avx128 in fact also show failures for --with-arch=btver2
> --with-tune=btver2, and in that case *are* fixed by adding
> -mno-prefer-avx128.  Thus, while in those cases there may still be
> other tuning issues as noted in
>  (btver2
> doesn't enable the flag in question) I think it *is* correct to use
> -mno-prefer-avx128 for these two tests, and this patch adds it.
>
> Tested x86_64-linux-gnu.  OK to commit?
>
> 2014-04-02  Joseph Myers  
>
> * gcc.target/i386/avx2-vpand-3.c,
> gcc.target/i386/avx256-unaligned-load-2.c: Use -mno-prefer-avx128.

OK.

Thanks,
Uros.


Re: RFA: RL78: Fix handling of (SUBREG (SYMBOL_REF))

2014-04-02 Thread DJ Delorie

This is OK.  Thanks!


Re: [C++ patch] for C++/52369

2014-04-02 Thread Fabien Chêne
2014-03-31 23:48 GMT+02:00 Jason Merrill :
[...]
>> if (permerror (input_location,
>>"default argument given for parameter "
>>"%d of %q#D", i, newdecl))
>>   permerror (DECL_SOURCE_LOCATION (olddecl),
>>  "previous specification in %q#D here",
>>  olddecl);
>>
>> should the second permerror be a note instead ?
>
>
> Yes.

OK to commit the attached patch ?
Tested x86_64 linux, though this piece of code does not seem to be
covered by the testsuite.

2014-04-02  Fabien Chêne  

* cp/decl.c (duplicate_decls): Check for the return of
permerror before emitting a note.

-- 
Fabien
Index: gcc/cp/decl.c
===
--- gcc/cp/decl.c	(révision 208997)
+++ gcc/cp/decl.c	(copie de travail)
@@ -1737,9 +1737,9 @@ duplicate_decls (tree newdecl, tree oldd
 			if (permerror (input_location,
    "default argument given for parameter "
    "%d of %q#D", i, newdecl))
-			  permerror (DECL_SOURCE_LOCATION (olddecl),
- "previous specification in %q#D here",
- olddecl);
+			  inform (DECL_SOURCE_LOCATION (olddecl),
+  "previous specification in %q#D here",
+  olddecl);
 		  }
 		else
 		  {


[BUILD] Ping for Jakub's --with-build-config=bootstrap-asan / bootstrap-ubsan patches

2014-04-02 Thread Tobias Burnus
I would like to ping the following two patches of Jakub. As he wrote in 
PR60667:


The http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01370.html fix is still 
waiting for review, you need that for both 
--with-build-config=bootstrap-ubsan and --with-build-config=bootstrap-asan.


For --with-build-config=bootstrap-asan also the 
http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01433.html patch is needed, 
plus --with-build-config=bootstrap-asan will only work with 
-disable-werror for now (fix for that expected only in stage1).



Tobias


[Patch, fortran] PR60717 - Wrong code with recursive procedure with unlimited polymorphic dummy argument

2014-04-02 Thread Paul Richard Thomas
Dear All,

This fix, of itself, is quite obvious.  The offset was being set to
zero for array segments, rather than that required for unity valued
lvalues.

I think that the fix could be used to clean up:

trans-expr.c(gfc_trans_alloc_subarray_assign)
trans-expr.c(gfc_trans_pointer_assign)
trans-expr.c(fncall_realloc_result)
trans-array.c(trans_associate_var)

each of which contains calculation of the offset. However, I do not
think that this is the stage to fix things that are not broken!

I propose to keep the PR open as a reminder to look into this.

Bootstrapped and regtested on X86_64/FC17 - OK for trunk and backporting to 4.8?

Paul

 2014-04-12  Paul Thomas  

PR fortran/58771
* trans.h : Add 'use_offset' bitfield to gfc_se.
* trans-array.c (gfc_conv_expr_descriptor) : Use 'use_offset'
as a trigger to unconditionally recalculate the offset.
trans-expr.c (gfc_conv_intrinsic_to_class) : Use it.
(gfc_conv_procedure_call) : Ditto.

2014-04-02  Paul Thomas  

PR fortran/58771
* gfortran.dg/unlimited_polymorphic_17.f90 : New test
Index: gcc/fortran/trans-array.c
===
*** gcc/fortran/trans-array.c   (revision 208997)
--- gcc/fortran/trans-array.c   (working copy)
*** gfc_conv_expr_descriptor (gfc_se *se, gf
*** 6807,6813 
  
/* Set offset for assignments to pointer only to zero if it is not
   the full array.  */
!   if (se->direct_byref
  && info->ref && info->ref->u.ar.type != AR_FULL)
base = gfc_index_zero_node;
else if (GFC_ARRAY_TYPE_P (TREE_TYPE (desc)))
--- 6807,6813 
  
/* Set offset for assignments to pointer only to zero if it is not
   the full array.  */
!   if ((se->direct_byref || se->use_offset)
  && info->ref && info->ref->u.ar.type != AR_FULL)
base = gfc_index_zero_node;
else if (GFC_ARRAY_TYPE_P (TREE_TYPE (desc)))
*** gfc_conv_expr_descriptor (gfc_se *se, gf
*** 6899,6905 
  base = fold_build2_loc (input_location, MINUS_EXPR,
  TREE_TYPE (base), base, stride);
}
! else if (GFC_ARRAY_TYPE_P (TREE_TYPE (desc)))
{
  tmp = gfc_conv_array_lbound (desc, n);
  tmp = fold_build2_loc (input_location, MINUS_EXPR,
--- 6899,6905 
  base = fold_build2_loc (input_location, MINUS_EXPR,
  TREE_TYPE (base), base, stride);
}
! else if (GFC_ARRAY_TYPE_P (TREE_TYPE (desc)) || se->use_offset)
{
  tmp = gfc_conv_array_lbound (desc, n);
  tmp = fold_build2_loc (input_location, MINUS_EXPR,
*** gfc_conv_expr_descriptor (gfc_se *se, gf
*** 6935,6942 
gfc_get_dataptr_offset (&loop.pre, parm, desc, offset,
subref_array_target, expr);
  
!   if ((se->direct_byref || GFC_ARRAY_TYPE_P (TREE_TYPE (desc)))
  && !se->data_not_needed)
{
  /* Set the offset.  */
  gfc_conv_descriptor_offset_set (&loop.pre, parm, base);
--- 6935,6943 
gfc_get_dataptr_offset (&loop.pre, parm, desc, offset,
subref_array_target, expr);
  
!   if (((se->direct_byref || GFC_ARRAY_TYPE_P (TREE_TYPE (desc)))
   && !se->data_not_needed)
+ || (se->use_offset && base != NULL_TREE))
{
  /* Set the offset.  */
  gfc_conv_descriptor_offset_set (&loop.pre, parm, base);
Index: gcc/fortran/trans-expr.c
===
*** gcc/fortran/trans-expr.c(revision 208997)
--- gcc/fortran/trans-expr.c(working copy)
*** gfc_conv_intrinsic_to_class (gfc_se *par
*** 593,598 
--- 593,599 
else
{
  parmse->ss = ss;
+ parmse->use_offset = 1;
  gfc_conv_expr_descriptor (parmse, e);
  gfc_add_modify (&parmse->pre, ctree, parmse->expr);
}
*** gfc_conv_procedure_call (gfc_se * se, gf
*** 4378,4383 
--- 4379,4385 
|| CLASS_DATA (fsym)->attr.codimension))
{
  /* Pass a class array.  */
+ parmse.use_offset = 1;
  gfc_conv_expr_descriptor (&parmse, e);
  
  /* If an ALLOCATABLE dummy argument has INTENT(OUT) and is
Index: gcc/fortran/trans.h
===
*** gcc/fortran/trans.h (revision 208997)
--- gcc/fortran/trans.h (working copy)
*** typedef struct gfc_se
*** 87,92 
--- 87,96 
   args alias.  */
unsigned force_tmp:1;
  
+   / * Unconditionally calculate offset for array segments in
+   gfc_conv_expr_descriptor.  */
+   unsigned use_offset:1;
+ 
unsigned want_coarray:1;
  
/* Scalarization parameters.  */
Index: gcc/testsuite/gfortran.dg/unlimited_p

Re: [C++ patch] for C++/52369

2014-04-02 Thread Jason Merrill

On 04/02/2014 04:21 PM, Fabien Chêne wrote:

 * cp/decl.c (duplicate_decls): Check for the return of
 permerror before emitting a note.


You don't need "cp/" within cp/ChangeLog.  OK with that change.

Jason



one more patch to fix PR60650

2014-04-02 Thread Vladimir Makarov

  The following patch fixes the PR for new set of options.

The details of the problem can be found on

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60650

  The patch affects a sensitive part for LRA.  Therefore I bootstrapped 
and tested it on x86-64, aarch64, arm, s390, and Ppc64.  The results 
look ok.


  x86/x86-64 SPEC2000 testing shows no visible effect on performance 
and code size.


  Committed as rev. 209038.

2014-04-02  Vladimir Makarov  

PR rtl-optimization/60650
* lra-constraints.c (process_alt_operands): Decrease reject for
earlyclobber matching.

2014-04-02  Vladimir Makarov  

PR rtl-optimization/60650
* gcc.target/arm/pr60650-2.c: New.
Index: lra-constraints.c
===
--- lra-constraints.c   (revision 208989)
+++ lra-constraints.c   (working copy)
@@ -1747,12 +1747,27 @@ process_alt_operands (int only_alternati
  [GET_MODE (*curr_id->operand_loc[m])]);
  }
 
-   /* We prefer no matching alternatives because
-  it gives more freedom in RA.  */
-   if (operand_reg[nop] == NULL_RTX
-   || (find_regno_note (curr_insn, REG_DEAD,
-REGNO (operand_reg[nop]))
-== NULL_RTX))
+   /* Prefer matching earlyclobber alternative as
+  it results in less hard regs required for
+  the insn than a non-matching earlyclobber
+  alternative.  */
+   if (curr_static_id->operand[m].early_clobber)
+ {
+   if (lra_dump_file != NULL)
+ fprintf
+   (lra_dump_file,
+"%d Matching earlyclobber alt:"
+" reject--\n",
+nop);
+   reject--;
+ }
+   /* Otherwise we prefer no matching
+  alternatives because it gives more freedom
+  in RA.  */
+   else if (operand_reg[nop] == NULL_RTX
+|| (find_regno_note (curr_insn, REG_DEAD,
+ REGNO (operand_reg[nop]))
+== NULL_RTX))
  {
if (lra_dump_file != NULL)
  fprintf
@@ -2143,7 +2158,7 @@ process_alt_operands (int only_alternati
}
  /* If the operand is dying, has a matching constraint,
 and satisfies constraints of the matched operand
-which failed to satisfy the own constraints, probably
+which failed to satisfy the own constraints, most probably
 the reload for this operand will be gone.  */
  if (this_alternative_matches >= 0
  && !curr_alt_win[this_alternative_matches]
Index: testsuite/gcc.target/arm/pr60650-2.c
===
--- testsuite/gcc.target/arm/pr60650-2.c(revision 0)
+++ testsuite/gcc.target/arm/pr60650-2.c(working copy)
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-omit-frame-pointer -march=armv7-a" } */
+
+int a, h, j;
+long long d, e, i;
+int f;
+fn1 (void *p1, int p2)
+{
+switch (p2)
+case 8:
+{
+register b = *(long long *) p1, c asm ("r2");
+asm ("%0": "=r" (a), "=r" (c):"r" (b), "r" (0));
+*(long long *) p1 = c;
+}
+}
+
+fn2 ()
+{
+int k;
+k = f;
+while (1)
+{
+fn1 (&i, sizeof i);
+e = d + k;
+switch (d)
+case 0:
+(
+{
+register l asm ("r4");
+register m asm ("r0");
+asm ("  .err  .endif\n\t": "=r" (h), "=r" (j):"r" (m),
+"r"
+(l));;
+});
+}
+}


[PATCH, committed] Fix PR60733

2014-04-02 Thread Bill Schmidt
PR60733 identifies a case where straight-line strength reduction
produces code that doesn't satisfy SSA verification.  For a PHI
candidate, the insertion of an initializer for a stride calculation
along an incoming arc was specified to be at the point of the feeding
definition of the PHI along that arc.  This is wrong and can place the
initializer far earlier than its operands are guaranteed to be
available.  In this case, the initializer was placed earlier in the
block than the definition of one of its operands.

In fact, the initializer is only needed at the end of the feeding block
for the PHI argument, and its operands are guaranteed to be available at
that point.  This patch changes the placement of the initializer to this
location for PHI candidates.  The nearest common dominator algorithm may
still place the initializer at an earlier point, but only if it is safe
to do so.

Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new
regressions; committed.

Thanks,
Bill


[gcc]

2014-04-02  Bill Schmidt  

PR tree-optimization/60733
* gimple-ssa-strength-reduction.c (ncd_with_phi): Change required
insertion point for PHI candidates to be the end of the feeding
block for the PHI argument.

[gcc/testsuite]

2014-04-02  Bill Schmidt  

PR tree-optimization/60733
* gcc.dg/torture/pr60733.c:  New test.


Index: gcc/testsuite/gcc.dg/torture/pr60733.c
===
--- gcc/testsuite/gcc.dg/torture/pr60733.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr60733.c  (revision 0)
@@ -0,0 +1,36 @@
+/* { dg-do run } */
+
+int a, d, e, f, g, h, i, j, k;
+unsigned short b;
+
+short
+fn1 (int p1, int p2)
+{
+  return p1 * p2;
+}
+
+int
+main ()
+{
+  for (; a; a--)
+{
+  int l = 0;
+  if (f >= 0)
+   {
+ for (; h;)
+   e = 0;
+ for (; l != -6; l--)
+   {
+ j = fn1 (b--, d);
+ for (g = 0; g; g = 1)
+   ;
+ k = e ? 2 : 0;
+   }
+ i = 0;
+ for (;;)
+   ;
+   }
+}
+  d = 0;
+  return 0;
+}
Index: gcc/gimple-ssa-strength-reduction.c
===
--- gcc/gimple-ssa-strength-reduction.c (revision 209023)
+++ gcc/gimple-ssa-strength-reduction.c (working copy)
@@ -3001,10 +3001,10 @@ ncd_with_phi (slsr_cand_t c, double_int incr, gimp
{
  slsr_cand_t arg_cand = base_cand_from_table (arg);
  double_int diff = arg_cand->index - basis->index;
+ basic_block pred = gimple_phi_arg_edge (phi, i)->src;
 
  if ((incr == diff) || (!address_arithmetic_p && incr == -diff))
-   ncd = ncd_for_two_cands (ncd, gimple_bb (arg_cand->cand_stmt),
-*where, arg_cand, where);
+   ncd = ncd_for_two_cands (ncd, pred, *where, NULL, where);
}
}
 }




Re: [PATCH] Disable IPA-SRA for always_inline functions

2014-04-02 Thread Jan Hubicka
> Hi,
> 
> when dealing with a PR yesterday I have noticed that IPA-SRA was
> modifying an always_inline function which is useless work since the
> function must then be inlined anyway.  Thus I'd like to propose the
> following simple change disabling it in such cases.
> 
> Included in a bootstrap and testing on x86_64-linux.  OK for trunk now
> or in the next stsge1?

Actually are the attributes copied to the clone?
The patch looks OK to me, even at this stage.

Honza
> 
> Thanks,
> 
> Martin
> 
> 
> 2014-04-01  Martin Jambor  
> 
>   * tree-sra.c (ipa_sra_preliminary_function_checks): Skip
>   always_inline functions.
> 
> Index: src/gcc/tree-sra.c
> ===
> --- src.orig/gcc/tree-sra.c
> +++ src/gcc/tree-sra.c
> @@ -4960,6 +4960,15 @@ ipa_sra_preliminary_function_checks (str
>if (TYPE_ATTRIBUTES (TREE_TYPE (node->decl)))
>  return false;
>  
> +  if (lookup_attribute ("always_inline",
> + DECL_ATTRIBUTES (node->decl)) != NULL)
> +{
> +  if (dump_file)
> + fprintf (dump_file, "Allways inline function will be inlined "
> +  "anyway. \n");
> +  return false;
> +}
> +
>return true;
>  }
>  


Re: [PATCH][LTO/PGO] Warn when both -flto and -fprofile-generate are enabled

2014-04-02 Thread Jan Hubicka
> On Wed, Apr 2, 2014 at 2:07 PM, Richard Biener
>  wrote:
> > On Wed, Apr 2, 2014 at 1:50 PM, Markus Trippelsdorf
> >  wrote:
> >> It is a common mistake to enable both -flto and -fprofile-generate when
> >> building projects. This is not a good idea, because memory use will
> >> skyrocket due to instrumentation. So just warn the user.
> >>
> >> OK for next stage1?
> >
> > I'd rather see if we can fix the underlying issue.  For example as we
> > are now instrumenting as IPA pass we can allocate a single
> > counter array (if the number of global vars is the issue).  Basically
> > split analysis and instrumentation into two phases for that.
> >
> > Or even better, do profile instrumentation as "real" IPA pass.
> 
> Thus, isn't -coverage also facing the same issue?  Thus, is it
> really -fprofile-arcs already or only one of the value profiling pieces?

Yep, -fprofile-arcs will cause similar issues.
Implementing instrumentation as real IPA is on my TODO list, but pretty low,
since it is quite some work; we need to stream CFG into summaries and make
the instrumentation code independent of function bodies, that needs quite some
reorg (at moment we have no way to load cfg alone).

Note that -fprofile-generate -flto gives you a bit more precise profiles than
-fprofile-generate alone, this is because of COMDAT functions from static 
libraries
that may be lost in the first case.

Honza
> 
> Richard.
> 
> > Richard.
> >
> >> 2014-04-02  Markus Trippelsdorf  
> >>
> >> * common.opt (fprofile-generate): Add flag.
> >> * opts.c (finish_options): Add new warning.
> >> (common_handle_option): Set flag.
> >>
> >> diff --git a/gcc/common.opt b/gcc/common.opt
> >> index 62c72f0d2fbf..61e9adfa0df5 100644
> >> --- a/gcc/common.opt
> >> +++ b/gcc/common.opt
> >> @@ -1689,7 +1689,7 @@ Common Report Var(flag_profile_correction)
> >>  Enable correction of flow inconsistent profile data input
> >>
> >>  fprofile-generate
> >> -Common
> >> +Common Var(flag_profile_generate)
> >>  Enable common options for generating profile info for profile feedback 
> >> directed optimizations
> >>
> >>  fprofile-generate=
> >> diff --git a/gcc/opts.c b/gcc/opts.c
> >> index fdc903f9271a..b62a0d626d94 100644
> >> --- a/gcc/opts.c
> >> +++ b/gcc/opts.c
> >> @@ -833,6 +833,9 @@ finish_options (struct gcc_options *opts, struct 
> >> gcc_options *opts_set,
> >> error_at (loc, "only one -flto-partition value can be specified");
> >>  }
> >>
> >> +  if (opts->x_flag_generate_lto && opts->x_flag_profile_generate)
> >> +warning_at (loc, 0, "Enabling both -fprofile-generate and -flto is a 
> >> bad idea.");
> >> +
> >>/* We initialize opts->x_flag_split_stack to -1 so that targets can set 
> >> a
> >>   default value if they choose based on other options.  */
> >>if (opts->x_flag_split_stack == -1)
> >> @@ -1728,6 +1731,7 @@ common_handle_option (struct gcc_options *opts,
> >>
> >>  case OPT_fprofile_generate_:
> >>opts->x_profile_data_prefix = xstrdup (arg);
> >> +  opts->x_flag_profile_generate = true;
> >>value = true;
> >>/* No break here - do -fprofile-generate processing. */
> >>  case OPT_fprofile_generate:
> >> --
> >> Markus


merged trunk into gimple-front-end

2014-04-02 Thread Trevor Saunders
Hi,

I just merged trunk r209020 into the gimple-front-end branch, please
tell me if you see anything busted ;)

I successfully bootstrapped the merge including building the gimple front
end and its few tests passed.

Trev



signature.asc
Description: Digital signature


[Patch, moxie] Zero- and sign-extend values properly

2014-04-02 Thread Anthony Green

This patch does three related things for the moxie port...

1. Changes char to be unsigned by default
2. Changes WCHAR_TYPE from long int to unsigned int
3. Zero- and sign-extends values properly, sometimes using the new
sign-extension instructions.

I am committing this change even at this late stage of the GCC release
process because it only touches the moxie target directory.

AG

2014-04-02  Anthony Green  

* config/moxie/moxie.md (zero_extendqisi2, zero_extendhisi2)
(extendqisi2, extendhisi2): Define.
* config/moxie/moxie.h (DEFAULT_SIGNED_CHAR): Change to 0.
(WCHAR_TYPE): Change to unsigned int.


Index: gcc/config/moxie/moxie.h
===
--- gcc/config/moxie/moxie.h(revision 209042)
+++ gcc/config/moxie/moxie.h(working copy)
@@ -59,7 +59,7 @@
 #define DOUBLE_TYPE_SIZE 64
 #define LONG_DOUBLE_TYPE_SIZE 64
 
-#define DEFAULT_SIGNED_CHAR 1
+#define DEFAULT_SIGNED_CHAR 0
 
 #undef  SIZE_TYPE
 #define SIZE_TYPE "unsigned int"
@@ -68,7 +68,7 @@
 #define PTRDIFF_TYPE "int"
 
 #undef  WCHAR_TYPE
-#define WCHAR_TYPE "long int"
+#define WCHAR_TYPE "unsigned int"
 
 #undef  WCHAR_TYPE_SIZE
 #define WCHAR_TYPE_SIZE BITS_PER_WORD
Index: gcc/config/moxie/moxie.md
===
--- gcc/config/moxie/moxie.md   (revision 209042)
+++ gcc/config/moxie/moxie.md   (working copy)
@@ -239,6 +239,56 @@
ldo.l  %0, %1"
   [(set_attr "length"  "2,2,6,2,6,2,6,6,6")])
 
+(define_insn_and_split "zero_extendqisi2"
+  [(set (match_operand:SI 0 "register_operand" "=r,r,r,r")
+   (zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "0,W,A,B")))]
+  ""
+  "@
+   ;
+   ld.b   %0, %1
+   lda.b  %0, %1
+   ldo.b  %0, %1
+  "reload_completed"
+  [(set (match_dup 2) (match_dup 1))
+   (set (match_dup 0) (zero_extend:SI (match_dup 2)))]
+{
+  operands[2] = gen_lowpart (QImode, operands[0]);
+}
+  [(set_attr "length" "0,2,6,6")])
+
+(define_insn_and_split "zero_extendhisi2"
+  [(set (match_operand:SI 0 "register_operand" "=r,r,r,r")
+   (zero_extend:SI (match_operand:HI 1 "nonimmediate_operand" "0,W,A,B")))]
+  ""
+  "@
+   ;
+   ld.s   %0, %1
+   lda.s  %0, %1
+   ldo.s  %0, %1
+  "reload_completed"
+  [(set (match_dup 2) (match_dup 1))
+   (set (match_dup 0) (zero_extend:SI (match_dup 2)))]
+{
+  operands[2] = gen_lowpart (HImode, operands[0]);
+}
+  [(set_attr "length" "0,2,6,6")])
+
+(define_insn "extendqisi2"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (sign_extend:SI (match_operand:QI 1 "nonimmediate_operand" "r")))]
+  ""
+  "@
+   sex.b  %0, %1"
+  [(set_attr "length" "2")])
+
+(define_insn "extendhisi2"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (sign_extend:SI (match_operand:HI 1 "nonimmediate_operand" "r")))]
+  ""
+  "@
+   sex.s  %0, %1"
+  [(set_attr "length" "2")])
+
 (define_expand "movqi"
   [(set (match_operand:QI 0 "general_operand" "")
(match_operand:QI 1 "general_operand" ""))]


Fix ipa-devirt ICE

2014-04-02 Thread Jan Hubicka
Hi,
this patch fixes ICE on type inconsistent code.  The ICE happens because of
gcc_unreachable I forgot in code during development.  I added way to mark calls
as inconsistent that is useful to redirect them to UNREACHABLE.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

* testsuite/g++.dg/torture/pr60659.C: New testcase.
* ipa-devirt.c (get_polymorphic_call_info): Do not ICE on type 
inconsistent
code and instead mark the context inconsistent.
(possible_polymorphic_call_targets): For inconsistent contexts
return empty complete list.
Index: testsuite/g++.dg/torture/pr60659.C
===
--- testsuite/g++.dg/torture/pr60659.C  (revision 0)
+++ testsuite/g++.dg/torture/pr60659.C  (revision 0)
@@ -0,0 +1,58 @@
+// { dg-do compile }
+template  void __distance (_InputIterator);
+template 
+void distance (_InputIterator, _InputIterator p2)
+{
+  __distance (p2);
+}
+
+namespace boost
+{
+template  struct A
+{
+  typedef typename Iterator::difference_type type;
+};
+template  typename T::const_iterator end (T &);
+template  typename T::const_iterator begin (T &);
+template  struct D : A
+{
+};
+template  typename D::type distance (const T &p1)
+{
+  distance (boost::begin (p1), boost::end (p1));
+  return 0;
+}
+template  class B
+{
+public:
+  typedef B type;
+  typedef IteratorT const_iterator;
+};
+}
+
+typedef int storage_t[];
+struct F;
+template  class> struct G
+{
+  G (const G &p1) { p1.m_fn1 ().m_fn1 (0); }
+  const F &m_fn1 () const
+  {
+const void *a;
+a = &data_m;
+return *static_cast(a);
+  }
+  storage_t *data_m;
+};
+
+struct F
+{
+  virtual F *m_fn1 (void *) const;
+};
+template  struct H;
+struct C : G
+{
+  typedef int difference_type;
+};
+boost::B AllTransVideos ();
+int b = boost::distance (AllTransVideos ());
+
Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 208915)
+++ ipa-devirt.c(working copy)
@@ -1214,7 +1214,13 @@ get_polymorphic_call_info (tree fndecl,
 not part of outer type.  */
  if (!contains_type_p (TREE_TYPE (base),
context->offset + offset2, *otr_type))
-   return base_pointer;
+   {
+ /* Use OTR_TOKEN = INT_MAX as a marker of probably type 
inconsistent
+code sequences; we arrange the calls to be 
builtin_unreachable
+later.  */
+ *otr_token = INT_MAX;
+ return base_pointer;
+   }
  get_polymorphic_call_info_for_decl (context, base,
  context->offset + 
offset2);
  return NULL;
@@ -1288,8 +1294,10 @@ get_polymorphic_call_info (tree fndecl,
  if (!contains_type_p (context->outer_type, context->offset,
*otr_type))
{ 
- context->outer_type = NULL;
- gcc_unreachable ();
+ /* Use OTR_TOKEN = INT_MAX as a marker of probably type 
inconsistent
+code sequences; we arrange the calls to be builtin_unreachable
+later.  */
+ *otr_token = INT_MAX;
  return base_pointer;
}
  context->maybe_derived_type = false;
@@ -1389,6 +1397,9 @@ devirt_variable_node_removal_hook (varpo
temporarily change to one of base types.  INCLUDE_DERIVER_TYPES make
us to walk the inheritance graph for all derivations.
 
+   OTR_TOKEN == INT_MAX is used to mark calls that are provably
+   undefined and should be redirected to unreachable.
+
If COMPLETEP is non-NULL, store true if the list is complete. 
CACHE_TOKEN (if non-NULL) will get stored to an unique ID of entry
in the target cache.  If user needs to visit every target list
@@ -1422,6 +1433,7 @@ possible_polymorphic_call_targets (tree
   bool complete;
   bool can_refer;
 
+  /* If ODR is not initialized, return empty incomplete list.  */
   if (!odr_hash.is_created ())
 {
   if (completep)
@@ -1431,11 +1443,28 @@ possible_polymorphic_call_targets (tree
   return nodes;
 }
 
+  /* If we hit type inconsistency, just return empty list of targets.  */
+  if (otr_token == INT_MAX)
+{
+  if (completep)
+   *completep = true;
+  if (nonconstruction_targetsp)
+   *nonconstruction_targetsp = 0;
+  return nodes;
+}
+
   type = get_odr_type (otr_type, true);
 
   /* Lookup the outer class type we want to walk.  */
-  if (context.outer_type)
-get_class_context (&context, otr_type);
+  if (context.outer_type
+  && !get_class_context (&context, otr_type))
+{
+  if (completep)
+   *completep = false;
+  if (nonconstruction_targetsp)
+   *nonconstruction_targetsp = 0;
+  return nodes;
+}
 
   /* We canonicalize 

RE: [PATCH][2/3] Fix PR54733 Optimize endian independent load/store

2014-04-02 Thread Thomas Preud'homme
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> 
> "More like" isn't enough to answer this - do you have a testcase?  (usually
> these end up in undefined-overflow and/or conversion-to-sizetype issues)

I do. See attachment. This testcase needs to be compiled with patch 2/3
applied. As you can see from the patch, data[a] and data[a+1] will be
converted to offsets by multiplying the index with the element size. Then
later, analyzing the ORing, a substraction of these two index will be done.
So you have two fold_build and not one. I can't reproduce it with a simple
expression such as (a+1)*1 - a*1 so maybe being done in two part is the
reason, you know better.

Best regards,

Thomas

missed_folding.c
Description: Binary data


RE: [PATCH][1/3] Fix PR54733 Optimize endian independent load/store

2014-04-02 Thread Thomas Preud'homme
> From: Joseph Myers [mailto:jos...@codesourcery.com]
> 
> > +   if { [is-effective-target bswap]
> > +&& ![istarget x86_64-*-*] } {
> 
> That x86_64-*-* test is wrong.  x86_64-*-* and i?86-*-* should always be
> handled the same (if you then want to distinguish 32-bit and 64-bit
> multilibs, you check the appropriate effective-target there, depending on
> whether the condition is one on the ABI or which register size is being
> used, which affects how x32 should be counted).

Indeed, it's a mistake. I?86 should be in there two. Please find attached an 
updated patch.

Best regards,

Thomas 


gcc32rm-84.3.1.part1.diff
Description: Binary data


Re: [PATCH][1/3] Fix PR54733 Optimize endian independent load/store

2014-04-02 Thread Rainer Orth
"Thomas Preud'homme"  writes:

>> From: Joseph Myers [mailto:jos...@codesourcery.com]
>> 
>> > +   if { [is-effective-target bswap]
>> > +&& ![istarget x86_64-*-*] } {
>> 
>> That x86_64-*-* test is wrong.  x86_64-*-* and i?86-*-* should always be
>> handled the same (if you then want to distinguish 32-bit and 64-bit
>> multilibs, you check the appropriate effective-target there, depending on
>> whether the condition is one on the ABI or which register size is being
>> used, which affects how x32 should be counted).
>
> Indeed, it's a mistake. I?86 should be in there two. Please find attached an 
> updated patch.

> diff --git a/gcc/testsuite/gcc.dg/optimize-bswapdi-1.c 
> b/gcc/testsuite/gcc.dg/optimize-bswapdi-1.c
> index 7d557f3..a9c3443 100644
> --- a/gcc/testsuite/gcc.dg/optimize-bswapdi-1.c
> +++ b/gcc/testsuite/gcc.dg/optimize-bswapdi-1.c
> @@ -1,6 +1,6 @@
> -/* { dg-do compile { target arm*-*-* alpha*-*-* ia64*-*-* x86_64-*-* 
> s390x-*-* powerpc*-*-* rs6000-*-* } } */
> +/* { dg-do compile { target *-*-* } } */

Just omit the { target *-*-* } completely, also a few more times.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University