Re: [PATCH] Replace Yoda conditions in gcc/

2017-12-20 Thread Eric Botcazou
> If some port maintainer used 1 <= x && x <= 24 style and doesn't like
> x >= 1 && x <= 24 for some reason, there is always IN_RANGE macro and
> IN_RANGE (x, 1, 24) can be used instead (though, such a change requires
> double checking the type of x, it shouldn't be wider than HOST_WIDE_INT).

Yes, the visium changes are rather counter-productive, especially in 
visium_legitimize_address & visium_legitimize_reload_address where the style 
is clearly inconsistent now.

As far as I'm concerned, reading (a < x && x < b) is twice as fast as reading 
(x > a && x < b).  And IN_RANGE is too ambiguous wrt the bounds.

-- 
Eric Botcazou


Re: [SFN] Bootstrap broken

2017-12-20 Thread Jakub Jelinek
On Wed, Dec 20, 2017 at 04:57:43AM -0200, Alexandre Oliva wrote:
> for  gcc/ChangeLog
> 
>   PR bootstrap/83396
>   * cfgexpand.c (label_rtx_for_bb): Revert SFN changes that
>   allowed debug stmts before labels.
>   (expand_gimple_basic_block): Likewise.
>   * gimple-iterator.c (gimple_find_edge_insert_loc): Likewise.
>   * gimple-iterator.h (gsi_after_labels): Likewise.
>   * tree-cfgcleanup (remove_forwarder_block): Likewise, but
>   rename reused variable, and simplify using gsi_move_before.
>   * tree-ssa-tail-merge.c (find_duplicate): Likewise.
>   * tree-cfg.c (make_edges, cleanup_dead_labels): Likewise.
>   (gimple_can_merge_blocks_p, verify_gimple_in_cfg): Likewise.
>   (gimple_verify_flow_info, gimple_block_label): Likewise.
>   (make_blocks): Move debug markers after adjacent labels.
>   * cfgrtl.c (skip_insns_after_block): Revert SFN changes that
>   allowed debug insns outside blocks.
>   * df-scan.c (df_insn_delete): Likewise.
>   * lra-constraints.c (update_ebb_live_info): Likewise.
>   * var-tracking.c (get_first_insn, vt_emit_notes): Likewise.
>   (vt_initialize, delete_vta_debug_insns): Likewise.
>   (reemit_marker_as_note): Drop BB parm.  Adjust callers.

Ok, thanks.

Jakub


Re: [PATCH] Replace Yoda conditions in gcc/

2017-12-20 Thread Jakub Jelinek
On Wed, Dec 20, 2017 at 09:29:22AM +0100, Eric Botcazou wrote:
> > If some port maintainer used 1 <= x && x <= 24 style and doesn't like
> > x >= 1 && x <= 24 for some reason, there is always IN_RANGE macro and
> > IN_RANGE (x, 1, 24) can be used instead (though, such a change requires
> > double checking the type of x, it shouldn't be wider than HOST_WIDE_INT).
> 
> Yes, the visium changes are rather counter-productive, especially in 
> visium_legitimize_address & visium_legitimize_reload_address where the style 
> is clearly inconsistent now.
> 
> As far as I'm concerned, reading (a < x && x < b) is twice as fast as reading 
> (x > a && x < b).  And IN_RANGE is too ambiguous wrt the bounds.

That depends on the reader, and as we have multiple readers, it is better to
be consistent.  And, I find no ambiguity on IN_RANGE, it is
inclusive for both values and used heavily through the compiler, so using it
is a welcome cleanup, e.g. some ports that use it most often:
67 config/i386/
59 config/frv/
53 config/rs6000/
52 config/powerpcspe/
46 config/mips/
44 config/avr/
28 config/m68k/
23 config/rl78/
23 config/arm/
23 config/aarch64/
15 config/rx/
15 config/mn10300/
14 config/m32c/
14 config/h8300/
11 config/cris/
10 config/fr30/

Jakub


Re: [PATCH][arm] PR target/82975: Guard against reg_renumber being NULL in arm.h

2017-12-20 Thread Kyrill Tkachov


On 19/12/17 21:56, Jakub Jelinek wrote:

On Tue, Dec 19, 2017 at 04:58:37PM +, Kyrill Tkachov wrote:

2017-12-19  Kyrylo Tkachov  

 PR target/82975
 * gcc.dg/pr82975.c: New test.

The testcase FAILs on x86_64-linux/i686-linux and probably on all
non-arm/aarch64 targets.

Fixed thusly, committed to trunk as obvious.


Thank you for fixing this. I had committed an older version of the 
testcase accidentally :(

My proper testcase did have the target guard.

Kyrill


2017-12-19  Jakub Jelinek  

PR target/82975
* gcc.dg/pr82975.c: Only add -mtune=cortex-a57 on arm*/aarch64*
targets.

--- gcc/testsuite/gcc.dg/pr82975.c.jj   2017-12-19 18:07:33.0 +0100
+++ gcc/testsuite/gcc.dg/pr82975.c  2017-12-19 22:53:36.383901451 +0100
@@ -1,6 +1,7 @@
  /* PR target/82975.  */
  /* { dg-do compile } */
-/* { dg-options "-mtune=cortex-a57 -fno-sched-pressure -O2" } */
+/* { dg-options "-fno-sched-pressure -O2" } */
+/* { dg-additional-options "-mtune=cortex-a57" { target arm*-*-* aarch64*-*-* 
} } */
  
  typedef __SIZE_TYPE__ size_t;
  


Jakub




Re: [PATCH, PR83492] Fix selection of aarch64 big-endian shift parameters based on __AARCH64EB__

2017-12-20 Thread Kyrill Tkachov


On 20/12/17 00:23, Jeff Law wrote:

On 12/19/2017 11:23 AM, Jakub Jelinek wrote:

On Tue, Dec 19, 2017 at 07:01:44PM +0100, Michael Weiser wrote:

Hi Kyrill,

On Tue, Dec 19, 2017 at 05:35:03PM +, Kyrill Tkachov wrote:


below patch fixes PR 83492.

I agree with your analysis of the bug and your patch will fix the
problem when compiling with GCC. However, I believe the more
appropriate macro to check here is __ARM_BIG_ENDIAN as that
is the macro defined by the ACLE specification [1].
__AARCH64EB__ is indeed defined by aarch64 GCC when compiling
for big-endian but I don't think it's standardised and may in theory
not be defined by other host compilers the user may use to compile GCC.

Since the macro is used in various places around the code, I'd suggest
fixing this bug and getting back in line with the rest of the codebase
by applying my patch as-is. The move towards a more appropriate macro to
base the checks on can then be done in a separate step using
search'n'replace.

$ grep -r __AARCH64EB__ . | cut -d: -f1 | sort -u | grep -v \.svn
./gcc/config/aarch64/aarch64-c.c
./gcc/config/aarch64/arm_neon.h
./gcc/testsuite/lib/target-supports.exp
./libcpp/ChangeLog
./libcpp/lex.c
./libffi/src/aarch64/ffi.c
./libffi/src/aarch64/sysv.S
./libgcc/config/aarch64/linux-unwind.h
./libgcc/config/aarch64/sfp-machine.h
./libstdc++-v3/config/cpu/aarch64/opt/ext/opt_random.h

That doesn't tell anything on what should be used in libcpp.  Unlike the
rest, libcpp is a library that needs to be built on the host, with the
host compiler (which doesn't have to be GCC).  The only case that is in
the same category is opt_random.h, because it is an installed header and
so again, can be used by GCC or other compilers.
GCC isn't going to stop defining __AARCH64EB__ nor __ARM_BIG_ENDIAN,
it is purely about other compilers.

I think using ACLE specified macro is the way to go here.  At least for
libcpp.  opt_random seems like a good one to fix, particularly for
situations where folks may be using LLVM with GCC's libstdc++.

A patch to use __ARM_BIG_ENDIAN in those two spots is approved.


I'll take care of the opt_random occurrence.
Michael, would you like to respin your patch to libcpp to use 
__ARM_BIG_ENDIAN?


Thanks,
Kyrill


jeff




Re: [PATCH][arm] PR target/82975: Guard against reg_renumber being NULL in arm.h

2017-12-20 Thread Kyrill Tkachov


On 20/12/17 09:22, Kyrill Tkachov wrote:


On 19/12/17 21:56, Jakub Jelinek wrote:
> On Tue, Dec 19, 2017 at 04:58:37PM +, Kyrill Tkachov wrote:
>> 2017-12-19  Kyrylo Tkachov 
>>
>>  PR target/82975
>>  * gcc.dg/pr82975.c: New test.
> The testcase FAILs on x86_64-linux/i686-linux and probably on all
> non-arm/aarch64 targets.
>
> Fixed thusly, committed to trunk as obvious.

Thank you for fixing this. I had committed an older version of the
testcase accidentally :(
My proper testcase did have the target guard.



In fact, here is the patch I meant to commit.
Committed the missing comment change to arm.h.

Kyrill


Kyrill

> 2017-12-19  Jakub Jelinek  
>
>PR target/82975
>* gcc.dg/pr82975.c: Only add -mtune=cortex-a57 on arm*/aarch64*
>targets.
>
> --- gcc/testsuite/gcc.dg/pr82975.c.jj 2017-12-19 18:07:33.0 
+0100
> +++ gcc/testsuite/gcc.dg/pr82975.c2017-12-19 22:53:36.383901451 
+0100

> @@ -1,6 +1,7 @@
>   /* PR target/82975.  */
>   /* { dg-do compile } */
> -/* { dg-options "-mtune=cortex-a57 -fno-sched-pressure -O2" } */
> +/* { dg-options "-fno-sched-pressure -O2" } */
> +/* { dg-additional-options "-mtune=cortex-a57" { target arm*-*-* 
aarch64*-*-* } } */

>
>   typedef __SIZE_TYPE__ size_t;
>
>
>Jakub



commit d0469b43a85de253444f36702090219521e65f54
Author: Kyrylo Tkachov 
Date:   Mon Dec 18 12:06:55 2017 +

[arm] PR target/82975: Guard against reg_renumber being NULL in arm.h

diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index ac51412..c46fd60 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -1615,12 +1615,10 @@ enum arm_auto_incmodes
 
 /* These assume that REGNO is a hard or pseudo reg number.
They give nonzero only if REGNO is a hard reg of the suitable class
-   or a pseudo reg currently allocated to a suitable hard reg.
-   Since they use reg_renumber, they are safe only once reg_renumber
-   has been allocated, which happens in reginfo.c during register
-   allocation.  */
+   or a pseudo reg currently allocated to a suitable hard reg.  */
 #define TEST_REGNO(R, TEST, VALUE) \
-  ((R TEST VALUE) || ((unsigned) reg_renumber[R] TEST VALUE))
+  ((R TEST VALUE)	\
+|| (reg_renumber && ((unsigned) reg_renumber[R] TEST VALUE)))
 
 /* Don't allow the pc to be used.  */
 #define ARM_REGNO_OK_FOR_BASE_P(REGNO)			\
diff --git a/gcc/testsuite/gcc.dg/pr82975.c b/gcc/testsuite/gcc.dg/pr82975.c
new file mode 100644
index 000..f4c15fa
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr82975.c
@@ -0,0 +1,20 @@
+/* PR target/82975.  */
+/* { dg-do compile } */
+/* { dg-options "-fno-sched-pressure -O2" } */
+/* { dg-options "-mtune=cortex-a57 -fno-sched-pressure -O2" { target arm*-*-* } } */
+
+typedef __SIZE_TYPE__ size_t;
+
+struct S1
+{
+  char pad1;
+  char val;
+  short pad2;
+};
+
+extern char t[256];
+
+void foo (struct S1 a, size_t i)
+{
+  t[i] = a.val;
+}


Re: [PATCH] Replace Yoda conditions in gcc/

2017-12-20 Thread Eric Botcazou
> That depends on the reader, and as we have multiple readers, it is better to
> be consistent.  And, I find no ambiguity on IN_RANGE, it is
> inclusive for both values and used heavily through the compiler, so using it
> is a welcome cleanup, e.g. some ports that use it most often:

OK, let's go for IN_RANGE then, but it's Yoda style on steroids!

-- 
Eric Botcazou


[PATCH][aarch64][libstdc++] Use __ARM_BIG_ENDIAN instead of __AARCH64EB__ in opt_random.h

2017-12-20 Thread Kyrill Tkachov

Hi all,

As has been spotted at 
https://gcc.gnu.org/ml/gcc-patches/2017-12/msg01289.html

we check the __AARCH64EB__ macro for aarch64 big-endian
detection in config/cpu/aarch64/opt/ext/opt_random.h.
That works just fine with GCC but the standardised ACLE[1] macro
for that purpose is __ARM_BIG_ENDIAN so there is a possibility
that non-GCC compilers that include this header are not aware
of this predefine.

So this patch changes the use of __AARCH64EB__ to
the more portable __ARM_BIG_ENDIAN.

Tested on aarch64-none-elf and aarch64_be-none-elf.

Preapproved by Jeff at 
https://gcc.gnu.org/ml/gcc-patches/2017-12/msg01326.html


Committed to trunk.

Thanks,
Kyrill

[1] https://static.docs.arm.com/ihi0053/c/IHI0053C_acle_2_0.pdf

2017-12-20  Kyrylo Tkachov  

* config/cpu/aarch64/opt/ext/opt_random.h (__VEXT): Check
__ARM_BIG_ENDIAN instead of __AARCH64EB__.
commit acf0abfe3d166df053ca81840fbe29f4eac72b38
Author: Kyrylo Tkachov 
Date:   Tue Dec 19 18:45:01 2017 +

[aarch64][libstdc++] Use __ARM_BIG_ENDIAN instead of __AARCH64EB__ in opt_random.h

diff --git a/libstdc++-v3/config/cpu/aarch64/opt/ext/opt_random.h b/libstdc++-v3/config/cpu/aarch64/opt/ext/opt_random.h
index 330050f..7c82480 100644
--- a/libstdc++-v3/config/cpu/aarch64/opt/ext/opt_random.h
+++ b/libstdc++-v3/config/cpu/aarch64/opt/ext/opt_random.h
@@ -34,7 +34,7 @@
 
 #ifdef __ARM_NEON
 
-#ifdef __AARCH64EB__
+#ifdef __ARM_BIG_ENDIAN
 # define __VEXT(_A,_B,_C) __builtin_shuffle (_A, _B, (__Uint8x16_t) \
 {16-_C, 17-_C, 18-_C, 19-_C, 20-_C, 21-_C, 22-_C, 23-_C, \
  24-_C, 25-_C, 26-_C, 27-_C, 28-_C, 29-_C, 30-_C, 31-_C})


Re: [PR83370][AARCH64]Use tighter register constraints for sibcall patterns.

2017-12-20 Thread Renlin Li

Ping ~

On 11/12/17 15:27, Renlin Li wrote:

Hi all,

In aarch64 backend, ip0/ip1 register will be used in the prologue/epilogue as
temporary register.

When the compiler is performing sibcall optimization. It has the chance to use
ip0/ip1 register for indirect function call to hold the address. However, those 
two register might
be clobbered by the epilogue code which makes the last sibcall instruction
invalid.

The following is an extreme example:
When built with -O2 -ffixed-x0 -ffixed-x1 -ffixed-x2 -ffixed-x3 -ffixed-x4 
-ffixed-x5 -ffixed-x6 -ffixed-x7
-ffixed-x8 -ffixed-x9 -ffixed-x10 -ffixed-x11 -ffixed-x12 -ffixed-x13 
-ffixed-x14 -ffixed-x15 -ffixed-x17 -ffixed-x18

void (*f)();
int xx;

void tailcall (int i)

{
    int arr[5000];
    xx = arr[i];
    f();
}


tailcall:
 mov    x16, 20016
 sub    sp, sp, x16
 adrp    x16, .LANCHOR0
 stp    x19, x30, [sp]
 add    x19, sp, 16
 ldr    s0, [x19, w0, sxtw 2]
 ldp    x19, x30, [sp]
 str    s0, [x16, #:lo12:.LANCHOR0]
 mov    x16, 20016
 add    sp, sp, x16
 br    x16   // oops


As we can see, x16 is used in the indirect sibcall instruction. It is used as
a temporary in the epilogue code as well. The register allocation is invalid.

With the change, the register allocator is only allowed to use r0-r15, r18 for
indirect sibcall instruction.

For this particular case above, the compiler will ICE as there is not register
could be used for this sibcall instruction.
And I think it is better to fail instead of wrong code-generation.

test.c:10:1: error: unable to generate reloads for:
  }
  ^
(call_insn/j 16 12 17 2 (parallel [
     (call (mem:DI (reg/f:DI 84 [ f ]) [0 *f.0_2 S8 A8])
     (const_int 0 [0]))
     (return)
     ]) "test.c":9 42 {*sibcall_insn}
  (expr_list:REG_DEAD (reg/f:DI 84 [ f ])
     (expr_list:REG_CALL_DECL (nil)
     (nil)))
     (expr_list (clobber (reg:DI 17 x17))
     (expr_list (clobber (reg:DI 16 x16))
     (nil

aarch64-none-elf test without regressions. Okay to commit?
The same issue affects gcc-6, gcc-7 as well. Backport are needed for those 
branches.

Regards,
Renlin

gcc/ChangeLog:

2017-12-11  Renlin Li  

     PR target/83370
 * config/aarch64/aarch64.c (aarch64_class_max_nregs): Handle
 TAILCALL_ADDR_REGS.
 (aarch64_register_move_cost): Likewise.
 * config/aarch64/aarch64.h (reg_class): Rename CALLER_SAVE_REGS to 
TAILCALL_ADDR_REGS.
 * config/aarch64/constraints.md (Ucs): Update register constraint.


[PATCH][arm][doc] Document accepted -march=armv8.3-a extension options

2017-12-20 Thread Kyrill Tkachov

Hi all,

I noticed that we helpfully list the extensions that are accepted
by the -march options on arm but we were missing the information
for 'armv8.3-a'.

This patchlet corrects that.
Built the documentation and it looked ok.

Committing to trunk.

Thanks,
Kyrill

2017-12-20  Kyrylo Tkachov  

* doc/invoke.texi (ARM Options): Document accepted extension options
for -march=armv8.3-a.
commit 68a56ead5df086cf11aea7651e76336b504a883d
Author: Kyrylo Tkachov 
Date:   Mon Dec 18 19:07:22 2017 +

[arm][doc] Document accepted Armv8.3-A extension options

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 2049c27..9390882 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15852,6 +15852,7 @@ Disable the floating-point, Advanced SIMD and cryptographic instructions.
 @end table
 
 @item armv8.2-a
+@itemx armv8.3-a
 @table @samp
 @item +fp16
 The half-precision floating-point data processing instructions.


[arm] PR target/83105: Minor change of default CPU for arm-linux-gnueabi

2017-12-20 Thread Richard Earnshaw (lists)
When GCC for ARM/linux is configured with --with-float=hard, or
--with-float=softfp the compiler will now die when trying to build the
support libraries because the baseline architecture is too old to
support VFP (older versions of GCC just emitted the VFP instructions
anyway, even though they wouldn't run on that version of the
architecture; but we're now more prickly about it).

This patch fixed the problem by raising the default architecture
(actually the default CPU) to ARMv5te (ARM10e) when we need to generate
HW floating-point code.

PR target/83105
* config.gcc (arm*-*-linux*): When configured with --with-float=hard
or --with-float=softfp, set the default CPU to arm10e.[arm] PR
target/83105: Minor change of default CPU for arm-linux-gnueabi

When GCC for ARM/linux is configured with --with-float=hard, or
--with-float=softfp the compiler will now die when trying to build the
support libraries because the baseline architecture is too old to
support VFP (older versions of GCC just emitted the VFP instructions
anyway, even though they wouldn't run on that version of the
architecture; but we're now more prickly about it).

This patch fixed the problem by raising the default architecture
(actually the default CPU) to ARMv5te (ARM10e) when we need to generate
HW floating-point code.

PR target/83105
* config.gcc (arm*-*-linux*): When configured with --with-float=hard
or --with-float=softfp, set the default CPU to arm10e.
diff --git a/gcc/config.gcc b/gcc/config.gcc
index e208d00bd5bf582f8179d045ae27f452c57eb623..c904f794d77bd30151ca82a076af471b2d211b71 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1132,7 +1132,12 @@ arm*-*-linux-*)			# ARM GNU/Linux with ELF
 	esac
 	tmake_file="${tmake_file} arm/t-arm arm/t-arm-elf arm/t-bpabi arm/t-linux-eabi"
 	tm_file="$tm_file arm/bpabi.h arm/linux-eabi.h arm/aout.h vxworks-dummy.h arm/arm.h"
-	target_cpu_cname="arm10tdmi"
+	# Generation of floating-point instructions requires at least ARMv5te.
+	if [ "$with_float" = "hard" -o "$with_float" = "softfp" ] ; then
+	target_cpu_cname="arm10e"
+	else
+	target_cpu_cname="arm10tdmi"
+	fi
 	# Define multilib configuration for arm-linux-androideabi.
 	case ${target} in
 	*-androideabi)


Re: [PATCH] Fix PR83452

2017-12-20 Thread Rainer Orth
Hi Richard,

> The following enables a hpux specific workaround for the WEAK UNDEF
> symbols used by LTO debuginfo extraction.  It makes sure to use
> a special name (gnu_lto_v1) for those.
>
> It also adjusts removed local symbols to not use UNDEFs (similar
> to solaris ld for globals hpux doesn't like undefs it cannot resolve even
> if they are unused).  Instead it uses DEFS with NULL name.
>
> I also fixed some readelf complaints about stale sh_info/link in
> removed sections.  (maybe that fixes the solaris complaints? ... just
> having some christmas wishes ;))

unfortunately, Santa doesn't grant all our wishes ;-)

> LTO Bootstrapped and tested on x86_64-unknown-linux-gnu.
>
> I've tested GNU ld and gold for a simple testcase exercising multiple
> WEAK UNDEFs with the same name and local defs, they seem happy
> (but they also retain all those local defs now in the partial link ... :/)
>
> I'll probably see to implement "real" section removal in stage3 to
> fix the Solaris complaints but I hope to not need reloc section
> rewriting (aka really pruning stuff from the symtab).

I've tested all of unmodified mainline, trunk + current pr81968 patch,
and trunk + this one last night on i386-pc-solaris2.11 and
sparc-sun-solaris2.11.  There's no difference between results with and
without this patch, while the pr81968 (which I've been keeping in my
tree for quite some time) does help quite a bit, but several errors
remain.

Rainer
 
-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [047/nnn] poly_int: argument sizes

2017-12-20 Thread Richard Sandiford
Jeff Law  writes:
> On 10/23/2017 11:20 AM, Richard Sandiford wrote:
>> This patch changes various bits of state related to argument sizes so
>> that they have type poly_int64 rather than HOST_WIDE_INT.  This includes:
>> 
>> - incoming_args::pops_args and incoming_args::size
>> - rtl_data::outgoing_args_size
>> - pending_stack_adjust
>> - stack_pointer_delta
>> - stack_usage::pushed_stack_size
>> - args_size::constant
>> 
>> It also changes TARGET_RETURN_POPS_ARGS so that the size of the
>> arguments passed in and the size returned by the hook are both
>> poly_int64s.
>> 
>> 
>> 2017-10-23  Richard Sandiford  
>>  Alan Hayward  
>>  David Sherwood  
>> 
>> gcc/
>>  * target.def (return_pops_args): Treat both the input and output
>>  sizes as poly_int64s rather than HOST_WIDE_INTS.
>>  * targhooks.h (default_return_pops_args): Update accordingly.
>>  * targhooks.c (default_return_pops_args): Likewise.
>>  * doc/tm.texi: Regenerate.
>>  * emit-rtl.h (incoming_args): Change pops_args, size and
>>  outgoing_args_size from int to poly_int64_pod.
>>  * function.h (expr_status): Change x_pending_stack_adjust and
>>  x_stack_pointer_delta from int to poly_int64.
>>  (args_size::constant): Change from HOST_WIDE_INT to poly_int64.
>>  (ARGS_SIZE_RTX): Update accordingly.
>>  * calls.c (highest_outgoing_arg_in_use): Change from int to
>>  unsigned int.
>>  (stack_usage_watermark, stored_args_watermark): New variables.
>>  (stack_region_maybe_used_p, mark_stack_region_used): New functions.
>>  (emit_call_1): Change the stack_size and rounded_stack_size
>>  parameters from HOST_WIDE_INT to poly_int64.  Track n_popped
>>  as a poly_int64.
>>  (save_fixed_argument_area): Check stack_usage_watermark.
>>  (initialize_argument_information): Change old_pending_adj from
>>  a HOST_WIDE_INT * to a poly_int64_pod *.
>>  (compute_argument_block_size): Return the size as a poly_int64
>>  rather than an int.
>>  (finalize_must_preallocate): Track polynomial argument sizes.
>>  (compute_argument_addresses): Likewise.
>>  (internal_arg_pointer_based_exp): Track polynomial offsets.
>>  (mem_overlaps_already_clobbered_arg_p): Rename to...
>>  (mem_might_overlap_already_clobbered_arg_p): ...this and take the
>>  size as a poly_uint64 rather than an unsigned HOST_WIDE_INT.
>>  Check stored_args_used_watermark.
>>  (load_register_parameters): Update accordingly.
>>  (check_sibcall_argument_overlap_1): Likewise.
>>  (combine_pending_stack_adjustment_and_call): Take the unadjusted
>>  args size as a poly_int64 rather than an int.  Return a bool
>>  indicating whether the optimization was possible and return
>>  the new adjustment by reference.
>>  (check_sibcall_argument_overlap): Track polynomail argument sizes.
>>  Update stored_args_watermark.
>>  (can_implement_as_sibling_call_p): Handle polynomial argument sizes.
>>  (expand_call): Likewise.  Maintain stack_usage_watermark and
>>  stored_args_watermark.  Update calls to
>>  combine_pending_stack_adjustment_and_call.
>>  (emit_library_call_value_1): Handle polynomial argument sizes.
>>  Call stack_region_maybe_used_p and mark_stack_region_used.
>>  Maintain stack_usage_watermark.
>>  (store_one_arg): Likewise.  Update call to
>>  mem_overlaps_already_clobbered_arg_p.
>>  * config/arm/arm.c (arm_output_function_prologue): Add a cast to
>>  HOST_WIDE_INT.
>>  * config/avr/avr.c (avr_outgoing_args_size): Likewise.
>>  * config/microblaze/microblaze.c (microblaze_function_prologue):
>>  Likewise.
>>  * config/cr16/cr16.c (cr16_return_pops_args): Update for new
>>  TARGET_RETURN_POPS_ARGS interface.
>>  (cr16_compute_frame, cr16_initial_elimination_offset): Add casts
>>  to HOST_WIDE_INT.
>>  * config/ft32/ft32.c (ft32_compute_frame): Likewise.
>>  * config/i386/i386.c (ix86_return_pops_args): Update for new
>>  TARGET_RETURN_POPS_ARGS interface.
>>  (ix86_expand_split_stack_prologue): Add a cast to HOST_WIDE_INT.
>>  * config/moxie/moxie.c (moxie_compute_frame): Likewise.
>>  * config/m68k/m68k.c (m68k_return_pops_args): Update for new
>>  TARGET_RETURN_POPS_ARGS interface.
>>  * config/vax/vax.c (vax_return_pops_args): Likewise.
>>  * config/pa/pa.h (STACK_POINTER_OFFSET): Add a cast to poly_int64.
>>  (EXIT_IGNORE_STACK): Update reference to crtl->outgoing_args_size.
>>  * config/arm/arm.h (CALLER_INTERWORKING_SLOT_SIZE): Likewise.
>>  * config/powerpcspe/aix.h (STACK_DYNAMIC_OFFSET): Likewise.
>>  * config/powerpcspe/darwin.h (STACK_DYNAMIC_OFFSET): Likewise.
>>  * config/powerpcspe/powerpcspe.h (STACK_DYNAMIC_OFFSET): Likewise.
>>  * config/rs6000/aix.h (STACK_DYNAMIC_OFFSET): Likewise.
>>  * config/rs6000/darwin.h (STACK_DYNAMIC_OFFSET): Likewise.
>>  * config/rs6000/rs6000.h (STAC

Re: [039/nnn] poly_int: pass_store_merging::execute

2017-12-20 Thread Richard Sandiford
Jeff Law  writes:
> On 10/23/2017 11:17 AM, Richard Sandiford wrote:
>> This patch makes pass_store_merging::execute track polynomial sizes
>> and offsets.
>> 
>> 
>> 2017-10-23  Richard Sandiford  
>>  Alan Hayward  
>>  David Sherwood  
>> 
>> gcc/
>>  * gimple-ssa-store-merging.c (pass_store_merging::execute): Track
>>  polynomial sizes and offsets.
> OK.  THough I wouldn't be surprised if this needs revamping after
> Jakub's work in this space.

Yeah, it needed quite a big revamp in the end.  Same idea, just in
different places.  And...

> It wasn't clear why you moved some of the code which computes invalid vs
> where we test invalid, but I don't see any problem with that movement of
> code.

...this part fortunately went away with the new code structure.

Here's what I applied after retesting.  It now fits in the series after
[036/nnn].

Thanks,
Richard


2017-12-20  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* poly-int-types.h (round_down_to_byte_boundary): New macro.
(round_up_to_byte_boundary): Likewise.
* expr.h (get_bit_range): Add temporary shim.
* gimple-ssa-store-merging.c (store_operand_info): Change the
bitsize, bitpos, bitregion_start and bitregion_end fields from
unsigned HOST_WIDE_INT to poly_uint64.
(merged_store_group): Likewise load_align_base.
(compatible_load_p, compatible_load_p): Update accordingly.
(imm_store_chain_info::coalesce_immediate_stores): Likewise.
(split_group, imm_store_chain_info::output_merged_store): Likewise.
(mem_valid_for_store_merging): Return the bitsize, bitpos,
bitregion_start and bitregion_end as poly_uint64s rather than
unsigned HOST_WIDE_INTs.  Track polynomial offsets internally.
(handled_load): Take the bitsize, bitpos,
bitregion_start and bitregion_end as poly_uint64s rather than
unsigned HOST_WIDE_INTs.
(pass_store_merging::process_store): Update call to
mem_valid_for_store_merging.

Index: gcc/poly-int-types.h
===
--- gcc/poly-int-types.h2017-12-20 09:36:17.817094740 +
+++ gcc/poly-int-types.h2017-12-20 09:37:07.445382807 +
@@ -60,6 +60,18 @@ #define bits_to_bytes_round_up(X) force_
of bytes in size.  */
 #define num_trailing_bits(X) force_get_misalignment (X, BITS_PER_UNIT)
 
+/* Round bit quantity X down to the nearest byte boundary.
+
+   This is safe because non-constant mode sizes must be a whole number
+   of bytes in size.  */
+#define round_down_to_byte_boundary(X) force_align_down (X, BITS_PER_UNIT)
+
+/* Round bit quantity X up the nearest byte boundary.
+
+   This is safe because non-constant mode sizes must be a whole number
+   of bytes in size.  */
+#define round_up_to_byte_boundary(X) force_align_up (X, BITS_PER_UNIT)
+
 /* Return the size of an element in a vector of size SIZE, given that
the vector has NELTS elements.  The return value is in the same units
as SIZE (either bits or bytes).
Index: gcc/expr.h
===
--- gcc/expr.h  2017-12-20 09:36:17.817094740 +
+++ gcc/expr.h  2017-12-20 09:37:07.444382841 +
@@ -243,6 +243,15 @@ extern bool emit_push_insn (rtx, machine
 extern void get_bit_range (unsigned HOST_WIDE_INT *, unsigned HOST_WIDE_INT *,
   tree, HOST_WIDE_INT *, tree *);
 
+/* Temporary.  */
+inline void
+get_bit_range (poly_uint64_pod *bitstart, poly_uint64_pod *bitend, tree exp,
+  poly_int64_pod *bitpos, tree *offset)
+{
+  get_bit_range (&bitstart->coeffs[0], &bitend->coeffs[0], exp,
+&bitpos->coeffs[0], offset);
+}
+
 /* Expand an assignment that stores the value of FROM into TO.  */
 extern void expand_assignment (tree, tree, bool);
 
Index: gcc/gimple-ssa-store-merging.c
===
--- gcc/gimple-ssa-store-merging.c  2017-12-20 09:36:17.817094740 +
+++ gcc/gimple-ssa-store-merging.c  2017-12-20 09:37:07.445382807 +
@@ -1321,10 +1321,10 @@ struct store_operand_info
 {
   tree val;
   tree base_addr;
-  unsigned HOST_WIDE_INT bitsize;
-  unsigned HOST_WIDE_INT bitpos;
-  unsigned HOST_WIDE_INT bitregion_start;
-  unsigned HOST_WIDE_INT bitregion_end;
+  poly_uint64 bitsize;
+  poly_uint64 bitpos;
+  poly_uint64 bitregion_start;
+  poly_uint64 bitregion_end;
   gimple *stmt;
   bool bit_not_p;
   store_operand_info ();
@@ -1414,7 +1414,7 @@ struct merged_store_group
   /* The size of the allocated memory for val and mask.  */
   unsigned HOST_WIDE_INT buf_size;
   unsigned HOST_WIDE_INT align_base;
-  unsigned HOST_WIDE_INT load_align_base[2];
+  poly_uint64 load_align_base[2];
 
   unsigned int align;
   unsigned int load_align[2];
@@ -2198,8 +2198,8 @@ compatible_load_p (merged_store_group *m
 {
   store_immediate_info *infof =

[committed] Fix multiple_p for two non-poly_ints

2017-12-20 Thread Richard Sandiford
Fix a stupid inversion.  This function is very rarely used and was
mostly to help split patches up, which is why it didn't get picked
up during initial testing.  Sorry for the mistake.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.
Applied as obvious.

Richard


2017-12-20  Richard Sandiford  

gcc/
* poly-int.h (multiple_p): Fix handling of two non-poly_ints.

gcc/testsuite/
* gcc.dg/plugin/poly-int-tests.h (test_nonpoly_multiple_p): New
function.
(test_nonpoly_type): Call it.

Index: gcc/poly-int.h
===
--- gcc/poly-int.h  2017-12-20 08:55:30.768462429 +
+++ gcc/poly-int.h  2017-12-20 08:55:55.911492315 +
@@ -2027,7 +2027,7 @@ constant_multiple_p (const poly_int_pod<
 inline typename if_nonpoly2::type
 multiple_p (Ca a, Cb b)
 {
-  return a % b != 0;
+  return a % b == 0;
 }
 
 /* Return true if A is a (polynomial) multiple of B.  */
Index: gcc/testsuite/gcc.dg/plugin/poly-int-tests.h
===
--- gcc/testsuite/gcc.dg/plugin/poly-int-tests.h2017-12-14 
00:05:54.955021396 +
+++ gcc/testsuite/gcc.dg/plugin/poly-int-tests.h2017-12-20 
08:55:55.912492276 +
@@ -4505,6 +4505,19 @@ test_uhwi ()
   wi::uhwi (210, 16)));
 }
 
+/* Test multiple_p for non-polynomial T.  */
+
+template
+static void
+test_nonpoly_multiple_p ()
+{
+  ASSERT_TRUE (multiple_p (T (6), T (2)));
+  ASSERT_TRUE (multiple_p (T (6), T (3)));
+  ASSERT_FALSE (multiple_p (T (6), T (4)));
+  ASSERT_FALSE (multiple_p (T (7), T (4)));
+  ASSERT_TRUE (multiple_p (T (8), T (4)));
+}
+
 /* Test known_size_p for non-polynomial T.  */
 
 template
@@ -4523,6 +4536,7 @@ test_nonpoly_known_size_p ()
 static void
 test_nonpoly_type ()
 {
+  test_nonpoly_multiple_p ();
   test_nonpoly_known_size_p ();
 }
 


[middle-end, obvious] Remove dead error_mark_node check

2017-12-20 Thread Paolo Carlini

Hi,

today noticed this nit too: when ret_expr == error_mark_node 
gimplify_return_expr immediatey returns GS_ERROR.


Thanks, Paolo.

//


2017-12-20  Paolo Carlini  

* gimplify.c (gimplify_return_expr): Remove dead error_mark_node check.
Index: gimplify.c
===
--- gimplify.c  (revision 255855)
+++ gimplify.c  (working copy)
@@ -1499,10 +1499,9 @@ gimplify_return_expr (tree stmt, gimple_
   if (ret_expr == error_mark_node)
 return GS_ERROR;
 
   if (!ret_expr
-  || TREE_CODE (ret_expr) == RESULT_DECL
-  || ret_expr == error_mark_node)
+  || TREE_CODE (ret_expr) == RESULT_DECL)
 {
   maybe_add_early_return_predict_stmt (pre_p);
   greturn *ret = gimple_build_return (ret_expr);
   gimple_set_no_warning (ret, TREE_NO_WARNING (stmt));


Re: [05/13] Remove vec_perm_const optab

2017-12-20 Thread Richard Sandiford
Richard Biener  writes:
> On Sun, Dec 10, 2017 at 12:16 AM, Richard Sandiford
>  wrote:
>> One of the changes needed for variable-length VEC_PERM_EXPRs -- and for
>> long fixed-length VEC_PERM_EXPRs -- is the ability to use constant
>> selectors that wouldn't fit in the vectors being permuted.  E.g. a
>> permute on two V256QIs can't be done using a V256QI selector.
>>
>> At the moment constant permutes use two interfaces:
>> targetm.vectorizer.vec_perm_const_ok for testing whether a permute is
>> valid and the vec_perm_const optab for actually emitting the permute.
>> The former gets passed a vec<> selector and the latter an rtx selector.
>> Most ports share a lot of code between the hook and the optab, with a
>> wrapper function for each interface.
>>
>> We could try to keep that interface and require ports to define wider
>> vector modes that could be attached to the CONST_VECTOR (e.g. V256HI or
>> V256SI in the example above).  But building a CONST_VECTOR rtx seems a bit
>> pointless here, since the expand code only creates the CONST_VECTOR in
>> order to call the optab, and the first thing the target does is take
>> the CONST_VECTOR apart again.
>>
>> The easiest approach therefore seemed to be to remove the optab and
>> reuse the target hook to emit the code.  One potential drawback is that
>> it's no longer possible to use match_operand predicates to force
>> operands into the required form, but in practice all targets want
>> register operands anyway.
>>
>> The patch also changes vec_perm_indices into a class that provides
>> some simple routines for handling permutations.  A later patch will
>> flesh this out and get rid of auto_vec_perm_indices, but I didn't
>> want to do all that in this patch and make it more complicated than
>> it already is.
>>
>>
>> 2017-12-09  Richard Sandiford  
>>
>> gcc/
>> * Makefile.in (OBJS): Add vec-perm-indices.o.
>> * vec-perm-indices.h: New file.
>> * vec-perm-indices.c: Likewise.
>> * target.h (vec_perm_indices): Replace with a forward class
>> declaration.
>> (auto_vec_perm_indices): Move to vec-perm-indices.h.
>> * optabs.h: Include vec-perm-indices.h.
>> (expand_vec_perm): Delete.
>> (selector_fits_mode_p, expand_vec_perm_var): Declare.
>> (expand_vec_perm_const): Declare.
>> * target.def (vec_perm_const_ok): Replace with...
>> (vec_perm_const): ...this new hook.
>> * doc/tm.texi.in (TARGET_VECTORIZE_VEC_PERM_CONST_OK): Replace 
>> with...
>> (TARGET_VECTORIZE_VEC_PERM_CONST): ...this new hook.
>> * doc/tm.texi: Regenerate.
>> * optabs.def (vec_perm_const): Delete.
>> * doc/md.texi (vec_perm_const): Likewise.
>> (vec_perm): Refer to TARGET_VECTORIZE_VEC_PERM_CONST.
>> * expr.c (expand_expr_real_2): Use expand_vec_perm_const rather than
>> expand_vec_perm for constant permutation vectors.  Assert that
>> the mode of variable permutation vectors is the integer equivalent
>> of the mode that is being permuted.
>> * optabs-query.h (selector_fits_mode_p): Declare.
>> * optabs-query.c: Include vec-perm-indices.h.
>> (can_vec_perm_const_p): Check whether 
>> targetm.vectorize.vec_perm_const
>> is defined, instead of checking whether the vec_perm_const_optab
>> exists.  Use targetm.vectorize.vec_perm_const instead of
>> targetm.vectorize.vec_perm_const_ok.  Check whether the indices
>> fit in the vector mode before using a variable permute.
>> * optabs.c (shift_amt_for_vec_perm_mask): Take a mode and a
>> vec_perm_indices instead of an rtx.
>> (expand_vec_perm): Replace with...
>> (expand_vec_perm_const): ...this new function.  Take the selector
>> as a vec_perm_indices rather than an rtx.  Also take the mode of
>> the selector.  Update call to shift_amt_for_vec_perm_mask.
>> Use targetm.vectorize.vec_perm_const instead of vec_perm_const_optab.
>> Use vec_perm_indices::new_expanded_vector to expand the original
>> selector into bytes.  Check whether the indices fit in the vector
>> mode before using a variable permute.
>> (expand_vec_perm_var): Make global.
>> (expand_mult_highpart): Use expand_vec_perm_const.
>> * fold-const.c: Includes vec-perm-indices.h.
>> * tree-ssa-forwprop.c: Likewise.
>> * tree-vect-data-refs.c: Likewise.
>> * tree-vect-generic.c: Likewise.
>> * tree-vect-loop.c: Likewise.
>> * tree-vect-slp.c: Likewise.
>> * tree-vect-stmts.c: Likewise.
>> * config/aarch64/aarch64-protos.h (aarch64_expand_vec_perm_const):
>> Delete.
>> * config/aarch64/aarch64-simd.md (vec_perm_const): Delete.
>> * config/aarch64/aarch64.c (aarch64_expand_vec_perm_const)
>> (aarch64_vectorize_vec_perm_const_ok): Fuse into...
>> (aarch64_vectorize_vec_pe

Re: [07/13] Make vec_perm_indices use new vector encoding

2017-12-20 Thread Richard Sandiford
Richard Biener  writes:
> On Tue, Dec 12, 2017 at 4:46 PM, Richard Sandiford
>  wrote:
>> Richard Biener  writes:
>>> On Sun, Dec 10, 2017 at 12:20 AM, Richard Sandiford
>>>  wrote:
 This patch changes vec_perm_indices from a plain vec<> to a class
 that stores a canonicalised permutation, using the same encoding
 as for VECTOR_CSTs.  This means that vec_perm_indices now carries
 information about the number of vectors being permuted (currently
 always 1 or 2) and the number of elements in each input vector.
>>>
>>> Before I dive into  the C++ details can you explain why it needs this
>>> info and how it encodes it for variable-length vectors?  To interleave
>>> two vectors you need sth like { 0, N, 1, N+1, ... }, I'm not sure we
>>> can directly encode N here, can we?  extract even/odd should just
>>> work as { 0, 2, 4, 6, ...} without knowledge of whether we permute
>>> one or two vectors (the one vector case just has two times the same
>>> vector) or how many elements each of the vectors (or the result) has.
>>
>> One of the later patches switches the element types to HOST_WIDE_INT,
>> so that we can represent all ssizetypes.  Then there's a poly_int
>> patch (not yet posted) to make that poly_int64, so that we can
>> represent the N even for variable-length vectors.
>>
>> The class needs to know the number of elements because that affects
>> the canonical representation.  E.g. extract even on fixed-length
>> vectors with both inputs the same should be { 0, 2, 4, ..., 0, 2, 4 ... },
>> which we can't encode as a simple series.  Interleave low with both
>> inputs the same should be { 0, 0, 1, 1, ... } for both fixed-length and
>> variable-length vectors.
>
> Huh?  extract even is { 0, 2, 4, 6, 8 ... } indexes in the selection vector
> are referencing concat'ed input vectors.  So yes, for two same vectors
> that's effectively { 0, 2, 4, ..., 0, 2, 4, ... } but I don't see why
> that should
> be the canonical form?

Current practice is to use the single-input form where possible,
if both inputs are the same (see e.g. the VEC_PERM_EXPR handling
in fold-const.c).  It means that things like:

_1 = VEC_PERM_EXPR ;
_2 = VEC_PERM_EXPR ;
_3 = VEC_PERM_EXPR ;

get folded to the same sequence, and so can be CSEd.

We could instead convert the single-input form to use the two-input
selector, but that would be harder.  The advantage of treating the
single-input form as canonical is that it works even for irregular
permutes.

Thanks,
Richard

>> Also, operator[] is supposed to return an in-range selector even if
>> the selector element is only implicitly encoded.  So we need to know
>> the number of input elements there.
>>
>> Separating the number of input elements into the number of inputs
>> and the number of elements per input isn't really necessary, but made
>> it easier to provide routines for testing whether all selected
>> elements come from a particular input, and for rotating the selector
>> by a whole number of inputs.
>>
>> Thanks,
>> Richard


[PATCH] Fix PR83491

2017-12-20 Thread Wilco Dijkstra
This patch fixes PR83491 - if an SSA expression contains 2 identical float
constants, the division reciprocal optimization will ICE.  Fix this by 
explicitly
checking for SSA_NAME in the tree code before walking the uses.  Also fix 
several
coding style issues pointed out by Jakub and make comments more readable.

Bootstrap OK, OK for trunk?

ChangeLog:
2017-12-20  Wilco Dijkstra  

gcc/
PR tree-optimization/83491
* tree-ssa-math-opts.c (execute_cse_reciprocals_1): Check for SSA_NAME
before walking uses.  Improve coding style and comments.

gcc/testsuite/
PR tree-optimization/83491
* gcc.dg/pr83491.c: Add new test.
--

diff --git a/gcc/testsuite/gcc.dg/pr83491.c b/gcc/testsuite/gcc.dg/pr83491.c
new file mode 100644
index 
..f23cc19c72f57ca8d34f05f28fee75fc2c13f33a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr83491.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -funsafe-math-optimizations" } */
+
+float a;
+float b;
+void bar ()
+{
+  a = __builtin_nanf ("");
+  b = __builtin_powf (a, 2.5F);
+}
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 
8db12f5e1cd5d227bd6c3780420e93cd3fe7b435..4e8f5e728d0c8ef5ac564d8c3c999d8a6ff15e7e
 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -544,29 +544,32 @@ execute_cse_reciprocals_1 (gimple_stmt_iterator *def_gsi, 
tree def)
   int square_recip_count = 0;
   int sqrt_recip_count = 0;
 
-  gcc_assert (FLOAT_TYPE_P (TREE_TYPE (def)) && is_gimple_reg (def));
+  gcc_assert (FLOAT_TYPE_P (TREE_TYPE (def)) && is_gimple_reg (def)
+ && TREE_CODE (def) == SSA_NAME);
   threshold = targetm.min_divisions_for_recip_mul (TYPE_MODE (TREE_TYPE 
(def)));
 
-  /* If this is a square (x * x), we should check whether there are any
- enough divisions by x on it's own to warrant waiting for that pass.  */
-  if (TREE_CODE (def) == SSA_NAME)
+  /* If DEF is a square (x * x), count the number of divisions by x.
+ If there are more divisions by x than by (DEF * DEF), prefer to optimize
+ the reciprocal of x instead of DEF.  This improves cases like:
+   def = x * x
+   t0 = a / def
+   t1 = b / def
+   t2 = c / x
+ Reciprocal optimization of x results in 1 division rather than 2 or 3.  */
+  gimple *def_stmt = SSA_NAME_DEF_STMT (def);
+
+  if (is_gimple_assign (def_stmt)
+  && gimple_assign_rhs_code (def_stmt) == MULT_EXPR
+  && TREE_CODE (gimple_assign_rhs1 (def_stmt)) == SSA_NAME
+  && gimple_assign_rhs1 (def_stmt) == gimple_assign_rhs2 (def_stmt))
 {
-  gimple *def_stmt = SSA_NAME_DEF_STMT (def);
+  tree op0 = gimple_assign_rhs1 (def_stmt);
 
-  if (is_gimple_assign (def_stmt)
- && gimple_assign_rhs_code (def_stmt) == MULT_EXPR
- && gimple_assign_rhs1 (def_stmt) == gimple_assign_rhs2 (def_stmt))
+  FOR_EACH_IMM_USE_FAST (use_p, use_iter, op0)
{
- /* This statement is a square of something.  We should take this
-in to account, as it may be more profitable to not extract
-the reciprocal here.  */
- tree op0 = gimple_assign_rhs1 (def_stmt);
- FOR_EACH_IMM_USE_FAST (use_p, use_iter, op0)
-   {
- gimple *use_stmt = USE_STMT (use_p);
- if (is_division_by (use_stmt, op0))
-   sqrt_recip_count ++;
-   }
+ gimple *use_stmt = USE_STMT (use_p);
+ if (is_division_by (use_stmt, op0))
+   sqrt_recip_count++;
}
 }
 
@@ -587,26 +590,23 @@ execute_cse_reciprocals_1 (gimple_stmt_iterator *def_gsi, 
tree def)
  gimple *square_use_stmt = USE_STMT (square_use_p);
  if (is_division_by (square_use_stmt, square_def))
{
- /* Halve the relative importance as this is called twice
-for each division by a square.  */
+ /* This is executed twice for each division by a square.  */
  register_division_in (gimple_bb (square_use_stmt), 1);
- square_recip_count ++;
+ square_recip_count++;
}
}
}
 }
 
-  /* Square reciprocals will have been counted twice.  */
+  /* Square reciprocals were counted twice above.  */
   square_recip_count /= 2;
 
+  /* If it is more profitable to optimize 1 / x, don't optimize 1 / (x * x).  
*/
   if (sqrt_recip_count > square_recip_count)
-/* It will be more profitable to extract a 1 / x expression later,
-   so it is not worth attempting to extract 1 / (x * x) now.  */
 return;
 
   /* Do the expensive part only if we can hope to optimize something.  */
-  if (count + square_recip_count >= threshold
-  && count >= 1)
+  if (count + square_recip_count >= threshold && count >= 1)
 {
   gimple *use_stmt;
   for (occ = occ_head; occ; occ = occ->next)
@@ -623,8 +623,7 @@ execute_cse_reciprocals_1 (gimple_stmt_iterator *

Re: [PATCH] Fix PR83491

2017-12-20 Thread Jakub Jelinek
On Wed, Dec 20, 2017 at 01:56:27PM +, Wilco Dijkstra wrote:
> This patch fixes PR83491 - if an SSA expression contains 2 identical float
> constants, the division reciprocal optimization will ICE.  Fix this by 
> explicitly
> checking for SSA_NAME in the tree code before walking the uses.  Also fix 
> several
> coding style issues pointed out by Jakub and make comments more readable.
> 
> Bootstrap OK, OK for trunk?
> 
> ChangeLog:
> 2017-12-20  Wilco Dijkstra  
> 
> gcc/
>   PR tree-optimization/83491
>   * tree-ssa-math-opts.c (execute_cse_reciprocals_1): Check for SSA_NAME
>   before walking uses.  Improve coding style and comments.
> 
> gcc/testsuite/
>   PR tree-optimization/83491
>   * gcc.dg/pr83491.c: Add new test.

Ok, thanks.

Jakub


Re: [C++] Add support for #pragma GCC unroll v4

2017-12-20 Thread Eric Botcazou
> This needs some C++ tests, particularly with templates and range-for.  I
> suspect that using the pragma in a template will ICE.

No, that wasn't the case, but the combination template/range-for was indeed 
not working; fixed by adding a 5th parameter to the RANGE_FOR_STMT node.

Tested on x86_64-suse-linux, OK for the mainline?


2017-12-20  Mike Stump  
Eric Botcazou  

cp/
* constexpr.c (cxx_eval_constant_expression) : Remove
assertion on 2nd operand.
(potential_constant_expression_1): Likewise.
* cp-tree.def (RANGE_FOR_STMT): Take a 5th operand.
* cp-tree.h (RANGE_FOR_UNROLL): New macro.
(cp_convert_range_for): Adjust prototype.
(finish_while_stmt_cond): Likewise.
(finish_do_stmt): Likewise.
(finish_for_cond): Likewise.
* init.c (build_vec_init): Adjut call to finish_for_cond.
* parser.c (cp_parser_statement): Adjust call to
cp_parser_iteration_statement.
(cp_parser_for): Add unroll parameter and pass it in calls to
cp_parser_range_for and cp_parser_c_for.
(cp_parser_c_for): Add unroll parameter and pass it in call to
finish_for_cond.
(cp_parser_range_for): Add unroll parameter, set in on RANGE_FOR_STMT
and pass it in call to cp_convert_range_for.
(cp_convert_range_for): Add unroll parameter and pass it in call to
finish_for_cond.
(cp_parser_iteration_statement): Add unroll parameter and pass it in
calls to finish_while_stmt_cond, finish_do_stmt and cp_parser_for.
(cp_parser_pragma_ivdep): New static function.
(cp_parser_pragma_unroll): Likewise.
(cp_parser_pragma) : Add support for pragma Unroll.
: New case.
* pt.c (tsubst_expr) : Adjust call to finish_for_cond.
: Pass unrolling factor to cp_convert_range_for.
: Adjust call to finish_while_stmt_cond.
: Adjust call to finish_do_stmt.
* semantics.c (finish_while_stmt_cond): Add unroll parameter and
build ANNOTATE_EXPR if present.
(finish_do_stmt): Likewise.
(finish_for_cond): Likewise.
(begin_range_for_stmt): Build RANGE_FOR_STMT with 5th operand.

testsuite/
* g++.dg/ext/unroll-1.C: New test.
* g++.dg/ext/unroll-2.C: Likewise.
* g++.dg/ext/unroll-3.C: Likewise.

-- 
Eric BotcazouIndex: cp/constexpr.c
===
--- cp/constexpr.c	(revision 255850)
+++ cp/constexpr.c	(working copy)
@@ -4689,7 +4689,6 @@ cxx_eval_constant_expression (const cons
   return t;
 
 case ANNOTATE_EXPR:
-  gcc_assert (tree_to_uhwi (TREE_OPERAND (t, 1)) == annot_expr_ivdep_kind);
   r = cxx_eval_constant_expression (ctx, TREE_OPERAND (t, 0),
 	lval,
 	non_constant_p, overflow_p,
@@ -5940,7 +5939,6 @@ potential_constant_expression_1 (tree t,
   }
 
 case ANNOTATE_EXPR:
-  gcc_assert (tree_to_uhwi (TREE_OPERAND (t, 1)) == annot_expr_ivdep_kind);
   return RECUR (TREE_OPERAND (t, 0), rval);
 
 default:
Index: cp/cp-tree.def
===
--- cp/cp-tree.def	(revision 255850)
+++ cp/cp-tree.def	(working copy)
@@ -302,8 +302,8 @@ DEFTREECODE (FOR_STMT, "for_stmt", tcc_s
 
 /* Used to represent a range-based `for' statement. The operands are
RANGE_FOR_DECL, RANGE_FOR_EXPR, RANGE_FOR_BODY, and RANGE_FOR_SCOPE,
-   respectively.  Only used in templates.  */
-DEFTREECODE (RANGE_FOR_STMT, "range_for_stmt", tcc_statement, 4)
+   RANGE_FOR_UNROLL respectively.  Only used in templates.  */
+DEFTREECODE (RANGE_FOR_STMT, "range_for_stmt", tcc_statement, 5)
 
 /* Used to represent a 'while' statement. The operands are WHILE_COND
and WHILE_BODY, respectively.  */
Index: cp/cp-tree.h
===
--- cp/cp-tree.h	(revision 255850)
+++ cp/cp-tree.h	(working copy)
@@ -4844,6 +4844,7 @@ more_aggr_init_expr_args_p (const aggr_i
 #define RANGE_FOR_EXPR(NODE)	TREE_OPERAND (RANGE_FOR_STMT_CHECK (NODE), 1)
 #define RANGE_FOR_BODY(NODE)	TREE_OPERAND (RANGE_FOR_STMT_CHECK (NODE), 2)
 #define RANGE_FOR_SCOPE(NODE)	TREE_OPERAND (RANGE_FOR_STMT_CHECK (NODE), 3)
+#define RANGE_FOR_UNROLL(NODE)	TREE_OPERAND (RANGE_FOR_STMT_CHECK (NODE), 4)
 #define RANGE_FOR_IVDEP(NODE)	TREE_LANG_FLAG_6 (RANGE_FOR_STMT_CHECK (NODE))
 
 #define SWITCH_STMT_COND(NODE)	TREE_OPERAND (SWITCH_STMT_CHECK (NODE), 0)
@@ -6433,7 +6434,8 @@ extern tree implicitly_declare_fn
 extern bool maybe_clone_body			(tree);
 
 /* In parser.c */
-extern tree cp_convert_range_for (tree, tree, tree, tree, unsigned int, bool);
+extern tree cp_convert_range_for (tree, tree, tree, tree, unsigned int, bool,
+  unsigned short);
 extern bool parsing_nsdmi (void);
 extern bool parsing_default_capturing_generic_lambda_in_template (void);
 extern void inject_this_parameter (tree, cp_cv_quals);
@@ -6718,16 +6720,16 @@ extern void beg

Re: [RFC] Add means to split dump file into several files -- Use in lra

2017-12-20 Thread Tom de Vries

On 12/14/2017 06:00 PM, Vladimir Makarov wrote:
What is inconvenient is that sometimes I need to see RTL after each 
subpass. Currently I do it by calling debug_rtx_range in gdb after the 
subpass in question.




I've added dumping of the RTL after each subpass.

So this approach would be useful at least for me in some cases. But I 
think it should be not a default approach


I've added an option flra-split-dump, off by default.

The dump files look like:
...
$ ls -1 reduc-1.c.278r.reload*
reduc-1.c.278r.reload
reduc-1.c.278r.reload.001.remove_scratches
reduc-1.c.278r.reload.002.lra_constraints
reduc-1.c.278r.reload.003.lra_inheritance
reduc-1.c.278r.reload.004.lra_create_live_ranges
reduc-1.c.278r.reload.005.lra_assign
reduc-1.c.278r.reload.006.lra_undo_inheritance
reduc-1.c.278r.reload.007.lra_constraints
reduc-1.c.278r.reload.008.lra_finishing
reduc-1.c.278r.reload.009.remove_scratches
reduc-1.c.278r.reload.010.lra_constraints
reduc-1.c.278r.reload.011.lra_inheritance
reduc-1.c.278r.reload.012.lra_create_live_ranges
reduc-1.c.278r.reload.013.lra_assign
reduc-1.c.278r.reload.014.lra_undo_inheritance
reduc-1.c.278r.reload.015.lra_constraints
reduc-1.c.278r.reload.016.lra_finishing
reduc-1.c.278r.reload.017.remove_scratches
reduc-1.c.278r.reload.018.lra_constraints
reduc-1.c.278r.reload.019.lra_inheritance
reduc-1.c.278r.reload.020.lra_create_live_ranges
reduc-1.c.278r.reload.021.lra_assign
reduc-1.c.278r.reload.022.lra_undo_inheritance
reduc-1.c.278r.reload.023.lra_constraints
reduc-1.c.278r.reload.024.lra_finishing
...

The main dump file "reduc-1.c.278r.reload" contains all the dumping that 
is done by the pass manager, not by the lra pass itself.


Bootstrapped and reg-tested on x86_64.

Any further comments?

Thanks,
- Tom
Add flra-split-dump

2017-12-20  Tom de Vries  

	* dumpfile.c (DUMP_FILE_INFO): Add inits for bump and bump_name fields.
	(get_dump_file_name): Handle bump and bump_name fields.
	(dump_start): Handle bump field.
	(gcc::dump_manager::dump_bump, dump_bump)
	(gcc::dump_manager::reset_dump_bump, reset_dump_bump): New function.
	* dumpfile.h (struct dump_file_info): Add bump, save_bump and bump_name
	fields.
	(dump_bump, reset_dump_bump): Declare.
	* lra-assigns.c (lra_assign): Use lra_dump_bump.
	* lra-coalesce.c (lra_coalesce): Same.
	* lra-constraints.c (lra_constraints, lra_inheritance)
	(lra_undo_inheritance): Same.
	* lra-int.h (lra_dump_bump): Declare.
	* lra-lives.c (lra_create_live_ranges): Use lra_dump_bump.
	* lra-remat.c (lra_remat): Same.
	* lra-spills.c (lra_spill): Same.
	* lra.c (lra_dump_bump, lra_dump_bump_start, lra_reset_dump_bump): New
	function.
	(remove_scratches): Use lra_dump_bump.
	(lra): Use lra_dump_bump_start, lra_dump_bump and lra_reset_dump_bump.
	* common.opt (flra-split-dump): New switch.
	* ira.c (do_reload): Handle flag_lra_split_dump.

---
 gcc/common.opt|  3 ++
 gcc/dumpfile.c| 85 ++-
 gcc/dumpfile.h| 13 +++-
 gcc/ira.c |  2 +-
 gcc/lra-assigns.c |  1 +
 gcc/lra-coalesce.c|  1 +
 gcc/lra-constraints.c |  3 ++
 gcc/lra-int.h |  1 +
 gcc/lra-lives.c   |  1 +
 gcc/lra-remat.c   |  1 +
 gcc/lra-spills.c  |  1 +
 gcc/lra.c | 47 
 12 files changed, 149 insertions(+), 10 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index d80ae5b..9da73e4 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1798,6 +1798,9 @@ flra-remat
 Common Report Var(flag_lra_remat) Optimization
 Do CFG-sensitive rematerialization in LRA.
 
+flra-split-dump
+Common Report Var(flag_lra_split_dump)
+
 flto
 Common
 Enable link-time optimization.
diff --git a/gcc/dumpfile.c b/gcc/dumpfile.c
index 4a59d60..d23f84f 100644
--- a/gcc/dumpfile.c
+++ b/gcc/dumpfile.c
@@ -51,7 +51,7 @@ dump_flags_t dump_flags;
 
 #define DUMP_FILE_INFO(suffix, swtch, dkind, num) \
   {suffix, swtch, NULL, NULL, NULL, NULL, NULL, dkind, 0, 0, 0, 0, 0, num, \
-   false, false}
+   0, 0, NULL, false, false}
 
 /* Table of tree dump switches. This must be consistent with the
TREE_DUMP_INDEX enumeration in dumpfile.h.  */
@@ -287,7 +287,14 @@ char *
 gcc::dump_manager::
 get_dump_file_name (struct dump_file_info *dfi) const
 {
-  char dump_id[10];
+  char dump_id[1   /* '.' */
+	   + 7 /* '%03d' */
+	   + 1 /* '[tir]' */
+	   + 1 /* '\0' */];
+
+  char bump_id[1   /* '.' */
+	   + 7 /* '%03d' */
+	   + 1 /* '\0' */];
 
   gcc_assert (dfi);
 
@@ -305,11 +312,24 @@ get_dump_file_name (struct dump_file_info *dfi) const
   /* (null), LANG, TREE, RTL, IPA.  */
   char suffix = " ltri"[dfi->dkind];
   
+  if (dfi->bump > 0)
+	{
+	  if (snprintf (bump_id, sizeof (bump_id), ".%03d", dfi->bump) < 0)
+	bump_id[0] = '\0';
+	}
+
   if (snprintf (dump_id, sizeof (dump_id), ".%03d%c", dfi->num, suffix) < 0)
 	dump_id[0] = '\0';
 }
 
-  return concat (dump_base_name, dump_id, dfi->suffix, NULL);
+  if (dfi->bump_n

[PATCH] Fix gcov-dump tool for GCDA files (PR gcov-profile/83509).

2017-12-20 Thread Martin Liška
Hi.

It's quite simple bug, where we read more than allowed in case of GCDA files.
Tested on couple of source files.

Ready for trunk?
Thanks,
Martin

gcc/ChangeLog:

2017-12-20  Martin Liska  

PR gcov-profile/83509
* gcov-dump.c (dump_gcov_file): Do not read info about
support_unexecuted_blocks for gcda files.
---
 gcc/gcov-dump.c | 20 
 1 file changed, 12 insertions(+), 8 deletions(-)


diff --git a/gcc/gcov-dump.c b/gcc/gcov-dump.c
index e5e649cb18f..7dc28bbf216 100644
--- a/gcc/gcov-dump.c
+++ b/gcc/gcov-dump.c
@@ -170,6 +170,7 @@ dump_gcov_file (const char *filename)
 {
   unsigned tags[4];
   unsigned depth = 0;
+  bool is_data_type;
 
   if (!gcov_open (filename, 1))
 {
@@ -181,14 +182,13 @@ dump_gcov_file (const char *filename)
   {
 unsigned magic = gcov_read_unsigned ();
 unsigned version;
-const char *type = NULL;
 int endianness = 0;
 char m[4], v[4];
 
 if ((endianness = gcov_magic (magic, GCOV_DATA_MAGIC)))
-  type = "data";
+  is_data_type = true;
 else if ((endianness = gcov_magic (magic, GCOV_NOTE_MAGIC)))
-  type = "note";
+  is_data_type = false;
 else
   {
 	printf ("%s:not a gcov file\n", filename);
@@ -199,7 +199,8 @@ dump_gcov_file (const char *filename)
 GCOV_UNSIGNED2STRING (v, version);
 GCOV_UNSIGNED2STRING (m, magic);
 
-printf ("%s:%s:magic `%.4s':version `%.4s'%s\n", filename, type,
+printf ("%s:%s:magic `%.4s':version `%.4s'%s\n", filename,
+	is_data_type ? "data" : "note",
  	m, v, endianness < 0 ? " (swapped endianness)" : "");
 if (version != GCOV_VERSION)
   {
@@ -217,10 +218,13 @@ dump_gcov_file (const char *filename)
 printf ("%s:stamp %lu\n", filename, (unsigned long)stamp);
   }
 
-  /* Support for unexecuted basic blocks.  */
-  unsigned support_unexecuted_blocks = gcov_read_unsigned ();
-  if (!support_unexecuted_blocks)
-printf ("%s: has_unexecuted_block is not supported\n", filename);
+  if (!is_data_type)
+{
+  /* Support for unexecuted basic blocks.  */
+  unsigned support_unexecuted_blocks = gcov_read_unsigned ();
+  if (!support_unexecuted_blocks)
+	printf ("%s: has_unexecuted_block is not supported\n", filename);
+}
 
   while (1)
 {



Re: [PATCH] Append PWD to path when using -fprofile-generate=/some/path.

2017-12-20 Thread Martin Liška
PING^1

On 10/27/2017 03:17 PM, Martin Liška wrote:
> Hello.
> 
> It's improvement that I consider still useful even though we're not going to 
> use
> it for profiled bootstrap.
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> Ready for trunk?
> Thanks,
> Martin



Re: [PATCH PR81740]Enforce dependence check for outer loop vectorization

2017-12-20 Thread Bin.Cheng
On Tue, Dec 19, 2017 at 12:56 PM, Richard Biener
 wrote:
> On Tue, Dec 19, 2017 at 12:58 PM, Bin.Cheng  wrote:
>> On Mon, Dec 18, 2017 at 2:35 PM, Michael Matz  wrote:
>>> Hi,
>>>
>>> On Mon, 18 Dec 2017, Richard Biener wrote:
>>>
 where *unroll is similar to *max_vf I think.  dist_v[0] is the innermost 
 loop.
>>>
>>> [0] is always outermost loop.
>>>
 The vectorizer does way more complicated things and only looks at the
 distance with respect to the outer loop as far as I can see which can be
 negative.

 Not sure if fusion and vectorizer "interleaving" makes a difference
 here. I think the idea was that when interleaving stmt-by-stmt then
 forward dependences would be preserved and thus we don't need to check
 the inner loop dependences.  speaking with "forward vs. backward"
 dependences again, not distances...

 This also means that unroll-and-jam could be enhanced to "interleave"
 stmts and thus cover more cases?

 Again, I hope Micha can have a look here...
>>>
>>> Haven't yet looked at the patch, but some comments anyway:
>>>
>>> fusion and interleaving interact in the following way in outer loop
>>> vectorization, conceptually:
>>> * (1) the outer loop is unrolled
>>> * (2) the inner loops are fused
>>> * (3) the (now single) inner body is rescheduled/shuffled/interleaved.
>>>
>> Thanks Michael for explaining issue clearer, this is what I meant.  As
>> for PR60276, I think it's actually the other side of the problem,
>> which only relates to dependence validity of interleaving.
>
> Interleaving validity is what is checked by the current code, I don't
> see any checking for validity of (2).  Now, the current code only
> looks at the outer loop distances to verify interleaving validity.
>
> I think we need to verify whether fusion is valid, preferably clearly
> separated from the current code checking interleaving validity.
>
> I'm not 100% convinced the interleaving validity check is correct for
> the outer loop vectorization case.
>
> I think it helps to reduce the dependence checking code to what we do
> for unroll-and-jam:
>
> Index: gcc/tree-vect-data-refs.c
> ===
> --- gcc/tree-vect-data-refs.c   (revision 255777)
> +++ gcc/tree-vect-data-refs.c   (working copy)
> @@ -378,7 +378,26 @@ vect_analyze_data_ref_dependence (struct
> dump_printf_loc (MSG_NOTE, vect_location,
>   "dependence distance  = %d.\n", dist);
>
> -  if (dist == 0)
> +  if (dist < 0)
> +   gcc_unreachable ();
> +
> +  else if (dist >= *max_vf)
> +   {
> + /* Dependence distance does not create dependence, as far as
> +vectorization is concerned, in this case.  */
> + if (dump_enabled_p ())
> +   dump_printf_loc (MSG_NOTE, vect_location,
> +"dependence distance >= VF.\n");
> + continue;
> +   }
> +
> +  else if (DDR_NB_LOOPS (ddr) == 2
> +  && (lambda_vector_lexico_pos (dist_v + 1, DDR_NB_LOOPS (ddr) - 
> 1)
> +  || (lambda_vector_zerop (dist_v + 1, DDR_NB_LOOPS (ddr) - 
> 1)
> +  && dist > 0)))
> +   continue;
> +
> +  else if (dist == 0)
> {
>   if (dump_enabled_p ())
> {
> @@ -427,26 +446,7 @@ vect_analyze_data_ref_dependence (struct
>   continue;
> }
>
> -  if (dist > 0 && DDR_REVERSED_P (ddr))
> -   {
> - /* If DDR_REVERSED_P the order of the data-refs in DDR was
> -reversed (to make distance vector positive), and the actual
> -distance is negative.  */
> - if (dump_enabled_p ())
> -   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> -"dependence distance negative.\n");
> - /* Record a negative dependence distance to later limit the
> -amount of stmt copying / unrolling we can perform.
> -Only need to handle read-after-write dependence.  */
> - if (DR_IS_READ (drb)
> - && (STMT_VINFO_MIN_NEG_DIST (stmtinfo_b) == 0
> - || STMT_VINFO_MIN_NEG_DIST (stmtinfo_b) > (unsigned)dist))
> -   STMT_VINFO_MIN_NEG_DIST (stmtinfo_b) = dist;
> - continue;
> -   }
Simply removing DDR_REVERSED_P doesn't work, as below failures indicating.
> -
> -  if (abs (dist) >= 2
> - && abs (dist) < *max_vf)
> +  if (dist >= 2)
> {
>   /* The dependence distance requires reduction of the maximal
>  vectorization factor.  */
> @@ -455,15 +455,6 @@ vect_analyze_data_ref_dependence (struct
> dump_printf_loc (MSG_NOTE, vect_location,
>  "adjusting maximal vectorization factor to %i\n",
>  *max_vf);
> -   }
> -
> -  if (abs (dist) >= *max_vf)
> -   {
> - /* Dependence distance does not create depen

Re: [PATCH, PR83492] Fix selection of aarch64 big-endian shift parameters based on __AARCH64EB__

2017-12-20 Thread Michael Weiser
Hi Kyrill,

On Wed, Dec 20, 2017 at 09:25:07AM +, Kyrill Tkachov wrote:

> > A patch to use __ARM_BIG_ENDIAN in those two spots is approved.
> I'll take care of the opt_random occurrence.
> Michael, would you like to respin your patch to libcpp to use
> __ARM_BIG_ENDIAN?

Happy to. Created yesterday against trunk, no changes with today's trunk.
Tested successfully by applying to 7.2.0, bootstrapping on
aarch64_be-unknown-linux-gnu and compiling various OS packages with the
resulting compiler.

2017-12-20  Michael Weiser  

PR preprocessor/83492
* lex.c (search_line_fast) [__ARM_NEON && __ARM_64BIT_STATE]:
Fix selection of big-endian shift parameters by using
__ARM_BIG_ENDIAN.

Index: libcpp/lex.c
===
--- libcpp/lex.c(revision 255828)
+++ libcpp/lex.c(working copy)
@@ -772,7 +772,7 @@
   const uint8x16_t repl_qm = vdupq_n_u8 ('?');
   const uint8x16_t xmask = (uint8x16_t) vdupq_n_u64 (0x8040201008040201ULL);
 
-#ifdef __AARCH64EB
+#ifdef __ARM_BIG_ENDIAN
   const int16x8_t shift = {8, 8, 8, 8, 0, 0, 0, 0};
 #else
   const int16x8_t shift = {0, 0, 0, 0, 8, 8, 8, 8};
-- 
Michael


Re: [PATCH, PR83492] Fix selection of aarch64 big-endian shift parameters based on __AARCH64EB__

2017-12-20 Thread Kyrill Tkachov


On 20/12/17 14:58, Michael Weiser wrote:

Hi Kyrill,

On Wed, Dec 20, 2017 at 09:25:07AM +, Kyrill Tkachov wrote:


A patch to use __ARM_BIG_ENDIAN in those two spots is approved.

I'll take care of the opt_random occurrence.
Michael, would you like to respin your patch to libcpp to use
__ARM_BIG_ENDIAN?

Happy to. Created yesterday against trunk, no changes with today's trunk.
Tested successfully by applying to 7.2.0, bootstrapping on
aarch64_be-unknown-linux-gnu and compiling various OS packages with the
resulting compiler.


Thanks! I think that's the first report I've seen of an 
aarch64_be-unknown-linux-gnu bootstrap :)

Since Jeff approved such a change already I've applied it
for you to trunk as r255896.

I'll give it a day or two for any auto-testers to give it a
try before backporting to the GCC 7 branch.

Thanks,
Kyrill


2017-12-20  Michael Weiser  

PR preprocessor/83492
* lex.c (search_line_fast) [__ARM_NEON && __ARM_64BIT_STATE]:
Fix selection of big-endian shift parameters by using
__ARM_BIG_ENDIAN.

Index: libcpp/lex.c
===
--- libcpp/lex.c(revision 255828)
+++ libcpp/lex.c(working copy)
@@ -772,7 +772,7 @@
const uint8x16_t repl_qm = vdupq_n_u8 ('?');
const uint8x16_t xmask = (uint8x16_t) vdupq_n_u64 (0x8040201008040201ULL);
  
-#ifdef __AARCH64EB

+#ifdef __ARM_BIG_ENDIAN
const int16x8_t shift = {8, 8, 8, 8, 0, 0, 0, 0};
  #else
const int16x8_t shift = {0, 0, 0, 0, 8, 8, 8, 8};




[C++ Patch] PR 81055 ("[6/7/8 Regression] ICE with invalid initializer for array new")

2017-12-20 Thread Paolo Carlini

Hi,

in this error recovery regression, after a sensible error produced by 
unqualified_name_lookup_error we ICE much later when 
gimplify_modify_expr encounters a corresponding error_mark_node as 
second argument of a MODIFY_EXPR. I believe we have a very general error 
recovery weakness with errors like unqualified_name_lookup_error and 
functions like cp_parser_initializer_list returning a vec: certainly we 
don't want to give up the parsing too early but then we have to cope 
with error_mark_nodes filtering down and reappearing much later in the 
compilation. The present bug is a rather clear example, but I have seen 
many others in the past: a couple of times I even tried doing something 
about it, but I have yet to figure out something worth sending to the 
mailing list. Anyway, here I'm wondering if at this stage it would make 
sense to handle the error_mark_node in gimplify_modify_expr - I believe 
we do have a couple other cases of such late handling in the gimplifier. 
Tested x86_64-linux.


Thanks, Paolo.

//

2017-12-20  Paolo Carlini  

PR c++/81055
* gimplify.c (gimplify_modify_expr): When the second operand of
the expression is error_mark_node immediately return GS_ERROR.

/testsuite
2017-12-20  Paolo Carlini  

PR c++/81055
* g++.dg/cpp0x/new2.C: New.
Index: gimplify.c
===
--- gimplify.c  (revision 255894)
+++ gimplify.c  (working copy)
@@ -5554,6 +5553,9 @@ gimplify_modify_expr (tree *expr_p, gimple_seq *pr
   gcc_assert (TREE_CODE (*expr_p) == MODIFY_EXPR
  || TREE_CODE (*expr_p) == INIT_EXPR);
 
+  if (*from_p == error_mark_node)
+return GS_ERROR;
+
   /* Trying to simplify a clobber using normal logic doesn't work,
  so handle it here.  */
   if (TREE_CLOBBER_P (*from_p))
Index: testsuite/g++.dg/cpp0x/new2.C
===
--- testsuite/g++.dg/cpp0x/new2.C   (nonexistent)
+++ testsuite/g++.dg/cpp0x/new2.C   (working copy)
@@ -0,0 +1,4 @@
+// PR c++/81055
+// { dg-do compile { target c++11 } }
+
+int** p = new int*[1]{q};  // { dg-error "not declared" }


Re: [PATCH] Fix gcov-dump tool for GCDA files (PR gcov-profile/83509).

2017-12-20 Thread Nathan Sidwell

On 12/20/2017 09:43 AM, Martin Liška wrote:

Hi.

It's quite simple bug, where we read more than allowed in case of GCDA files.
Tested on couple of source files.


ok


--
Nathan Sidwell


Re: [PATCH] Fix gcov-dump tool for GCDA files (PR gcov-profile/83509).

2017-12-20 Thread Jeff Law
On 12/20/2017 07:43 AM, Martin Liška wrote:
> Hi.
> 
> It's quite simple bug, where we read more than allowed in case of GCDA files.
> Tested on couple of source files.
> 
> Ready for trunk?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2017-12-20  Martin Liska  
> 
>   PR gcov-profile/83509
>   * gcov-dump.c (dump_gcov_file): Do not read info about
>   support_unexecuted_blocks for gcda files.
OK.
jeff


Re: [middle-end, obvious] Remove dead error_mark_node check

2017-12-20 Thread Jeff Law
On 12/20/2017 06:36 AM, Paolo Carlini wrote:
> Hi,
> 
> today noticed this nit too: when ret_expr == error_mark_node
> gimplify_return_expr immediatey returns GS_ERROR.
> 
> Thanks, Paolo.
> 
> //
> 
> 
> 
> CL
> 
> 
> 2017-12-20  Paolo Carlini  
> 
>   * gimplify.c (gimplify_return_expr): Remove dead error_mark_node check.
OK.
jeff


Re: [PATCH] Fix up one mishandled plural in diagnostics in gimple-ssa-sprintf.c

2017-12-20 Thread David Malcolm
On Tue, 2017-12-19 at 18:58 +0100, Jakub Jelinek wrote:
> On Tue, Dec 19, 2017 at 10:49:07AM -0700, Martin Sebor wrote:
> > Can the math be moved into inform_n (and warning_n) itself?
> 
> No.  I'm against having dozens of inform_n and warning_n etc.
> overloads, it is already bad enough we have so many diagnostics
> entry points now that putting breakpoints on them is so hard.

FWIW, I added a "break-on-diagnostic" command to our .gdbinit script,
in case that helps:

  https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01382.html

> Those functions just have int, and we can have
> perhaps some other function (plural_count or some better name)
> that is inline and overloaded on the various types and can be used
> if the count isn't int.
> 
> As for gimple-ssa-sprintf.c, there are several other cases that
> want a plural form, but I haven't changed them because
> format_warning_at_substring doesn't have _n suffixed variant.
> 
>   Jakub


Re: [PATCH] diagnose attribute conflicts on conversion operators (PR 83394)

2017-12-20 Thread Martin Sebor

On 12/19/2017 10:25 AM, Jason Merrill wrote:

On 12/18/2017 05:25 PM, Martin Sebor wrote:

On 12/13/2017 02:38 PM, Jason Merrill wrote:

On Wed, Dec 13, 2017 at 12:54 PM, Martin Sebor  wrote:

The attached update also fixes both instances of the ICE
reported in bug 83322 and supersedes Jakub's patch for that
bug (https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00765.html).
This passes bootstrap on x86_64 with no new regressions (there
are an increasing number of failures on trunk at the moment
but, AFAICS, none caused by this patch).

Jason, I'm still trying to come up with a test case for templates
that would illustrate the issue you're concerned about.  If you
have one that would be great (preferably one showing a regression).


I looked at the case I was concerned about, and found that it isn't an
issue because in that case we call duplicate_decls before applying
attributes.

But it looks like we'll still get this testcase wrong, because the
code assumes that if the old decl is a single _DECL, it must match.

[[gnu::noinline]] void f() { }
[[gnu::always_inline]] void f(int) { }  // OK, not the same function

I think the answer is to use Nathan's new iterators unconditionally,
probably lkp_iterator.


Thanks for the test case.  You're right that this problem still
exists.  I thought a complete fix for it would be simple enough
to include in this patch but after running into issues with
assumptions about how inline/noinline conflicts are resolved
I think it's best to fix the ICE alone in this patch and deal
with the pre-existing bug above in a follow up.  Apparently
(according to comment #6 on pr83322) the ICE is causing some
anxiety about the timely availability of a fix, so I'd like
to avoid spending more time on it than is necessary.

Attached is an updated patch.  It handles the overloads above
correctly but it doesn't fix the latent problem and so they
are still diagnosed, same as in GCC 7.


I still share Jakub's uneasiness with this approach; looking up a
redeclaration works rather differently from a normal lookup, and we
already have code for handling that.  I still think that handling this
stuff by extending diagnose_mismatched_attributes is a better way to go.

That said, it's good to fix the ICE as a stopgap.


I committed the patch without the block below in r255844.

I've been continuing to try to find a test case that shows
the problem you and Jakub are concerned about.  I've created
a bunch of passing test cases that I think would be useful to
add to the test suite.  I've also found a few pre-existing bugs
but (AFAICT) nothing that conclusively implicates the new code.

I've opened the following for these problems.  The first one
is the one you pointed out.  The GCC 8 regression (pr83394)
is triggered by my patch.  I haven't debugged it yet but I
wonder if it's due to the same root cause as 83502).

83498 - bogus -Wattributes for always_inline and noinline on
distinct overloads
83504 - incorrect attribute const interpretation on function
overloads
83502 - [6/7/8 Regression] bogus -Wattributes for always_inline
and noinline on function template specialization
83394 - [8 Regression] always_inline vs. noinline no longer
diagnosed​

Martin




+  if (TREE_CODE (last_decl) == TREE_LIST)
+{
+  /* The list contains a mix of symbols with the same name
+ (e.g., functions and data members defined in different
+ base classes).  */
+  do
+{
+  if (decls_match (decl, TREE_VALUE (last_decl)))
+return TREE_VALUE (last_decl);
+
+  last_decl = TREE_CHAIN (last_decl);
+}
+  while (last_decl);
+}


We shouldn't need to handle TREE_LIST at all, as getting that result
should indicate that there isn't any declaration in the scope we care
about; decls from base classes will never match.

OK with this block removed.

Jason




Re: [v2 of PATCH 13/14] c-format.c: handle location wrappers

2017-12-20 Thread David Malcolm
On Tue, 2017-12-19 at 14:55 -0500, Jason Merrill wrote:
> On 12/17/2017 11:29 AM, David Malcolm wrote:
> > On Mon, 2017-12-11 at 18:45 -0500, Jason Merrill wrote:
> > > On 11/10/2017 04:45 PM, David Malcolm wrote:
> > > > +  STRIP_ANY_LOCATION_WRAPPER (format_tree);
> > > > +
> > > >   if (VAR_P (format_tree))
> > > > {
> > > >   /* Pull out a constant value if the front end
> > > > didn't.  */
> > > 
> > > It seems like we want fold_for_warn here instead of the special
> > > variable
> > > handling.  That probably makes sense for the other places you
> > > change in
> > > this patch, too.
> > 
> > Here's an updated version of the patch which uses fold_for_warn,
> > rather than STRIP_ANY_LOCATION_WRAPPER.
> > 
> > In one place it was necessary to add a STRIP_NOPS, since the
> > fold_for_warn can add a cast around a:
> >ADDR_EXPR  (
> >  STRING_CST)
> > turning it into a:
> >NOP_EXPR (
> >  ADDR_EXPR  (
> >STRING_CST))
> 
> Hmm, that seems like a bug.  fold_for_warn shouldn't change the type
> of 
> the result.

In a similar vein, is the result of decl_constant_value (decl) required
to be the same type as that of the decl?

What's happening for this testcase (g++.dg/warn/Wformat-1.C) is that we
have a VAR_DECL with a DECL_INITIAL, but the types of the two don't
quite match up.

decl_constant_value returns an unshare_expr clone of the DECL_INITIAL,
and this is what's returned from fold_for_warn.

Am I right in thinking

(a) that the bug here is that a DECL_INITIAL ought to have the same
type as its decl, and so there's a missing cast where that gets
set up, or

(b) should decl_constant_value handle such cases, and introduce a cast
if it uncovers them?

Thanks
Dave



> > +  format_tree = fold_for_warn (format_tree);
> > +
> > if (VAR_P (format_tree))
> >   {
> > /* Pull out a constant value if the front end didn't.  */
> 
> I was suggesting dropping the if (VAR_P... block, since pulling out
> the 
> constant value should now be covered by fold_for_warn.
> 
> > -format_tree = TREE_OPERAND (format_tree, 0);
> > +{
> > +  format_tree = fold_for_warn (TREE_OPERAND (format_tree, 0));
> > +}
> 
> Why the added braces?

(I messed up; will be fixed in the next version)


Re: [PATCH] Append PWD to path when using -fprofile-generate=/some/path.

2017-12-20 Thread Martin Sebor

On 10/27/2017 07:17 AM, Martin Liška wrote:

Hello.

It's improvement that I consider still useful even though we're not
going to use
it for profiled bootstrap.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.


For what it's worth, although it looks correct as is, I think
the code in the patch would be clearer and less prone to off
by-1 errors, while at the same time equally as efficient, if
it were written in terms of strcpy and strcat:

  char *b = XNEWVEC (char,
 2 + strlen (pwd)
 + strlen (profile_data_prefix));

  strcpy (b, profile_data_prefix);
  strcat (b, "/");
  strcat (b, pwd);

GCC has all the smarts to turn this code into the elaborated
form using memcpy.  Plus, it's getting better at checking
string functions for invalid accesses, but it's not quite
as good at checking raw memory functions, or at checking
combinations of those and direct accesses to array elements.
We might as well eat our own dog food and put all this
machinery to good use within GCC itself.

Martin


Re: [RFC] Add means to split dump file into several files -- Use in lra

2017-12-20 Thread Sandra Loosemore

On 12/20/2017 07:42 AM, Tom de Vries wrote:

On 12/14/2017 06:00 PM, Vladimir Makarov wrote:
What is inconvenient is that sometimes I need to see RTL after each 
subpass. Currently I do it by calling debug_rtx_range in gdb after the 
subpass in question.




I've added dumping of the RTL after each subpass.

So this approach would be useful at least for me in some cases. But I 
think it should be not a default approach


I've added an option flra-split-dump, off by default.

The dump files look like:
...
$ ls -1 reduc-1.c.278r.reload*
reduc-1.c.278r.reload
reduc-1.c.278r.reload.001.remove_scratches
reduc-1.c.278r.reload.002.lra_constraints
reduc-1.c.278r.reload.003.lra_inheritance
reduc-1.c.278r.reload.004.lra_create_live_ranges
reduc-1.c.278r.reload.005.lra_assign
reduc-1.c.278r.reload.006.lra_undo_inheritance
reduc-1.c.278r.reload.007.lra_constraints
reduc-1.c.278r.reload.008.lra_finishing
reduc-1.c.278r.reload.009.remove_scratches
reduc-1.c.278r.reload.010.lra_constraints
reduc-1.c.278r.reload.011.lra_inheritance
reduc-1.c.278r.reload.012.lra_create_live_ranges
reduc-1.c.278r.reload.013.lra_assign
reduc-1.c.278r.reload.014.lra_undo_inheritance
reduc-1.c.278r.reload.015.lra_constraints
reduc-1.c.278r.reload.016.lra_finishing
reduc-1.c.278r.reload.017.remove_scratches
reduc-1.c.278r.reload.018.lra_constraints
reduc-1.c.278r.reload.019.lra_inheritance
reduc-1.c.278r.reload.020.lra_create_live_ranges
reduc-1.c.278r.reload.021.lra_assign
reduc-1.c.278r.reload.022.lra_undo_inheritance
reduc-1.c.278r.reload.023.lra_constraints
reduc-1.c.278r.reload.024.lra_finishing
...

The main dump file "reduc-1.c.278r.reload" contains all the dumping that 
is done by the pass manager, not by the lra pass itself.


Bootstrapped and reg-tested on x86_64.

Any further comments?


The new option needs a help string in the .opt file and documentation in 
invoke.texi, no?


-Sandra


Re: [PATCH] Append PWD to path when using -fprofile-generate=/some/path.

2017-12-20 Thread Jakub Jelinek
On Wed, Dec 20, 2017 at 10:35:44AM -0700, Martin Sebor wrote:
> On 10/27/2017 07:17 AM, Martin Liška wrote:
> > Hello.
> > 
> > It's improvement that I consider still useful even though we're not
> > going to use
> > it for profiled bootstrap.
> > 
> > Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> For what it's worth, although it looks correct as is, I think
> the code in the patch would be clearer and less prone to off
> by-1 errors, while at the same time equally as efficient, if
> it were written in terms of strcpy and strcat:
> 
>   char *b = XNEWVEC (char,
>  2 + strlen (pwd)
>  + strlen (profile_data_prefix));
> 
>   strcpy (b, profile_data_prefix);
>   strcat (b, "/");
>   strcat (b, pwd);

I disagree.  One thing is that GCC attempts to optimize even bad code;
in many cases just can't as soon as there is some call in between etc.,
the other is that we just shouldn't give bad examples.
When you know the sizes, we shouldn't be lazy and rely on compiler to figure
out what we should have written.  Though on this case we have a nice
function for all of that,
  char *b = concat (profile_data_prefix, "/", pwd, NULL);

Another thing is that the "/" in there is wrong, so
  const char dir_separator_str[] = { DIR_SEPARATOR, '\0' };
  char *b = concat (profile_data_prefix, dir_separator_str, pwd, NULL);
needs to be used instead.
Does profile_data_prefix have any dir separators stripped from the end?
Is pwd guaranteed to be relative in this case?

Jakub


Re: [PATCH] Replace Yoda conditions in gcc/

2017-12-20 Thread Martin Sebor

On 12/20/2017 01:29 AM, Eric Botcazou wrote:

If some port maintainer used 1 <= x && x <= 24 style and doesn't like
x >= 1 && x <= 24 for some reason, there is always IN_RANGE macro and
IN_RANGE (x, 1, 24) can be used instead (though, such a change requires
double checking the type of x, it shouldn't be wider than HOST_WIDE_INT).


Yes, the visium changes are rather counter-productive, especially in
visium_legitimize_address & visium_legitimize_reload_address where the style
is clearly inconsistent now.

As far as I'm concerned, reading (a < x && x < b) is twice as fast as reading
(x > a && x < b).  And IN_RANGE is too ambiguous wrt the bounds.


FWIW, I agree.  We should all be comfortable reading code either
way.  If it doesn't matter in a < b where a is a macro or enum or
some constant then it shouldn't matter if it's 0 < b with a literal.

But I've lost that battle several times in the past.

Martin



Re: [PATCH] Append PWD to path when using -fprofile-generate=/some/path.

2017-12-20 Thread Martin Sebor

On 12/20/2017 10:45 AM, Jakub Jelinek wrote:

On Wed, Dec 20, 2017 at 10:35:44AM -0700, Martin Sebor wrote:

On 10/27/2017 07:17 AM, Martin Liška wrote:

Hello.

It's improvement that I consider still useful even though we're not
going to use
it for profiled bootstrap.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.


For what it's worth, although it looks correct as is, I think
the code in the patch would be clearer and less prone to off
by-1 errors, while at the same time equally as efficient, if
it were written in terms of strcpy and strcat:

  char *b = XNEWVEC (char,
 2 + strlen (pwd)
 + strlen (profile_data_prefix));

  strcpy (b, profile_data_prefix);
  strcat (b, "/");
  strcat (b, pwd);


I disagree.


Of course you do.


 One thing is that GCC attempts to optimize even bad code;
in many cases just can't as soon as there is some call in between etc.,
the other is that we just shouldn't give bad examples.
When you know the sizes, we shouldn't be lazy and rely on compiler to figure
out what we should have written.


It's my turn to disagree now.  We most certainly should be lazy
and rely on our tools and high-level abstractions to do the work
for us.  That's what they're for.


Though on this case we have a nice
function for all of that,
  char *b = concat (profile_data_prefix, "/", pwd, NULL);


Even better (unless this is being lazy ;-)

Martin


[v3 of 05/14] C++: handle locations wrappers when calling warn_for_memset

2017-12-20 Thread David Malcolm
On Tue, 2017-12-19 at 15:02 -0500, Jason Merrill wrote:
> On 12/16/2017 03:01 PM, Martin Sebor wrote:
> > On 12/16/2017 06:12 AM, David Malcolm wrote:
> > > On Mon, 2017-12-11 at 18:36 -0500, Jason Merrill wrote:
> > > > On 11/10/2017 04:45 PM, David Malcolm wrote:
> > > > > We need to strip away location wrappers in tree.c predicates
> > > > > like
> > > > > integer_zerop, otherwise they fail when they're called on
> > > > > wrapped INTEGER_CST; an example can be seen for
> > > > >c-c++-common/Wmemset-transposed-args1.c
> > > > > in g++.sum, where the warn_for_memset fails to detect integer
> > > > > zero
> > > > > if the location wrappers aren't stripped.
> > > > 
> > > > These shouldn't be needed; callers should have folded away
> > > > location
> > > > wrappers.  I would hope for STRIP_ANY_LOCATION_WRAPPER to be
> > > > almost
> > > > never needed.
> > > > 
> > > > warn_for_memset may be missing some calls to fold_for_warn.
> > > 
> > > It is, thanks.
> > > 
> > > Here's a revised fix for e.g. Wmemset-transposed-args1.c,
> > > replacing
> > > "[PATCH 05/14] tree.c: strip location wrappers from integer_zerop
> > > etc"
> > > and
> > > "[PATCH 10/14] warn_for_memset: handle location wrappers"
> > > 
> > > This version of the patch simply calls fold_for_warn on each of
> > > the
> > > arguments before calling warn_for_memset.  This ensures that
> > > warn_for_memset detects integer zero.  It also adds a
> > > literal_integer_zerop to deal with location wrapper nodes when
> > > building "literal_mask" for the call, since this must be called
> > > *before* the fold_for_warn calls.
> > > 
> > > Successfully bootstrapped®rtested on x86_64-pc-linux-gnu, as
> > > part of the kit.
> > > 
> > > Is this OK for trunk, once the rest of the kit is approved?
> > > 
> > > gcc/cp/ChangeLog:
> > > * parser.c (literal_integer_zerop): New function.
> > > (cp_parser_postfix_expression): When calling warn_for_memset,
> > > replace integer_zerop calls with literal_integer_zerop, and
> > > call fold_for_warn on the arguments.
> > > ---
> > >  gcc/cp/parser.c | 18 --
> > >  1 file changed, 16 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
> > > index d15769a..7bca460 100644
> > > --- a/gcc/cp/parser.c
> > > +++ b/gcc/cp/parser.c
> > > @@ -6621,6 +6621,16 @@ cp_parser_compound_literal_p (cp_parser
> > > *parser)
> > >return compound_literal_p;
> > >  }
> > > 
> > > +/* Return 1 if EXPR is the integer constant zero or a complex
> > > constant
> > > +   of zero, without any folding, but ignoring location
> > > wrappers.  */
> > > +
> > > +static int
> > > +literal_integer_zerop (const_tree expr)
> > > +{
> > > +  STRIP_ANY_LOCATION_WRAPPER (expr);
> > > +  return integer_zerop (expr);
> > > +}
> > > +
> > >  /* Parse a postfix-expression.
> > > 
> > > postfix-expression:
> > > @@ -7168,8 +7178,12 @@ cp_parser_postfix_expression (cp_parser 
> > > *parser, bool address_p, bool cast_p,
> > >  tree arg0 = (*args)[0];
> > >  tree arg1 = (*args)[1];
> > >  tree arg2 = (*args)[2];
> > > -int literal_mask = ((!!integer_zerop (arg1) << 1)
> > > -| (!!integer_zerop (arg2) << 2));
> > > +/* Handle any location wrapper nodes.  */
> > > +arg0 = fold_for_warn (arg0);
> > > +int literal_mask = ((!!literal_integer_zerop (arg1) <<
> > > 1)
> > > +| (!!literal_integer_zerop (arg2) << 2));
> > 
> > The double negation jumped out at me.  The integer_zerop() function
> > looks like a predicate that hasn't yet been converted to return
> > bool.
> > It seems that new predicates that are implemented on top of it
> > could
> > return bool and their callers avoid this conversion.  (At some
> > point
> > in the future it would also be nice to convert the existing
> > predicates to return bool).
> 
> Agreed.  And since integer_zerop in fact returns boolean values,
> there 
> seems to be no need for the double negative in the first place.
> 
> So, please make the new function return bool and remove the !!s.
> 
> And I think the fold_for_warn call should be in warn_for_memset, and
> it 
> should be called for arg2 as well instead of specifically handling 
> CONST_DECL Here.

Thanks.

Here's an updated version of the patch which I believe covers all
of the above.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu, as
part of the kit.

OK for trunk once the rest of the kit is approved?

gcc/c-family/ChangeLog:
* c-warn.c (warn_for_memset): Call fold_for_warn on the arguments.

gcc/cp/ChangeLog:
* parser.c (literal_integer_zerop): New function.
(cp_parser_postfix_expression): When calling warn_for_memset,
replace integer_zerop calls with literal_integer_zerop,
eliminating the double logical negation cast to bool.
Eliminate the special-casing for CONST_DECL in favor of the
fold_for_warn within warn_for_memset.
---
 gcc/c-f

[PATCH] Fix ICE with -ftree-parallelize-loops=2 -fno-ipa-pure-const (PR ipa/83506)

2017-12-20 Thread Jakub Jelinek
Hi!

Prathamesh's r254140 moved the
  if (!flag_wpa)
ipa_free_fn_summary ();
from the ipa-inline pass to the following ipa-pure-const pass.
That doesn't work well though, because ipa-inline pass has unconditional
gate, so is done always when real IPA passes are performed, but
ipa-pure-const is gated on some flag_* options, so if those options
are disabled, we keep fn summary computed until end of compilation, and
e.g. new function insertion hooks then attempt to update the fnsummary
even when with RTL hooks registered.

Calling ipa_free_fn_summary () in ipa-inline pass if the gate of
the following pass will not be run looks too ugly to me, and while there
is another unconditional ipa pass, moving it there doesn't look too clean to
me either.

We already have a separate pass ipa-free-fn-summary, just use it so far only
during small IPA passes.  I think the cleanest is to add another ipa pass
that will just free it instead of sticking it into unrelated passes which
just happen to be unconditional ATM.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-12-20  Jakub Jelinek  

PR ipa/83506
* ipa-fnsummary.c (pass_data_ipa_free_fn_summary): Use 0 for
todo_flags_finish.
(pass_ipa_free_fn_summary): Add small_p private data member,
initialize to false in the ctor.
(pass_ipa_free_fn_summary::clone,
pass_ipa_free_fn_summary::set_pass_param,
pass_ipa_free_fn_summary::gate): New methods.
(pass_ipa_free_fn_summary::execute): Return TODO_remove_functions
| TODO_dump_symtab if small_p.
* passes.def: Add true parm for the existing pass_ipa_free_fn_summary
entry and add another instance of the pass with false parm after
ipa-pure-const.
* ipa-pure-const.c (pass_ipa_pure_const): Don't call
ipa_free_fn_summary here.

* gcc.dg/pr83506.c: New test.
* gcc.dg/ipa/ctor-empty-1.c: Use -fdump-ipa-free-fnsummary1 instead
of -fdump-ipa-free-fnsummary and scan in free-fnsummary1 instead of
free-fnsummary dump.

--- gcc/ipa-fnsummary.c.jj  2017-12-20 11:01:13.0 +0100
+++ gcc/ipa-fnsummary.c 2017-12-20 11:43:40.462288141 +0100
@@ -3538,26 +3538,36 @@ const pass_data pass_data_ipa_free_fn_su
   0, /* properties_provided */
   0, /* properties_destroyed */
   0, /* todo_flags_start */
-  /* Early optimizations may make function unreachable.  We can not
- remove unreachable functions as part of the ealry opts pass because
- TODOs are run before subpasses.  Do it here.  */
-  ( TODO_remove_functions | TODO_dump_symtab ), /* todo_flags_finish */
+  0, /* todo_flags_finish */
 };
 
 class pass_ipa_free_fn_summary : public simple_ipa_opt_pass
 {
 public:
   pass_ipa_free_fn_summary (gcc::context *ctxt)
-: simple_ipa_opt_pass (pass_data_ipa_free_fn_summary, ctxt)
+: simple_ipa_opt_pass (pass_data_ipa_free_fn_summary, ctxt),
+  small_p (false)
   {}
 
   /* opt_pass methods: */
+  opt_pass *clone () { return new pass_ipa_free_fn_summary (m_ctxt); }
+  void set_pass_param (unsigned int n, bool param)
+{
+  gcc_assert (n == 0);
+  small_p = param;
+}
+  virtual bool gate (function *) { return small_p || !flag_wpa; }
   virtual unsigned int execute (function *)
 {
   ipa_free_fn_summary ();
-  return 0;
+  /* Early optimizations may make function unreachable.  We can not
+remove unreachable functions as part of the early opts pass because
+TODOs are run before subpasses.  Do it here.  */
+  return small_p ? TODO_remove_functions | TODO_dump_symtab : 0;
 }
 
+private:
+  bool small_p;
 }; // class pass_ipa_free_fn_summary
 
 } // anon namespace
--- gcc/passes.def.jj   2017-12-18 14:57:20.0 +0100
+++ gcc/passes.def  2017-12-20 11:31:32.402117369 +0100
@@ -144,7 +144,7 @@ along with GCC; see the file COPYING3.
   PUSH_INSERT_PASSES_WITHIN (pass_ipa_tree_profile)
   NEXT_PASS (pass_feedback_split_functions);
   POP_INSERT_PASSES ()
-  NEXT_PASS (pass_ipa_free_fn_summary);
+  NEXT_PASS (pass_ipa_free_fn_summary, true /* small_p */);
   NEXT_PASS (pass_ipa_increase_alignment);
   NEXT_PASS (pass_ipa_tm);
   NEXT_PASS (pass_ipa_lower_emutls);
@@ -161,6 +161,7 @@ along with GCC; see the file COPYING3.
   NEXT_PASS (pass_ipa_fn_summary);
   NEXT_PASS (pass_ipa_inline);
   NEXT_PASS (pass_ipa_pure_const);
+  NEXT_PASS (pass_ipa_free_fn_summary, false /* small_p */);
   NEXT_PASS (pass_ipa_reference);
   /* This pass needs to be scheduled after any IP code duplication.   */
   NEXT_PASS (pass_ipa_single_use);
--- gcc/ipa-pure-const.c.jj 2017-12-04 22:29:11.0 +0100
+++ gcc/ipa-pure-const.c2017-12-20 11:33:11.905956408 +0100
@@ -2013,10 +2013,6 @@ execute (function *)
 if (has_function_state (node))
   free (get_function_state (node));
   funct_state_vec.release ();
-
-  /* In WPA we use inline summaries for partitioning process.  */
-  if (!flag_wpa)
-  

[v3 of PATCH 15/14] Use fold_for_warn in get_atomic_generic_size

2017-12-20 Thread David Malcolm
On Tue, 2017-12-19 at 23:22 -0500, Jason Merrill wrote:
> On Tue, Dec 19, 2017 at 7:53 PM, David Malcolm 
> wrote:
> > On Tue, 2017-12-19 at 15:35 -0500, Jason Merrill wrote:
> > > On Sat, Dec 16, 2017 at 8:12 PM, David Malcolm  > > om>
> > > wrote:
> > > > I rebased the v2 patchkit; here's an extra patch to fix an
> > > > issue
> > > > with it uncovered by a recently-added testcase (in r254990).
> > > > 
> > > > With the patch kit, but without this patch, g++'s
> > > >   c-c++-common/pr83059.c
> > > > fails to emit the "invalid memory model argument 6" warning.
> > > > 
> > > > Successfully bootstrapped®rtested on x86_64-pc-linux-gnu, as
> > > > part of the kit.
> > > > 
> > > > Is this OK for trunk, assuming the rest of the kit is approved?
> > > > 
> > > > gcc/c-family/ChangeLog:
> > > > * c-common.c (get_atomic_generic_size): Call
> > > > fold_for_warn
> > > > on the
> > > > params before checking for INTEGER_CST.
> > > > ---
> > > >  gcc/c-family/c-common.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
> > > > index 3438b87..ab03b7d 100644
> > > > --- a/gcc/c-family/c-common.c
> > > > +++ b/gcc/c-family/c-common.c
> > > > @@ -6720,7 +6720,7 @@ get_atomic_generic_size (location_t loc,
> > > > tree
> > > > function,
> > > >/* Check memory model parameters for validity.  */
> > > >for (x = n_param - n_model ; x < n_param; x++)
> > > >  {
> > > > -  tree p = (*params)[x];
> > > > +  tree p = fold_for_warn ((*params)[x]);
> > > >if (TREE_CODE (p) == INTEGER_CST)
> > > >  {
> > > >   /* memmodel_base masks the low 16 bits, thus ignore
> > > > any
> > > > bits above
> > > 
> > > Let's check the error case before we call fold_for_warn.
> > > 
> > > Jason
> > 
> > Do you mean like this?  (bootstrapped; regrtest in progress)
> 
> Ah, no, sorry I wasn't clear.  I meant to reorder the if/else there,
> so we check for INTEGER_TYPE first and potentially give an error,
> then
> fold, and then potentially warn.
> 
> Jason

Aha - thanks!

Here's an updated version of the patch.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu, as
part of the kit.

OK for trunk once the rest of the kit is approved?

gcc/c-family/ChangeLog:
* c-common.c (get_atomic_generic_size): Perform the test for
integral type before the range test for any integer constant,
fixing indentation of braces.  Call fold_for_warn before
testing for an INTEGER_CST.
---
 gcc/c-family/c-common.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 3438b87..7cc749b 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -6721,8 +6721,15 @@ get_atomic_generic_size (location_t loc, tree function,
   for (x = n_param - n_model ; x < n_param; x++)
 {
   tree p = (*params)[x];
+  if (!INTEGRAL_TYPE_P (TREE_TYPE (p)))
+   {
+ error_at (loc, "non-integer memory model argument %d of %qE", x + 1,
+   function);
+ return 0;
+   }
+  p = fold_for_warn (p);
   if (TREE_CODE (p) == INTEGER_CST)
-{
+   {
  /* memmodel_base masks the low 16 bits, thus ignore any bits above
 it by using TREE_INT_CST_LOW instead of tree_to_*hwi.  Those high
 bits will be checked later during expansion in target specific
@@ -6732,14 +6739,7 @@ get_atomic_generic_size (location_t loc, tree function,
"invalid memory model argument %d of %qE", x + 1,
function);
}
-  else
-   if (!INTEGRAL_TYPE_P (TREE_TYPE (p)))
- {
-   error_at (loc, "non-integer memory model argument %d of %qE", x + 1,
-  function);
-   return 0;
- }
-  }
+}
 
   return size_0;
 }
-- 
1.8.5.3



[PATCH] Fix some x86 OPTION_MASK_ISA* issues (PR target/83488)

2017-12-20 Thread Jakub Jelinek
Hi!

As you know, we ran out of ix86_isa_flags bitmask bits some time ago.
The testcase from the PR's #c0 (which I'm not adding into testsuite, because
it will be useless any time *.opt is modified with any of the
OPTION_MASK_ISA* bits) ICEs because we mix in i386-builtins.def
OPTION_MASK_ISA_* constants from the two bitmasks together,
e.g.
OPTION_MASK_ISA_SHSTK | OPTION_MASK_ISA_64BIT
and we have just a single bitmask there, so depending on the BDESC_
chunk it must be either solely ix86_isa_flags or solely ix86_isa_flags2,
otherwise OPTION_MASK_ISA_64BIT:
#define OPTION_MASK_ISA_64BIT (HOST_WIDE_INT_1U << 1)
stands for
#define OPTION_MASK_ISA_AVX5124VNNIW (HOST_WIDE_INT_1U << 1)
instead.
I've grepped around quite a bit and determined the only problematic two
OPTION_MASK_ISA_* options right now are OPTION_MASK_ISA_SHSTK due
to being ored with OPTION_MASK_ISA_64BIT and
OPTION_MASK_ISA_AVX512VBMI2 due to being ored with OPTION_MASK_ISA_AVX512VL.

So, this patch moves those 2 OPTION_MASK_ISA* flags into ix86_isa_flags
and to free some room there, moves 4 other OPTION_MASK_ISA* flags that
are never ored in any way and have very few dependencies into
ix86_isa_flags2.

In addition to this, it fixes a bug where for -mavx512vbmi2
we properly enable -mavx512f, but for -mavx512vbmi2 -mno-avx512f
we keep -mavx512vbmi2 enabled while -mavx512f disabled, so things ICE badly.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

As I said in the PR, one way to perhaps improve incrementally the status
quo could be to encode explicitly whether it is a mask option for
ix86_isa_flags or ix86_isa_flags2 in the OPTION_MASK_ISA* name, e.g. by
using OPTION_MASK_ISA2_ for the latter.  Thoughts on that?

And, while we now have 4 bits left, if we keep adding new ISA bits at the
current rate, while there are still a few ISA flags that could be moved,
perhaps at some point it might be easier to move everything AVX512 related
to ix86_isa_flags and keep the rest in ix86_isa_flags.  For that we'd need
some magic shadow option for OPTION_MASK_ISA_64BIT and keep that
synchronized.

2017-12-20  Jakub Jelinek  

PR target/83488
* config/i386/i386.c (ix86_target_string): Move -mavx512vbmi2 and
-mshstk entries from isa_opts2 to isa_opts and -mhle, -mmovbe,
-mclzero and -mmwaitx entries from isa_opts to isa_opts2.
(ix86_option_override_internal): Adjust for
OPTION_MASK_ISA_{HLE,MOVBE,CLZERO,MWAITX} moving to ix86_isa_flags2
and OPTION_MASK_ISA_SHSTK moving to ix86_isa_flags.
(BDESC_VERIFYS): Remove SPECIAL_ARGS2 related checks.
(ix86_init_mmx_sse_builtins): Remove bdesc_special_args2 handling.
Use def_builtin2 instead of def_builtin for OPTION_MASK_ISA_MWAITX
and OPTION_MASK_ISA_CLZERO builtins.  Use def_builtin instead of
def_builtin2 for CET builtins.
(ix86_expand_builtin): Remove bdesc_special_args2 handling.  Fix
up formatting in IX86_BUILTIN_RDPID code.
* config/i386/i386-builtin.def: Move VBMI2 builtins from SPECIAL_ARGS2
section to SPECIAL_ARGS and from ARGS2 section to ARGS.
* config/i386/i386.opt (mavx512vbmi2, mshstk): Move from
ix86_isa_flags2 to ix86_isa_flags.
(mhle, mmovbe, mclzero, mmwaitx): Move from ix86_isa_flags to
ix86_isa_flags2.
* config/i386/i386-c.c (ix86_target_macros_internal): Check for
OPTION_MASK_ISA_{CLZERO,MWAITX} in isa_flag2 instead of isa_flag.
Check for OPTION_MASK_ISA_{SHSTK,AVX512VBMI2} in isa_flag instead
of isa_flag2.
* common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512VBMI2_SET):
Or in OPTION_MASK_ISA_AVX512F_SET.
(OPTION_MASK_ISA_AVX512F_UNSET): Or in
OPTION_MASK_ISA_AVX512VBMI2_UNSET.
(ix86_handle_option): Adjust for
OPTION_MASK_ISA_{SHSTK,AVX512VBMI2}_*SET being in ix86_isa_flags
and OPTION_MASK_ISA_{MOVBE,MWAITX,CLZERO}_*SET in ix86_isa_flags2.

* gcc.target/i386/pr83488.c: New test.

--- gcc/config/i386/i386.c.jj   2017-12-20 10:10:10.0 +0100
+++ gcc/config/i386/i386.c  2017-12-20 13:15:46.738245880 +0100
@@ -2753,21 +2753,24 @@ ix86_target_string (HOST_WIDE_INT isa, H
   {
 { "-mcx16",OPTION_MASK_ISA_CX16 },
 { "-mmpx", OPTION_MASK_ISA_MPX },
-{ "-mavx512vbmi2", OPTION_MASK_ISA_AVX512VBMI2 },
-{ "-mavx512vnni", OPTION_MASK_ISA_AVX512VNNI },
+{ "-mavx512vnni",  OPTION_MASK_ISA_AVX512VNNI },
 { "-mvaes",OPTION_MASK_ISA_VAES },
 { "-mrdpid",   OPTION_MASK_ISA_RDPID },
 { "-msgx", OPTION_MASK_ISA_SGX },
 { "-mavx5124vnniw", OPTION_MASK_ISA_AVX5124VNNIW },
 { "-mavx5124fmaps", OPTION_MASK_ISA_AVX5124FMAPS },
 { "-mavx512vpopcntdq", OPTION_MASK_ISA_AVX512VPOPCNTDQ },
-{ "-mibt", OPTION_MASK_ISA_IBT },
-{ "-mshstk",   OPTION_MASK_ISA_SHSTK }
+{ "-mibt", OPTION_MASK_ISA_IBT },
+{ "-mhle", OP

Re: [PATCH] Fix ICE with -ftree-parallelize-loops=2 -fno-ipa-pure-const (PR ipa/83506)

2017-12-20 Thread Richard Biener
On December 20, 2017 8:24:04 PM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>Prathamesh's r254140 moved the
>  if (!flag_wpa)
>ipa_free_fn_summary ();
>from the ipa-inline pass to the following ipa-pure-const pass.
>That doesn't work well though, because ipa-inline pass has
>unconditional
>gate, so is done always when real IPA passes are performed, but
>ipa-pure-const is gated on some flag_* options, so if those options
>are disabled, we keep fn summary computed until end of compilation, and
>e.g. new function insertion hooks then attempt to update the fnsummary
>even when with RTL hooks registered.
>
>Calling ipa_free_fn_summary () in ipa-inline pass if the gate of
>the following pass will not be run looks too ugly to me, and while
>there
>is another unconditional ipa pass, moving it there doesn't look too
>clean to
>me either.
>
>We already have a separate pass ipa-free-fn-summary, just use it so far
>only
>during small IPA passes.  I think the cleanest is to add another ipa
>pass
>that will just free it instead of sticking it into unrelated passes
>which
>just happen to be unconditional ATM.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK. 

Richard. 

>2017-12-20  Jakub Jelinek  
>
>   PR ipa/83506
>   * ipa-fnsummary.c (pass_data_ipa_free_fn_summary): Use 0 for
>   todo_flags_finish.
>   (pass_ipa_free_fn_summary): Add small_p private data member,
>   initialize to false in the ctor.
>   (pass_ipa_free_fn_summary::clone,
>   pass_ipa_free_fn_summary::set_pass_param,
>   pass_ipa_free_fn_summary::gate): New methods.
>   (pass_ipa_free_fn_summary::execute): Return TODO_remove_functions
>   | TODO_dump_symtab if small_p.
>   * passes.def: Add true parm for the existing pass_ipa_free_fn_summary
>   entry and add another instance of the pass with false parm after
>   ipa-pure-const.
>   * ipa-pure-const.c (pass_ipa_pure_const): Don't call
>   ipa_free_fn_summary here.
>
>   * gcc.dg/pr83506.c: New test.
>   * gcc.dg/ipa/ctor-empty-1.c: Use -fdump-ipa-free-fnsummary1 instead
>   of -fdump-ipa-free-fnsummary and scan in free-fnsummary1 instead of
>   free-fnsummary dump.
>
>--- gcc/ipa-fnsummary.c.jj 2017-12-20 11:01:13.0 +0100
>+++ gcc/ipa-fnsummary.c2017-12-20 11:43:40.462288141 +0100
>@@ -3538,26 +3538,36 @@ const pass_data pass_data_ipa_free_fn_su
>   0, /* properties_provided */
>   0, /* properties_destroyed */
>   0, /* todo_flags_start */
>-  /* Early optimizations may make function unreachable.  We can not
>- remove unreachable functions as part of the ealry opts pass
>because
>- TODOs are run before subpasses.  Do it here.  */
>-  ( TODO_remove_functions | TODO_dump_symtab ), /* todo_flags_finish
>*/
>+  0, /* todo_flags_finish */
> };
> 
> class pass_ipa_free_fn_summary : public simple_ipa_opt_pass
> {
> public:
>   pass_ipa_free_fn_summary (gcc::context *ctxt)
>-: simple_ipa_opt_pass (pass_data_ipa_free_fn_summary, ctxt)
>+: simple_ipa_opt_pass (pass_data_ipa_free_fn_summary, ctxt),
>+  small_p (false)
>   {}
> 
>   /* opt_pass methods: */
>+  opt_pass *clone () { return new pass_ipa_free_fn_summary (m_ctxt); }
>+  void set_pass_param (unsigned int n, bool param)
>+{
>+  gcc_assert (n == 0);
>+  small_p = param;
>+}
>+  virtual bool gate (function *) { return small_p || !flag_wpa; }
>   virtual unsigned int execute (function *)
> {
>   ipa_free_fn_summary ();
>-  return 0;
>+  /* Early optimizations may make function unreachable.  We can
>not
>+   remove unreachable functions as part of the early opts pass because
>+   TODOs are run before subpasses.  Do it here.  */
>+  return small_p ? TODO_remove_functions | TODO_dump_symtab : 0;
> }
> 
>+private:
>+  bool small_p;
> }; // class pass_ipa_free_fn_summary
> 
> } // anon namespace
>--- gcc/passes.def.jj  2017-12-18 14:57:20.0 +0100
>+++ gcc/passes.def 2017-12-20 11:31:32.402117369 +0100
>@@ -144,7 +144,7 @@ along with GCC; see the file COPYING3.
>   PUSH_INSERT_PASSES_WITHIN (pass_ipa_tree_profile)
>   NEXT_PASS (pass_feedback_split_functions);
>   POP_INSERT_PASSES ()
>-  NEXT_PASS (pass_ipa_free_fn_summary);
>+  NEXT_PASS (pass_ipa_free_fn_summary, true /* small_p */);
>   NEXT_PASS (pass_ipa_increase_alignment);
>   NEXT_PASS (pass_ipa_tm);
>   NEXT_PASS (pass_ipa_lower_emutls);
>@@ -161,6 +161,7 @@ along with GCC; see the file COPYING3.
>   NEXT_PASS (pass_ipa_fn_summary);
>   NEXT_PASS (pass_ipa_inline);
>   NEXT_PASS (pass_ipa_pure_const);
>+  NEXT_PASS (pass_ipa_free_fn_summary, false /* small_p */);
>   NEXT_PASS (pass_ipa_reference);
> /* This pass needs to be scheduled after any IP code duplication.   */
>   NEXT_PASS (pass_ipa_single_use);
>--- gcc/ipa-pure-const.c.jj2017-12-04 22:29:11.0 +0100
>+++ gcc/ipa-pure-const.c   2017-12-20 11:33:11.905956408 +0100
>@@ -2013,10 +2013,6 @@ execute (function *)
> if (has

Re: [PATCH] Fix ICE with -ftree-parallelize-loops=2 -fno-ipa-pure-const (PR ipa/83506)

2017-12-20 Thread David Malcolm
On Wed, 2017-12-20 at 20:24 +0100, Jakub Jelinek wrote:
> Hi!
> 
> Prathamesh's r254140 moved the
>   if (!flag_wpa)
> ipa_free_fn_summary ();
> from the ipa-inline pass to the following ipa-pure-const pass.
> That doesn't work well though, because ipa-inline pass has
> unconditional
> gate, so is done always when real IPA passes are performed, but
> ipa-pure-const is gated on some flag_* options, so if those options
> are disabled, we keep fn summary computed until end of compilation,
> and
> e.g. new function insertion hooks then attempt to update the
> fnsummary
> even when with RTL hooks registered.
> 
> Calling ipa_free_fn_summary () in ipa-inline pass if the gate of
> the following pass will not be run looks too ugly to me, and while
> there
> is another unconditional ipa pass, moving it there doesn't look too
> clean to
> me either.
> 
> We already have a separate pass ipa-free-fn-summary, just use it so
> far only
> during small IPA passes.  I think the cleanest is to add another ipa
> pass
> that will just free it instead of sticking it into unrelated passes
> which
> just happen to be unconditional ATM.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Out of interest, did you test this with "jit" enabled?  I mention it
because we ran into issues with ipa fn summary cleanup with jit before,
which required r254458 to fix (PR jit/82826).

Thanks
Dave

> 2017-12-20  Jakub Jelinek  
> 
>   PR ipa/83506
>   * ipa-fnsummary.c (pass_data_ipa_free_fn_summary): Use 0 for
>   todo_flags_finish.
>   (pass_ipa_free_fn_summary): Add small_p private data member,
>   initialize to false in the ctor.
>   (pass_ipa_free_fn_summary::clone,
>   pass_ipa_free_fn_summary::set_pass_param,
>   pass_ipa_free_fn_summary::gate): New methods.
>   (pass_ipa_free_fn_summary::execute): Return
> TODO_remove_functions
>   | TODO_dump_symtab if small_p.
>   * passes.def: Add true parm for the existing
> pass_ipa_free_fn_summary
>   entry and add another instance of the pass with false parm
> after
>   ipa-pure-const.
>   * ipa-pure-const.c (pass_ipa_pure_const): Don't call
>   ipa_free_fn_summary here.
> 
>   * gcc.dg/pr83506.c: New test.
>   * gcc.dg/ipa/ctor-empty-1.c: Use -fdump-ipa-free-fnsummary1
> instead
>   of -fdump-ipa-free-fnsummary and scan in free-fnsummary1
> instead of
>   free-fnsummary dump.
> 
> --- gcc/ipa-fnsummary.c.jj2017-12-20 11:01:13.0 +0100
> +++ gcc/ipa-fnsummary.c   2017-12-20 11:43:40.462288141 +0100
> @@ -3538,26 +3538,36 @@ const pass_data pass_data_ipa_free_fn_su
>0, /* properties_provided */
>0, /* properties_destroyed */
>0, /* todo_flags_start */
> -  /* Early optimizations may make function unreachable.  We can not
> - remove unreachable functions as part of the ealry opts pass
> because
> - TODOs are run before subpasses.  Do it here.  */
> -  ( TODO_remove_functions | TODO_dump_symtab ), /* todo_flags_finish
> */
> +  0, /* todo_flags_finish */
>  };
>  
>  class pass_ipa_free_fn_summary : public simple_ipa_opt_pass
>  {
>  public:
>pass_ipa_free_fn_summary (gcc::context *ctxt)
> -: simple_ipa_opt_pass (pass_data_ipa_free_fn_summary, ctxt)
> +: simple_ipa_opt_pass (pass_data_ipa_free_fn_summary, ctxt),
> +  small_p (false)
>{}
>  
>/* opt_pass methods: */
> +  opt_pass *clone () { return new pass_ipa_free_fn_summary (m_ctxt);
> }
> +  void set_pass_param (unsigned int n, bool param)
> +{
> +  gcc_assert (n == 0);
> +  small_p = param;
> +}
> +  virtual bool gate (function *) { return small_p || !flag_wpa; }
>virtual unsigned int execute (function *)
>  {
>ipa_free_fn_summary ();
> -  return 0;
> +  /* Early optimizations may make function unreachable.  We can
> not
> +  remove unreachable functions as part of the early opts pass
> because
> +  TODOs are run before subpasses.  Do it here.  */
> +  return small_p ? TODO_remove_functions | TODO_dump_symtab : 0;
>  }
>  
> +private:
> +  bool small_p;
>  }; // class pass_ipa_free_fn_summary
>  
>  } // anon namespace
> --- gcc/passes.def.jj 2017-12-18 14:57:20.0 +0100
> +++ gcc/passes.def2017-12-20 11:31:32.402117369 +0100
> @@ -144,7 +144,7 @@ along with GCC; see the file COPYING3.
>PUSH_INSERT_PASSES_WITHIN (pass_ipa_tree_profile)
>NEXT_PASS (pass_feedback_split_functions);
>POP_INSERT_PASSES ()
> -  NEXT_PASS (pass_ipa_free_fn_summary);
> +  NEXT_PASS (pass_ipa_free_fn_summary, true /* small_p */);
>NEXT_PASS (pass_ipa_increase_alignment);
>NEXT_PASS (pass_ipa_tm);
>NEXT_PASS (pass_ipa_lower_emutls);
> @@ -161,6 +161,7 @@ along with GCC; see the file COPYING3.
>NEXT_PASS (pass_ipa_fn_summary);
>NEXT_PASS (pass_ipa_inline);
>NEXT_PASS (pass_ipa_pure_const);
> +  NEXT_PASS (pass_ipa_free_fn_summary, false /* small_p */);
>NEXT_PASS (pass_ipa_reference);
>/* This pass n

Re: [PATCH] Fix ICE with -ftree-parallelize-loops=2 -fno-ipa-pure-const (PR ipa/83506)

2017-12-20 Thread Jakub Jelinek
On Wed, Dec 20, 2017 at 02:38:13PM -0500, David Malcolm wrote:
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> Out of interest, did you test this with "jit" enabled?  I mention it
> because we ran into issues with ipa fn summary cleanup with jit before,
> which required r254458 to fix (PR jit/82826).

No, I'm testing jit only when building distro rpms, not in normal bootstraps
(because it requires --enable-host-shared).

Jakub


[PATCH] Add selftest for "fold_for_warn (error_mark_node)"

2017-12-20 Thread David Malcolm
Maybe this is overkill, but this patch adds a selftest that:

   fold_for_warn (error_mark_node)

does the right thing.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.

Manually tested with "make s-selftest-c++" (since we don't run
the selftests for cc1plus by default).

OK for trunk?

gcc/c-family/ChangeLog:
* c-common.c: Include "selftest.h".
(selftest::test_fold_for_warn): New function.
(selftest::c_common_c_tests): New function.
(selftest::c_family_tests): Call it.
---
 gcc/c-family/c-common.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 7cc749b..80f7495 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimplify.h"
 #include "substring-locations.h"
 #include "spellcheck.h"
+#include "selftest.h"
 
 cpp_reader *parse_in;  /* Declared in c-pragma.h.  */
 
@@ -8173,11 +8174,28 @@ maybe_suggest_missing_token_insertion (rich_location 
*richloc,
 
 namespace selftest {
 
+/* Verify that fold_for_warn on error_mark_node is safe.  */
+
+static void
+test_fold_for_warn ()
+{
+  ASSERT_EQ (error_mark_node, fold_for_warn (error_mark_node));
+}
+
+/* Run all of the selftests within this file.  */
+
+static void
+c_common_c_tests ()
+{
+  test_fold_for_warn ();
+}
+
 /* Run all of the tests within c-family.  */
 
 void
 c_family_tests (void)
 {
+  c_common_c_tests ();
   c_format_c_tests ();
   c_pretty_print_c_tests ();
   c_spellcheck_cc_tests ();
-- 
1.8.5.3



Re: [PATCH] Add NON_RECURSIVE attribute for procedures

2017-12-20 Thread Thomas Koenig

Hi Janne,

I think you need a few more test cases, like (compile with -fcheck=all
and -std=f2018):

module foo
contains
  subroutine f(n)
call g(n-1)
  end subroutine f
  subroutine g(n)
if (n<0) return
call f(n-1)
  end subroutine g
end module foo
program main
  use foo
  call f(10)
end program main

Also, there should be a run-time check that the above fails
with -std=f2018, -fcheck=all and the NON_RECURSIVE attribute.

However, I am not sure we should make f2018 the default standard as yet,
especially since some things have not yet been ironed out, but I am
open to discussion on that.

Any more thoughts on this one?

Regards

Thomas


[v2 of PATCH 08/14] cp/tree.c: strip location wrappers in lvalue_kind

2017-12-20 Thread David Malcolm
On Mon, 2017-12-11 at 18:39 -0500, Jason Merrill wrote:
> On 11/10/2017 04:45 PM, David Malcolm wrote:
> > Without this, then lvalue_p returns false for decls, and hence
> > e.g. uses of them for references fail.
> > 
> > Stripping location wrappers in lvalue_kind restores the correct
> > behavior of lvalue_p etc.
> > 
> > gcc/cp/ChangeLog:
> > * tree.c (lvalue_kind): Strip any location wrapper.
> 
> Rather, lvalue_kind should learn to handle VIEW_CONVERT_EXPR.
> 
> Jason

Thanks.

This patch does so, using:

case NON_LVALUE_EXPR:
case VIEW_CONVERT_EXPR:
  if (location_wrapper_p (ref))
return lvalue_kind (TREE_OPERAND (ref, 0));

As well as the VIEW_CONVERT_EXPR, lvalue_kind needs to handle
NON_LVALUE_EXPR, otherwise a location-wrapped string literal
hits this clause in the "default" case:

  if (CLASS_TYPE_P (TREE_TYPE (ref))
  || TREE_CODE (TREE_TYPE (ref)) == ARRAY_TYPE)
return clk_class;

when it should have hit this one (after removing the
location wrapper):

case STRING_CST:
case COMPOUND_LITERAL_EXPR:
  return clk_ordinary;

The patch adds selftest coverage for this case (and others).

Seen in e.g. libstdc++.sum's 21_strings/basic_string/cons/char/8.cc, which
would otherwise fail with this bogus error:

8.cc: In instantiation of 'std::size_t construct(Args&& ...)
  [with Args = {const char [9], int}; std::size_t = long unsigned int]':
8.cc:63:   required from here
8.cc:38: error: invalid static_cast from type 'const char [9]' to type 'const 
char [9]'
   TestBaseObjCtor as_base_obj( static_cast(args)... );
^~~

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu, as
part of the kit.
Also, manually tested with "make s-selftest-c++" (since we don't run
the selftests for cc1plus by default).

OK for trunk once the rest of the kit is approved?

gcc/cp/ChangeLog:
* cp-lang.c (selftest::run_cp_tests): Call
selftest::cp_tree_c_tests.
* cp-tree.h (selftest::cp_tree_c_tests): New decl.
* tree.c: Include "selftest.h".
(lvalue_kind): Handle location wrapper nodes.
(selftest::test_lvalue_kind): New function.
(selftest::cp_tree_c_tests): New function.
---
 gcc/cp/cp-lang.c |  1 +
 gcc/cp/cp-tree.h |  8 +++
 gcc/cp/tree.c| 67 
 3 files changed, 76 insertions(+)

diff --git a/gcc/cp/cp-lang.c b/gcc/cp/cp-lang.c
index 805319a..9bb4ce7 100644
--- a/gcc/cp/cp-lang.c
+++ b/gcc/cp/cp-lang.c
@@ -247,6 +247,7 @@ run_cp_tests (void)
   c_family_tests ();
 
   /* Additional C++-specific tests.  */
+  cp_tree_c_tests ();
 }
 
 } // namespace selftest
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 9879e16..6d4a2b8a 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7451,6 +7451,14 @@ named_decl_hash::equal (const value_type existing, 
compare_type candidate)
   return candidate == name;
 }
 
+#if CHECKING_P
+namespace selftest {
+  /* Declarations for specific families of tests within cp,
+ by source file, in alphabetical order.  */
+  extern void cp_tree_c_tests (void);
+} // namespace selftest
+#endif /* #if CHECKING_P */
+
 /* -- end of C++ */
 
 #endif /* ! GCC_CP_TREE_H */
diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 0ae2eff..ad8884c 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -35,6 +35,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "stringpool.h"
 #include "attribs.h"
 #include "flags.h"
+#include "selftest.h"
 
 static tree bot_manip (tree *, int *, void *);
 static tree bot_replace (tree *, int *, void *);
@@ -240,6 +241,12 @@ lvalue_kind (const_tree ref)
 case NON_DEPENDENT_EXPR:
   return lvalue_kind (TREE_OPERAND (ref, 0));
 
+case NON_LVALUE_EXPR:
+case VIEW_CONVERT_EXPR:
+  if (location_wrapper_p (ref))
+   return lvalue_kind (TREE_OPERAND (ref, 0));
+  /* Fallthrough.  */
+
 default:
   if (!TREE_TYPE (ref))
return clk_none;
@@ -5339,4 +5346,64 @@ lang_check_failed (const char* file, int line, const 
char* function)
 }
 #endif /* ENABLE_TREE_CHECKING */
 
+#if CHECKING_P
+
+namespace selftest {
+
+/* Verify that lvalue_kind () works, for various expressions,
+   and that location wrappers don't affect the results.  */
+
+static void
+test_lvalue_kind ()
+{
+  location_t loc = BUILTINS_LOCATION;
+
+  /* Verify constants and parameters, without and with
+ location wrappers.  */
+  tree int_cst = build_int_cst (integer_type_node, 42);
+  ASSERT_EQ (clk_none, lvalue_kind (integer_zero_node));
+
+  tree wrapped_int_cst = maybe_wrap_with_location (int_cst, loc);
+  ASSERT_TRUE (location_wrapper_p (wrapped_int_cst));
+  ASSERT_EQ (clk_none, lvalue_kind (integer_zero_node));
+
+  tree string_lit = build_string (3, "foo");
+  TREE_TYPE (string_lit) = char_array_type_node;
+  string_lit = fix_string_type (string_lit);
+  ASSERT_EQ (clk_ordinary, lvalue_kind (string_lit));
+
+  tree wrapped_string_li

[PATCH] Fix vector handling in simplify-rtx.c (PR rtl-optimization/82973)

2017-12-20 Thread Jakub Jelinek
Hi!

In rtl.texi we say:
@findex const_vector
@item (const_vector:@var{m} [@var{x0} @var{x1} @dots{}])
Represents a vector constant.  The square brackets stand for the vector
containing the constant elements.  @var{x0}, @var{x1} and so on are
the @code{const_int}, @code{const_wide_int}, @code{const_double} or
@code{const_fixed} elements.
and it is a very reasonable requirement. 
simplify_const_{unary,binary}_operation can violate that though, because
the recursion can return also other expressions, like in this case
a (mult (const_double) (const_double)) that can't be simplified because
of -frounding-math.  We need to punt on those.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-12-20  Jakub Jelinek  

PR rtl-optimization/82973
* simplify-rtx.c (simplify_const_unary_operation): Don't optimize into
CONST_VECTOR if some element isn't simplified into CONST_{,WIDE_}INT,
CONST_DOUBLE or CONST_FIXED.
(simplify_const_binary_operation): Likewise.  Use CONST_FIXED_P macro
instead of GET_CODE == CONST_FIXED.
(simplify_subreg): Use CONST_FIXED_P macro instead of
GET_CODE == CONST_FIXED.

* gfortran.dg/pr82973.f90: New test.

--- gcc/simplify-rtx.c.jj   2017-12-19 18:09:05.0 +0100
+++ gcc/simplify-rtx.c  2017-12-20 18:29:55.190089315 +0100
@@ -1776,7 +1776,12 @@ simplify_const_unary_operation (enum rtx
  rtx x = simplify_unary_operation (code, GET_MODE_INNER (mode),
CONST_VECTOR_ELT (op, i),
GET_MODE_INNER (opmode));
- if (!x)
+ /* Only CONST_INT, CONST_WIDE_INT, CONST_DOUBLE or CONST_FIXED
+is valid inside of CONST_VECTOR.  */
+ if (!x
+ || !(CONST_SCALAR_INT_P (x)
+  || CONST_DOUBLE_AS_FLOAT_P (x)
+  || CONST_FIXED_P (x)))
return 0;
  RTVEC_ELT (v, i) = x;
}
@@ -4006,7 +4011,12 @@ simplify_const_binary_operation (enum rt
  rtx x = simplify_binary_operation (code, GET_MODE_INNER (mode),
 CONST_VECTOR_ELT (op0, i),
 CONST_VECTOR_ELT (op1, i));
- if (!x)
+ /* Only CONST_INT, CONST_WIDE_INT, CONST_DOUBLE or CONST_FIXED
+is valid inside of CONST_VECTOR.  */
+ if (!x
+ || !(CONST_SCALAR_INT_P (x)
+  || CONST_DOUBLE_AS_FLOAT_P (x)
+  || CONST_FIXED_P (x)))
return 0;
  RTVEC_ELT (v, i) = x;
}
@@ -4017,11 +4027,11 @@ simplify_const_binary_operation (enum rt
   if (VECTOR_MODE_P (mode)
   && code == VEC_CONCAT
   && (CONST_SCALAR_INT_P (op0)
- || GET_CODE (op0) == CONST_FIXED
+ || CONST_FIXED_P (op0)
  || CONST_DOUBLE_AS_FLOAT_P (op0))
   && (CONST_SCALAR_INT_P (op1)
  || CONST_DOUBLE_AS_FLOAT_P (op1)
- || GET_CODE (op1) == CONST_FIXED))
+ || CONST_FIXED_P (op1)))
 {
   unsigned n_elts = GET_MODE_NUNITS (mode);
   rtvec v = rtvec_alloc (n_elts);
@@ -6193,7 +6203,7 @@ simplify_subreg (machine_mode outermode,
 
   if (CONST_SCALAR_INT_P (op)
   || CONST_DOUBLE_AS_FLOAT_P (op)
-  || GET_CODE (op) == CONST_FIXED
+  || CONST_FIXED_P (op)
   || GET_CODE (op) == CONST_VECTOR)
 {
   /* simplify_immed_subreg deconstructs OP into bytes and constructs
--- gcc/testsuite/gfortran.dg/pr82973.f90.jj2017-12-20 18:53:14.392540473 
+0100
+++ gcc/testsuite/gfortran.dg/pr82973.f90   2017-12-20 18:52:47.0 
+0100
@@ -0,0 +1,31 @@
+! PR rtl-optimization/82973
+! { dg-do compile }
+! { dg-options "-Ofast -frounding-math" }
+
+program pr82973
+  integer, parameter :: n=16
+  real, dimension(n) :: ar, br, modulo_result, floor_result
+  integer, dimension(n) :: ai, bi , imodulo_result, ifloor_result
+  ai(1:4) = 5
+  ai(5:8) = -5
+  ai(9:12) = 1
+  ai(13:16) = -1
+  bi(1:4) = (/ 3,-3, 1, -1/)
+  bi(5:8) = bi(1:4)
+  bi(9:12) = bi(1:4)
+  bi(13:16) = bi(1:4)
+  ar = ai
+  br = bi
+  modulo_result = modulo(ar,br)
+  imodulo_result = modulo(ai,bi)
+  floor_result = ar-floor(ar/br)*br
+  ifloor_result = nint(real(ai-floor(real(ai)/real(bi))*bi))
+  do i=1,n
+if (modulo_result(i) /= floor_result(i)) then
+  call abort()
+end if
+if (imodulo_result(i) /= ifloor_result(i)) then
+  call abort ()
+end if
+  end do
+end program pr82973

Jakub


[openacc, committed] Simplify fold_internal_goacc_dim

2017-12-20 Thread Tom de Vries

Hi,

this patch simplifies fold_internal_goacc_dim.

Bootstrapped and reg-tested on x86_64.

Committed as trivial.

Thanks,
- Tom
Simplify fold_internal_goacc_dim

2017-12-20  Tom de Vries  

	* gimple-fold.c (fold_internal_goacc_dim): Simplify.

---
 gcc/gimple-fold.c | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 87ce3d8..fafd833 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -3728,15 +3728,23 @@ fold_internal_goacc_dim (const gimple *call)
 {
   int axis = oacc_get_ifn_dim_arg (call);
   int size = oacc_get_fn_dim_size (current_function_decl, axis);
-  bool is_pos = gimple_call_internal_fn (call) == IFN_GOACC_DIM_POS;
   tree result = NULL_TREE;
+  tree type = TREE_TYPE (gimple_call_lhs (call));
 
-  /* If the size is 1, or we only want the size and it is not dynamic,
- we know the answer.  */
-  if (size == 1 || (!is_pos && size))
+  switch (gimple_call_internal_fn (call))
 {
-  tree type = TREE_TYPE (gimple_call_lhs (call));
-  result = build_int_cst (type, size - is_pos);
+case IFN_GOACC_DIM_POS:
+  /* If the size is 1, we know the answer.  */
+  if (size == 1)
+	result = build_int_cst (type, 0);
+  break;
+case IFN_GOACC_DIM_SIZE:
+  /* If the size is not dynamic, we know the answer.  */
+  if (size)
+	result = build_int_cst (type, size);
+  break;
+default:
+  break;
 }
 
   return result;


Re: [PATCH,PTX] Add support for CUDA 9

2017-12-20 Thread Cesar Philippidis
On 12/19/2017 04:39 PM, Tom de Vries wrote:
> On 12/20/2017 12:25 AM, Cesar Philippidis wrote:
>> In CUDA 9, Nvidia removed support for treating the labels of functions
>> as generic address spaces as part of their PTX 6.0 changes. More
>> specifically,
>> :
>>
>>
>>    Support for taking address of labels, using labels in initializers
>>    which was unimplemented has been removed from the spec.
>>
>> Despite targeting PTX 3.0, the ptxas assembler shipped with CUDA 9 no
>> longer support that legacy functionality. Consequently, this prevented
>> newlib from building. This patch fixes that problem by not using a
>> generic address space when initializing variables using a label address.
>>
> 
> What is the effect for pre-9.0 cudas?

No change in the libgomp execution test suite as an accelerator.

>> Is this OK for trunk?
>>
> 
> How did you test this?

Just libgomp as an accelerator for now. I'm trying to get standalone
nvptx working right now. Which testsuites do you usually run? I only see
the test results for check-gcc-c in your nightly build bot.

By the way, do you know what caused the recent nvptx breakage in trunk?

Cesar



Re: [PATCH,PTX] Add support for CUDA 9

2017-12-20 Thread Tom de Vries

On 12/20/2017 11:59 PM, Cesar Philippidis wrote:

On 12/19/2017 04:39 PM, Tom de Vries wrote:

On 12/20/2017 12:25 AM, Cesar Philippidis wrote:

In CUDA 9, Nvidia removed support for treating the labels of functions
as generic address spaces as part of their PTX 6.0 changes. More
specifically,
:


    Support for taking address of labels, using labels in initializers
    which was unimplemented has been removed from the spec.

Despite targeting PTX 3.0, the ptxas assembler shipped with CUDA 9 no
longer support that legacy functionality. Consequently, this prevented
newlib from building. This patch fixes that problem by not using a
generic address space when initializing variables using a label address.



What is the effect for pre-9.0 cudas?


No change in the libgomp execution test suite as an accelerator.


Is this OK for trunk?



How did you test this?


Just libgomp as an accelerator for now. I'm trying to get standalone
nvptx working right now.


Indeed, you should test that as well, both with 9.0 and pre 9.0 cuda.


Which testsuites do you usually run? I only see
the test results for check-gcc-c in your nightly build bot.



The gcc test suite should be enough.

Thanks,
- Tom


Re: [PATCH] Fix PR83491

2017-12-20 Thread Jeff Law
On 12/20/2017 06:59 AM, Jakub Jelinek wrote:
> On Wed, Dec 20, 2017 at 01:56:27PM +, Wilco Dijkstra wrote:
>> This patch fixes PR83491 - if an SSA expression contains 2 identical float
>> constants, the division reciprocal optimization will ICE.  Fix this by 
>> explicitly
>> checking for SSA_NAME in the tree code before walking the uses.  Also fix 
>> several
>> coding style issues pointed out by Jakub and make comments more readable.
>>
>> Bootstrap OK, OK for trunk?
>>
>> ChangeLog:
>> 2017-12-20  Wilco Dijkstra  
>>
>> gcc/
>>  PR tree-optimization/83491
>>  * tree-ssa-math-opts.c (execute_cse_reciprocals_1): Check for SSA_NAME
>>  before walking uses.  Improve coding style and comments.
>>
>> gcc/testsuite/
>>  PR tree-optimization/83491
>>  * gcc.dg/pr83491.c: Add new test.
> 
> Ok, thanks.
And I went ahead and committted.

jeff


Re: [PATCH] Fix -fcompare-debug due to DEBUG_BEGIN_STMTs (PR debug/83419)

2017-12-20 Thread Alexandre Oliva
On Dec 15, 2017, Richard Biener  wrote:

> On Thu, 14 Dec 2017, Jakub Jelinek wrote:
>> Hi!
>> 
>> The following testcase FAILs -fcompare-debug, because one COND_EXPR
>> branch from the FE during gimplifications is just >
>> which doesn't have TREE_SIDE_EFFECTS, but for -gstatement-frontiers it
>> is a STATEMENT_LIST which contains # DEBUG BEGIN_STMT and that > >.

> Ugh...  the issue is that this difference might have many other
> -fcompare-debug issues, like when folding things?

I fixed a number of -fcompare-debug issues related with
TREE_SIDE_EFFECTS in STATEMENT_LISTs during the development of SFN.
It's not too hard, really: most surviving statement lists end up with
TREE_SIDE_EFFECTS set.

This case, that I had not caught, was one in which the non-debug case
actually discards the STATEMENT_LIST (that had TREE_SIDE_EFFECTS set),
and just uses a naked expr (that does NOT have TREE_SIDE_EFFECTS).  It
wasn't too hard to detect and fix this case, but of course there might
be others still lurking: that's why we have -fcompare-debug, and every
now and then some new case pops up.  Some are new regressions, but
others were just latent issues that hadn't been noticed.

> Why is it a STATEMENT_LIST rather than a COMPOUND_EXPR?

Since we introduce the begin stmt marker in the existing statement list
long before we even start parsing the corresponding statement, using a
stand-alone statement made sense to me.  I did not even consider using
COMPOUND_EXPRs.

I suspect, however, that this could cause more complications, for it
would hide any specific forms that might be searched for within the
COMPOUND_EXPRs.  Do you not think so?  I guess it wouldn't be too hard
to adjust the same spot I'm touching in this patch so as to turn begin
stmt markers + stmt into COMPOUND_EXPRs.

> like adding a TREE_SIDE_EFFECTS_VALID flag on STATEMENT_LIST,
> keeping TREE_SIDE_EFFECTS up-to-date when easily possible and when doing
> the expensive thing cache the result.

I experimented a bit with that yesterday.  Surprisingly, just deferring
the computation of TREE_SIDE_EFFECTS for STATEMENT_LISTs caused trouble:
pop_stmt_list expects STATEMENT_LISTs to have TREE_SIDE_EFFECTs set
unless they're empty lists.  Suspecting there might be other such issues
lurking, I decided to go back to the strategy I used for earlier
occurrences of such bugs, namely to get the behavior to match as closely
as possible what we'd get if debug stmts weren't there.  It didn't take
me long to get a fix this way.

> Is it really necessary to introduce this IL difference so early and in
> such an intrusive way?

The most reasonable way to introduce markers at statement boundaries is
to have the parser do so.  I don't see why this should be such a big
deal, though; every time I introduce such IL debug-only changes, it took
me significant effort to approach the state in which nearly all of the
code behaves in the same way regardless of the debug-only stuff.  That
things work reasonably flawlessly now is not suggestive that it was
easy, or not too intrusive, but rather that the work to make it seamless
despite the intrusiveness was done, at first, and then over time.
That's the reason for -fcompare-debug and the various bootstrap options
that stress it further.

> Can't we avoid adding # DEBUG BEGIN_STMT when there's not already
> a STATEMENT_LIST around for example?

We could, but that would drop most of the begin_stmt markers, or none.
A STATEMENT_LIST is always there, ready to get stmts, and after parsing
a statement we often extract it from the statement list containing it,
and throw the list away.

IMHO the way these markers are being introduced is just fine, it will
just take us (me?) a bit more work to shake out the few remaining bugs,
just like it did when VTA was introduced.  A lot of work was done before
the patch was submitted to this end, but a number of problems only
showed up later, on other platforms or under different combinations of
optimization flags.  There's no reason to expect it to be different this
time.



The patch below fixes the PR83419 bug.  I've bootstrapped it on x86_64-,
i686-, ppc64-, ppc64le-, and aarch64-linux-gnu, with -fcompare-debug in
all of stage3.  sparc64-linux-gnu ran into -fcompare-debug failures
early in stage3, the same I ran into before, and that I'll investigate
next.

Ok to install?


[SFN] propagate single-nondebug-stmt's side effects to enclosing list

Statements without side effects, preceded by debug begin stmt markers,
would become a statement list with side effects, although the stmt on
its own would be extracted from the list and remain not having side
effects.  This causes debug info and possibly codegen differences.
This patch fixes it, identifying the situation in which the stmt would
have been extracted from the stmt list, and propagating the side
effects flag from the stmt to the list.

for  gcc/ChangeLog

PR debug/83419
* c-family/c-semantics.c (pop_stmt_list): Propagate side
effect

[PATCH] avoid using %lli et al.

2017-12-20 Thread Martin Sebor

The attached patch replaces use of %lli and %llu with
HOST_WIDE_INT_PRINT_DEC and HOST_WIDE_INT_PRINT_UNSIGNED
for portability, and also replaces wide_int::to_shwi() with
offset_int::from().

Although I couldn't come up with a test where the latter change
above would matter, by using obscenely large offset values I did
manage to trigger an assertion in the -Wrestrict pass in place
to try to verify the sanity of the offsets/sizes after they have
been converted from offset_int to HOST_WIDE_INT for printing
(and potentially mangled in the process).  I got rid of
the assertion since it doesn't serve a useful purpose.

Martin
gcc/ChangeLog:

	* gimple-ssa-warn-restrict.c (builtin_memref::builtin_memref): Use
	offset_int::from instead of wide_int::to_shwi.
	(maybe_diag_overlap): Remove assertion.
	Use HOST_WIDE_INT_PRINT_DEC instead of %lli.
	* gimple-ssa-sprintf.c (format_directive): Same.
	(parse_directive): Same.
	(sprintf_dom_walker::compute_format_length): Same.
	(try_substitute_return_value): Same.

gcc/testsuite/ChangeLog:

	* gcc.dg/Wrestrict-3.c: New test.

Index: /ssd/src/gcc/svn/gcc/testsuite/gcc.dg/Wrestrict-3.c
===
--- /ssd/src/gcc/svn/gcc/testsuite/gcc.dg/Wrestrict-3.c	(nonexistent)
+++ /ssd/src/gcc/svn/gcc/testsuite/gcc.dg/Wrestrict-3.c	(working copy)
@@ -0,0 +1,17 @@
+/* Test to verify that the call below with the out-of-bounds offset
+   doesn't trigger an internal assertion and is diagnosed.
+   { dg-do compile }
+   { dg-options "-O2 -Wrestrict" } */
+
+#define DIFF_MAX   __PTRDIFF_MAX__
+
+void test_no_ice (int *d, __PTRDIFF_TYPE__ i, __SIZE_TYPE__ n)
+{
+  if (i < DIFF_MAX / sizeof *d - 1 || DIFF_MAX / sizeof *d + 2 < i)
+i = DIFF_MAX / sizeof *d - 1;
+
+  if (n < DIFF_MAX)
+n = DIFF_MAX / sizeof *d;
+
+  __builtin_strncpy ((char*)(d + i), (char*)d, n);   /* { dg-warning "\\\[-Wrestrict]" } */
+}
Index: gcc/gimple-ssa-warn-restrict.c
===
--- gcc/gimple-ssa-warn-restrict.c	(revision 255901)
+++ gcc/gimple-ssa-warn-restrict.c	(working copy)
@@ -276,13 +276,13 @@ builtin_memref::builtin_memref (tree expr, tree si
 		  value_range_type rng = get_range_info (offset, &min, &max);
 		  if (rng == VR_RANGE)
 		{
-		  offrange[0] = min.to_shwi ();
-		  offrange[1] = max.to_shwi ();
+		  offrange[0] = offset_int::from (min, SIGNED);
+		  offrange[1] = offset_int::from (max, SIGNED);
 		}
 		  else if (rng == VR_ANTI_RANGE)
 		{
-		  offrange[0] = (max + 1).to_shwi ();
-		  offrange[1] = (min - 1).to_shwi ();
+		  offrange[0] = offset_int::from (max + 1, SIGNED);
+		  offrange[1] = offset_int::from (min - 1, SIGNED);
 		}
 		  else
 		{
@@ -1228,24 +1228,30 @@ maybe_diag_overlap (location_t loc, gcall *call, b
 
   if (dstref.offrange[0] == dstref.offrange[1]
   || dstref.offrange[1] > HOST_WIDE_INT_MAX)
-sprintf (offstr[0], "%lli", (long long) dstref.offrange[0].to_shwi ());
+sprintf (offstr[0], HOST_WIDE_INT_PRINT_DEC,
+	 (long long) dstref.offrange[0].to_shwi ());
   else
-sprintf (offstr[0], "[%lli, %lli]",
+sprintf (offstr[0],
+	 "[" HOST_WIDE_INT_PRINT_DEC ", " HOST_WIDE_INT_PRINT_DEC "]",
 	 (long long) dstref.offrange[0].to_shwi (),
 	 (long long) dstref.offrange[1].to_shwi ());
 
   if (srcref.offrange[0] == srcref.offrange[1]
   || srcref.offrange[1] > HOST_WIDE_INT_MAX)
-sprintf (offstr[1], "%lli", (long long) srcref.offrange[0].to_shwi ());
+sprintf (offstr[1],
+	 HOST_WIDE_INT_PRINT_DEC,
+	 (long long) srcref.offrange[0].to_shwi ());
   else
-sprintf (offstr[1], "[%lli, %lli]",
+sprintf (offstr[1],
+	 "[" HOST_WIDE_INT_PRINT_DEC ", " HOST_WIDE_INT_PRINT_DEC "]",
 	 (long long) srcref.offrange[0].to_shwi (),
 	 (long long) srcref.offrange[1].to_shwi ());
 
   if (ovloff[0] == ovloff[1] || !ovloff[1])
-sprintf (offstr[2], "%lli", (long long) ovloff[0]);
+sprintf (offstr[2], HOST_WIDE_INT_PRINT_DEC, (long long) ovloff[0]);
   else
-sprintf (offstr[2], "[%lli, %lli]",
+sprintf (offstr[2],
+	 "[" HOST_WIDE_INT_PRINT_DEC ", " HOST_WIDE_INT_PRINT_DEC "]",
 	 (long long) ovloff[0], (long long) ovloff[1]);
 
   const offset_int maxobjsize = tree_to_shwi (max_object_size ());
@@ -1361,9 +1367,6 @@ maybe_diag_overlap (location_t loc, gcall *call, b
 }
 
   /* Issue "may overlap" diagnostics below.  */
-  gcc_assert (ovlsiz[0] == 0
-	  && ovlsiz[1] > 0
-	  && ovlsiz[1] <= maxobjsize.to_shwi ());
 
   /* Use more concise wording when one of the offsets is unbounded
  to avoid confusing the user with large and mostly meaningless
Index: gcc/gimple-ssa-sprintf.c
===
--- gcc/gimple-ssa-sprintf.c	(revision 255901)
+++ gcc/gimple-ssa-sprintf.c	(working copy)
@@ -2993,8 +2993,12 @@ format_directive (const sprintf_dom_walker::call_i
 
   if (dump_f

Re: [PATCH] avoid using %lli et al.

2017-12-20 Thread Richard Sandiford
Martin Sebor  writes:
> The attached patch replaces use of %lli and %llu with
> HOST_WIDE_INT_PRINT_DEC and HOST_WIDE_INT_PRINT_UNSIGNED
> for portability, and also replaces wide_int::to_shwi() with
> offset_int::from().

Thanks for doing this.  I think you need to remove the (long long)
casts too though, so that the format stays in sync with the types.

Richard


Re: [PATCH] Fix vector handling in simplify-rtx.c (PR rtl-optimization/82973)

2017-12-20 Thread Richard Sandiford
Jakub Jelinek  writes:
> Hi!
>
> In rtl.texi we say:
> @findex const_vector
> @item (const_vector:@var{m} [@var{x0} @var{x1} @dots{}])
> Represents a vector constant.  The square brackets stand for the vector
> containing the constant elements.  @var{x0}, @var{x1} and so on are
> the @code{const_int}, @code{const_wide_int}, @code{const_double} or
> @code{const_fixed} elements.
> and it is a very reasonable requirement. 
> simplify_const_{unary,binary}_operation can violate that though, because
> the recursion can return also other expressions, like in this case
> a (mult (const_double) (const_double)) that can't be simplified because
> of -frounding-math.  We need to punt on those.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

How about renaming valid_for_const_vec_duplicate_p to something more
generic and using it here too?  The requirements should be the same.

Thanks,
Richard


Re: [PATCH] avoid using %lli et al.

2017-12-20 Thread Jakub Jelinek
On Wed, Dec 20, 2017 at 05:03:23PM -0700, Martin Sebor wrote:
> @@ -1228,24 +1228,30 @@ maybe_diag_overlap (location_t loc, gcall *call, b
>  
>if (dstref.offrange[0] == dstref.offrange[1]
>|| dstref.offrange[1] > HOST_WIDE_INT_MAX)
> -sprintf (offstr[0], "%lli", (long long) dstref.offrange[0].to_shwi ());
> +sprintf (offstr[0], HOST_WIDE_INT_PRINT_DEC,
> +  (long long) dstref.offrange[0].to_shwi ());

to_shwi () returns a HOST_WIDE_INT, there is no point in casting that to
(long long), and in case long long is different from HOST_WIDE_INT it might
result in warnings or even UB.  Just drop those casts everywhere where
the operand already is SHWI or UHWI.

Jakub


Re: [PATCH] Fix vector handling in simplify-rtx.c (PR rtl-optimization/82973)

2017-12-20 Thread Jakub Jelinek
On Thu, Dec 21, 2017 at 12:12:14AM +, Richard Sandiford wrote:
> Jakub Jelinek  writes:
> > Hi!
> >
> > In rtl.texi we say:
> > @findex const_vector
> > @item (const_vector:@var{m} [@var{x0} @var{x1} @dots{}])
> > Represents a vector constant.  The square brackets stand for the vector
> > containing the constant elements.  @var{x0}, @var{x1} and so on are
> > the @code{const_int}, @code{const_wide_int}, @code{const_double} or
> > @code{const_fixed} elements.
> > and it is a very reasonable requirement. 
> > simplify_const_{unary,binary}_operation can violate that though, because
> > the recursion can return also other expressions, like in this case
> > a (mult (const_double) (const_double)) that can't be simplified because
> > of -frounding-math.  We need to punt on those.
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> How about renaming valid_for_const_vec_duplicate_p to something more
> generic and using it here too?  The requirements should be the same.

As it generates CONST_VECTOR, renaming it to valid_for_const_vector_p
and using it also in simplify-rtx.c looks reasonable to me.

Jakub


Re: [PATCH] avoid using %lli et al.

2017-12-20 Thread Martin Sebor

On 12/20/2017 05:07 PM, Richard Sandiford wrote:

Martin Sebor  writes:

The attached patch replaces use of %lli and %llu with
HOST_WIDE_INT_PRINT_DEC and HOST_WIDE_INT_PRINT_UNSIGNED
for portability, and also replaces wide_int::to_shwi() with
offset_int::from().


Thanks for doing this.  I think you need to remove the (long long)
casts too though, so that the format stays in sync with the types.


Doh!  Thanks!  Had I waited for stage 2 bootstrap I suspect I would
have found out when it broke due to -Wformat complaining that
HOST_WIDE_INT_PRINT_DEC (%ld on my system) not matching the long
long argument.

Attached is the patch with the casts removed (still bootstrapping).

Martin
gcc/ChangeLog:

	* gimple-ssa-warn-restrict.c (builtin_memref::builtin_memref): Use
	offset_int::from instead of wide_int::to_shwi.
	(maybe_diag_overlap): Remove assertion.
	Use HOST_WIDE_INT_PRINT_DEC instead of %lli.
	* gimple-ssa-sprintf.c (format_directive): Same.
	(parse_directive): Same.
	(sprintf_dom_walker::compute_format_length): Same.
	(try_substitute_return_value): Same.

gcc/testsuite/ChangeLog:

	* gcc.dg/Wrestrict-3.c: New test.

Index: gcc/testsuite/gcc.dg/Wrestrict-3.c
===
--- gcc/testsuite/gcc.dg/Wrestrict-3.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/Wrestrict-3.c	(working copy)
@@ -0,0 +1,17 @@
+/* Test to verify that the call below with the out-of-bounds offset
+   doesn't trigger an internal assertion and is diagnosed.
+   { dg-do compile }
+   { dg-options "-O2 -Wrestrict" } */
+
+#define DIFF_MAX   __PTRDIFF_MAX__
+
+void test_no_ice (int *d, __PTRDIFF_TYPE__ i, __SIZE_TYPE__ n)
+{
+  if (i < DIFF_MAX / sizeof *d - 1 || DIFF_MAX / sizeof *d + 2 < i)
+i = DIFF_MAX / sizeof *d - 1;
+
+  if (n < DIFF_MAX)
+n = DIFF_MAX / sizeof *d;
+
+  __builtin_strncpy ((char*)(d + i), (char*)d, n);   /* { dg-warning "\\\[-Wrestrict]" } */
+}
Index: gcc/gimple-ssa-warn-restrict.c
===
--- gcc/gimple-ssa-warn-restrict.c	(revision 255901)
+++ gcc/gimple-ssa-warn-restrict.c	(working copy)
@@ -276,13 +276,13 @@ builtin_memref::builtin_memref (tree expr, tree si
 		  value_range_type rng = get_range_info (offset, &min, &max);
 		  if (rng == VR_RANGE)
 		{
-		  offrange[0] = min.to_shwi ();
-		  offrange[1] = max.to_shwi ();
+		  offrange[0] = offset_int::from (min, SIGNED);
+		  offrange[1] = offset_int::from (max, SIGNED);
 		}
 		  else if (rng == VR_ANTI_RANGE)
 		{
-		  offrange[0] = (max + 1).to_shwi ();
-		  offrange[1] = (min - 1).to_shwi ();
+		  offrange[0] = offset_int::from (max + 1, SIGNED);
+		  offrange[1] = offset_int::from (min - 1, SIGNED);
 		}
 		  else
 		{
@@ -1228,25 +1228,31 @@ maybe_diag_overlap (location_t loc, gcall *call, b
 
   if (dstref.offrange[0] == dstref.offrange[1]
   || dstref.offrange[1] > HOST_WIDE_INT_MAX)
-sprintf (offstr[0], "%lli", (long long) dstref.offrange[0].to_shwi ());
+sprintf (offstr[0], HOST_WIDE_INT_PRINT_DEC,
+	 dstref.offrange[0].to_shwi ());
   else
-sprintf (offstr[0], "[%lli, %lli]",
-	 (long long) dstref.offrange[0].to_shwi (),
-	 (long long) dstref.offrange[1].to_shwi ());
+sprintf (offstr[0],
+	 "[" HOST_WIDE_INT_PRINT_DEC ", " HOST_WIDE_INT_PRINT_DEC "]",
+	 dstref.offrange[0].to_shwi (),
+	 dstref.offrange[1].to_shwi ());
 
   if (srcref.offrange[0] == srcref.offrange[1]
   || srcref.offrange[1] > HOST_WIDE_INT_MAX)
-sprintf (offstr[1], "%lli", (long long) srcref.offrange[0].to_shwi ());
+sprintf (offstr[1],
+	 HOST_WIDE_INT_PRINT_DEC,
+	 srcref.offrange[0].to_shwi ());
   else
-sprintf (offstr[1], "[%lli, %lli]",
-	 (long long) srcref.offrange[0].to_shwi (),
-	 (long long) srcref.offrange[1].to_shwi ());
+sprintf (offstr[1],
+	 "[" HOST_WIDE_INT_PRINT_DEC ", " HOST_WIDE_INT_PRINT_DEC "]",
+	 srcref.offrange[0].to_shwi (),
+	 srcref.offrange[1].to_shwi ());
 
   if (ovloff[0] == ovloff[1] || !ovloff[1])
-sprintf (offstr[2], "%lli", (long long) ovloff[0]);
+sprintf (offstr[2], HOST_WIDE_INT_PRINT_DEC, ovloff[0]);
   else
-sprintf (offstr[2], "[%lli, %lli]",
-	 (long long) ovloff[0], (long long) ovloff[1]);
+sprintf (offstr[2],
+	 "[" HOST_WIDE_INT_PRINT_DEC ", " HOST_WIDE_INT_PRINT_DEC "]",
+	 ovloff[0], ovloff[1]);
 
   const offset_int maxobjsize = tree_to_shwi (max_object_size ());
   bool must_overlap = ovlsiz[0] > 0;
@@ -1361,9 +1367,6 @@ maybe_diag_overlap (location_t loc, gcall *call, b
 }
 
   /* Issue "may overlap" diagnostics below.  */
-  gcc_assert (ovlsiz[0] == 0
-	  && ovlsiz[1] > 0
-	  && ovlsiz[1] <= maxobjsize.to_shwi ());
 
   /* Use more concise wording when one of the offsets is unbounded
  to avoid confusing the user with large and mostly meaningless
Index: gcc/gimple-ssa-sprintf.c
=

Re: [PATCH] Fix -fcompare-debug due to DEBUG_BEGIN_STMTs (PR debug/83419)

2017-12-20 Thread Alexandre Oliva
On Dec 20, 2017, Alexandre Oliva  wrote:

> things work reasonably flawlessly now is not suggestive that it was
> easy, or not too intrusive, but rather that the work to make it seamless
> despite the intrusiveness was done, at first, and then over time.
> That's the reason for -fcompare-debug and the various bootstrap options
> that stress it further.

> sparc64-linux-gnu ran into -fcompare-debug failures early in stage3,
> the same I ran into before, and that I'll investigate next.

The problems I observed on sparc64 were all similar, caused by the
delayed branch infrastructure.  Here's a patch that fixes it.  Testing
on sparc64-linux-gnu underway, but I'm posting it right away because
it's so obvious.  And yet, I ask: ok to install?

This fixes stage3 -fcompare-debug builds of at least the builds of
libbacktrace/dwarf.o, zlib/deflate.o, and libiberty/cplus-dem.o.


[-fcompare-debug] retain insn locations when turning dbr seq into return

A number of -fcompare-debug errors on sparc arise as we split a dbr
SEQUENCE back into separate insns to turn the branch into a return.
If we just take the location from the PREV_INSN, it might be a debug
insn without INSN_LOCATION, or an insn with an unrelated location.
But that's silly: each of the SEQUENCEd insns is still an insn with
its own INSN_LOCATION, so use that instead, even though some may have
been adjusted while constructing the SEQUENCE.

for  gcc/ChangeLog

* reorg.c (make_return_insns): Reemit each insn with its own
location.
---
 gcc/reorg.c |9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/gcc/reorg.c b/gcc/reorg.c
index 96778addce52..02d8adc61808 100644
--- a/gcc/reorg.c
+++ b/gcc/reorg.c
@@ -3612,9 +3612,14 @@ make_return_insns (rtx_insn *first)
 
  delete_related_insns (insn);
  for (i = 1; i < XVECLEN (pat, 0); i++)
-   prev = emit_insn_after (PATTERN (XVECEXP (pat, 0, i)), prev);
+   {
+ rtx_insn *in_seq_insn = as_a (XVECEXP (pat, 0, i));
+ prev = emit_insn_after_setloc (PATTERN (in_seq_insn), prev,
+INSN_LOCATION (in_seq_insn));
+   }
 
- insn = emit_jump_insn_after (PATTERN (jump_insn), prev);
+ insn = emit_jump_insn_after_setloc (PATTERN (jump_insn), prev,
+ INSN_LOCATION (jump_insn));
  emit_barrier_after (insn);
 
  if (slots)


-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PATCH] Fix -fcompare-debug due to DEBUG_BEGIN_STMTs (PR debug/83419)

2017-12-20 Thread Jeff Law
On 12/20/2017 06:37 PM, Alexandre Oliva wrote:
> On Dec 20, 2017, Alexandre Oliva  wrote:
> 
>> things work reasonably flawlessly now is not suggestive that it was
>> easy, or not too intrusive, but rather that the work to make it seamless
>> despite the intrusiveness was done, at first, and then over time.
>> That's the reason for -fcompare-debug and the various bootstrap options
>> that stress it further.
> 
>> sparc64-linux-gnu ran into -fcompare-debug failures early in stage3,
>> the same I ran into before, and that I'll investigate next.
> 
> The problems I observed on sparc64 were all similar, caused by the
> delayed branch infrastructure.  Here's a patch that fixes it.  Testing
> on sparc64-linux-gnu underway, but I'm posting it right away because
> it's so obvious.  And yet, I ask: ok to install?
> 
> This fixes stage3 -fcompare-debug builds of at least the builds of
> libbacktrace/dwarf.o, zlib/deflate.o, and libiberty/cplus-dem.o.
> 
> 
> [-fcompare-debug] retain insn locations when turning dbr seq into return
> 
> A number of -fcompare-debug errors on sparc arise as we split a dbr
> SEQUENCE back into separate insns to turn the branch into a return.
> If we just take the location from the PREV_INSN, it might be a debug
> insn without INSN_LOCATION, or an insn with an unrelated location.
> But that's silly: each of the SEQUENCEd insns is still an insn with
> its own INSN_LOCATION, so use that instead, even though some may have
> been adjusted while constructing the SEQUENCE.
> 
> for  gcc/ChangeLog
> 
>   * reorg.c (make_return_insns): Reemit each insn with its own
>   location.
OK.  FWIW I wouldn't be surprised if there's other places in reorg that
are going to need similar fixes.

FOr example, look at the code after these two comments:

 /* Delete the RETURN and just execute the delay list insns.

 We do this by deleting the INSN containing the SEQUENCE, then
 re-emitting the insns separately, and then deleting the RETURN.
 This allows the count of the jump target to be properly
 decremented.

 Note that we need to change the INSN_UID of the re-emitted
insns
 since it is used to hash the insns for
mark_target_live_regs and
 the re-emitted insns will no longer be wrapped up in a
SEQUENCE.

 Clear the from target bit, since these insns are no longer
 in delay slots.  */


 /* All this insn does is execute its delay list and jump to the
 following insn.  So delete the jump and just execute the delay
 list insns.

 We do this by deleting the INSN containing the SEQUENCE, then
 re-emitting the insns separately, and then deleting the jump.
 This allows the count of the jump target to be properly
 decremented.

 Note that we need to change the INSN_UID of the re-emitted
insns
 since it is used to hash the insns for
mark_target_live_regs and
 the re-emitted insns will no longer be wrapped up in a
SEQUENCE.

 Clear the from target bit, since these insns are no longer
 in delay slots.  */

> ---


Re: [PATCH] Fix -fcompare-debug due to DEBUG_BEGIN_STMTs (PR debug/83419)

2017-12-20 Thread Jeff Law
On 12/20/2017 05:02 PM, Alexandre Oliva wrote:
> On Dec 15, 2017, Richard Biener  wrote:
> 
>> On Thu, 14 Dec 2017, Jakub Jelinek wrote:
>>> Hi!
>>>
>>> The following testcase FAILs -fcompare-debug, because one COND_EXPR
>>> branch from the FE during gimplifications is just >
>>> which doesn't have TREE_SIDE_EFFECTS, but for -gstatement-frontiers it
>>> is a STATEMENT_LIST which contains # DEBUG BEGIN_STMT and that >> >.
> 
>> Ugh...  the issue is that this difference might have many other
>> -fcompare-debug issues, like when folding things?
> 
> I fixed a number of -fcompare-debug issues related with
> TREE_SIDE_EFFECTS in STATEMENT_LISTs during the development of SFN.
> It's not too hard, really: most surviving statement lists end up with
> TREE_SIDE_EFFECTS set.
> 
> This case, that I had not caught, was one in which the non-debug case
> actually discards the STATEMENT_LIST (that had TREE_SIDE_EFFECTS set),
> and just uses a naked expr (that does NOT have TREE_SIDE_EFFECTS).  It
> wasn't too hard to detect and fix this case, but of course there might
> be others still lurking: that's why we have -fcompare-debug, and every
> now and then some new case pops up.  Some are new regressions, but
> others were just latent issues that hadn't been noticed.
> 
>> Why is it a STATEMENT_LIST rather than a COMPOUND_EXPR?
> 
> Since we introduce the begin stmt marker in the existing statement list
> long before we even start parsing the corresponding statement, using a
> stand-alone statement made sense to me.  I did not even consider using
> COMPOUND_EXPRs.
> 
> I suspect, however, that this could cause more complications, for it
> would hide any specific forms that might be searched for within the
> COMPOUND_EXPRs.  Do you not think so?  I guess it wouldn't be too hard
> to adjust the same spot I'm touching in this patch so as to turn begin
> stmt markers + stmt into COMPOUND_EXPRs.
> 
>> like adding a TREE_SIDE_EFFECTS_VALID flag on STATEMENT_LIST,
>> keeping TREE_SIDE_EFFECTS up-to-date when easily possible and when doing
>> the expensive thing cache the result.
> 
> I experimented a bit with that yesterday.  Surprisingly, just deferring
> the computation of TREE_SIDE_EFFECTS for STATEMENT_LISTs caused trouble:
> pop_stmt_list expects STATEMENT_LISTs to have TREE_SIDE_EFFECTs set
> unless they're empty lists.  Suspecting there might be other such issues
> lurking, I decided to go back to the strategy I used for earlier
> occurrences of such bugs, namely to get the behavior to match as closely
> as possible what we'd get if debug stmts weren't there.  It didn't take
> me long to get a fix this way.
> 
>> Is it really necessary to introduce this IL difference so early and in
>> such an intrusive way?
> 
> The most reasonable way to introduce markers at statement boundaries is
> to have the parser do so.  I don't see why this should be such a big
> deal, though; every time I introduce such IL debug-only changes, it took
> me significant effort to approach the state in which nearly all of the
> code behaves in the same way regardless of the debug-only stuff.  That
> things work reasonably flawlessly now is not suggestive that it was
> easy, or not too intrusive, but rather that the work to make it seamless
> despite the intrusiveness was done, at first, and then over time.
> That's the reason for -fcompare-debug and the various bootstrap options
> that stress it further.
> 
>> Can't we avoid adding # DEBUG BEGIN_STMT when there's not already
>> a STATEMENT_LIST around for example?
> 
> We could, but that would drop most of the begin_stmt markers, or none.
> A STATEMENT_LIST is always there, ready to get stmts, and after parsing
> a statement we often extract it from the statement list containing it,
> and throw the list away.
> 
> IMHO the way these markers are being introduced is just fine, it will
> just take us (me?) a bit more work to shake out the few remaining bugs,
> just like it did when VTA was introduced.  A lot of work was done before
> the patch was submitted to this end, but a number of problems only
> showed up later, on other platforms or under different combinations of
> optimization flags.  There's no reason to expect it to be different this
> time.
> 
> 
> 
> The patch below fixes the PR83419 bug.  I've bootstrapped it on x86_64-,
> i686-, ppc64-, ppc64le-, and aarch64-linux-gnu, with -fcompare-debug in
> all of stage3.  sparc64-linux-gnu ran into -fcompare-debug failures
> early in stage3, the same I ran into before, and that I'll investigate
> next.
> 
> Ok to install?
> 
> 
> [SFN] propagate single-nondebug-stmt's side effects to enclosing list
> 
> Statements without side effects, preceded by debug begin stmt markers,
> would become a statement list with side effects, although the stmt on
> its own would be extracted from the list and remain not having side
> effects.  This causes debug info and possibly codegen differences.
> This patch fixes it, identifying the situation in which the stmt woul

Re: [PATCH] Add selftest for "fold_for_warn (error_mark_node)"

2017-12-20 Thread Jeff Law
On 12/20/2017 12:45 PM, David Malcolm wrote:
> Maybe this is overkill, but this patch adds a selftest that:
> 
>fold_for_warn (error_mark_node)
> 
> does the right thing.
> 
> Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
> 
> Manually tested with "make s-selftest-c++" (since we don't run
> the selftests for cc1plus by default).
> 
> OK for trunk?
> 
> gcc/c-family/ChangeLog:
>   * c-common.c: Include "selftest.h".
>   (selftest::test_fold_for_warn): New function.
>   (selftest::c_common_c_tests): New function.
>   (selftest::c_family_tests): Call it.
OK.
jeff


Re: [PATCH] Fix PR83418

2017-12-20 Thread Jeff Law
On 12/15/2017 09:30 AM, Richard Biener wrote:
> On December 15, 2017 5:27:14 PM GMT+01:00, Jeff Law  wrote:
>> On 12/15/2017 01:10 AM, Richard Biener wrote:
>>> On Thu, 14 Dec 2017, Richard Biener wrote:
>>>
 On December 14, 2017 4:43:42 PM GMT+01:00, Jeff Law 
>> wrote:
> On 12/14/2017 01:54 AM, Richard Biener wrote:
>>
>> IVOPTs (at least) leaves unfolded stmts in the IL and VRP
> overzealously
>> asserts they cannot happen.
>>
>> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
>>
>> Richard.
>>
>> 2017-12-14  Richard Biener  
>>
>>  PR tree-optimization/83418
>>  * vr-values.c
> (vr_values::extract_range_for_var_from_comparison_expr):
>>  Instead of asserting we don't get unfolded comparisons deal with
>>  them.
>>
>>  * gcc.dg/torture/pr83418.c: New testcase.
> I think this also potentially affects dumping.  I've seen the
>> dumper
> crash trying to access a INTEGER_CST where we expected to find an
> SSA_NAME while iterating over a statement's operands.
>
> I haven't submitted the workaround because I hadn't tracked down
>> the
> root cause to verify something deeper isn't wrong.

 Yes, I've seen this as well, see my comment in the PR. The issue is
>> that DOM calls VRP analyze (and dump) routines with not up to date
>> operands during optimize_stmt. 
>>>
>>> I had the following in my tree to allow dumping.
>>>
>>> Richard.
>>>
>>> Index: gcc/tree-ssa-dom.c
>>> ===
>>> --- gcc/tree-ssa-dom.c  (revision 255640)
>>> +++ gcc/tree-ssa-dom.c  (working copy)
>>> @@ -2017,6 +2017,7 @@ dom_opt_dom_walker::optimize_stmt (basic
>>>  undefined behavior that get diagnosed if they're
>> left in 
>>> the
>>>  IL because we've attached range information to new
>>>  SSA_NAMES.  */
>>> + update_stmt_if_modified (stmt);
>>>   edge taken_edge = NULL;
>>>   evrp_range_analyzer.vrp_visit_cond_stmt (as_a 
>>
>>> (stmt),
>>>&taken_edge);
>>>
>> I think this implies something earlier changed a statement without
>> updating it.
> 
> Dom itself does this and delays updating on purpose as an optimization. That 
> doesn't work quite well when dispatching into different code. 
So I went ahead with a bootstrap and regression test with this patch.
If (of course) worked fine.  I'm installing it on the trunk.

jeff



Re: [v2 of PATCH 08/14] cp/tree.c: strip location wrappers in lvalue_kind

2017-12-20 Thread Jason Merrill
On Wed, Dec 20, 2017 at 5:14 PM, David Malcolm  wrote:
> On Mon, 2017-12-11 at 18:39 -0500, Jason Merrill wrote:
>> On 11/10/2017 04:45 PM, David Malcolm wrote:
>> > Without this, then lvalue_p returns false for decls, and hence
>> > e.g. uses of them for references fail.
>> >
>> > Stripping location wrappers in lvalue_kind restores the correct
>> > behavior of lvalue_p etc.
>> >
>> > gcc/cp/ChangeLog:
>> > * tree.c (lvalue_kind): Strip any location wrapper.
>>
>> Rather, lvalue_kind should learn to handle VIEW_CONVERT_EXPR.

> This patch does so, using:
>
> case NON_LVALUE_EXPR:
> case VIEW_CONVERT_EXPR:
>   if (location_wrapper_p (ref))
> return lvalue_kind (TREE_OPERAND (ref, 0));
>
> As well as the VIEW_CONVERT_EXPR, lvalue_kind needs to handle
> NON_LVALUE_EXPR, otherwise a location-wrapped string literal
> hits this clause in the "default" case:
>
>   if (CLASS_TYPE_P (TREE_TYPE (ref))
>   || TREE_CODE (TREE_TYPE (ref)) == ARRAY_TYPE)
> return clk_class;
>
> when it should have hit this one (after removing the
> location wrapper):
>
> case STRING_CST:
> case COMPOUND_LITERAL_EXPR:
>   return clk_ordinary;

Ah, the issue is that string literals should use VIEW_CONVERT_EXPR
rather than NON_LVALUE_EXPR, since they are lvalues.  With that
change, we shouldn't need to handle NON_LVALUE_EXPR specifically.

Jason


Re: [v3 of PATCH 15/14] Use fold_for_warn in get_atomic_generic_size

2017-12-20 Thread Jason Merrill
On Wed, Dec 20, 2017 at 2:36 PM, David Malcolm  wrote:
> On Tue, 2017-12-19 at 23:22 -0500, Jason Merrill wrote:
>> On Tue, Dec 19, 2017 at 7:53 PM, David Malcolm 
>> wrote:
>> > On Tue, 2017-12-19 at 15:35 -0500, Jason Merrill wrote:
>> > > On Sat, Dec 16, 2017 at 8:12 PM, David Malcolm > > > om>
>> > > wrote:
>> > > > I rebased the v2 patchkit; here's an extra patch to fix an
>> > > > issue
>> > > > with it uncovered by a recently-added testcase (in r254990).
>> > > >
>> > > > With the patch kit, but without this patch, g++'s
>> > > >   c-c++-common/pr83059.c
>> > > > fails to emit the "invalid memory model argument 6" warning.
>> > > >
>> > > > Successfully bootstrapped®rtested on x86_64-pc-linux-gnu, as
>> > > > part of the kit.
>> > > >
>> > > > Is this OK for trunk, assuming the rest of the kit is approved?
>> > > >
>> > > > gcc/c-family/ChangeLog:
>> > > > * c-common.c (get_atomic_generic_size): Call
>> > > > fold_for_warn
>> > > > on the
>> > > > params before checking for INTEGER_CST.
>> > > > ---
>> > > >  gcc/c-family/c-common.c | 2 +-
>> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
>> > > >
>> > > > diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
>> > > > index 3438b87..ab03b7d 100644
>> > > > --- a/gcc/c-family/c-common.c
>> > > > +++ b/gcc/c-family/c-common.c
>> > > > @@ -6720,7 +6720,7 @@ get_atomic_generic_size (location_t loc,
>> > > > tree
>> > > > function,
>> > > >/* Check memory model parameters for validity.  */
>> > > >for (x = n_param - n_model ; x < n_param; x++)
>> > > >  {
>> > > > -  tree p = (*params)[x];
>> > > > +  tree p = fold_for_warn ((*params)[x]);
>> > > >if (TREE_CODE (p) == INTEGER_CST)
>> > > >  {
>> > > >   /* memmodel_base masks the low 16 bits, thus ignore
>> > > > any
>> > > > bits above
>> > >
>> > > Let's check the error case before we call fold_for_warn.
>> > >
>> > > Jason
>> >
>> > Do you mean like this?  (bootstrapped; regrtest in progress)
>>
>> Ah, no, sorry I wasn't clear.  I meant to reorder the if/else there,
>> so we check for INTEGER_TYPE first and potentially give an error,
>> then
>> fold, and then potentially warn.
>>
>> Jason
>
> Aha - thanks!
>
> Here's an updated version of the patch.
>
> Successfully bootstrapped®rtested on x86_64-pc-linux-gnu, as
> part of the kit.
>
> OK for trunk once the rest of the kit is approved?

OK.

Jason


Re: [v3 of 05/14] C++: handle locations wrappers when calling warn_for_memset

2017-12-20 Thread Jason Merrill
On Wed, Dec 20, 2017 at 2:26 PM, David Malcolm  wrote:
> On Tue, 2017-12-19 at 15:02 -0500, Jason Merrill wrote:
>> On 12/16/2017 03:01 PM, Martin Sebor wrote:
>> > On 12/16/2017 06:12 AM, David Malcolm wrote:
>> > > On Mon, 2017-12-11 at 18:36 -0500, Jason Merrill wrote:
>> > > > On 11/10/2017 04:45 PM, David Malcolm wrote:
>> > > > > We need to strip away location wrappers in tree.c predicates
>> > > > > like
>> > > > > integer_zerop, otherwise they fail when they're called on
>> > > > > wrapped INTEGER_CST; an example can be seen for
>> > > > >c-c++-common/Wmemset-transposed-args1.c
>> > > > > in g++.sum, where the warn_for_memset fails to detect integer
>> > > > > zero
>> > > > > if the location wrappers aren't stripped.
>> > > >
>> > > > These shouldn't be needed; callers should have folded away
>> > > > location
>> > > > wrappers.  I would hope for STRIP_ANY_LOCATION_WRAPPER to be
>> > > > almost
>> > > > never needed.
>> > > >
>> > > > warn_for_memset may be missing some calls to fold_for_warn.
>> > >
>> > > It is, thanks.
>> > >
>> > > Here's a revised fix for e.g. Wmemset-transposed-args1.c,
>> > > replacing
>> > > "[PATCH 05/14] tree.c: strip location wrappers from integer_zerop
>> > > etc"
>> > > and
>> > > "[PATCH 10/14] warn_for_memset: handle location wrappers"
>> > >
>> > > This version of the patch simply calls fold_for_warn on each of
>> > > the
>> > > arguments before calling warn_for_memset.  This ensures that
>> > > warn_for_memset detects integer zero.  It also adds a
>> > > literal_integer_zerop to deal with location wrapper nodes when
>> > > building "literal_mask" for the call, since this must be called
>> > > *before* the fold_for_warn calls.
>> > >
>> > > Successfully bootstrapped®rtested on x86_64-pc-linux-gnu, as
>> > > part of the kit.
>> > >
>> > > Is this OK for trunk, once the rest of the kit is approved?
>> > >
>> > > gcc/cp/ChangeLog:
>> > > * parser.c (literal_integer_zerop): New function.
>> > > (cp_parser_postfix_expression): When calling warn_for_memset,
>> > > replace integer_zerop calls with literal_integer_zerop, and
>> > > call fold_for_warn on the arguments.
>> > > ---
>> > >  gcc/cp/parser.c | 18 --
>> > >  1 file changed, 16 insertions(+), 2 deletions(-)
>> > >
>> > > diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
>> > > index d15769a..7bca460 100644
>> > > --- a/gcc/cp/parser.c
>> > > +++ b/gcc/cp/parser.c
>> > > @@ -6621,6 +6621,16 @@ cp_parser_compound_literal_p (cp_parser
>> > > *parser)
>> > >return compound_literal_p;
>> > >  }
>> > >
>> > > +/* Return 1 if EXPR is the integer constant zero or a complex
>> > > constant
>> > > +   of zero, without any folding, but ignoring location
>> > > wrappers.  */
>> > > +
>> > > +static int
>> > > +literal_integer_zerop (const_tree expr)
>> > > +{
>> > > +  STRIP_ANY_LOCATION_WRAPPER (expr);
>> > > +  return integer_zerop (expr);
>> > > +}
>> > > +
>> > >  /* Parse a postfix-expression.
>> > >
>> > > postfix-expression:
>> > > @@ -7168,8 +7178,12 @@ cp_parser_postfix_expression (cp_parser
>> > > *parser, bool address_p, bool cast_p,
>> > >  tree arg0 = (*args)[0];
>> > >  tree arg1 = (*args)[1];
>> > >  tree arg2 = (*args)[2];
>> > > -int literal_mask = ((!!integer_zerop (arg1) << 1)
>> > > -| (!!integer_zerop (arg2) << 2));
>> > > +/* Handle any location wrapper nodes.  */
>> > > +arg0 = fold_for_warn (arg0);
>> > > +int literal_mask = ((!!literal_integer_zerop (arg1) <<
>> > > 1)
>> > > +| (!!literal_integer_zerop (arg2) << 2));
>> >
>> > The double negation jumped out at me.  The integer_zerop() function
>> > looks like a predicate that hasn't yet been converted to return
>> > bool.
>> > It seems that new predicates that are implemented on top of it
>> > could
>> > return bool and their callers avoid this conversion.  (At some
>> > point
>> > in the future it would also be nice to convert the existing
>> > predicates to return bool).
>>
>> Agreed.  And since integer_zerop in fact returns boolean values,
>> there
>> seems to be no need for the double negative in the first place.
>>
>> So, please make the new function return bool and remove the !!s.
>>
>> And I think the fold_for_warn call should be in warn_for_memset, and
>> it
>> should be called for arg2 as well instead of specifically handling
>> CONST_DECL Here.
>
> Thanks.
>
> Here's an updated version of the patch which I believe covers all
> of the above.
>
> Successfully bootstrapped®rtested on x86_64-pc-linux-gnu, as
> part of the kit.
>
> OK for trunk once the rest of the kit is approved?

OK.

Jason


Re: [PATCH] RTEMS: Add GCC Runtime Library Exception

2017-12-20 Thread Jeff Law
On 08/22/2017 11:15 PM, Sebastian Huber wrote:
> Hello Jeff,
> 
> On 03/08/17 07:11, Sebastian Huber wrote:
>> On 02/08/17 21:30, Jeff Law wrote:
>>
>>> On 07/24/2017 12:03 AM, Sebastian Huber wrote:
 gcc/

     PR libgcc/61152
     * aarch64/rtems.h: Add GCC Runtime Library Exception. Format
     changes.
     * arm/rtems.h: Likewise.
     * bfin/rtems.h: Likewise.
     * i386/rtemself.h: Likewise.
     * lm32/rtems.h: Likewise.
     * m32c/rtems.h: Likewise.
     * m68k/rtemself.h: Likewise.
     * microblaze/rtems.h: Likewise.
     * mips/rtems.h: Likewise.
     * moxie/rtems.h: Likewise.
     * nios2/rtems.h: Likewise.
     * powerpcspe/rtems.h: Likewise.
     * rs6000/rtems.h: Likewise.
     * rtems.h: Likewise.
     * sh/rtems.h: Likewise.
     * sh/rtemself.h: Likewise.
     * sparc/rtemself.h: Likewise.
>>> This seems horribly wrong.  Did anyone ack this change?  I'm fully
>>> supportive of target maintainers taking care of their areas, but
>>> licensing stuff probably should get explicitly ack'd.
>>>
>>> I just reviewed all the rtems config files and I don't see anything in
>>> any of them that deserves a runtime exception with the possible
>>> exception of rs6000/rtems.h.
>>>
>>> Seriously.  Redefining the CPP builtins?  LINK_SPEC?  #undefs? Those
>>> are not things we should be granting an exception for.
>>>
>>> The one that looks marginal to me would be rs6000/rtems.h and its
>>> definition of CRT_CALL_STATIC_FUNCTION.
>>
>> I asked on the mailing list:
>>
>> https://gcc.gnu.org/ml/gcc/2017-07/msg00171.html
>>
>> Jakub Jelinek said that for header files included by libgcc it is
>> important whether they have the runtime exception or not:
>>
>> https://gcc.gnu.org/ml/gcc/2017-07/msg00176.html
>>
>> There is also this PR61152 from 2014
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61152
>>
>> which adds the runtime exception to a couple of gcc/ subdirectory
>> files (including some RTEMS files). So, I had no explicit acknowledge,
>> but my impression was that I did simply fix some left over files.
>>
>> If you don't add the runtime exception to files included by libgcc,
>> then the user of GCC must check that these files contain no content
>> that deserves a copyright. Is this really user friendly?
>>
> 
> What should I do now? It would be nice to have a general guideline on
> how to deal with header files used by libgcc.
You have to look at each header and determine how the contents are used
and whether or not those uses are of a nature that requires an exception.

So, yes it's important to get the exception in there when it's needed.
But that doesn't mean we just drop the exception into every file that's
included by something in libgcc.  We are stewards of the GCC sources for
the FSF and we have to be very careful when it comes to changing the
licensing on any code.

WRT 61152.   That doesn't mean splatting the runtime exception into
those files is/was the correct thing to do.  Again, one has to look at
it on a file by file basis.  I have not looked at those changes in
detail to know if they really need the license exception or not.

The FSF has been very clear through the decades that linking against
libgcc should not drag the user's code under the terms of the GPL.  So
all users have to do is extend a degree of trust.  When files have
actually needed a license change we've taken care of it and always will.


I'm not going to demand you revert the change, but let's please get an
explicit ack before changing the licensing terms in the future.

jeff


Re: [v2 of PATCH 13/14] c-format.c: handle location wrappers

2017-12-20 Thread Jason Merrill
On Wed, Dec 20, 2017 at 12:33 PM, David Malcolm  wrote:
> On Tue, 2017-12-19 at 14:55 -0500, Jason Merrill wrote:
>> On 12/17/2017 11:29 AM, David Malcolm wrote:
>> > On Mon, 2017-12-11 at 18:45 -0500, Jason Merrill wrote:
>> > > On 11/10/2017 04:45 PM, David Malcolm wrote:
>> > > > +  STRIP_ANY_LOCATION_WRAPPER (format_tree);
>> > > > +
>> > > >   if (VAR_P (format_tree))
>> > > > {
>> > > >   /* Pull out a constant value if the front end
>> > > > didn't.  */
>> > >
>> > > It seems like we want fold_for_warn here instead of the special
>> > > variable
>> > > handling.  That probably makes sense for the other places you
>> > > change in
>> > > this patch, too.
>> >
>> > Here's an updated version of the patch which uses fold_for_warn,
>> > rather than STRIP_ANY_LOCATION_WRAPPER.
>> >
>> > In one place it was necessary to add a STRIP_NOPS, since the
>> > fold_for_warn can add a cast around a:
>> >ADDR_EXPR  (
>> >  STRING_CST)
>> > turning it into a:
>> >NOP_EXPR (
>> >  ADDR_EXPR  (
>> >STRING_CST))
>>
>> Hmm, that seems like a bug.  fold_for_warn shouldn't change the type of
>> the result.
>
> In a similar vein, is the result of decl_constant_value (decl) required
> to be the same type as that of the decl?
>
> What's happening for this testcase (g++.dg/warn/Wformat-1.C) is that we
> have a VAR_DECL with a DECL_INITIAL, but the types of the two don't
> quite match up.
>
> decl_constant_value returns an unshare_expr clone of the DECL_INITIAL,
> and this is what's returned from fold_for_warn.
>
> Am I right in thinking
>
> (a) that the bug here is that a DECL_INITIAL ought to have the same
> type as its decl, and so there's a missing cast where that gets
> set up, or

This seems right.

> (b) should decl_constant_value handle such cases, and introduce a cast
> if it uncovers them?

decl_constant_value should probably assert that the types match closely enough.

Jason


Re: [C++] Add support for #pragma GCC unroll v4

2017-12-20 Thread Jason Merrill
On Wed, Dec 20, 2017 at 9:20 AM, Eric Botcazou  wrote:
>> This needs some C++ tests, particularly with templates and range-for.  I
>> suspect that using the pragma in a template will ICE.
>
> No, that wasn't the case, but the combination template/range-for was indeed
> not working; fixed by adding a 5th parameter to the RANGE_FOR_STMT node.

OK.

Jason


Re: [SFN+LVU+IEPM v4 9/9] [IEPM] Introduce inline entry point markers

2017-12-20 Thread Jeff Law
On 12/11/2017 07:54 PM, Alexandre Oliva wrote:
> On Nov 10, 2017, Alexandre Oliva  wrote:
> 
>> Output DW_AT_entry_pc based on markers.
> 
> Here's an updated version of the patch.
> 
> [IEPM] Introduce inline entry point markers
> 
> Output DW_AT_entry_pc based on markers.
> 
> Introduce DW_AT_GNU_entry_view as a DWARF extension.
> 
> If views are enabled are we're not in strict compliance mode, output
> DW_AT_GNU_entry_view if it might be nonzero.
> 
> This patch depends on SFN and LVU patchsets, and on the IEPM patch that
> introduces the inline_entry debug hook.
> 
> for  include/ChangeLog
> 
>   * dwarf2.def (DW_AT_GNU_entry_view): New.
> 
> for  gcc/ChangeLog
> 
>   * cfgexpand.c (expand_gimple_basic_block): Handle inline entry
>   markers.
>   * dwarf2out.c (dwarf2_debug_hooks): Enable inline_entry hook.
>   (BLOCK_INLINE_ENTRY_LABEL): New.
>   (dwarf2out_var_location): Disregard inline entry markers.
>   (inline_entry_data): New struct.
>   (inline_entry_data_hasher): New hashtable type.
>   (inline_entry_data_hasher::hash): New.
>   (inline_entry_data_hasher::equal): New.
>   (inline_entry_data_table): New variable.
>   (add_high_low_attributes): Add DW_AT_entry_pc and
>   DW_AT_GNU_entry_view attributes if a pending entry is found
>   in inline_entry_data_table.  Add old entry_pc attribute only
>   if debug nonbinding markers are disabled.
>   (gen_inlined_subroutine_die): Set BLOCK_DIE if nonbinding
>   markers are enabled.
>   (block_within_block_p, dwarf2out_inline_entry): New.
>   (dwarf2out_finish): Check that no entries remained in
>   inline_entry_data_table.
>   * final.c (reemit_insn_block_notes): Handle inline entry notes.
>   (final_scan_insn, notice_source_line): Likewise.
>   (rest_of_clean_state): Skip inline entry markers.
>   * gimple-pretty-print.c (dump_gimple_debug): Handle inline entry
>   markers.
>   * gimple.c (gimple_build_debug_inline_entry): New.
>   * gimple.h (enum gimple_debug_subcode): Add
>   GIMPLE_DEBUG_INLINE_ENTRY.
>   (gimple_build_debug_inline_entry): Declare.
>   (gimple_debug_inline_entry_p): New.
>   (gimple_debug_nonbind_marker_p): Adjust.
>   * insn-notes.def (INLINE_ENTRY): New.
>   * print-rtl.c (rtx_writer::print_rtx_operand_code_0): Handle
>   inline entry marker notes.
>   (print_insn): Likewise.
>   * rtl.h (NOTE_MARKER_P): Add INLINE_ENTRY support.
>   (INSN_DEBUG_MARKER_KIND): Likewise.
>   (GEN_RTX_DEBUG_MARKER_INLINE_ENTRY_PAT): New.
>   * tree-inline.c (expand_call_inline): Build and insert
>   debug_inline_entry stmt.
>   * tree-ssa-live.c (remove_unused_scope_block_p): Preserve
>   inline entry blocks early, if nonbind markers are enabled.
>   (dump_scope_block): Dump fragment info.
>   * var-tracking.c (reemit_marker_as_note): Handle inline entry note.
>   * doc/gimple.texi (gimple_debug_inline_entry_p): New.
>   (gimple_build_debug_inline_entry): New.
>   * doc/invoke.texi (gstatement-frontiers, gno-statement-frontiers):
>   Enable/disable inline entry points too.
>   * doc/rtl.texi (NOTE_INSN_INLINE_ENTRY): New.
>   (DEBUG_INSN): Describe inline entry markers.



> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index a88b4cdf6e25..df2f6d92c6fc 100644
> --- a/gcc/dwarf2out.c
> +++ b/gcc/dwarf2out.c
> @@ -23533,6 +23579,42 @@ add_high_low_attributes (tree stmt, dw_die_ref die)
>  {
>char label[MAX_ARTIFICIAL_LABEL_BYTES];
>  
> +  if (inline_entry_data **iedp
> +  = !inline_entry_data_table ? NULL
> +  : inline_entry_data_table->find_slot_with_hash (stmt,
> +   htab_hash_pointer (stmt),
> +   NO_INSERT))
> +{
> +  inline_entry_data *ied = *iedp;
> +  gcc_assert (MAY_HAVE_DEBUG_MARKER_INSNS);
> +  gcc_assert (inlined_function_outer_scope_p (stmt));
> +  ASM_GENERATE_INTERNAL_LABEL (label, ied->label_pfx, ied->label_num);
> +  add_AT_lbl_id (die, DW_AT_entry_pc, label);
> +
> +  if (debug_variable_location_views && !ZERO_VIEW_P (ied->view))
> + {
> +   if (!output_asm_line_debug_info ())
> + add_AT_unsigned (die, DW_AT_GNU_entry_view, ied->view);
> +   else
> + {
> +   ASM_GENERATE_INTERNAL_LABEL (label, "LVU", ied->view);
> +   /* FIXME: this will resolve to a small number.  Could we
> +  possibly emit smaller data?  Ideally we'd emit a
> +  uleb128, but that would make the size of DIEs
> +  impossible for the compiler to compute, since it's
> +  the assembler that computes the value of the view
> +  label in this case.  Ideally, we'd have a single form
> +  encompassing both the address and the view, and
> +  indirecting them through a table might make things
> +  easier, 

Re: [PATCH] Fix vector handling in simplify-rtx.c (PR rtl-optimization/82973)

2017-12-20 Thread Jeff Law
On 12/20/2017 03:18 PM, Jakub Jelinek wrote:
> Hi!
> 
> In rtl.texi we say:
> @findex const_vector
> @item (const_vector:@var{m} [@var{x0} @var{x1} @dots{}])
> Represents a vector constant.  The square brackets stand for the vector
> containing the constant elements.  @var{x0}, @var{x1} and so on are
> the @code{const_int}, @code{const_wide_int}, @code{const_double} or
> @code{const_fixed} elements.
> and it is a very reasonable requirement. 
> simplify_const_{unary,binary}_operation can violate that though, because
> the recursion can return also other expressions, like in this case
> a (mult (const_double) (const_double)) that can't be simplified because
> of -frounding-math.  We need to punt on those.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2017-12-20  Jakub Jelinek  
> 
>   PR rtl-optimization/82973
>   * simplify-rtx.c (simplify_const_unary_operation): Don't optimize into
>   CONST_VECTOR if some element isn't simplified into CONST_{,WIDE_}INT,
>   CONST_DOUBLE or CONST_FIXED.
>   (simplify_const_binary_operation): Likewise.  Use CONST_FIXED_P macro
>   instead of GET_CODE == CONST_FIXED.
>   (simplify_subreg): Use CONST_FIXED_P macro instead of
>   GET_CODE == CONST_FIXED.
> 
>   * gfortran.dg/pr82973.f90: New test.
OK.
jeff


Re: [AArch64] Tweak aarch64_classify_address interface

2017-12-20 Thread Richard Sandiford
James Greenhalgh  writes:
> On Mon, Oct 23, 2017 at 06:58:29PM +0100, Richard Sandiford wrote:
>> Ping.
>
> Makes sense. OK.
>
> Reviewed-By: James Greenhalgh 

Thanks.  I needed to update it slightly after recent changes to
config/aarch64, but the differences from the first one were obvious.
here's what I applied.

(And I just realised I forgot the Reviewed-By:, sorry about that.)

Richard


2017-12-21  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* config/aarch64/aarch64-protos.h (aarch64_addr_query_type): New enum.
(aarch64_legitimate_address_p): Use it instead of an rtx code,
as an optional final parameter.
* config/aarch64/aarch64.c (aarch64_classify_address): Likewise.
(aarch64_legitimate_address_p): Likewise.
(aarch64_print_address_internal): Take an aarch64_addr_query_type
instead of an rtx code.
(aarch64_address_valid_for_prefetch_p): Update calls accordingly.
(aarch64_legitimate_address_hook_p): Likewise.
(aarch64_print_ldpstp_address): Likewise.
(aarch64_print_operand_address): Likewise.
(aarch64_address_cost): Likewise.
* config/aarch64/constraints.md (Uml, Umq, Ump, Utq): Likewise.
* config/aarch64/predicates.md (aarch64_mem_pair_operand): Likewise.
(aarch64_mem_pair_lanes_operand): Likewise.

Index: gcc/config/aarch64/aarch64-protos.h
===
--- gcc/config/aarch64/aarch64-protos.h 2017-12-15 13:09:53.224711621 +
+++ gcc/config/aarch64/aarch64-protos.h 2017-12-20 23:03:10.548917491 +
@@ -111,6 +111,19 @@ enum aarch64_symbol_type
   SYMBOL_FORCE_TO_MEM
 };
 
+/* Classifies the type of an address query.
+
+   ADDR_QUERY_M
+  Query what is valid for an "m" constraint and a memory_operand
+  (the rules are the same for both).
+
+   ADDR_QUERY_LDP_STP
+  Query what is valid for a load/store pair.  */
+enum aarch64_addr_query_type {
+  ADDR_QUERY_M,
+  ADDR_QUERY_LDP_STP
+};
+
 /* A set of tuning parameters contains references to size and time
cost models and vectors for address cost calculations, register
move costs and memory move costs.  */
@@ -440,7 +453,8 @@ bool aarch64_float_const_representable_p
 
 #if defined (RTX_CODE)
 
-bool aarch64_legitimate_address_p (machine_mode, rtx, RTX_CODE, bool);
+bool aarch64_legitimate_address_p (machine_mode, rtx, bool,
+  aarch64_addr_query_type = ADDR_QUERY_M);
 machine_mode aarch64_select_cc_mode (RTX_CODE, rtx, rtx);
 rtx aarch64_gen_compare_reg (RTX_CODE, rtx, rtx);
 rtx aarch64_load_tp (rtx);
Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c2017-12-16 14:10:35.780523658 +
+++ gcc/config/aarch64/aarch64.c2017-12-20 23:03:10.549917461 +
@@ -4437,21 +4437,21 @@ virt_or_elim_regno_p (unsigned regno)
  || regno == ARG_POINTER_REGNUM);
 }
 
-/* Return true if X is a valid address for machine mode MODE.  If it is,
-   fill in INFO appropriately.  STRICT_P is true if REG_OK_STRICT is in
-   effect.  OUTER_CODE is PARALLEL for a load/store pair.  */
+/* Return true if X is a valid address of type TYPE for machine mode MODE.
+   If it is, fill in INFO appropriately.  STRICT_P is true if
+   REG_OK_STRICT is in effect.  */
 
 static bool
 aarch64_classify_address (struct aarch64_address_info *info,
- rtx x, machine_mode mode,
- RTX_CODE outer_code, bool strict_p)
+ rtx x, machine_mode mode, bool strict_p,
+ aarch64_addr_query_type type = ADDR_QUERY_M)
 {
   enum rtx_code code = GET_CODE (x);
   rtx op0, op1;
 
   /* On BE, we use load/store pair for all large int mode load/stores.
  TI/TFmode may also use a load/store pair.  */
-  bool load_store_pair_p = (outer_code == PARALLEL
+  bool load_store_pair_p = (type == ADDR_QUERY_LDP_STP
|| mode == TImode
|| mode == TFmode
|| (BYTES_BIG_ENDIAN
@@ -4683,7 +4683,7 @@ aarch64_address_valid_for_prefetch_p (rt
   struct aarch64_address_info addr;
 
   /* PRFM accepts the same addresses as DImode...  */
-  bool res = aarch64_classify_address (&addr, x, DImode, MEM, strict_p);
+  bool res = aarch64_classify_address (&addr, x, DImode, strict_p);
   if (!res)
 return false;
 
@@ -4719,19 +4719,18 @@ aarch64_legitimate_address_hook_p (machi
 {
   struct aarch64_address_info addr;
 
-  return aarch64_classify_address (&addr, x, mode, MEM, strict_p);
+  return aarch64_classify_address (&addr, x, mode, strict_p);
 }
 
-/* Return TRUE if X is a legitimate address for accessing memory in
-   mode MODE.  OUTER_CODE will be PARALLEL if this is a load/store
-   pair operation.  */
+/* Return TRUE if X is a legitimate address of type TYPE for accessing
+   memor

Re: [PATCH] Fix some x86 OPTION_MASK_ISA* issues (PR target/83488)

2017-12-20 Thread Uros Bizjak
On Wed, Dec 20, 2017 at 8:37 PM, Jakub Jelinek  wrote:
> Hi!
>
> As you know, we ran out of ix86_isa_flags bitmask bits some time ago.
> The testcase from the PR's #c0 (which I'm not adding into testsuite, because
> it will be useless any time *.opt is modified with any of the
> OPTION_MASK_ISA* bits) ICEs because we mix in i386-builtins.def
> OPTION_MASK_ISA_* constants from the two bitmasks together,
> e.g.
> OPTION_MASK_ISA_SHSTK | OPTION_MASK_ISA_64BIT
> and we have just a single bitmask there, so depending on the BDESC_
> chunk it must be either solely ix86_isa_flags or solely ix86_isa_flags2,
> otherwise OPTION_MASK_ISA_64BIT:
> #define OPTION_MASK_ISA_64BIT (HOST_WIDE_INT_1U << 1)
> stands for
> #define OPTION_MASK_ISA_AVX5124VNNIW (HOST_WIDE_INT_1U << 1)
> instead.
> I've grepped around quite a bit and determined the only problematic two
> OPTION_MASK_ISA_* options right now are OPTION_MASK_ISA_SHSTK due
> to being ored with OPTION_MASK_ISA_64BIT and
> OPTION_MASK_ISA_AVX512VBMI2 due to being ored with OPTION_MASK_ISA_AVX512VL.
>
> So, this patch moves those 2 OPTION_MASK_ISA* flags into ix86_isa_flags
> and to free some room there, moves 4 other OPTION_MASK_ISA* flags that
> are never ored in any way and have very few dependencies into
> ix86_isa_flags2.
>
> In addition to this, it fixes a bug where for -mavx512vbmi2
> we properly enable -mavx512f, but for -mavx512vbmi2 -mno-avx512f
> we keep -mavx512vbmi2 enabled while -mavx512f disabled, so things ICE badly.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> As I said in the PR, one way to perhaps improve incrementally the status
> quo could be to encode explicitly whether it is a mask option for
> ix86_isa_flags or ix86_isa_flags2 in the OPTION_MASK_ISA* name, e.g. by
> using OPTION_MASK_ISA2_ for the latter.  Thoughts on that?
>
> And, while we now have 4 bits left, if we keep adding new ISA bits at the
> current rate, while there are still a few ISA flags that could be moved,
> perhaps at some point it might be easier to move everything AVX512 related
> to ix86_isa_flags and keep the rest in ix86_isa_flags.  For that we'd need
> some magic shadow option for OPTION_MASK_ISA_64BIT and keep that
> synchronized.
>
> 2017-12-20  Jakub Jelinek  
>
> PR target/83488
> * config/i386/i386.c (ix86_target_string): Move -mavx512vbmi2 and
> -mshstk entries from isa_opts2 to isa_opts and -mhle, -mmovbe,
> -mclzero and -mmwaitx entries from isa_opts to isa_opts2.
> (ix86_option_override_internal): Adjust for
> OPTION_MASK_ISA_{HLE,MOVBE,CLZERO,MWAITX} moving to ix86_isa_flags2
> and OPTION_MASK_ISA_SHSTK moving to ix86_isa_flags.
> (BDESC_VERIFYS): Remove SPECIAL_ARGS2 related checks.
> (ix86_init_mmx_sse_builtins): Remove bdesc_special_args2 handling.
> Use def_builtin2 instead of def_builtin for OPTION_MASK_ISA_MWAITX
> and OPTION_MASK_ISA_CLZERO builtins.  Use def_builtin instead of
> def_builtin2 for CET builtins.
> (ix86_expand_builtin): Remove bdesc_special_args2 handling.  Fix
> up formatting in IX86_BUILTIN_RDPID code.
> * config/i386/i386-builtin.def: Move VBMI2 builtins from SPECIAL_ARGS2
> section to SPECIAL_ARGS and from ARGS2 section to ARGS.
> * config/i386/i386.opt (mavx512vbmi2, mshstk): Move from
> ix86_isa_flags2 to ix86_isa_flags.
> (mhle, mmovbe, mclzero, mmwaitx): Move from ix86_isa_flags to
> ix86_isa_flags2.
> * config/i386/i386-c.c (ix86_target_macros_internal): Check for
> OPTION_MASK_ISA_{CLZERO,MWAITX} in isa_flag2 instead of isa_flag.
> Check for OPTION_MASK_ISA_{SHSTK,AVX512VBMI2} in isa_flag instead
> of isa_flag2.
> * common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512VBMI2_SET):
> Or in OPTION_MASK_ISA_AVX512F_SET.
> (OPTION_MASK_ISA_AVX512F_UNSET): Or in
> OPTION_MASK_ISA_AVX512VBMI2_UNSET.
> (ix86_handle_option): Adjust for
> OPTION_MASK_ISA_{SHSTK,AVX512VBMI2}_*SET being in ix86_isa_flags
> and OPTION_MASK_ISA_{MOVBE,MWAITX,CLZERO}_*SET in ix86_isa_flags2.
>
> * gcc.target/i386/pr83488.c: New test.

LGTM.

Thanks,
Uros.

> --- gcc/config/i386/i386.c.jj   2017-12-20 10:10:10.0 +0100
> +++ gcc/config/i386/i386.c  2017-12-20 13:15:46.738245880 +0100
> @@ -2753,21 +2753,24 @@ ix86_target_string (HOST_WIDE_INT isa, H
>{
>  { "-mcx16",OPTION_MASK_ISA_CX16 },
>  { "-mmpx", OPTION_MASK_ISA_MPX },
> -{ "-mavx512vbmi2", OPTION_MASK_ISA_AVX512VBMI2 },
> -{ "-mavx512vnni", OPTION_MASK_ISA_AVX512VNNI },
> +{ "-mavx512vnni",  OPTION_MASK_ISA_AVX512VNNI },
>  { "-mvaes",OPTION_MASK_ISA_VAES },
>  { "-mrdpid",   OPTION_MASK_ISA_RDPID },
>  { "-msgx", OPTION_MASK_ISA_SGX },
>  { "-mavx5124vnniw", OPTION_MASK_ISA_AVX5124VNNIW },
>  { "-mavx5124fmaps", O

PR83501

2017-12-20 Thread Prathamesh Kulkarni
Hi Jakub,
Based on your suggestions in PR83501, I have updated the patch to
check for integer_zerop for 2nd operand of mem_ref.

With the patch, Warray-bounds started warning for the following test
in Warray-bounds-3.c in line 362 and thus I removed xfail on it:
TM (a5, "0123", ma.a5 + i, ma.a5);

Does it look OK ?
Validation in progress.

Thanks,
Prathamesh
2017-12-21  Prathamesh Kulkarni  

* tree-ssa-strlen.c (get_string_cst): New.
(handle_char_store): Call get_string_cst.

testsuite/
* gcc.dg/tree-ssa/pr83501-1.c: New test.
* c-c++-common/Warray-bounds-3.c (test_strcpy_bounds_memarray_range):
Remove xfail and adjust test for a5.
diff --git a/gcc/testsuite/c-c++-common/Warray-bounds-3.c 
b/gcc/testsuite/c-c++-common/Warray-bounds-3.c
index 73103f43d01..62667344c0a 100644
--- a/gcc/testsuite/c-c++-common/Warray-bounds-3.c
+++ b/gcc/testsuite/c-c++-common/Warray-bounds-3.c
@@ -359,7 +359,7 @@ void test_strcpy_bounds_memarray_range (void)
   TM (a5, "0",ma.a5 + i, ma.a5);
   TM (a5, "01",   ma.a5 + i, ma.a5);
   TM (a5, "012",  ma.a5 + i, ma.a5);
-  TM (a5, "0123", ma.a5 + i, ma.a5); /* { dg-warning "offset 10 from the 
object at .ma. is out of the bounds of referenced subobject .\(MA::\)?a5. with 
type .char\\\[5]. at offset 4" "strcpy" { xfail *-*-* } } */
+  TM (a5, "0123", ma.a5 + i, ma.a5); /* { dg-warning "offset 10 from the 
object at .ma. is out of the bounds of referenced subobject .\(MA::\)?a5. with 
type .char ?\\\[5]' at offset 4" } */
 
   TM (a11, "0",   ma.a5, ma.a11);
   TM (a11, "01",  ma.a5, ma.a11);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83501.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr83501.c
new file mode 100644
index 000..d8d3bf6039a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83501.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+char a[4];
+
+void f (void)
+{
+  __builtin_strcpy (a, "abc");
+
+  if (__builtin_strlen (a) != 3)
+__builtin_abort ();
+}
+
+/* { dg-final { scan-tree-dump-not "__builtin_strlen" "strlen" } } */
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index 8971c7df4f3..f1d4f6ef059 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -2769,6 +2769,21 @@ handle_pointer_plus (gimple_stmt_iterator *gsi)
 }
 }
 
+/* Check if RHS is string_cst possibly wrapped by mem_ref.  */
+static tree
+get_string_cst (tree rhs)
+{
+  if (TREE_CODE (rhs) == MEM_REF
+  && integer_zerop (TREE_OPERAND (rhs, 1)))
+{
+  rhs = TREE_OPERAND (rhs, 0);
+  if (TREE_CODE (rhs) == ADDR_EXPR)
+   rhs = TREE_OPERAND (rhs, 0);
+}
+
+  return (TREE_CODE (rhs) == STRING_CST) ? rhs : NULL_TREE;
+}
+
 /* Handle a single character store.  */
 
 static bool
@@ -2924,11 +2939,11 @@ handle_char_store (gimple_stmt_iterator *gsi)
}
 }
   else if (idx == 0
-  && TREE_CODE (gimple_assign_rhs1 (stmt)) == STRING_CST
+  && (rhs = get_string_cst (gimple_assign_rhs1 (stmt)))
   && ssaname == NULL_TREE
   && TREE_CODE (TREE_TYPE (lhs)) == ARRAY_TYPE)
 {
-  size_t l = strlen (TREE_STRING_POINTER (gimple_assign_rhs1 (stmt)));
+  size_t l = strlen (TREE_STRING_POINTER (rhs));
   HOST_WIDE_INT a = int_size_in_bytes (TREE_TYPE (lhs));
   if (a > 0 && (unsigned HOST_WIDE_INT) a > l)
{