[PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-07-05 Thread Thomas Preudhomme
In case of high register pressure in PIC mode, address of the stack
protector's guard can be spilled on ARM targets as shown in PR85434,
thus allowing an attacker to control what the canary would be compared
against. ARM does lack stack_protect_set and stack_protect_test insn
patterns, defining them does not help as the address is expanded
regularly and the patterns only deal with the copy and test of the
guard with the canary.

This problem does not occur for x86 targets because the PIC access and
the test can be done in the same instruction. Aarch64 is exempt too
because PIC access insn pattern are mov of UNSPEC which prevents it from
the second access in the epilogue being CSEd in cse_local pass with the
first access in the prologue.

The approach followed here is to create new "combined" set and test
standard pattern names that take the unexpanded guard and do the set or
test. This allows the target to use an opaque pattern (eg. using UNSPEC)
to hide the individual instructions being generated to the compiler and
split the pattern into generic load, compare and branch instruction
after register allocator, therefore avoiding any spilling. This is here
implemented for the ARM targets. For targets not implementing these new
standard pattern names, the existing stack_protect_set and
stack_protect_test pattern names are used.

To be able to split PIC access after register allocation, the functions
had to be augmented to force a new PIC register load and to control
which register it loads into. This is because sharing the PIC register
between prologue and epilogue could lead to spilling due to CSE again
which an attacker could use to control what the canary gets compared
against.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-07-05  Thomas Preud'homme  

PR target/85434
* target-insns.def (stack_protect_combined_set): Define new standard
pattern name.
(stack_protect_combined_test): Likewise.
* cfgexpand.c (stack_protect_prologue): Try new
stack_protect_combined_set pattern first.
* function.c (stack_protect_epilogue): Try new
stack_protect_combined_test pattern first.
* config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
parameters to control which register to use as PIC register and force
reloading PIC register respectively.
(legitimize_pic_address): Expose above new parameters in prototype and
adapt recursive calls accordingly.
(arm_legitimize_address): Adapt to new legitimize_pic_address
prototype.
(thumb_legitimize_address): Likewise.
(arm_emit_call_insn): Adapt to new require_pic_register prototype.
* config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
change.
* config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
prototype change.
(stack_protect_combined_set): New insn_and_split pattern.
(stack_protect_set): New insn pattern.
(stack_protect_combined_test): New insn_and_split pattern.
(stack_protect_test): New insn pattern.
* config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
(UNSPEC_SP_TEST): Likewise.
* doc/md.texi (stack_protect_combined_set): Document new standard
pattern name.
(stack_protect_set): Clarify that the operand for guard's address is
legal.
(stack_protect_combined_test): Document new standard pattern name.
(stack_protect_test): Clarify that the operand for guard's address is
legal.

*** gcc/testsuite/ChangeLog ***

2018-07-05  Thomas Preud'homme  

PR target/85434
* gcc.target/arm/pr85434.c: New test.

Testing: Bootstrapped on ARM in both Arm and Thumb-2 mode as well as on
Aarch64. Testsuite shows no regression on these 3 variants either both
with default flags and with -fstack-protector-all.

Is this ok for trunk? If yes, would this be acceptable as a backport to
GCC 6, 7 and 8 provided that no regression is found?

Best regards,

Thomas
From d917d48c2005e46154383589f203d06f3c6167e0 Mon Sep 17 00:00:00 2001
From: Thomas Preud'homme 
Date: Tue, 8 May 2018 15:47:05 +0100
Subject: [PATCH] PR85434: Prevent spilling of stack protector guard's address
 on ARM

In case of high register pressure in PIC mode, address of the stack
protector's guard can be spilled on ARM targets as shown in PR85434,
thus allowing an attacker to control what the canary would be compared
against. ARM does lack stack_protect_set and stack_protect_test insn
patterns, defining them does not help as the address is expanded
regularly and the patterns only deal with the copy and test of the
guard with the canary.

This problem does not occur for x86 targets because the PIC access and
the test can be done in the same instruction. Aarch64 is exempt too
because PIC access insn pattern are mov of UNSPEC which prevents it from
the second access in the epilogue being CSEd in cse_local pass with the
first access in the prologue.

The approach followed here is to create new "combined" set and test
standard pattern names tha

Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-07-10 Thread Thomas Preudhomme
Adding Jeff and Eric since the patch adds an RTL target hook.

Best regards,

Thomas

On Thu, 5 Jul 2018 at 15:48, Thomas Preudhomme
 wrote:
>
> In case of high register pressure in PIC mode, address of the stack
> protector's guard can be spilled on ARM targets as shown in PR85434,
> thus allowing an attacker to control what the canary would be compared
> against. ARM does lack stack_protect_set and stack_protect_test insn
> patterns, defining them does not help as the address is expanded
> regularly and the patterns only deal with the copy and test of the
> guard with the canary.
>
> This problem does not occur for x86 targets because the PIC access and
> the test can be done in the same instruction. Aarch64 is exempt too
> because PIC access insn pattern are mov of UNSPEC which prevents it from
> the second access in the epilogue being CSEd in cse_local pass with the
> first access in the prologue.
>
> The approach followed here is to create new "combined" set and test
> standard pattern names that take the unexpanded guard and do the set or
> test. This allows the target to use an opaque pattern (eg. using UNSPEC)
> to hide the individual instructions being generated to the compiler and
> split the pattern into generic load, compare and branch instruction
> after register allocator, therefore avoiding any spilling. This is here
> implemented for the ARM targets. For targets not implementing these new
> standard pattern names, the existing stack_protect_set and
> stack_protect_test pattern names are used.
>
> To be able to split PIC access after register allocation, the functions
> had to be augmented to force a new PIC register load and to control
> which register it loads into. This is because sharing the PIC register
> between prologue and epilogue could lead to spilling due to CSE again
> which an attacker could use to control what the canary gets compared
> against.
>
> ChangeLog entries are as follows:
>
> *** gcc/ChangeLog ***
>
> 2018-07-05  Thomas Preud'homme  
>
> PR target/85434
> * target-insns.def (stack_protect_combined_set): Define new standard
> pattern name.
> (stack_protect_combined_test): Likewise.
> * cfgexpand.c (stack_protect_prologue): Try new
> stack_protect_combined_set pattern first.
> * function.c (stack_protect_epilogue): Try new
> stack_protect_combined_test pattern first.
> * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> parameters to control which register to use as PIC register and force
> reloading PIC register respectively.
> (legitimize_pic_address): Expose above new parameters in prototype and
> adapt recursive calls accordingly.
> (arm_legitimize_address): Adapt to new legitimize_pic_address
> prototype.
> (thumb_legitimize_address): Likewise.
> (arm_emit_call_insn): Adapt to new require_pic_register prototype.
> * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
> change.
> * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
> prototype change.
> (stack_protect_combined_set): New insn_and_split pattern.
> (stack_protect_set): New insn pattern.
> (stack_protect_combined_test): New insn_and_split pattern.
> (stack_protect_test): New insn pattern.
> * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
> (UNSPEC_SP_TEST): Likewise.
> * doc/md.texi (stack_protect_combined_set): Document new standard
> pattern name.
> (stack_protect_set): Clarify that the operand for guard's address is
> legal.
> (stack_protect_combined_test): Document new standard pattern name.
> (stack_protect_test): Clarify that the operand for guard's address is
> legal.
>
> *** gcc/testsuite/ChangeLog ***
>
> 2018-07-05  Thomas Preud'homme  
>
> PR target/85434
> * gcc.target/arm/pr85434.c: New test.
>
> Testing: Bootstrapped on ARM in both Arm and Thumb-2 mode as well as on
> Aarch64. Testsuite shows no regression on these 3 variants either both
> with default flags and with -fstack-protector-all.
>
> Is this ok for trunk? If yes, would this be acceptable as a backport to
> GCC 6, 7 and 8 provided that no regression is found?
>
> Best regards,
>
> Thomas
From d917d48c2005e46154383589f203d06f3c6167e0 Mon Sep 17 00:00:00 2001
From: Thomas Preud'homme 
Date: Tue, 8 May 2018 15:47:05 +0100
Subject: [PATCH] PR85434: Prevent spilling of stack protector guard's address
 on ARM

In case of high register pressure in PIC mode, address of the stack
protector's guard can be spilled on ARM targets as shown in PR85434,
thus allowing an attacker to control what the canary would be compared
aga

Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-07-17 Thread Thomas Preudhomme
Fixed in attached patch. ChangeLog entries are unchanged:

*** gcc/ChangeLog ***

2018-07-05  Thomas Preud'homme  

PR target/85434
* target-insns.def (stack_protect_combined_set): Define new standard
pattern name.
(stack_protect_combined_test): Likewise.
* cfgexpand.c (stack_protect_prologue): Try new
stack_protect_combined_set pattern first.
* function.c (stack_protect_epilogue): Try new
stack_protect_combined_test pattern first.
* config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
parameters to control which register to use as PIC register and force
reloading PIC register respectively.
(legitimize_pic_address): Expose above new parameters in prototype and
adapt recursive calls accordingly.
(arm_legitimize_address): Adapt to new legitimize_pic_address
prototype.
(thumb_legitimize_address): Likewise.
(arm_emit_call_insn): Adapt to new require_pic_register prototype.
* config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
change.
* config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
prototype change.
(stack_protect_combined_set): New insn_and_split pattern.
(stack_protect_set): New insn pattern.
(stack_protect_combined_test): New insn_and_split pattern.
(stack_protect_test): New insn pattern.
* config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
(UNSPEC_SP_TEST): Likewise.
* doc/md.texi (stack_protect_combined_set): Document new standard
pattern name.
(stack_protect_set): Clarify that the operand for guard's address is
legal.
(stack_protect_combined_test): Document new standard pattern name.
(stack_protect_test): Clarify that the operand for guard's address is
legal.

*** gcc/testsuite/ChangeLog ***

2018-07-05  Thomas Preud'homme  

PR target/85434
* gcc.target/arm/pr85434.c: New test.

Best regards,

Thomas
On Mon, 16 Jul 2018 at 22:46, Jeff Law  wrote:
>
> On 07/05/2018 08:48 AM, Thomas Preudhomme wrote:
> > In case of high register pressure in PIC mode, address of the stack
> > protector's guard can be spilled on ARM targets as shown in PR85434,
> > thus allowing an attacker to control what the canary would be compared
> > against. ARM does lack stack_protect_set and stack_protect_test insn
> > patterns, defining them does not help as the address is expanded
> > regularly and the patterns only deal with the copy and test of the
> > guard with the canary.
> >
> > This problem does not occur for x86 targets because the PIC access and
> > the test can be done in the same instruction. Aarch64 is exempt too
> > because PIC access insn pattern are mov of UNSPEC which prevents it from
> > the second access in the epilogue being CSEd in cse_local pass with the
> > first access in the prologue.
> >
> > The approach followed here is to create new "combined" set and test
> > standard pattern names that take the unexpanded guard and do the set or
> > test. This allows the target to use an opaque pattern (eg. using UNSPEC)
> > to hide the individual instructions being generated to the compiler and
> > split the pattern into generic load, compare and branch instruction
> > after register allocator, therefore avoiding any spilling. This is here
> > implemented for the ARM targets. For targets not implementing these new
> > standard pattern names, the existing stack_protect_set and
> > stack_protect_test pattern names are used.
> >
> > To be able to split PIC access after register allocation, the functions
> > had to be augmented to force a new PIC register load and to control
> > which register it loads into. This is because sharing the PIC register
> > between prologue and epilogue could lead to spilling due to CSE again
> > which an attacker could use to control what the canary gets compared
> > against.
> >
> > ChangeLog entries are as follows:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-07-05  Thomas Preud'homme  
> >
> > PR target/85434
> > * target-insns.def (stack_protect_combined_set): Define new standard
> > pattern name.
> > (stack_protect_combined_test): Likewise.
> > * cfgexpand.c (stack_protect_prologue): Try new
> > stack_protect_combined_set pattern first.
> > * function.c (stack_protect_epilogue): Try new
> > stack_protect_combined_test pattern first.
> > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> > parameters to control which register to use as PIC register and force
> > reloading PIC register respectively.
> > (legitimize_pic_address): Expose above new parameters in prototype and
> &

Re: [PATCH] Show valid options for -march and -mtune in --help=target for arm32 (PR driver/83193).

2018-07-18 Thread Thomas Preudhomme
Hi Martin,

Why is this needed when -mfpu does not seem to need it for instance?
Regarding the patch:

> -print "Name(processor_type) Type(enum processor_type)"
> -print "Known ARM CPUs (for use with the -mcpu= and -mtune= options):\n"
> +print "Name(processor_type) Type(enum processor_type) ForceHelp"
> +print "Known ARM CPUs (for use with the -mtune= options):\n"

Why changing the text beyond adding ForceHelp?

> +@item ForceHelp
> +This property is optional.  If present, enum values is printed
> +in @option{--help} output.
> +

are printed

Thanks,

Thomas
On Wed, 18 Jul 2018 at 16:50, Martin Liška  wrote:
>
> Hi.
>
> This introduces new ForceHelp option flag that helps to
> print valid option enum values that are not directly
> used as a type of an option.
>
> May I please ask ARM folks to test the patch?
> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> 2018-07-18  Martin Liska  
>
> PR driver/83193
> * config/arm/arm-tables.opt: Add ForceHelp flag for
> processor_type and arch_name enum types.
> * config/arm/parsecpu.awk: Likewise.
> * doc/options.texi: Document new flag ForceHelp.
> * opt-read.awk: Parse ForceHelp and set it in construction.
> * optc-gen.awk: Likewise.
> * opts.c (print_filtered_help): Handle force_help option.
> * opts.h (struct cl_enum): New field force_help.
> ---
>  gcc/config/arm/arm-tables.opt | 6 +++---
>  gcc/config/arm/parsecpu.awk   | 6 +++---
>  gcc/doc/options.texi  | 4 
>  gcc/opt-read.awk  | 3 +++
>  gcc/optc-gen.awk  | 3 ++-
>  gcc/opts.c| 3 ++-
>  gcc/opts.h| 3 +++
>  7 files changed, 20 insertions(+), 8 deletions(-)
>
>


Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-07-19 Thread Thomas Preudhomme
[Dropping Jeff Law from the list since he already commented on the
middle end parts]

Hi Kyrill,

On Thu, 19 Jul 2018 at 12:02, Kyrill Tkachov
 wrote:
>
> Hi Thomas,
>
> On 17/07/18 12:02, Thomas Preudhomme wrote:
> > Fixed in attached patch. ChangeLog entries are unchanged:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-07-05  Thomas Preud'homme 
> >
> > PR target/85434
> > * target-insns.def (stack_protect_combined_set): Define new standard
> > pattern name.
> > (stack_protect_combined_test): Likewise.
> > * cfgexpand.c (stack_protect_prologue): Try new
> > stack_protect_combined_set pattern first.
> > * function.c (stack_protect_epilogue): Try new
> > stack_protect_combined_test pattern first.
> > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> > parameters to control which register to use as PIC register and force
> > reloading PIC register respectively.
> > (legitimize_pic_address): Expose above new parameters in prototype and
> > adapt recursive calls accordingly.
> > (arm_legitimize_address): Adapt to new legitimize_pic_address
> > prototype.
> > (thumb_legitimize_address): Likewise.
> > (arm_emit_call_insn): Adapt to new require_pic_register prototype.
> > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
> > change.
> > * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
> > prototype change.
> > (stack_protect_combined_set): New insn_and_split pattern.
> > (stack_protect_set): New insn pattern.
> > (stack_protect_combined_test): New insn_and_split pattern.
> > (stack_protect_test): New insn pattern.
> > * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
> > (UNSPEC_SP_TEST): Likewise.
> > * doc/md.texi (stack_protect_combined_set): Document new standard
> > pattern name.
> > (stack_protect_set): Clarify that the operand for guard's address is
> > legal.
> > (stack_protect_combined_test): Document new standard pattern name.
> > (stack_protect_test): Clarify that the operand for guard's address is
> > legal.
> >
> > *** gcc/testsuite/ChangeLog ***
> >
> > 2018-07-05  Thomas Preud'homme 
> >
> > PR target/85434
> > * gcc.target/arm/pr85434.c: New test.
> >
>
> Sorry for the delay. Some comments inline.
>
> Kyrill
>
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index d6e3c382085..d1a893ac56e 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -6105,8 +6105,18 @@ stack_protect_prologue (void)
>   {
> tree guard_decl = targetm.stack_protect_guard ();
> rtx x, y;
> +  struct expand_operand ops[2];
>
> x = expand_normal (crtl->stack_protect_guard);
> +  create_fixed_operand (&ops[0], x);
> +  create_fixed_operand (&ops[1], DECL_RTL (guard_decl));
> +  /* Allow the target to compute address of Y and copy it to X without
> + leaking Y into a register.  This combined address + copy pattern allows
> + the target to prevent spilling of any intermediate results by splitting
> + it after register allocator.  */
> +  if (maybe_expand_insn (targetm.code_for_stack_protect_combined_set, 2, 
> ops))
> +return;
> +
> if (guard_decl)
>   y = expand_normal (guard_decl);
> else
> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> index 8537262ce64..100844e659c 100644
> --- a/gcc/config/arm/arm-protos.h
> +++ b/gcc/config/arm/arm-protos.h
> @@ -67,7 +67,7 @@ extern int const_ok_for_dimode_op (HOST_WIDE_INT, enum 
> rtx_code);
>   extern int arm_split_constant (RTX_CODE, machine_mode, rtx,
>HOST_WIDE_INT, rtx, rtx, int);
>   extern int legitimate_pic_operand_p (rtx);
> -extern rtx legitimize_pic_address (rtx, machine_mode, rtx);
> +extern rtx legitimize_pic_address (rtx, machine_mode, rtx, rtx, bool);
>   extern rtx legitimize_tls_address (rtx, rtx);
>   extern bool arm_legitimate_address_p (machine_mode, rtx, bool);
>   extern int arm_legitimate_address_outer_p (machine_mode, rtx, RTX_CODE, 
> int);
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index ec3abbcba9f..f4a970580c2 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -7369,20 +7369,26 @@ legitimate_pic_operand_p (rtx x)
>   }
>
>   /* Record that the current function needs a PIC register.  Initialize
> -   cfun->machine->pic_reg if we have not already done so.  */
> +   cfun->machine->pic_reg if we have not already done

Re: [PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.

2018-07-25 Thread Thomas Preudhomme
Hi Tamar,

On Mon, 23 Jul 2018 at 17:56, Tamar Christina  wrote:
>
> Hi All,
>
> My previous patch changed arm_can_change_mode_class to allow subregs of
> 64bit registers on arm big-endian.  However it seems that we can't do this
> because a the data in 64 bit VFP registers are stored in little-endian order,
> even on big-endian.
>
> Allowing this change had a knock on effect that caused GCC's no-op detection
> to think that loading from the first lane on arm big-endian is a no-op.  this
> because we can't describe the weird ordering we have on D registers on 
> big-endian.
>
> The original issue comes from the fact that the code does
>
> ... foo (... bar)
> {
>   return bar;
> }
>
> The expansion of the return statement causes GCC to try to return the value in
> a register.  GCC will try to emit the move then, from MEM to REG (due to the 
> SSA
> temporary.).  It checks for a mov optab for this which isn't available and
> then tries to do the move in bits using emit_move_multi_word.
>
> emit_move_multi_word will split the move into sub parts, but then needs to get
> the sub parts and does this using subregs, but it's told it can't do subregs!
>
> The compiler is now stuck in an infinite loop.
>
> The way this is worked around in the back-end is that we have move patterns in
> neon.md that usually just force the register instead of checking with the
> back-end. This prevents emit_move_multi_word from being needed.  However the
> pattern for V4HF and V8HF were guarded by TARGET_NEON && TARGET_FP16.
>
> I don't believe the TARGET_FP16 guard to be needed, because the pattern 
> doesn't
> actually generate code and requires another pattern for that, and a reg to 
> reg move
> should always be possible anyway. So allowing the force to register here is 
> safe
> and it allows the compiler to generate a correct error instead of ICEing in an
> infinite loop.

How about subreg to subreg move? Doesn't that expand to more insns
(subreg to reg and reg to subreg)? Couldn't you improve the logic to
check that there is actually a mode change so that if there isn't
(like moving from one subreg to another) just expand to a single move?

Best regards,

Thomas

>
> This patch ensures gcc.target/arm/big-endian-subreg.c is fixed without 
> introducing
> any regressions while fixing
>
> gcc.dg/vect/vect-nop-move.c execution test
> g++.dg/torture/vshuf-v2si.C   -O3 -g  execution test
> g++.dg/torture/vshuf-v4si.C   -O3 -g  execution test
> g++.dg/torture/vshuf-v8hi.C   -O3 -g  execution test
>
> Regtested on armeb-none-eabi and no regressions.
> Bootstrapped on arm-none-linux-gnueabihf and no issues.
>
>
> Ok for trunk?
>
> Thanks,
> Tamar
>
> gcc/
> 2018-07-23  Tamar Christina  
>
> PR target/84711
> * config/arm/arm.c (arm_can_change_mode_class): Disallow subreg.
> * config/arm/neon.md (movv4hf, movv8hf): Refactored to..
> (mov): ..this and enable unconditionally.
>
> --


Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-07-25 Thread Thomas Preudhomme
Hi Kyrill,

Using memory_operand worked, the issues I encountered when using it in
earlier versions of the patch must have been due to the missing test
on address_operand in the preparation statements which I added later.
Please find an updated patch in attachment. ChangeLog entry is as
follows:

*** gcc/ChangeLog ***

2018-07-05  Thomas Preud'homme  

* target-insns.def (stack_protect_combined_set): Define new standard
pattern name.
(stack_protect_combined_test): Likewise.
* cfgexpand.c (stack_protect_prologue): Try new
stack_protect_combined_set pattern first.
* function.c (stack_protect_epilogue): Try new
stack_protect_combined_test pattern first.
* config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
parameters to control which register to use as PIC register and force
reloading PIC register respectively.  Insert in the stream of insns if
possible.
(legitimize_pic_address): Expose above new parameters in prototype and
adapt recursive calls accordingly.
(arm_legitimize_address): Adapt to new legitimize_pic_address
prototype.
(thumb_legitimize_address): Likewise.
(arm_emit_call_insn): Adapt to new require_pic_register prototype.
* config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
change.
* config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
prototype change.
(stack_protect_combined_set): New insn_and_split pattern.
(stack_protect_set): New insn pattern.
(stack_protect_combined_test): New insn_and_split pattern.
(stack_protect_test): New insn pattern.
* config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
(UNSPEC_SP_TEST): Likewise.
* doc/md.texi (stack_protect_combined_set): Document new standard
pattern name.
(stack_protect_set): Clarify that the operand for guard's address is
legal.
(stack_protect_combined_test): Document new standard pattern name.
(stack_protect_test): Clarify that the operand for guard's address is
legal.

*** gcc/testsuite/ChangeLog ***

2018-07-05  Thomas Preud'homme  

* gcc.target/arm/pr85434.c: New test.

Bootstrapped again for Arm and Thumb-2 and regtested with and without
-fstack-protector-all without any regression.

Best regards,

Thomas
On Thu, 19 Jul 2018 at 17:34, Thomas Preudhomme
 wrote:
>
> [Dropping Jeff Law from the list since he already commented on the
> middle end parts]
>
> Hi Kyrill,
>
> On Thu, 19 Jul 2018 at 12:02, Kyrill Tkachov
>  wrote:
> >
> > Hi Thomas,
> >
> > On 17/07/18 12:02, Thomas Preudhomme wrote:
> > > Fixed in attached patch. ChangeLog entries are unchanged:
> > >
> > > *** gcc/ChangeLog ***
> > >
> > > 2018-07-05  Thomas Preud'homme 
> > >
> > > PR target/85434
> > > * target-insns.def (stack_protect_combined_set): Define new standard
> > > pattern name.
> > > (stack_protect_combined_test): Likewise.
> > > * cfgexpand.c (stack_protect_prologue): Try new
> > > stack_protect_combined_set pattern first.
> > > * function.c (stack_protect_epilogue): Try new
> > > stack_protect_combined_test pattern first.
> > > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> > > parameters to control which register to use as PIC register and force
> > > reloading PIC register respectively.
> > > (legitimize_pic_address): Expose above new parameters in prototype and
> > > adapt recursive calls accordingly.
> > > (arm_legitimize_address): Adapt to new legitimize_pic_address
> > > prototype.
> > > (thumb_legitimize_address): Likewise.
> > > (arm_emit_call_insn): Adapt to new require_pic_register prototype.
> > > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
> > > change.
> > > * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
> > > prototype change.
> > > (stack_protect_combined_set): New insn_and_split pattern.
> > > (stack_protect_set): New insn pattern.
> > > (stack_protect_combined_test): New insn_and_split pattern.
> > > (stack_protect_test): New insn pattern.
> > > * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
> > > (UNSPEC_SP_TEST): Likewise.
> > > * doc/md.texi (stack_protect_combined_set): Document new standard
> > > pattern name.
> > > (stack_protect_set): Clarify that the operand for guard's address is
> > > legal.
> > > (stack_protect_combined_test): Document new standard pattern name.
> > > (stack_protect_test): Clarify that the operand

Re: [PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.

2018-07-26 Thread Thomas Preudhomme
Hi Tamar,

On Wed, 25 Jul 2018 at 16:28, Tamar Christina  wrote:
>
> Hi Thomas,
>
> Thanks for the review!
>
> > >
> > > I don't believe the TARGET_FP16 guard to be needed, because the
> > > pattern doesn't actually generate code and requires another pattern
> > > for that, and a reg to reg move should always be possible anyway. So
> > > allowing the force to register here is safe and it allows the compiler
> > > to generate a correct error instead of ICEing in an infinite loop.
> >
> > How about subreg to subreg move? Doesn't that expand to more insns
> > (subreg to reg and reg to subreg)? Couldn't you improve the logic to check
> > that there is actually a mode change so that if there isn't (like moving 
> > from
> > one subreg to another) just expand to a single move?
> >
>
> Yes, but that is not a new issue. My patch is simply removing the TARGET_FP16 
> restrictions and
> merging two patterns that should be one using an iterator and nothing more.
>
> The redundant mov is already there and a different issue than the ICE I'm 
> trying to fix.

It's there for movv4hf and movv6hf but your patch extends this problem
to movv2sf and movv4sf as well.

>
> None of the code inside the expander is needed at all, the code really only 
> has an effect on subreg
> to subreg moves, as `force_reg` doesn't do anything when it's argument is 
> already a reg.
>
> The comment in the expander (which was already there) is wrong. The *reason* 
> the ICE is fixed isn't
> because of the `force_reg`. It's because of the mere presence of the expander 
> itself. The expander matches the
> standard mov$a optab and so this prevents emit_move_insn_1 from doing the 
> move by subwords as it finds a pattern
> that's able to do the move.

Could you then fix the comment in your patch as well? I hadn't
understood the force_reg was not key here. You might want to update
the following sentence from your patch description if you are going to
include it in your commit message:

The way this is worked around in the back-end is that we have move patterns in
neon.md that usually just force the register instead of checking with the
back-end.

"The way this is worked around (..) that just force the register" is
what led me to believe the force_reg was important.

>
> The expander however always falls through and doesn’t stop RTL generation. 
> You could remove all the code in there and have
> it properly match the *neon_mov instructions which will do the right thing 
> later at code generation time and avoid the redundant
> moves.  My guess is the original `force_reg` was copied from the other 
> patterns like `movti` and the existing `mov`. There It makes
> sense because the operands can be MEM or anything general_operand.
>
> However the redundant moves are a different problem than what I'm trying to 
> solve here. So I think that's another patch which requires further
> testing.

I was just thinking of restricting when does the force_reg happens but
if it can be removed completely I agree it should probably be done in
a separate patch.

Oh by the way, is there something that prevent those expander to ever
be used with a memory operand? Because the GCC internals contains the
following piece for mov standard pattern (bold marks added by me):

"Second, these patterns are not used solely in the RTL generation pass. Even
the reload pass can generate move insns to copy values from stack slots into
temporary registers. When it does so, one of the operands is a hard register
and the other is an operand that can need to be reloaded into a register.
Therefore, when given such a pair of operands, the pattern must generate RTL
which needs no reloading and needs no temporary registers—no registers other
than the operands. For example, if you support the pattern with a define_
expand, then in such a case the define_expand *mustn’t call force_reg* or any
other such function which might generate new pseudo registers."

Best regards,

Thomas

>
> Regards,
> Tamar
>
> > Best regards,
> >
> > Thomas
> >
> > >
> > > This patch ensures gcc.target/arm/big-endian-subreg.c is fixed without
> > > introducing any regressions while fixing
> > >
> > > gcc.dg/vect/vect-nop-move.c execution test
> > > g++.dg/torture/vshuf-v2si.C   -O3 -g  execution test
> > > g++.dg/torture/vshuf-v4si.C   -O3 -g  execution test
> > > g++.dg/torture/vshuf-v8hi.C   -O3 -g  execution test
> > >
> > > Regtested on armeb-none-eabi and no regressions.
> > > Bootstrapped on arm-none-linux-gnueabihf and no issues.
> > >
> > >
> > > Ok for trunk?
> > >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/
> > > 2018-07-23  Tamar Christina  
> > >
> > > PR target/84711
> > > * config/arm/arm.c (arm_can_change_mode_class): Disallow subreg.
> > > * config/arm/neon.md (movv4hf, movv8hf): Refactored to..
> > > (mov): ..this and enable unconditionally.
> > >
> > > --


Re: [PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.

2018-07-26 Thread Thomas Preudhomme
On Thu, 26 Jul 2018 at 12:01, Tamar Christina  wrote:
>
> Hi Thomas,
>
> > -Original Message-
> > From: Thomas Preudhomme 
> > Sent: Thursday, July 26, 2018 09:29
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd ; Ramana Radhakrishnan
> > ; Richard Earnshaw
> > ; ni...@redhat.com; Kyrylo Tkachov
> > 
> > Subject: Re: [PATCH][GCC][Arm] Fix subreg crash in different way by
> > enabling the FP16 pattern unconditionally.
> >
> > Hi Tamar,
> >
> > On Wed, 25 Jul 2018 at 16:28, Tamar Christina 
> > wrote:
> > >
> > > Hi Thomas,
> > >
> > > Thanks for the review!
> > >
> > > > >
> > > > > I don't believe the TARGET_FP16 guard to be needed, because the
> > > > > pattern doesn't actually generate code and requires another
> > > > > pattern for that, and a reg to reg move should always be possible
> > > > > anyway. So allowing the force to register here is safe and it
> > > > > allows the compiler to generate a correct error instead of ICEing in 
> > > > > an
> > infinite loop.
> > > >
> > > > How about subreg to subreg move? Doesn't that expand to more insns
> > > > (subreg to reg and reg to subreg)? Couldn't you improve the logic to
> > > > check that there is actually a mode change so that if there isn't
> > > > (like moving from one subreg to another) just expand to a single move?
> > > >
> > >
> > > Yes, but that is not a new issue. My patch is simply removing the
> > > TARGET_FP16 restrictions and merging two patterns that should be one
> > using an iterator and nothing more.
> > >
> > > The redundant mov is already there and a different issue than the ICE I'm
> > trying to fix.
> >
> > It's there for movv4hf and movv6hf but your patch extends this problem to
> > movv2sf and movv4sf as well.
>
> I don't understand how it can. My patch just replaces one pattern for V4HF and
> one for V8HF with one pattern operating on VH.
>
> ;; Vector modes for 16-bit floating-point support.
> (define_mode_iterator VH [V8HF V4HF])
>
> My pattern has absolutely no effect on V2SF and V4SF or any of the other 
> modes.

My bad, I was looking at VF.

>
> >
> > >
> > > None of the code inside the expander is needed at all, the code really
> > > only has an effect on subreg to subreg moves, as `force_reg` doesn't do
> > anything when it's argument is already a reg.
> > >
> > > The comment in the expander (which was already there) is wrong. The
> > > *reason* the ICE is fixed isn't because of the `force_reg`. It's
> > > because of the mere presence of the expander itself. The expander
> > > matches the standard mov$a optab and so this prevents
> > emit_move_insn_1 from doing the move by subwords as it finds a pattern
> > that's able to do the move.
> >
> > Could you then fix the comment in your patch as well? I hadn't understood
> > the force_reg was not key here. You might want to update the following
> > sentence from your patch description if you are going to include it in your
> > commit message:
>
> I'll update the comment in the patch. The cover letter won't be included in 
> the commit,
> But it does accurately reflect the current state of affairs. The patch will 
> do the force_reg,
> It's just not the reason it works.

Understood.

>
> >
> > The way this is worked around in the back-end is that we have move
> > patterns in neon.md that usually just force the register instead of checking
> > with the back-end.
> >
> > "The way this is worked around (..) that just force the register" is what 
> > led
> > me to believe the force_reg was important.
> >
> > >
> > > The expander however always falls through and doesn’t stop RTL
> > > generation. You could remove all the code in there and have it
> > > properly match the *neon_mov instructions which will do the right
> > > thing later at code generation time and avoid the redundant moves.  My
> > guess is the original `force_reg` was copied from the other patterns like
> > `movti` and the existing `mov`. There It makes sense because the
> > operands can be MEM or anything general_operand.
> > >
> > > However the redundant moves are a different problem than what I'm
> > > trying to solve here. So I think that's another patch which requires 
> &

Re: [PATCH, ARM] Fix gcc.c-torture/execute/loop-2b.c execution failure on cortex-m0

2016-03-03 Thread Thomas Preudhomme
On Friday 15 January 2016 12:45:04 Ramana Radhakrishnan wrote:
> On Wed, Dec 16, 2015 at 9:11 AM, Thomas Preud'homme
> 
>  wrote:
> > During reorg pass, thumb1_reorg () is tasked with rewriting mov rd, rn to
> > subs rd, rn, 0 to avoid a comparison against 0 instruction before doing a
> > conditional branch based on it. The actual avoiding of cmp is done in
> > cbranchsi4_insn instruction C output template. When the condition is met,
> > the source register (rn) is also propagated into the comparison in place
> > the destination register (rd).
> > 
> > However, right now thumb1_reorg () only look for a mov followed by a
> > cbranchsi but does not check whether the comparison in cbranchsi is
> > against the constant 0. This is not safe because a non clobbering
> > instruction could exist between the mov and the comparison that modifies
> > the source register. This is what happens here with a post increment of
> > the source register after the mov, which skip the &a[i] == &a[1]
> > comparison for iteration i == 1.
> > 
> > This patch fixes the issue by checking that the comparison is against
> > constant 0.
> > 
> > ChangeLog entry is as follow:
> > 
> > 
> > *** gcc/ChangeLog ***
> > 
> > 2015-12-07  Thomas Preud'homme  
> > 
> > * config/arm/arm.c (thumb1_reorg): Check that the comparison is
> > against the constant 0.
> 
> OK.
> 
> Ramana
> 
> > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > index 42bf272..49c0a06 100644
> > --- a/gcc/config/arm/arm.c
> > +++ b/gcc/config/arm/arm.c
> > @@ -17195,7 +17195,7 @@ thumb1_reorg (void)
> > 
> >FOR_EACH_BB_FN (bb, cfun)
> >
> >  {
> >  
> >rtx dest, src;
> > 
> > -  rtx pat, op0, set = NULL;
> > +  rtx cmp, op0, op1, set = NULL;
> > 
> >rtx_insn *prev, *insn = BB_END (bb);
> >bool insn_clobbered = false;
> > 
> > @@ -17208,8 +17208,13 @@ thumb1_reorg (void)
> > 
> > continue;
> >
> >/* Get the register with which we are comparing.  */
> > 
> > -  pat = PATTERN (insn);
> > -  op0 = XEXP (XEXP (SET_SRC (pat), 0), 0);
> > +  cmp = XEXP (SET_SRC (PATTERN (insn)), 0);
> > +  op0 = XEXP (cmp, 0);
> > +  op1 = XEXP (cmp, 1);
> > +
> > +  /* Check that comparison is against ZERO.  */
> > +  if (!CONST_INT_P (op1) || INTVAL (op1) != 0)
> > +   continue;
> > 
> >/* Find the first flag setting insn before INSN in basic block BB. 
> >*/
> >gcc_assert (insn != BB_HEAD (bb));
> > 
> > @@ -17249,7 +17254,7 @@ thumb1_reorg (void)
> > 
> >   PATTERN (prev) = gen_rtx_SET (dest, src);
> >   INSN_CODE (prev) = -1;
> >   /* Set test register in INSN to dest.  */
> > 
> > - XEXP (XEXP (SET_SRC (pat), 0), 0) = copy_rtx (dest);
> > + XEXP (cmp, 0) = copy_rtx (dest);
> > 
> >   INSN_CODE (insn) = -1;
> > 
> > }
> >  
> >  }
> > 
> > Testsuite shows no regression when run for arm-none-eabi with
> > -mcpu=cortex-m0 -mthumb

The patch applies cleanly on gcc-5-branch and also show no regression when run 
for arm-none-eabi with -mcpu=cortex-m0 -mthumb. Is it ok to backport?

Best regards,

Thomas


Re: [PATCH, ARM] Fix gcc.c-torture/execute/loop-2b.c execution failure on cortex-m0

2016-03-03 Thread Thomas Preudhomme
On Thursday 03 March 2016 09:44:31 Ramana Radhakrishnan wrote:
> On Thu, Mar 3, 2016 at 9:40 AM, Thomas Preudhomme
> 
>  wrote:
> > On Friday 15 January 2016 12:45:04 Ramana Radhakrishnan wrote:
> >> On Wed, Dec 16, 2015 at 9:11 AM, Thomas Preud'homme
> >> 
> >>  wrote:
> >> > During reorg pass, thumb1_reorg () is tasked with rewriting mov rd, rn
> >> > to
> >> > subs rd, rn, 0 to avoid a comparison against 0 instruction before doing
> >> > a
> >> > conditional branch based on it. The actual avoiding of cmp is done in
> >> > cbranchsi4_insn instruction C output template. When the condition is
> >> > met,
> >> > the source register (rn) is also propagated into the comparison in
> >> > place
> >> > the destination register (rd).
> >> > 
> >> > However, right now thumb1_reorg () only look for a mov followed by a
> >> > cbranchsi but does not check whether the comparison in cbranchsi is
> >> > against the constant 0. This is not safe because a non clobbering
> >> > instruction could exist between the mov and the comparison that
> >> > modifies
> >> > the source register. This is what happens here with a post increment of
> >> > the source register after the mov, which skip the &a[i] == &a[1]
> >> > comparison for iteration i == 1.
> >> > 
> >> > This patch fixes the issue by checking that the comparison is against
> >> > constant 0.
> >> > 
> >> > ChangeLog entry is as follow:
> >> > 
> >> > 
> >> > *** gcc/ChangeLog ***
> >> > 
> >> > 2015-12-07  Thomas Preud'homme  
> >> > 
> >> > * config/arm/arm.c (thumb1_reorg): Check that the comparison is
> >> > against the constant 0.
> >> 
> >> OK.
> >> 
> >> Ramana
> >> 
> >> > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> >> > index 42bf272..49c0a06 100644
> >> > --- a/gcc/config/arm/arm.c
> >> > +++ b/gcc/config/arm/arm.c
> >> > @@ -17195,7 +17195,7 @@ thumb1_reorg (void)
> >> > 
> >> >FOR_EACH_BB_FN (bb, cfun)
> >> >
> >> >  {
> >> >  
> >> >rtx dest, src;
> >> > 
> >> > -  rtx pat, op0, set = NULL;
> >> > +  rtx cmp, op0, op1, set = NULL;
> >> > 
> >> >rtx_insn *prev, *insn = BB_END (bb);
> >> >bool insn_clobbered = false;
> >> > 
> >> > @@ -17208,8 +17208,13 @@ thumb1_reorg (void)
> >> > 
> >> > continue;
> >> >
> >> >/* Get the register with which we are comparing.  */
> >> > 
> >> > -  pat = PATTERN (insn);
> >> > -  op0 = XEXP (XEXP (SET_SRC (pat), 0), 0);
> >> > +  cmp = XEXP (SET_SRC (PATTERN (insn)), 0);
> >> > +  op0 = XEXP (cmp, 0);
> >> > +  op1 = XEXP (cmp, 1);
> >> > +
> >> > +  /* Check that comparison is against ZERO.  */
> >> > +  if (!CONST_INT_P (op1) || INTVAL (op1) != 0)
> >> > +   continue;
> >> > 
> >> >/* Find the first flag setting insn before INSN in basic block
> >> >BB.
> >> >*/
> >> >gcc_assert (insn != BB_HEAD (bb));
> >> > 
> >> > @@ -17249,7 +17254,7 @@ thumb1_reorg (void)
> >> > 
> >> >   PATTERN (prev) = gen_rtx_SET (dest, src);
> >> >   INSN_CODE (prev) = -1;
> >> >   /* Set test register in INSN to dest.  */
> >> > 
> >> > - XEXP (XEXP (SET_SRC (pat), 0), 0) = copy_rtx (dest);
> >> > + XEXP (cmp, 0) = copy_rtx (dest);
> >> > 
> >> >   INSN_CODE (insn) = -1;
> >> > 
> >> > }
> >> >  
> >> >  }
> >> > 
> >> > Testsuite shows no regression when run for arm-none-eabi with
> >> > -mcpu=cortex-m0 -mthumb
> > 
> > The patch applies cleanly on gcc-5-branch and also show no regression when
> > run for arm-none-eabi with -mcpu=cortex-m0 -mthumb. Is it ok to backport?
> This deserves a testcase.

The original patch don't have one initially because it fixes a fail of an 
existing testcase (loop-2b.c). However, the test pass on gcc 5 due to 
difference in code generation. I'm currently trying to come up with a testcase 
and will get back at you.

Best regards,

Thomas


Re: [PATCH, ARM] Fix gcc.c-torture/execute/loop-2b.c execution failure on cortex-m0

2016-03-03 Thread Thomas Preudhomme
On Thursday 03 March 2016 15:32:27 Thomas Preudhomme wrote:
> On Thursday 03 March 2016 09:44:31 Ramana Radhakrishnan wrote:
> > On Thu, Mar 3, 2016 at 9:40 AM, Thomas Preudhomme
> > 
> >  wrote:
> > > On Friday 15 January 2016 12:45:04 Ramana Radhakrishnan wrote:
> > >> On Wed, Dec 16, 2015 at 9:11 AM, Thomas Preud'homme
> > >> 
> > >>  wrote:
> > >> > During reorg pass, thumb1_reorg () is tasked with rewriting mov rd,
> > >> > rn
> > >> > to
> > >> > subs rd, rn, 0 to avoid a comparison against 0 instruction before
> > >> > doing
> > >> > a
> > >> > conditional branch based on it. The actual avoiding of cmp is done in
> > >> > cbranchsi4_insn instruction C output template. When the condition is
> > >> > met,
> > >> > the source register (rn) is also propagated into the comparison in
> > >> > place
> > >> > the destination register (rd).
> > >> > 
> > >> > However, right now thumb1_reorg () only look for a mov followed by a
> > >> > cbranchsi but does not check whether the comparison in cbranchsi is
> > >> > against the constant 0. This is not safe because a non clobbering
> > >> > instruction could exist between the mov and the comparison that
> > >> > modifies
> > >> > the source register. This is what happens here with a post increment
> > >> > of
> > >> > the source register after the mov, which skip the &a[i] == &a[1]
> > >> > comparison for iteration i == 1.
> > >> > 
> > >> > This patch fixes the issue by checking that the comparison is against
> > >> > constant 0.
> > >> > 
> > >> > ChangeLog entry is as follow:
> > >> > 
> > >> > 
> > >> > *** gcc/ChangeLog ***
> > >> > 
> > >> > 2015-12-07  Thomas Preud'homme  
> > >> > 
> > >> > * config/arm/arm.c (thumb1_reorg): Check that the comparison
> > >> > is
> > >> > against the constant 0.
> > >> 
> > >> OK.
> > >> 
> > >> Ramana
> > >> 
> > >> > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > >> > index 42bf272..49c0a06 100644
> > >> > --- a/gcc/config/arm/arm.c
> > >> > +++ b/gcc/config/arm/arm.c
> > >> > @@ -17195,7 +17195,7 @@ thumb1_reorg (void)
> > >> > 
> > >> >FOR_EACH_BB_FN (bb, cfun)
> > >> >
> > >> >  {
> > >> >  
> > >> >rtx dest, src;
> > >> > 
> > >> > -  rtx pat, op0, set = NULL;
> > >> > +  rtx cmp, op0, op1, set = NULL;
> > >> > 
> > >> >rtx_insn *prev, *insn = BB_END (bb);
> > >> >bool insn_clobbered = false;
> > >> > 
> > >> > @@ -17208,8 +17208,13 @@ thumb1_reorg (void)
> > >> > 
> > >> > continue;
> > >> >
> > >> >/* Get the register with which we are comparing.  */
> > >> > 
> > >> > -  pat = PATTERN (insn);
> > >> > -  op0 = XEXP (XEXP (SET_SRC (pat), 0), 0);
> > >> > +  cmp = XEXP (SET_SRC (PATTERN (insn)), 0);
> > >> > +  op0 = XEXP (cmp, 0);
> > >> > +  op1 = XEXP (cmp, 1);
> > >> > +
> > >> > +  /* Check that comparison is against ZERO.  */
> > >> > +  if (!CONST_INT_P (op1) || INTVAL (op1) != 0)
> > >> > +   continue;
> > >> > 
> > >> >/* Find the first flag setting insn before INSN in basic block
> > >> >BB.
> > >> >*/
> > >> >gcc_assert (insn != BB_HEAD (bb));
> > >> > 
> > >> > @@ -17249,7 +17254,7 @@ thumb1_reorg (void)
> > >> > 
> > >> >   PATTERN (prev) = gen_rtx_SET (dest, src);
> > >> >   INSN_CODE (prev) = -1;
> > >> >   /* Set test register in INSN to dest.  */
> > >> > 
> > >> > - XEXP (XEXP (SET_SRC (pat), 0), 0) = copy_rtx (dest);
> > >> > + XEXP (cmp, 0) = copy_rtx (dest);
> > >> > 
> > >> >   INSN_CODE (insn) = -1;
> > >> > 
> > >> > }
> > >> >  
> > >> >  }
> > >> > 
> > >> > Testsuite shows no regression when run for arm-none-eabi with
> > >> > -mcpu=cortex-m0 -mthumb
> > > 
> > > The patch applies cleanly on gcc-5-branch and also show no regression
> > > when
> > > run for arm-none-eabi with -mcpu=cortex-m0 -mthumb. Is it ok to
> > > backport?
> > 
> > This deserves a testcase.
> 
> The original patch don't have one initially because it fixes a fail of an
> existing testcase (loop-2b.c). However, the test pass on gcc 5 due to
> difference in code generation. I'm currently trying to come up with a
> testcase and will get back at you.

Sadly I did not manage to come up with a testcase that works on GCC 5. One 
need to reproduce a sequence of the form:

(set B A)
(insn clobbering A that is not a set, ie store with post increment)
(conditional branch between A and something else)

In that case, thumb1_reorg changes the set into (set B (minus A 0)) which is 
safe but also replace A by B in the conditional insn which is unsafe in the 
above situation. The problem I am having is to make GCC generate a move 
instruction because it's always optimized away. Using local register variable 
is not an option because the move should be between regular registers.

Any idea to construct a testcase?

Best regards,

Thomas


[PATCH] Clarify source of tm.texi to copy for GFDL grant

2018-08-09 Thread Thomas Preudhomme
When tm.texi.in is updated in the source tree, the following message
gets displayed:

Verify that you have permission to grant a GFDL license for all
new text in tm.texi, then copy it to /gcc/doc/tm.texi.

Having been myself and some colleagues confused several time by that
message as to what tm.texi to copy, I think it would be clearer to
indicate the absolute path for the source as well. This patch achieves
that.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2018-08-09  Thomas Preud'homme  

* Makefile.in: Clarify which tm.texi to copy over to assert the
right to grant a GFDL license for all.

Testing: Built GCC with a change in tm.texi.in and copied by
copy/pasting the source and destination path from the resulting message.
Second build then succeeded.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index e7d818d174c..d8d2b885f6d 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2504,7 +2504,7 @@ s-tm-texi: build/genhooks$(build_exeext) $(srcdir)/doc/tm.texi.in
 	else \
 	  echo >&2 ; \
 	  echo Verify that you have permission to grant a GFDL license for all >&2 ; \
-	  echo new text in tm.texi, then copy it to $(srcdir)/doc/tm.texi. >&2 ; \
+	  echo new text in $(objdir)/tm.texi, then copy it to $(srcdir)/doc/tm.texi. >&2 ; \
 	  false; \
 	fi
 
-- 
2.18.0



Re: [PATCH][GCC][AArch64] Limit movmem copies to TImode copies.

2018-08-13 Thread Thomas Preudhomme
Hi Tamar,

Thanks for your patch.

Just one comment about your ChangeLog entry for the testsuiet change:
shouldn't it mention that it is a new testcase? The patch you attached
seems to create the file.

Best regards,

Thomas

On Mon, 13 Aug 2018 at 10:33, Tamar Christina 
wrote:

> Hi All,
>
> On AArch64 we have integer modes larger than TImode, and while we can
> generate
> moves for these they're not as efficient.
>
> So instead make sure we limit the maximum we can copy to TImode.  This
> means
> copying a 16 byte struct will issue 1 TImode copy, which will be done
> using a
> single STP as we expect but an CImode sized copy won't issue CImode
> operations.
>
> Bootstrapped and regtested on aarch4-none-linux-gnu and no issues.
> Crosstested aarch4_be-none-elf and no issues.
>
> Ok for trunk?
>
> Thanks,
> Tamar
>
> gcc/
> 2018-08-13  Tamar Christina  
>
> * config/aarch64/aarch64.c (aarch64_expand_movmem): Set TImode max.
>
> gcc/testsuite/
> 2018-08-13  Tamar Christina  
>
> * gcc.target/aarch64/large_struct_copy_2.c: Add assembler scan.
>
> --
>


Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-08-29 Thread Thomas Preudhomme
Resend hopefully without HTML this time.

On Wed, 29 Aug 2018 at 10:49, Thomas Preudhomme
 wrote:
>
> Hi,
>
> I've reworked the patch fixing PR85434 (spilling of stack protector guard's 
> address on ARM) to address the testsuite regression on powerpc and x86 as 
> well as glibc testsuite regression on ARM. Issues were due to unconditionally 
> attempting to generate the new patterns. The code now tests if there is a 
> pattern for them for the target before generating them. In the ARM side of 
> the patch, I've also added a more specific predicate for the new patterns. 
> The new patch is found below.
>
>
> In case of high register pressure in PIC mode, address of the stack
> protector's guard can be spilled on ARM targets as shown in PR85434,
> thus allowing an attacker to control what the canary would be compared
> against. ARM does lack stack_protect_set and stack_protect_test insn
> patterns, defining them does not help as the address is expanded
> regularly and the patterns only deal with the copy and test of the
> guard with the canary.
>
> This problem does not occur for x86 targets because the PIC access and
> the test can be done in the same instruction. Aarch64 is exempt too
> because PIC access insn pattern are mov of UNSPEC which prevents it from
> the second access in the epilogue being CSEd in cse_local pass with the
> first access in the prologue.
>
> The approach followed here is to create new "combined" set and test
> standard pattern names that take the unexpanded guard and do the set or
> test. This allows the target to use an opaque pattern (eg. using UNSPEC)
> to hide the individual instructions being generated to the compiler and
> split the pattern into generic load, compare and branch instruction
> after register allocator, therefore avoiding any spilling. This is here
> implemented for the ARM targets. For targets not implementing these new
> standard pattern names, the existing stack_protect_set and
> stack_protect_test pattern names are used.
>
> To be able to split PIC access after register allocation, the functions
> had to be augmented to force a new PIC register load and to control
> which register it loads into. This is because sharing the PIC register
> between prologue and epilogue could lead to spilling due to CSE again
> which an attacker could use to control what the canary gets compared
> against.
>
> ChangeLog entries are as follows:
>
> *** gcc/ChangeLog ***
>
> 2018-08-09  Thomas Preud'homme  
>
> * target-insns.def (stack_protect_combined_set): Define new standard
> pattern name.
> (stack_protect_combined_test): Likewise.
> * cfgexpand.c (stack_protect_prologue): Try new
> stack_protect_combined_set pattern first.
> * function.c (stack_protect_epilogue): Try new
> stack_protect_combined_test pattern first.
> * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> parameters to control which register to use as PIC register and force
> reloading PIC register respectively.  Insert in the stream of insns if
> possible.
> (legitimize_pic_address): Expose above new parameters in prototype and
> adapt recursive calls accordingly.
> (arm_legitimize_address): Adapt to new legitimize_pic_address
> prototype.
> (thumb_legitimize_address): Likewise.
> (arm_emit_call_insn): Adapt to new require_pic_register prototype.
> * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
> change.
> * config/arm/predicated.md (guard_operand): New predicate.
> * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
> prototype change.
> (stack_protect_combined_set): New insn_and_split pattern.
> (stack_protect_set): New insn pattern.
> (stack_protect_combined_test): New insn_and_split pattern.
> (stack_protect_test): New insn pattern.
> * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
> (UNSPEC_SP_TEST): Likewise.
> * doc/md.texi (stack_protect_combined_set): Document new standard
> pattern name.
> (stack_protect_set): Clarify that the operand for guard's address is
> legal.
> (stack_protect_combined_test): Document new standard pattern name.
> (stack_protect_test): Clarify that the operand for guard's address is
> legal.
>
> *** gcc/testsuite/ChangeLog ***
>
> 2018-07-05  Thomas Preud'homme  
>
> * gcc.target/arm/pr85434.c: New test.
>
>
> Testing:
>
> native x86_64: bootstrap + testsuite -> no regression, can see failures with 
> previous version of patch but not with new version
> native powerpc64: bootstrap + testsuite -> no regression, can see failur

Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-08-29 Thread Thomas Preudhomme
Forgot another important change in ARM backend:

The expander were causing one too many indirection which was what
caused the test failure in glibc. The new expanders code skip the
creation of a move from the memory reference of the guard's address to
a register since this is done in the insn themselves. I think during
the initial implementation of the first version of the patch I had
issues with loading the address and used that to load the address. As
can be seen from the absence of regression on the runtime stack
protector test in glibc, this is now working properly, also confirmed
by manual inspection of the code.

I've attached the interdiff from previous version for reference.

Best regards,

Thomas
On Wed, 29 Aug 2018 at 10:51, Thomas Preudhomme
 wrote:
>
> Resend hopefully without HTML this time.
>
> On Wed, 29 Aug 2018 at 10:49, Thomas Preudhomme
>  wrote:
> >
> > Hi,
> >
> > I've reworked the patch fixing PR85434 (spilling of stack protector guard's 
> > address on ARM) to address the testsuite regression on powerpc and x86 as 
> > well as glibc testsuite regression on ARM. Issues were due to 
> > unconditionally attempting to generate the new patterns. The code now tests 
> > if there is a pattern for them for the target before generating them. In 
> > the ARM side of the patch, I've also added a more specific predicate for 
> > the new patterns. The new patch is found below.
> >
> >
> > In case of high register pressure in PIC mode, address of the stack
> > protector's guard can be spilled on ARM targets as shown in PR85434,
> > thus allowing an attacker to control what the canary would be compared
> > against. ARM does lack stack_protect_set and stack_protect_test insn
> > patterns, defining them does not help as the address is expanded
> > regularly and the patterns only deal with the copy and test of the
> > guard with the canary.
> >
> > This problem does not occur for x86 targets because the PIC access and
> > the test can be done in the same instruction. Aarch64 is exempt too
> > because PIC access insn pattern are mov of UNSPEC which prevents it from
> > the second access in the epilogue being CSEd in cse_local pass with the
> > first access in the prologue.
> >
> > The approach followed here is to create new "combined" set and test
> > standard pattern names that take the unexpanded guard and do the set or
> > test. This allows the target to use an opaque pattern (eg. using UNSPEC)
> > to hide the individual instructions being generated to the compiler and
> > split the pattern into generic load, compare and branch instruction
> > after register allocator, therefore avoiding any spilling. This is here
> > implemented for the ARM targets. For targets not implementing these new
> > standard pattern names, the existing stack_protect_set and
> > stack_protect_test pattern names are used.
> >
> > To be able to split PIC access after register allocation, the functions
> > had to be augmented to force a new PIC register load and to control
> > which register it loads into. This is because sharing the PIC register
> > between prologue and epilogue could lead to spilling due to CSE again
> > which an attacker could use to control what the canary gets compared
> > against.
> >
> > ChangeLog entries are as follows:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-08-09  Thomas Preud'homme  
> >
> > * target-insns.def (stack_protect_combined_set): Define new standard
> > pattern name.
> > (stack_protect_combined_test): Likewise.
> > * cfgexpand.c (stack_protect_prologue): Try new
> > stack_protect_combined_set pattern first.
> > * function.c (stack_protect_epilogue): Try new
> > stack_protect_combined_test pattern first.
> > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> > parameters to control which register to use as PIC register and force
> > reloading PIC register respectively.  Insert in the stream of insns if
> > possible.
> > (legitimize_pic_address): Expose above new parameters in prototype and
> > adapt recursive calls accordingly.
> > (arm_legitimize_address): Adapt to new legitimize_pic_address
> > prototype.
> > (thumb_legitimize_address): Likewise.
> > (arm_emit_call_insn): Adapt to new require_pic_register prototype.
> > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
> > change.
> > * config/arm/predicated.md (guard_operand): New predicate.
> > * config/arm/arm.md (movsi expander

Re: [PATCH, GCC/LTO, ping3] Fix PR69866: LTO with def for weak alias in regular object file

2017-06-06 Thread Thomas Preudhomme

On 09/05/17 23:36, Jan Hubicka wrote:

Ping?

Sorry for late reply


My turn to apologize now.


Hi,

This patch fixes an assert failure when linking one LTOed object file
having a weak alias with a regular object file containing a strong
definition for that same symbol. The patch is twofold:

+ do not add an alias to a partition if it is external
+ do not declare (.globl) an alias if it is external


Adding external alises to partitions is important to keep the information
that two symbols are the same.


So how about simply relaxing the assert then? Right now it trips for any 
external symbol, even external aliases.


How about the following:


ChangeLog entries are as follow:

*** gcc/lto/ChangeLog ***

2017-06-02  Thomas Preud'homme  

* lto/lto-partition.c (add_symbol_to_partition_1): Change assert to
allow external aliases to be added.

*** gcc/ChangeLog ***

2017-03-01  Thomas Preud'homme  

* cgraphunit.c (cgraph_node::assemble_thunks_and_aliases): Do not
declare external aliases.

*** gcc/testsuite/ChangeLog ***

2017-02-28  Thomas Preud'homme  

* gcc.dg/lto/pr69866_0.c: New test.
* gcc.dg/lto/pr69866_1.c: Likewise.

Bootstrapped with LTO on Aarch64 and ARM and testsuite on both of these 
architectures do not show any regression.


Is this ok for trunk?

Best regards,

Thomas



The second part makes sense to me.  What breaks when you drop the first
change?

Honza


ChangeLog entries are as follow:

*** gcc/lto/ChangeLog ***

2017-03-01  Thomas Preud'homme  

   PR lto/69866
   * lto/lto-partition.c (add_symbol_to_partition_1): Do not add external
   aliases to partition.

*** gcc/ChangeLog ***

2017-03-01  Thomas Preud'homme  

   PR lto/69866
   * cgraphunit.c (cgraph_node::assemble_thunks_and_aliases): Do not
   declare external aliases.

*** gcc/testsuite/ChangeLog ***

2017-02-28  Thomas Preud'homme  

   PR lto/69866
   * gcc.dg/lto/pr69866_0.c: New test.
   * gcc.dg/lto/pr69866_1.c: Likewise.


Testing: Testsuite shows no regression when targeting Cortex-M3 with an
arm-none-eabi GCC cross-compiler, neither does it show any regression with
native LTO-bootstrapped x86-64_linux-gnu and aarch64-linux-gnu compilers.

Is this ok for stage4?

Best regards,

Thomas

On 31/03/17 18:07, Richard Biener wrote:

On March 31, 2017 5:23:03 PM GMT+02:00, Jeff Law  wrote:

On 03/16/2017 08:05 AM, Thomas Preudhomme wrote:

Ping?

Is this ok for stage4?

Given the lack of response from Richi, I'd suggest deferring to stage1.


Honza needs to review this, i habe too little knowledge here.

Richard.


jeff





diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index 
c82a88a599ca61b068dd9783d2a6158163809b37..580500ff922b8546d33119261a2455235edbf16d
 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -1972,7 +1972,7 @@ cgraph_node::assemble_thunks_and_aliases (void)
   FOR_EACH_ALIAS (this, ref)
 {
   cgraph_node *alias = dyn_cast  (ref->referring);
-  if (!alias->transparent_alias)
+  if (!alias->transparent_alias && !DECL_EXTERNAL (alias->decl))
{
  bool saved_written = TREE_ASM_WRITTEN (decl);

diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index 
e27d0d1690c1fcfb39e2fac03ce0f4154031fc7c..f44fd435ed075a27e373bdfdf0464eb06e1731ef
 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -178,7 +178,8 @@ add_symbol_to_partition_1 (ltrans_partition part, 
symtab_node *node)
   /* Add all aliases associated with the symbol.  */

   FOR_EACH_ALIAS (node, ref)
-if (!ref->referring->transparent_alias)
+if (!ref->referring->transparent_alias
+   && ref->referring->get_partitioning_class () != SYMBOL_EXTERNAL)
   add_symbol_to_partition_1 (part, ref->referring);
 else
   {
@@ -189,7 +190,8 @@ add_symbol_to_partition_1 (ltrans_partition part, 
symtab_node *node)
  {
/* Nested transparent aliases are not permitted.  */
gcc_checking_assert (!ref2->referring->transparent_alias);
-   add_symbol_to_partition_1 (part, ref2->referring);
+   if (ref2->referring->get_partitioning_class () != SYMBOL_EXTERNAL)
+ add_symbol_to_partition_1 (part, ref2->referring);
  }
   }

diff --git a/gcc/testsuite/gcc.dg/lto/pr69866_0.c 
b/gcc/testsuite/gcc.dg/lto/pr69866_0.c
new file mode 100644
index 
..f49ef8d4c1da7a21d1bfb5409d647bd18141595b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr69866_0.c
@@ -0,0 +1,13 @@
+/* { dg-lto-do link } */
+
+int _umh(int i)
+{
+  return i+1;
+}
+
+int weaks(int i) __attribute__((weak, alias("_umh")));
+
+int main()
+{
+  return weaks(10);
+}
diff --git a/gcc/testsuite/gcc.dg/lto/pr69866_1.c 
b/gcc/testsuite/gcc.dg/lto/pr69866_1.c
new file mode 100644
index 
..3a14f850eefaffbf659ce4642adef7900330f4ed
--- /dev

[PATCH, GCC/testsuite/ARM] Allow arm_arch_*_ok to test several macros

2017-06-07 Thread Thomas Preudhomme

Hi,

The general arm_arch_*_ok procedures check architecture availability by
substituting macros inside a defined preprocessor operator. This limits
them to only check definition of only one macro and force ARMv7VE to be
special cased.

This patch takes advantage of the fact that architecture macros, when
defined, are not null to allow expressing architecture availability by
a boolean operation of possibly several macros. It then takes advantage
of this to deal with ARMv7VE in the general case.  The patch also adds a
comment to make it clear that check_effective_target_arm_arch_FUNC_ok
does not work as intendend for architecture extensions (eg. ARMv8.1-A)
due to lack of extension-specific macro similar to __ARM_ARCH_*__.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-06-06  Thomas Preud'homme  

* lib/target-supports.exp (check_effective_target_arm_arch_FUNC_ok):
Test for null definitions instead of them being undefined.  Add entry
for ARMv7VE.  Reindent entry for ARMv8-M Baseline.  Add comment warning
about using the effective target for architecture extension.
(check_effective_target_arm_arch_v7ve_ok): Remove.
(add_options_for_arm_arch_v7ve): Likewise.

Testing:
- gcc.target/arm/atomic_loaddi_10.c passes with the patch for armv7ve
  but is marked unsupported for armv7-a
- verified in the logs that -march=armv7ve is correctly added when
  running gcc.target/arm/ftest-armv7ve-arm.c

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index ded6383cc1f9a1489cd83e1dace0c2fc48e252c3..e83ec757ae3c0dd7c3cad19cfd5d9577547d18a5 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -3775,12 +3775,13 @@ proc check_effective_target_arm_fp16_hw { } {
 # can be selected and a routine to give the flags to select that architecture
 # Note: Extra flags may be added to disable options from newer compilers
 # (Thumb in particular - but others may be added in the future).
-# -march=armv7ve is special and is handled explicitly after this loop because
-# it needs more than one predefine check to identify.
+# Warning: Do not use check_effective_target_arm_arch_*_ok for architecture
+# extension (eg. ARMv8.1-A) since there is no macro defined for them.  See
+# how only __ARM_ARCH_8A__ is checked for ARMv8.1-A.
 # Usage: /* { dg-require-effective-target arm_arch_v5_ok } */
 #/* { dg-add-options arm_arch_v5 } */
 #	 /* { dg-require-effective-target arm_arch_v5_multilib } */
-foreach { armfunc armflag armdef } {
+foreach { armfunc armflag armdefs } {
 	v4 "-march=armv4 -marm" __ARM_ARCH_4__
 	v4t "-march=armv4t" __ARM_ARCH_4T__
 	v5 "-march=armv5 -marm" __ARM_ARCH_5__
@@ -3795,20 +3796,23 @@ foreach { armfunc armflag armdef } {
 	v7r "-march=armv7-r" __ARM_ARCH_7R__
 	v7m "-march=armv7-m -mthumb" __ARM_ARCH_7M__
 	v7em "-march=armv7e-m -mthumb" __ARM_ARCH_7EM__
+	v7ve "-march=armv7ve -marm"
+		"__ARM_ARCH_7A__ && __ARM_FEATURE_IDIV"
 	v8a "-march=armv8-a" __ARM_ARCH_8A__
 	v8_1a "-march=armv8.1a" __ARM_ARCH_8A__
 	v8_2a "-march=armv8.2a" __ARM_ARCH_8A__
-	v8m_base "-march=armv8-m.base -mthumb -mfloat-abi=soft" __ARM_ARCH_8M_BASE__
+	v8m_base "-march=armv8-m.base -mthumb -mfloat-abi=soft"
+		__ARM_ARCH_8M_BASE__
 	v8m_main "-march=armv8-m.main -mthumb" __ARM_ARCH_8M_MAIN__ } {
-eval [string map [list FUNC $armfunc FLAG $armflag DEF $armdef ] {
+eval [string map [list FUNC $armfunc FLAG $armflag DEFS $armdefs ] {
 	proc check_effective_target_arm_arch_FUNC_ok { } {
 	if { [ string match "*-marm*" "FLAG" ] &&
 		![check_effective_target_arm_arm_ok] } {
 		return 0
 	}
 	return [check_no_compiler_messages arm_arch_FUNC_ok assembly {
-		#if !defined (DEF)
-		#error !DEF
+		#if !(DEFS)
+		#error !(DEFS)
 		#endif
 	} "FLAG" ]
 	}
@@ -3829,26 +3833,6 @@ foreach { armfunc armflag armdef } {
 }]
 }
 
-# Same functions as above but for -march=armv7ve.  To uniquely identify
-# -march=armv7ve we need to check for __ARM_ARCH_7A__ as well as
-# __ARM_FEATURE_IDIV otherwise it aliases with armv7-a.
-
-proc check_effective_target_arm_arch_v7ve_ok { } {
-  if { [ string match "*-marm*" "-march=armv7ve" ] &&
-	![check_effective_target_arm_arm_ok] } {
-		return 0
-}
-  return [check_no_compiler_messages arm_arch_v7ve_ok assembly {
-  #if !defined (__ARM_ARCH_7A__) || !defined (__ARM_FEATURE_IDIV)
-  #error !armv7ve
-  #endif
-  } "-march=armv7ve" ]
-}
-
-proc add_options_for_arm_arch_v7ve { flags } {
-return "$flags -march=armv7ve"
-}
-
 # Return 1 if GCC was configured with --with-mode=
 proc check_effective_target_default_mode { } {
 


Re: [PATCH, GCC/LTO, ping] Fix PR69866: LTO with def for weak alias in regular object file

2017-06-12 Thread Thomas Preudhomme

Ping?

Best regards,

Thomas

On 06/06/17 11:12, Thomas Preudhomme wrote:

On 09/05/17 23:36, Jan Hubicka wrote:

Ping?

Sorry for late reply


My turn to apologize now.


Hi,

This patch fixes an assert failure when linking one LTOed object file
having a weak alias with a regular object file containing a strong
definition for that same symbol. The patch is twofold:

+ do not add an alias to a partition if it is external
+ do not declare (.globl) an alias if it is external


Adding external alises to partitions is important to keep the information
that two symbols are the same.


So how about simply relaxing the assert then? Right now it trips for any
external symbol, even external aliases.

How about the following:


ChangeLog entries are as follow:

*** gcc/lto/ChangeLog ***

2017-06-02  Thomas Preud'homme  

* lto/lto-partition.c (add_symbol_to_partition_1): Change assert to
allow external aliases to be added.

*** gcc/ChangeLog ***

2017-03-01  Thomas Preud'homme  

* cgraphunit.c (cgraph_node::assemble_thunks_and_aliases): Do not
declare external aliases.

*** gcc/testsuite/ChangeLog ***

2017-02-28  Thomas Preud'homme  

* gcc.dg/lto/pr69866_0.c: New test.
* gcc.dg/lto/pr69866_1.c: Likewise.

Bootstrapped with LTO on Aarch64 and ARM and testsuite on both of these
architectures do not show any regression.

Is this ok for trunk?

Best regards,

Thomas



The second part makes sense to me.  What breaks when you drop the first
change?

Honza


ChangeLog entries are as follow:

*** gcc/lto/ChangeLog ***

2017-03-01  Thomas Preud'homme  

   PR lto/69866
   * lto/lto-partition.c (add_symbol_to_partition_1): Do not add external
   aliases to partition.

*** gcc/ChangeLog ***

2017-03-01  Thomas Preud'homme  

   PR lto/69866
   * cgraphunit.c (cgraph_node::assemble_thunks_and_aliases): Do not
   declare external aliases.

*** gcc/testsuite/ChangeLog ***

2017-02-28  Thomas Preud'homme  

   PR lto/69866
   * gcc.dg/lto/pr69866_0.c: New test.
   * gcc.dg/lto/pr69866_1.c: Likewise.


Testing: Testsuite shows no regression when targeting Cortex-M3 with an
arm-none-eabi GCC cross-compiler, neither does it show any regression with
native LTO-bootstrapped x86-64_linux-gnu and aarch64-linux-gnu compilers.

Is this ok for stage4?

Best regards,

Thomas

On 31/03/17 18:07, Richard Biener wrote:

On March 31, 2017 5:23:03 PM GMT+02:00, Jeff Law  wrote:

On 03/16/2017 08:05 AM, Thomas Preudhomme wrote:

Ping?

Is this ok for stage4?

Given the lack of response from Richi, I'd suggest deferring to stage1.


Honza needs to review this, i habe too little knowledge here.

Richard.


jeff





diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index
c82a88a599ca61b068dd9783d2a6158163809b37..580500ff922b8546d33119261a2455235edbf16d
100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -1972,7 +1972,7 @@ cgraph_node::assemble_thunks_and_aliases (void)
   FOR_EACH_ALIAS (this, ref)
 {
   cgraph_node *alias = dyn_cast  (ref->referring);
-  if (!alias->transparent_alias)
+  if (!alias->transparent_alias && !DECL_EXTERNAL (alias->decl))
 {
   bool saved_written = TREE_ASM_WRITTEN (decl);

diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index
e27d0d1690c1fcfb39e2fac03ce0f4154031fc7c..f44fd435ed075a27e373bdfdf0464eb06e1731ef
100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -178,7 +178,8 @@ add_symbol_to_partition_1 (ltrans_partition part,
symtab_node *node)
   /* Add all aliases associated with the symbol.  */

   FOR_EACH_ALIAS (node, ref)
-if (!ref->referring->transparent_alias)
+if (!ref->referring->transparent_alias
+&& ref->referring->get_partitioning_class () != SYMBOL_EXTERNAL)
   add_symbol_to_partition_1 (part, ref->referring);
 else
   {
@@ -189,7 +190,8 @@ add_symbol_to_partition_1 (ltrans_partition part,
symtab_node *node)
   {
 /* Nested transparent aliases are not permitted.  */
 gcc_checking_assert (!ref2->referring->transparent_alias);
-add_symbol_to_partition_1 (part, ref2->referring);
+if (ref2->referring->get_partitioning_class () != SYMBOL_EXTERNAL)
+  add_symbol_to_partition_1 (part, ref2->referring);
   }
   }

diff --git a/gcc/testsuite/gcc.dg/lto/pr69866_0.c
b/gcc/testsuite/gcc.dg/lto/pr69866_0.c
new file mode 100644
index
..f49ef8d4c1da7a21d1bfb5409d647bd18141595b

--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr69866_0.c
@@ -0,0 +1,13 @@
+/* { dg-lto-do link } */
+
+int _umh(int i)
+{
+  return i+1;
+}
+
+int weaks(int i) __attribute__((weak, alias("_umh")));
+
+int main()
+{
+  return weaks(10);
+}
diff --git a/gcc/testsuite/gcc.dg/lto/pr69866_1.c
b/gcc/testsuite/gcc.dg/lto/pr69866_1.c
new file mode 100644
index
..3a14f850eefaffbf659ce4642adef7900330f4ed

--

[PATCH, GCC/testsuite/ARM] Consistently check for neon in vect effective targets

2017-06-13 Thread Thomas Preudhomme

Hi,

Conditions checked for ARM targets in vector-related effective targets
are inconsistent:

* sometimes arm*-*-* is checked
* sometimes Neon is checked
* sometimes arm_neon_ok and sometimes arm_neon is used for neon check
* sometimes check_effective_target_* is used, sometimes is-effective-target

This patch consolidate all of these check into using is-effective-target
arm_neon and when little endian was checked, the check is kept.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-06-06  Thomas Preud'homme  

* lib/target-supports.exp (check_effective_target_vect_int): Replace
current ARM check by ARM NEON's availability check.
(check_effective_target_vect_intfloat_cvt): Likewise.
(check_effective_target_vect_uintfloat_cvt): Likewise.
(check_effective_target_vect_floatint_cvt): Likewise.
(check_effective_target_vect_floatuint_cvt): Likewise.
(check_effective_target_vect_shift): Likewise.
(check_effective_target_whole_vector_shift): Likewise.
(check_effective_target_vect_bswap): Likewise.
(check_effective_target_vect_shift_char): Likewise.
(check_effective_target_vect_long): Likewise.
(check_effective_target_vect_float): Likewise.
(check_effective_target_vect_perm): Likewise.
(check_effective_target_vect_perm_byte): Likewise.
(check_effective_target_vect_perm_short): Likewise.
(check_effective_target_vect_widen_sum_hi_to_si_pattern): Likewise.
(check_effective_target_vect_widen_sum_qi_to_hi): Likewise.
(check_effective_target_vect_widen_mult_qi_to_hi): Likewise.
(check_effective_target_vect_widen_mult_hi_to_si): Likewise.
(check_effective_target_vect_widen_mult_qi_to_hi_pattern): Likewise.
(check_effective_target_vect_widen_mult_hi_to_si_pattern): Likewise.
(check_effective_target_vect_widen_shift): Likewise.
(check_effective_target_vect_extract_even_odd): Likewise.
(check_effective_target_vect_interleave): Likewise.
(check_effective_target_vect_multiple_sizes): Likewise.
(check_effective_target_vect64): Likewise.
(check_effective_target_vect_max_reduc): Likewise.

Testing: Testsuite shows no regression when targeting ARMv7-A with 
-mfpu=neon-fpv4 and -mfloat-abi=hard or when targeting Cortex-M3 with default 
FPU and float ABI (soft).


Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index ded6383cc1f9a1489cd83e1dace0c2fc48e252c3..d7367999fc9df8cf7c654fbb03a059b309e062d6 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2916,7 +2916,7 @@ proc check_effective_target_vect_int { } {
 	 || [istarget alpha*-*-*]
 	 || [istarget ia64-*-*] 
 	 || [istarget aarch64*-*-*]
-	 || [check_effective_target_arm32]
+	 || [is-effective-target arm_neon]
 	 || ([istarget mips*-*-*]
 		 && ([et-is-effective-target mips_loongson]
 		 || [et-is-effective-target mips_msa])) } {
@@ -2944,8 +2944,7 @@ proc check_effective_target_vect_intfloat_cvt { } {
 if { [istarget i?86-*-*] || [istarget x86_64-*-*]
 	 || ([istarget powerpc*-*-*]
 		 && ![istarget powerpc-*-linux*paired*])
-	 || ([istarget arm*-*-*]
-		 && [check_effective_target_arm_neon_ok])
+	 || [is-effective-target arm_neon]
 	 || ([istarget mips*-*-*]
 		 && [et-is-effective-target mips_msa]) } {
 	set et_vect_intfloat_cvt_saved($et_index) 1
@@ -2987,8 +2986,7 @@ proc check_effective_target_vect_uintfloat_cvt { } {
 	 || ([istarget powerpc*-*-*]
 		 && ![istarget powerpc-*-linux*paired*])
 	 || [istarget aarch64*-*-*]
-	 || ([istarget arm*-*-*]
-		 && [check_effective_target_arm_neon_ok])
+	 || [is-effective-target arm_neon]
 	 || ([istarget mips*-*-*]
 		 && [et-is-effective-target mips_msa]) } {
 	set et_vect_uintfloat_cvt_saved($et_index) 1
@@ -3016,8 +3014,7 @@ proc check_effective_target_vect_floatint_cvt { } {
 if { [istarget i?86-*-*] || [istarget x86_64-*-*]
 	 || ([istarget powerpc*-*-*]
 		 && ![istarget powerpc-*-linux*paired*])
-	 || ([istarget arm*-*-*]
-		 && [check_effective_target_arm_neon_ok])
+	 || [is-effective-target arm_neon]
 	 || ([istarget mips*-*-*]
 		 && [et-is-effective-target mips_msa]) } {
 	set et_vect_floatint_cvt_saved($et_index) 1
@@ -3043,8 +3040,7 @@ proc check_effective_target_vect_floatuint_cvt { } {
 	set et_vect_floatuint_cvt_saved($et_index) 0
 if { ([istarget powerpc*-*-*]
 	  && ![istarget powerpc-*-linux*paired*])
-	|| ([istarget arm*-*-*]
-		&& [check_effective_target_arm_neon_ok])
+	|| [is-effective-target arm_neon]
 	|| ([istarget mips*-*-*]
 		&& [et-is-effective-target mips_msa]) } {
 	   set et_vect_floatuint_cvt_saved($et_index) 1
@@ -4903,7 +4899,7 @@ proc check_effective_target_vect_shift { } {
 	 || [istarget ia64-*

[PATCH, GCC/testsuite] Fix gen-vect-26.c requirements

2017-06-13 Thread Thomas Preudhomme

Hi,

gen-vect-26.c tests the vectorizer but only requires vect_cmdline_needed
effective target. It should also depends on vect_int to make sure a
vector unit is available on the target. This patch fixes that.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-06-05  Thomas Preud'homme  

* gcc.dg/tree-ssa/gen-vect-26.c: Also require vect_int effective
target.


Testing: Testcase is now skipped when targeting Cortex-M3.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-26.c b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-26.c
index 8e5f1410612b075914000dcdc643b2838ee3dcd9..8edeb0bbfd31b3926382da27bfafa4f331066ba9 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-26.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-26.c
@@ -1,4 +1,4 @@
-/* { dg-do run { target vect_cmdline_needed } } */
+/* { dg-do run { target { vect_cmdline_needed && vect_int } } } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details -fvect-cost-model=dynamic" } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details -fvect-cost-model=dynamic -mno-sse" { target { i?86-*-* x86_64-*-* } } } */
 


Re: [PATCH, GCC/testsuite/ARM] Consistently check for neon in vect effective targets

2017-06-14 Thread Thomas Preudhomme



On 13/06/17 20:22, Christophe Lyon wrote:

Hi Thomas,

On 13 June 2017 at 11:08, Thomas Preudhomme
 wrote:

Hi,

Conditions checked for ARM targets in vector-related effective targets
are inconsistent:

* sometimes arm*-*-* is checked
* sometimes Neon is checked
* sometimes arm_neon_ok and sometimes arm_neon is used for neon check
* sometimes check_effective_target_* is used, sometimes is-effective-target

This patch consolidate all of these check into using is-effective-target
arm_neon and when little endian was checked, the check is kept.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-06-06  Thomas Preud'homme  

* lib/target-supports.exp (check_effective_target_vect_int): Replace
current ARM check by ARM NEON's availability check.
(check_effective_target_vect_intfloat_cvt): Likewise.
(check_effective_target_vect_uintfloat_cvt): Likewise.
(check_effective_target_vect_floatint_cvt): Likewise.
(check_effective_target_vect_floatuint_cvt): Likewise.
(check_effective_target_vect_shift): Likewise.
(check_effective_target_whole_vector_shift): Likewise.
(check_effective_target_vect_bswap): Likewise.
(check_effective_target_vect_shift_char): Likewise.
(check_effective_target_vect_long): Likewise.
(check_effective_target_vect_float): Likewise.
(check_effective_target_vect_perm): Likewise.
(check_effective_target_vect_perm_byte): Likewise.
(check_effective_target_vect_perm_short): Likewise.
(check_effective_target_vect_widen_sum_hi_to_si_pattern): Likewise.
(check_effective_target_vect_widen_sum_qi_to_hi): Likewise.
(check_effective_target_vect_widen_mult_qi_to_hi): Likewise.
(check_effective_target_vect_widen_mult_hi_to_si): Likewise.
(check_effective_target_vect_widen_mult_qi_to_hi_pattern): Likewise.
(check_effective_target_vect_widen_mult_hi_to_si_pattern): Likewise.
(check_effective_target_vect_widen_shift): Likewise.
(check_effective_target_vect_extract_even_odd): Likewise.
(check_effective_target_vect_interleave): Likewise.
(check_effective_target_vect_multiple_sizes): Likewise.
(check_effective_target_vect64): Likewise.
(check_effective_target_vect_max_reduc): Likewise.

Testing: Testsuite shows no regression when targeting ARMv7-A with
-mfpu=neon-fpv4 and -mfloat-abi=hard or when targeting Cortex-M3 with
default FPU and float ABI (soft).



That's strange, my testing detects a syntax error:

  Executed from: gcc.dg/vect/vect.exp
gcc.dg/vect/slp-9.c: error executing dg-final: unbalanced close paren


Indeed, I can see the missing parenthesis. I've checked again with the sum file 
and even with -v -v -v -v dg-cmp-results does not show any regression. 
compare_tests does though but is often more noisy (saying some tests having 
disappeared and appeared).
This sounds like dg-cmp-results needs to be improved here. I'll do that first 
then test a fixed version of the patch.


Many thanks for the testing!



See 
http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/249142-consistent_neon_check/report-build-info.html
for a full picture.

Note that the cells with "BETTER" seem to be mostly several PASSes
becoming unsupported.

Thanks,


Best regards,

Thomas


Re: [PATCH, GCC/testsuite/ARM] Consistently check for neon in vect effective targets

2017-06-14 Thread Thomas Preudhomme



On 14/06/17 09:29, Christophe Lyon wrote:

On 14 June 2017 at 10:25, Thomas Preudhomme
 wrote:



On 13/06/17 20:22, Christophe Lyon wrote:


Hi Thomas,

On 13 June 2017 at 11:08, Thomas Preudhomme
 wrote:


Hi,

Conditions checked for ARM targets in vector-related effective targets
are inconsistent:

* sometimes arm*-*-* is checked
* sometimes Neon is checked
* sometimes arm_neon_ok and sometimes arm_neon is used for neon check
* sometimes check_effective_target_* is used, sometimes
is-effective-target

This patch consolidate all of these check into using is-effective-target
arm_neon and when little endian was checked, the check is kept.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-06-06  Thomas Preud'homme  

* lib/target-supports.exp (check_effective_target_vect_int):
Replace
current ARM check by ARM NEON's availability check.
(check_effective_target_vect_intfloat_cvt): Likewise.
(check_effective_target_vect_uintfloat_cvt): Likewise.
(check_effective_target_vect_floatint_cvt): Likewise.
(check_effective_target_vect_floatuint_cvt): Likewise.
(check_effective_target_vect_shift): Likewise.
(check_effective_target_whole_vector_shift): Likewise.
(check_effective_target_vect_bswap): Likewise.
(check_effective_target_vect_shift_char): Likewise.
(check_effective_target_vect_long): Likewise.
(check_effective_target_vect_float): Likewise.
(check_effective_target_vect_perm): Likewise.
(check_effective_target_vect_perm_byte): Likewise.
(check_effective_target_vect_perm_short): Likewise.
(check_effective_target_vect_widen_sum_hi_to_si_pattern):
Likewise.
(check_effective_target_vect_widen_sum_qi_to_hi): Likewise.
(check_effective_target_vect_widen_mult_qi_to_hi): Likewise.
(check_effective_target_vect_widen_mult_hi_to_si): Likewise.
(check_effective_target_vect_widen_mult_qi_to_hi_pattern):
Likewise.
(check_effective_target_vect_widen_mult_hi_to_si_pattern):
Likewise.
(check_effective_target_vect_widen_shift): Likewise.
(check_effective_target_vect_extract_even_odd): Likewise.
(check_effective_target_vect_interleave): Likewise.
(check_effective_target_vect_multiple_sizes): Likewise.
(check_effective_target_vect64): Likewise.
(check_effective_target_vect_max_reduc): Likewise.

Testing: Testsuite shows no regression when targeting ARMv7-A with
-mfpu=neon-fpv4 and -mfloat-abi=hard or when targeting Cortex-M3 with
default FPU and float ABI (soft).



That's strange, my testing detects a syntax error:

  Executed from: gcc.dg/vect/vect.exp
gcc.dg/vect/slp-9.c: error executing dg-final: unbalanced close paren



Indeed, I can see the missing parenthesis. I've checked again with the sum
file and even with -v -v -v -v dg-cmp-results does not show any regression.
compare_tests does though but is often more noisy (saying some tests having
disappeared and appeared).
This sounds like dg-cmp-results needs to be improved here. I'll do that
first then test a fixed version of the patch.



I did patch compare_tests a while ago such that it catches ERROR message from
dejagnu (r240288)


So dg-cmp-results assume there is only one tool tested in the .sum file (it 
throws everything before "^Running" and everything after "^[[:space:]]+===" 
which it assumes is the summary. Gosh, I've used it countless time in that way...


Will provide a patch to make it work also in that setup.

Best regards,

Thomas


[PATCH, contrib] Support multi-tool sum files in dg-cmp-results.sh

2017-06-14 Thread Thomas Preudhomme

Hi,

dg-cmp-results.sh contrib script is written to work with sum file for
a single tool only. It throws away the header including the first ===
line and everything starting from the following ===, assuming it is the
test result. This does not work well for sum files with results for
multiple tools.

This patch changes the logic to instead keep everything between "Running
target" line and the beginning of Summary line. Other existing filter
mechanism will ensure only FAIL, PASS, etc. lines are kept after that.

ChangeLog entry is as follow:

*** contrib/ChangeLog ***

2017-06-14  Thomas Preud'homme  

* dg-cmp-results.sh: Keep test result lines rather than throwing
header and summary to support sum files with multiple tools.

Tested successfully on sum file with single tool with similar results
and on sum file with multiple tools now showing a regression with patch
proposed in https://gcc.gnu.org/ml/gcc-patches/2017-06/msg00875.html

Is this ok for trunk?

Best regards,

Thomas


Re: [PATCH, GCC/testsuite/ARM] Allow arm_arch_*_ok to test several macros

2017-06-14 Thread Thomas Preudhomme

I've heard adding the patch usually helps getting it review so here it is. :-)

Best regards,

Thomas

On 07/06/17 16:42, Thomas Preudhomme wrote:

Hi,

The general arm_arch_*_ok procedures check architecture availability by
substituting macros inside a defined preprocessor operator. This limits
them to only check definition of only one macro and force ARMv7VE to be
special cased.

This patch takes advantage of the fact that architecture macros, when
defined, are not null to allow expressing architecture availability by
a boolean operation of possibly several macros. It then takes advantage
of this to deal with ARMv7VE in the general case.  The patch also adds a
comment to make it clear that check_effective_target_arm_arch_FUNC_ok
does not work as intendend for architecture extensions (eg. ARMv8.1-A)
due to lack of extension-specific macro similar to __ARM_ARCH_*__.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-06-06  Thomas Preud'homme  

* lib/target-supports.exp (check_effective_target_arm_arch_FUNC_ok):
Test for null definitions instead of them being undefined.  Add entry
for ARMv7VE.  Reindent entry for ARMv8-M Baseline.  Add comment warning
about using the effective target for architecture extension.
(check_effective_target_arm_arch_v7ve_ok): Remove.
(add_options_for_arm_arch_v7ve): Likewise.

Testing:
- gcc.target/arm/atomic_loaddi_10.c passes with the patch for armv7ve
  but is marked unsupported for armv7-a
- verified in the logs that -march=armv7ve is correctly added when
  running gcc.target/arm/ftest-armv7ve-arm.c

Is this ok for trunk?

Best regards,

Thomas
diff --git a/contrib/dg-cmp-results.sh b/contrib/dg-cmp-results.sh
index d291769547dcd2a02ecf6f80d60d6be7802af4fd..d875b4bd8bca16c1f381355612ef34f6879c5674 100755
--- a/contrib/dg-cmp-results.sh
+++ b/contrib/dg-cmp-results.sh
@@ -91,8 +91,7 @@ sed $E -e '/^[[:space:]]+===/,$d' $NFILE
 
 # Create a temporary file from the old file's interesting section.
 sed $E -e "1,/$header/d" \
-  -e '/^[[:space:]]+===/,$d' \
-  -e '/^[A-Z]+:/!d' \
+  -e '/^Running target /,/^[[:space:]]+===.*Summary ===/!d' \
   -e '/^(WARNING|ERROR):/d' \
   -e 's/\r$//' \
   -e 's/^/O:/' \
@@ -102,8 +101,7 @@ sed $E -e "1,/$header/d" \
 
 # Create a temporary file from the new file's interesting section.
 sed $E -e "1,/$header/d" \
-  -e '/^[[:space:]]+===/,$d' \
-  -e '/^[A-Z]+:/!d' \
+  -e '/^Running target /,/^[[:space:]]+===.*Summary ===/!d' \
   -e '/^(WARNING|ERROR):/d' \
   -e 's/\r$//' \
   -e 's/^/N:/' \


[PATCH, GCC/testsuite/ARM] Make gcc.target/arm/its.c more robust

2017-06-14 Thread Thomas Preudhomme

Hi,

Testcase gcc.target/arm/its.c was added as part of a patch [1] to limit
IT blocks to 2 instructions maximum. However, the patch was only tested
indirectly by *aiming* to check that the assembly output does not
contain a single IT block with all conditional code in it. This was
actually implemented by expecting exactly 2 IT blocks.

[1] https://gcc.gnu.org/ml/gcc-patches/2014-01/msg00764.html

This does not work as proved by the regression following code changes
brought by r248863: some of the instructions are conditionally executed
using a branch and thus there is only one IT block. This patch changes
the logic to look for an IT block with more than 2 conditions, ie. IT
followed by zero or one non space letter.

This patch also restrict the testcase to Thumb-only devices since the
patch the testcase was contributed with only concerned ARMv7-M targets.
Since tuning for ARMv7E-M targets is even more restrictive (only one
instruction per IT block), restricting the testcase to all Thumb-only
devices is sufficient.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-06-09  Thomas Preud'homme  

*   gcc.target/arm/its.c: Check that no IT blocks has more than 2
instructions in it rather than the number of IT blocks being 2.
Transfer scan directive arm_thumb2 restriction to the whole
testcase and restrict further to Thumb-only targets.


Testing: Test is correctly skipped when targeting Thumb mode of Cortex-A15
and Cortex-M0 and PASS for Cortex-M7. Note that it FAILs for Cortex-M3
and Cortex-M4 and manual inspection does reveal that an IT block is
generated with more than 2 instructions in it.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/its.c b/gcc/testsuite/gcc.target/arm/its.c
index 5425f1e920592c911771d93a4620448b06d51394..4e07871b57886e210391db1a72d1bc5b465a49d0 100644
--- a/gcc/testsuite/gcc.target/arm/its.c
+++ b/gcc/testsuite/gcc.target/arm/its.c
@@ -1,4 +1,6 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_cortex_m } */
+/* { dg-require-effective-target arm_thumb2 } */
 /* { dg-options "-O2" }  */
 int test (int a, int b)
 {
@@ -17,4 +19,6 @@ int test (int a, int b)
 r -= 3;
   return r;
 }
-/* { dg-final { scan-assembler-times "\tit" 2 { target arm_thumb2 } } } */
+/* Ensure there is no IT block with more than 2 instructions, ie. we only allow
+   IT, ITT and ITE.  */
+/* { dg-final { scan-assembler-not "\\sit\[^\\s\]{2,}\\s" } } */


Re: [PATCH, GCC/testsuite/ARM] Make gcc.target/arm/its.c more robust

2017-06-15 Thread Thomas Preudhomme



On 14/06/17 18:03, Richard Earnshaw (lists) wrote:

On 14/06/17 17:49, Thomas Preudhomme wrote:

Hi,

Testcase gcc.target/arm/its.c was added as part of a patch [1] to limit
IT blocks to 2 instructions maximum. However, the patch was only tested
indirectly by *aiming* to check that the assembly output does not
contain a single IT block with all conditional code in it. This was
actually implemented by expecting exactly 2 IT blocks.

[1] https://gcc.gnu.org/ml/gcc-patches/2014-01/msg00764.html

This does not work as proved by the regression following code changes
brought by r248863: some of the instructions are conditionally executed
using a branch and thus there is only one IT block. This patch changes
the logic to look for an IT block with more than 2 conditions, ie. IT
followed by zero or one non space letter.

This patch also restrict the testcase to Thumb-only devices since the
patch the testcase was contributed with only concerned ARMv7-M targets.
Since tuning for ARMv7E-M targets is even more restrictive (only one
instruction per IT block), restricting the testcase to all Thumb-only
devices is sufficient.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-06-09  Thomas Preud'homme  

*gcc.target/arm/its.c: Check that no IT blocks has more than 2
instructions in it rather than the number of IT blocks being 2.
Transfer scan directive arm_thumb2 restriction to the whole
testcase and restrict further to Thumb-only targets.


Testing: Test is correctly skipped when targeting Thumb mode of Cortex-A15
and Cortex-M0 and PASS for Cortex-M7. Note that it FAILs for Cortex-M3
and Cortex-M4 and manual inspection does reveal that an IT block is
generated with more than 2 instructions in it.

Is this ok for trunk?

Best regards,

Thomas

make_its_test_more_robust.patch


diff --git a/gcc/testsuite/gcc.target/arm/its.c 
b/gcc/testsuite/gcc.target/arm/its.c
index 
5425f1e920592c911771d93a4620448b06d51394..4e07871b57886e210391db1a72d1bc5b465a49d0
 100644
--- a/gcc/testsuite/gcc.target/arm/its.c
+++ b/gcc/testsuite/gcc.target/arm/its.c
@@ -1,4 +1,6 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_cortex_m } */
+/* { dg-require-effective-target arm_thumb2 } */
 /* { dg-options "-O2" }  */
 int test (int a, int b)
 {
@@ -17,4 +19,6 @@ int test (int a, int b)
 r -= 3;
   return r;
 }
-/* { dg-final { scan-assembler-times "\tit" 2 { target arm_thumb2 } } } */
+/* Ensure there is no IT block with more than 2 instructions, ie. we only allow
+   IT, ITT and ITE.  */
+/* { dg-final { scan-assembler-not "\\sit\[^\\s\]{2,}\\s" } } */



Wouldn't

{dg-final { scan-assembler-not "it[te][te]" } }

be easier to understand?


Indeed, or rather "\\sit\[te\]\[te\]" once properly escaped. "\\sit\[te\]{2}" 
also works and is even simpler so this is what this updated version uses.


Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/its.c b/gcc/testsuite/gcc.target/arm/its.c
index 5425f1e920592c911771d93a4620448b06d51394..f81a0df51cdb5fc26208c0a99e5c1cfb2ee4ed04 100644
--- a/gcc/testsuite/gcc.target/arm/its.c
+++ b/gcc/testsuite/gcc.target/arm/its.c
@@ -1,4 +1,6 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_cortex_m } */
+/* { dg-require-effective-target arm_thumb2 } */
 /* { dg-options "-O2" }  */
 int test (int a, int b)
 {
@@ -17,4 +19,6 @@ int test (int a, int b)
 r -= 3;
   return r;
 }
-/* { dg-final { scan-assembler-times "\tit" 2 { target arm_thumb2 } } } */
+/* Ensure there is no IT block with more than 2 instructions, ie. we only allow
+   IT, ITT and ITE.  */
+/* { dg-final { scan-assembler-not "\\sit\[te\]{2}" } } */


Re: [PATCH, contrib] Support multi-tool sum files in dg-cmp-results.sh

2017-06-15 Thread Thomas Preudhomme

Forgetting the patch: check!
Sending it later as a reply to the wrong message: check!

Hopefully I won't check a second time any of those.

Best regards,

Thomas

On 14/06/17 13:30, Thomas Preudhomme wrote:

Hi,

dg-cmp-results.sh contrib script is written to work with sum file for
a single tool only. It throws away the header including the first ===
line and everything starting from the following ===, assuming it is the
test result. This does not work well for sum files with results for
multiple tools.

This patch changes the logic to instead keep everything between "Running
target" line and the beginning of Summary line. Other existing filter
mechanism will ensure only FAIL, PASS, etc. lines are kept after that.

ChangeLog entry is as follow:

*** contrib/ChangeLog ***

2017-06-14  Thomas Preud'homme  

* dg-cmp-results.sh: Keep test result lines rather than throwing
header and summary to support sum files with multiple tools.

Tested successfully on sum file with single tool with similar results
and on sum file with multiple tools now showing a regression with patch
proposed in https://gcc.gnu.org/ml/gcc-patches/2017-06/msg00875.html

Is this ok for trunk?

Best regards,

Thomas
diff --git a/contrib/dg-cmp-results.sh b/contrib/dg-cmp-results.sh
index d291769547dcd2a02ecf6f80d60d6be7802af4fd..d875b4bd8bca16c1f381355612ef34f6879c5674 100755
--- a/contrib/dg-cmp-results.sh
+++ b/contrib/dg-cmp-results.sh
@@ -91,8 +91,7 @@ sed $E -e '/^[[:space:]]+===/,$d' $NFILE
 
 # Create a temporary file from the old file's interesting section.
 sed $E -e "1,/$header/d" \
-  -e '/^[[:space:]]+===/,$d' \
-  -e '/^[A-Z]+:/!d' \
+  -e '/^Running target /,/^[[:space:]]+===.*Summary ===/!d' \
   -e '/^(WARNING|ERROR):/d' \
   -e 's/\r$//' \
   -e 's/^/O:/' \
@@ -102,8 +101,7 @@ sed $E -e "1,/$header/d" \
 
 # Create a temporary file from the new file's interesting section.
 sed $E -e "1,/$header/d" \
-  -e '/^[[:space:]]+===/,$d' \
-  -e '/^[A-Z]+:/!d' \
+  -e '/^Running target /,/^[[:space:]]+===.*Summary ===/!d' \
   -e '/^(WARNING|ERROR):/d' \
   -e 's/\r$//' \
   -e 's/^/N:/' \


Re: [PATCH, GCC/testsuite/ARM] Consistently check for neon in vect effective targets

2017-06-15 Thread Thomas Preudhomme

Hi,

Conditions checked for ARM targets in vector-related effective targets
are inconsistent:

* sometimes arm*-*-* is checked
* sometimes Neon is checked
* sometimes arm_neon_ok and sometimes arm_neon is used for neon check
* sometimes check_effective_target_* is used, sometimes is-effective-target

This patch consolidate all of these check into using is-effective-target
arm_neon and when little endian was checked, the check is kept.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-06-06  Thomas Preud'homme  

* lib/target-supports.exp (check_effective_target_vect_int): Replace
current ARM check by ARM NEON's availability check.
(check_effective_target_vect_intfloat_cvt): Likewise.
(check_effective_target_vect_uintfloat_cvt): Likewise.
(check_effective_target_vect_floatint_cvt): Likewise.
(check_effective_target_vect_floatuint_cvt): Likewise.
(check_effective_target_vect_shift): Likewise.
(check_effective_target_whole_vector_shift): Likewise.
(check_effective_target_vect_bswap): Likewise.
(check_effective_target_vect_shift_char): Likewise.
(check_effective_target_vect_long): Likewise.
(check_effective_target_vect_float): Likewise.
(check_effective_target_vect_perm): Likewise.
(check_effective_target_vect_perm_byte): Likewise.
(check_effective_target_vect_perm_short): Likewise.
(check_effective_target_vect_widen_sum_hi_to_si_pattern): Likewise.
(check_effective_target_vect_widen_sum_qi_to_hi): Likewise.
(check_effective_target_vect_widen_mult_qi_to_hi): Likewise.
(check_effective_target_vect_widen_mult_hi_to_si): Likewise.
(check_effective_target_vect_widen_mult_qi_to_hi_pattern): Likewise.
(check_effective_target_vect_widen_mult_hi_to_si_pattern): Likewise.
(check_effective_target_vect_widen_shift): Likewise.
(check_effective_target_vect_extract_even_odd): Likewise.
(check_effective_target_vect_interleave): Likewise.
(check_effective_target_vect_multiple_sizes): Likewise.
(check_effective_target_vect64): Likewise.
(check_effective_target_vect_max_reduc): Likewise.

Testing: Testsuite shows no regression when targeting ARMv7-A with 
-mfpu=neon-fpv4 and -mfloat-abi=hard or when targeting Cortex-M3 with default 
FPU and float ABI (soft). Testing was done with both compare_tests and the 
updated dg-cmp-results proposed in 
https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01030.html


Is this ok for trunk?

Best regards,

Thomas

On 13/06/17 20:22, Christophe Lyon wrote:

Hi Thomas,

On 13 June 2017 at 11:08, Thomas Preudhomme
 wrote:

Hi,

Conditions checked for ARM targets in vector-related effective targets
are inconsistent:

* sometimes arm*-*-* is checked
* sometimes Neon is checked
* sometimes arm_neon_ok and sometimes arm_neon is used for neon check
* sometimes check_effective_target_* is used, sometimes is-effective-target

This patch consolidate all of these check into using is-effective-target
arm_neon and when little endian was checked, the check is kept.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-06-06  Thomas Preud'homme  

* lib/target-supports.exp (check_effective_target_vect_int): Replace
current ARM check by ARM NEON's availability check.
(check_effective_target_vect_intfloat_cvt): Likewise.
(check_effective_target_vect_uintfloat_cvt): Likewise.
(check_effective_target_vect_floatint_cvt): Likewise.
(check_effective_target_vect_floatuint_cvt): Likewise.
(check_effective_target_vect_shift): Likewise.
(check_effective_target_whole_vector_shift): Likewise.
(check_effective_target_vect_bswap): Likewise.
(check_effective_target_vect_shift_char): Likewise.
(check_effective_target_vect_long): Likewise.
(check_effective_target_vect_float): Likewise.
(check_effective_target_vect_perm): Likewise.
(check_effective_target_vect_perm_byte): Likewise.
(check_effective_target_vect_perm_short): Likewise.
(check_effective_target_vect_widen_sum_hi_to_si_pattern): Likewise.
(check_effective_target_vect_widen_sum_qi_to_hi): Likewise.
(check_effective_target_vect_widen_mult_qi_to_hi): Likewise.
(check_effective_target_vect_widen_mult_hi_to_si): Likewise.
(check_effective_target_vect_widen_mult_qi_to_hi_pattern): Likewise.
(check_effective_target_vect_widen_mult_hi_to_si_pattern): Likewise.
(check_effective_target_vect_widen_shift): Likewise.
(check_effective_target_vect_extract_even_odd): Likewise.
(check_effective_target_vect_interleave): Likewise.
(check_effective_target_vect_multiple_sizes): Likewise.
(check_effective_target_vect64): Likewise.
(check_effective_target_vect_max_reduc): Likewise.

Testing: Testsuite show

Re: [PATCH, GCC/testsuite/ARM] Consistently check for neon in vect effective targets

2017-06-19 Thread Thomas Preudhomme



On 19/06/17 08:41, Christophe Lyon wrote:

Hi Thomas,


On 15 June 2017 at 18:18, Thomas Preudhomme
 wrote:

Hi,

Conditions checked for ARM targets in vector-related effective targets
are inconsistent:

* sometimes arm*-*-* is checked
* sometimes Neon is checked
* sometimes arm_neon_ok and sometimes arm_neon is used for neon check
* sometimes check_effective_target_* is used, sometimes is-effective-target

This patch consolidate all of these check into using is-effective-target
arm_neon and when little endian was checked, the check is kept.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-06-06  Thomas Preud'homme  

* lib/target-supports.exp (check_effective_target_vect_int): Replace
current ARM check by ARM NEON's availability check.
(check_effective_target_vect_intfloat_cvt): Likewise.
(check_effective_target_vect_uintfloat_cvt): Likewise.
(check_effective_target_vect_floatint_cvt): Likewise.
(check_effective_target_vect_floatuint_cvt): Likewise.
(check_effective_target_vect_shift): Likewise.
(check_effective_target_whole_vector_shift): Likewise.
(check_effective_target_vect_bswap): Likewise.
(check_effective_target_vect_shift_char): Likewise.
(check_effective_target_vect_long): Likewise.
(check_effective_target_vect_float): Likewise.
(check_effective_target_vect_perm): Likewise.
(check_effective_target_vect_perm_byte): Likewise.
(check_effective_target_vect_perm_short): Likewise.
(check_effective_target_vect_widen_sum_hi_to_si_pattern): Likewise.
(check_effective_target_vect_widen_sum_qi_to_hi): Likewise.
(check_effective_target_vect_widen_mult_qi_to_hi): Likewise.
(check_effective_target_vect_widen_mult_hi_to_si): Likewise.
(check_effective_target_vect_widen_mult_qi_to_hi_pattern): Likewise.
(check_effective_target_vect_widen_mult_hi_to_si_pattern): Likewise.
(check_effective_target_vect_widen_shift): Likewise.
(check_effective_target_vect_extract_even_odd): Likewise.
(check_effective_target_vect_interleave): Likewise.
(check_effective_target_vect_multiple_sizes): Likewise.
(check_effective_target_vect64): Likewise.
(check_effective_target_vect_max_reduc): Likewise.

Testing: Testsuite shows no regression when targeting ARMv7-A with
-mfpu=neon-fpv4 and -mfloat-abi=hard or when targeting Cortex-M3 with
default FPU and float ABI (soft). Testing was done with both compare_tests
and the updated dg-cmp-results proposed in
https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01030.html

Is this ok for trunk?



I applied your patch on top of r249233, and noticed quite a few changes:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/249233-consistent_neon_check.patch/report-build-info.html

Note that "Big-Regression" cases are caused by the fact that there a
are PASS->XPASS and XFAILs disappear with your patch, and many
(3000-4000) PASS disappear.
In that intended?


It certainly is not. I'd like to investigate this but the link to results for 
rev 249233 is broken. Could you provide me with the results you have for that so 
that I can compare manually?


Best regards,

Thomas


Re: [PATCH, contrib] Support multi-tool sum files in dg-cmp-results.sh

2017-06-19 Thread Thomas Preudhomme
Wrong copy paste between the patch I tested and the patch I sent. The first and 
second command of the sed should be replaced, not the second and third as in the 
patch I sent. For more safety I'll rerun the tests.


Best regards,

Thomas

On 15/06/17 17:15, Thomas Preudhomme wrote:

Forgetting the patch: check!
Sending it later as a reply to the wrong message: check!

Hopefully I won't check a second time any of those.

Best regards,

Thomas

On 14/06/17 13:30, Thomas Preudhomme wrote:

Hi,

dg-cmp-results.sh contrib script is written to work with sum file for
a single tool only. It throws away the header including the first ===
line and everything starting from the following ===, assuming it is the
test result. This does not work well for sum files with results for
multiple tools.

This patch changes the logic to instead keep everything between "Running
target" line and the beginning of Summary line. Other existing filter
mechanism will ensure only FAIL, PASS, etc. lines are kept after that.

ChangeLog entry is as follow:

*** contrib/ChangeLog ***

2017-06-14  Thomas Preud'homme  

* dg-cmp-results.sh: Keep test result lines rather than throwing
header and summary to support sum files with multiple tools.

Tested successfully on sum file with single tool with similar results
and on sum file with multiple tools now showing a regression with patch
proposed in https://gcc.gnu.org/ml/gcc-patches/2017-06/msg00875.html

Is this ok for trunk?

Best regards,

Thomas


Re: [PATCH, GCC/testsuite/ARM] Consistently check for neon in vect effective targets

2017-06-19 Thread Thomas Preudhomme



On 19/06/17 10:16, Thomas Preudhomme wrote:



On 19/06/17 08:41, Christophe Lyon wrote:

Hi Thomas,


On 15 June 2017 at 18:18, Thomas Preudhomme
 wrote:

Hi,

Conditions checked for ARM targets in vector-related effective targets
are inconsistent:

* sometimes arm*-*-* is checked
* sometimes Neon is checked
* sometimes arm_neon_ok and sometimes arm_neon is used for neon check
* sometimes check_effective_target_* is used, sometimes is-effective-target

This patch consolidate all of these check into using is-effective-target
arm_neon and when little endian was checked, the check is kept.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-06-06  Thomas Preud'homme  

* lib/target-supports.exp (check_effective_target_vect_int): Replace
current ARM check by ARM NEON's availability check.
(check_effective_target_vect_intfloat_cvt): Likewise.
(check_effective_target_vect_uintfloat_cvt): Likewise.
(check_effective_target_vect_floatint_cvt): Likewise.
(check_effective_target_vect_floatuint_cvt): Likewise.
(check_effective_target_vect_shift): Likewise.
(check_effective_target_whole_vector_shift): Likewise.
(check_effective_target_vect_bswap): Likewise.
(check_effective_target_vect_shift_char): Likewise.
(check_effective_target_vect_long): Likewise.
(check_effective_target_vect_float): Likewise.
(check_effective_target_vect_perm): Likewise.
(check_effective_target_vect_perm_byte): Likewise.
(check_effective_target_vect_perm_short): Likewise.
(check_effective_target_vect_widen_sum_hi_to_si_pattern): Likewise.
(check_effective_target_vect_widen_sum_qi_to_hi): Likewise.
(check_effective_target_vect_widen_mult_qi_to_hi): Likewise.
(check_effective_target_vect_widen_mult_hi_to_si): Likewise.
(check_effective_target_vect_widen_mult_qi_to_hi_pattern): Likewise.
(check_effective_target_vect_widen_mult_hi_to_si_pattern): Likewise.
(check_effective_target_vect_widen_shift): Likewise.
(check_effective_target_vect_extract_even_odd): Likewise.
(check_effective_target_vect_interleave): Likewise.
(check_effective_target_vect_multiple_sizes): Likewise.
(check_effective_target_vect64): Likewise.
(check_effective_target_vect_max_reduc): Likewise.

Testing: Testsuite shows no regression when targeting ARMv7-A with
-mfpu=neon-fpv4 and -mfloat-abi=hard or when targeting Cortex-M3 with
default FPU and float ABI (soft). Testing was done with both compare_tests
and the updated dg-cmp-results proposed in
https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01030.html

Is this ok for trunk?



I applied your patch on top of r249233, and noticed quite a few changes:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/249233-consistent_neon_check.patch/report-build-info.html


Note that "Big-Regression" cases are caused by the fact that there a
are PASS->XPASS and XFAILs disappear with your patch, and many
(3000-4000) PASS disappear.
In that intended?


It certainly is not. I'd like to investigate this but the link to results for
rev 249233 is broken. Could you provide me with the results you have for that so
that I can compare manually?


Actually yes it is, at least for the configurations with default (which still 
uses -mfpu=vfp in r249233) or VFP (whatever version) FPU. I've checked all the 
->NA and ->UNSUPPORTED for the arm-none-linux-gnueabi configuration and none of 
them has a dg directive to select the neon unit (such as dg-additional-options 
). I've also looked at 
arm-none-linux-gnueabihf configuration with neon FPU and there is no regression 
there.


I therefore think this is all normal and expected. Note that under current trunk 
this should be different because neon-fp16 would be selected instead of vfp for 
default FPU with Cortex-A9.


Best regards,

Thomas


Re: [PATCH, GCC/testsuite/ARM] Consistently check for neon in vect effective targets

2017-06-19 Thread Thomas Preudhomme



On 19/06/17 15:31, Christophe Lyon wrote:

On 19 June 2017 at 16:11, Thomas Preudhomme
 wrote:



On 19/06/17 10:16, Thomas Preudhomme wrote:




On 19/06/17 08:41, Christophe Lyon wrote:


Hi Thomas,


On 15 June 2017 at 18:18, Thomas Preudhomme
 wrote:


Hi,

Conditions checked for ARM targets in vector-related effective targets
are inconsistent:

* sometimes arm*-*-* is checked
* sometimes Neon is checked
* sometimes arm_neon_ok and sometimes arm_neon is used for neon check
* sometimes check_effective_target_* is used, sometimes
is-effective-target

This patch consolidate all of these check into using is-effective-target
arm_neon and when little endian was checked, the check is kept.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-06-06  Thomas Preud'homme  

* lib/target-supports.exp (check_effective_target_vect_int):
Replace
current ARM check by ARM NEON's availability check.
(check_effective_target_vect_intfloat_cvt): Likewise.
(check_effective_target_vect_uintfloat_cvt): Likewise.
(check_effective_target_vect_floatint_cvt): Likewise.
(check_effective_target_vect_floatuint_cvt): Likewise.
(check_effective_target_vect_shift): Likewise.
(check_effective_target_whole_vector_shift): Likewise.
(check_effective_target_vect_bswap): Likewise.
(check_effective_target_vect_shift_char): Likewise.
(check_effective_target_vect_long): Likewise.
(check_effective_target_vect_float): Likewise.
(check_effective_target_vect_perm): Likewise.
(check_effective_target_vect_perm_byte): Likewise.
(check_effective_target_vect_perm_short): Likewise.
(check_effective_target_vect_widen_sum_hi_to_si_pattern):
Likewise.
(check_effective_target_vect_widen_sum_qi_to_hi): Likewise.
(check_effective_target_vect_widen_mult_qi_to_hi): Likewise.
(check_effective_target_vect_widen_mult_hi_to_si): Likewise.
(check_effective_target_vect_widen_mult_qi_to_hi_pattern):
Likewise.
(check_effective_target_vect_widen_mult_hi_to_si_pattern):
Likewise.
(check_effective_target_vect_widen_shift): Likewise.
(check_effective_target_vect_extract_even_odd): Likewise.
(check_effective_target_vect_interleave): Likewise.
(check_effective_target_vect_multiple_sizes): Likewise.
(check_effective_target_vect64): Likewise.
(check_effective_target_vect_max_reduc): Likewise.

Testing: Testsuite shows no regression when targeting ARMv7-A with
-mfpu=neon-fpv4 and -mfloat-abi=hard or when targeting Cortex-M3 with
default FPU and float ABI (soft). Testing was done with both
compare_tests
and the updated dg-cmp-results proposed in
https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01030.html

Is this ok for trunk?



I applied your patch on top of r249233, and noticed quite a few changes:

http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/249233-consistent_neon_check.patch/report-build-info.html


Note that "Big-Regression" cases are caused by the fact that there a
are PASS->XPASS and XFAILs disappear with your patch, and many
(3000-4000) PASS disappear.
In that intended?



It certainly is not. I'd like to investigate this but the link to results
for
rev 249233 is broken. Could you provide me with the results you have for
that so
that I can compare manually?



Actually yes it is, at least for the configurations with default (which
still uses -mfpu=vfp in r249233) or VFP (whatever version) FPU. I've checked
all the ->NA and ->UNSUPPORTED for the arm-none-linux-gnueabi configuration
and none of them has a dg directive to select the neon unit (such as
dg-additional-options ).
I've also looked at arm-none-linux-gnueabihf configuration with neon FPU and
there is no regression there.

I therefore think this is all normal and expected. Note that under current
trunk this should be different because neon-fp16 would be selected instead
of vfp for default FPU with Cortex-A9.



OK, thanks for checking. So the version you sent on June 15th is OK?


Yes.


I can start a validation against current trunk, after Richard's series,
it probably makes sense, doesn't it?


I think it'll give cleaner results yes. Note that the one with an explicit 
-mfpu=vfp* without neon will still have a lot of changes but at least the one 
with default FPU should be more readable.


Thanks,

Christophe


Best regards,

Thomas


[arm-embedded] [PATCH, ARM] Implement __ARM_FEATURE_COPROC coprocessor intrinsic feature macro

2017-06-20 Thread Thomas Preudhomme

Hi,

We have decided to apply the following patch to the ARM/embedded-6-branch and 
ARM/embedded-7-branch to implement the __ARM_FEATURE_COPROC coprocessor 
intrinsic feature macro.


2017-06-20  Thomas Preud'homme  

Backport from mainline
2017-06-20  Prakhar Bahuguna  

gcc/
* config/arm/arm-c.c (arm_cpu_builtins): New block to define
__ARM_FEATURE_COPROC according to support.

gcc/testsuite/
* gcc.target/arm/acle/cdp.c: Add feature macro bitmap test.
* gcc.target/arm/acle/cdp2.c: Likewise.
* gcc.target/arm/acle/ldc.c: Likewise.
* gcc.target/arm/acle/ldc2.c: Likewise.
* gcc.target/arm/acle/ldc2l.c: Likewise.
* gcc.target/arm/acle/ldcl.c: Likewise.
* gcc.target/arm/acle/mcr.c: Likewise.
* gcc.target/arm/acle/mcr2.c: Likewise.
* gcc.target/arm/acle/mcrr.c: Likewise.
* gcc.target/arm/acle/mcrr2.c: Likewise.
* gcc.target/arm/acle/mrc.c: Likewise.
* gcc.target/arm/acle/mrc2.c: Likewise.
* gcc.target/arm/acle/mrrc.c: Likewise.
* gcc.target/arm/acle/mrrc2.c: Likewise.
* gcc.target/arm/acle/stc.c: Likewise.
* gcc.target/arm/acle/stc2.c: Likewise.
* gcc.target/arm/acle/stc2l.c: Likewise.
* gcc.target/arm/acle/stcl.c: Likewise.

Best regards,

Thomas
--- Begin Message ---
On 16/06/2017 15:37:18, Richard Earnshaw (lists) wrote:
> On 16/06/17 08:48, Prakhar Bahuguna wrote:
> > On 15/06/2017 17:23:43, Richard Earnshaw (lists) wrote:
> >> On 14/06/17 10:35, Prakhar Bahuguna wrote:
> >>> The ARM ACLE defines the __ARM_FEATURE_COPROC macro which indicates which
> >>> coprocessor intrinsics are available for the target. If 
> >>> __ARM_FEATURE_COPROC is
> >>> undefined, the target does not support coprocessor intrinsics. The feature
> >>> levels are defined as follows:
> >>>
> >>> +-+---+--+
> >>> | **Bit** | **Value** | **Intrinsics Available** |
> >>> +-+---+--+
> >>> | 0   | 0x1   | __arm_cdp __arm_ldc, __arm_ldcl, __arm_stc,  |
> >>> | |   | __arm_stcl, __arm_mcr and __arm_mrc  |
> >>> +-+---+--+
> >>> | 1   | 0x2   | __arm_cdp2, __arm_ldc2, __arm_stc2, __arm_ldc2l, |
> >>> | |   | __arm_stc2l, __arm_mcr2 and __arm_mrc2   |
> >>> +-+---+--+
> >>> | 2   | 0x4   | __arm_mcrr and __arm_mrrc|
> >>> +-+---+--+
> >>> | 3   | 0x8   | __arm_mcrr2 and __arm_mrrc2  |
> >>> +-+---+--+
> >>>
> >>> This patch implements full support for this feature macro as defined in 
> >>> section
> >>> 5.9 of the ACLE
> >>> (https://developer.arm.com/products/software-development-tools/compilers/arm-compiler-5/docs/101028/latest/5-feature-test-macros).
> >>>
> >>> gcc/ChangeLog:
> >>>
> >>> 2017-06-14  Prakhar Bahuguna  
> >>>
> >>>   * config/arm/arm-c.c (arm_cpu_builtins): New block to define
> >>>__ARM_FEATURE_COPROC according to support.
> >>>
> >>> 2017-06-14  Prakhar Bahuguna  
> >>>   * gcc/testsuite/gcc.target/arm/acle/cdp.c: Add feature macro bitmap
> >>>   test.
> >>>   * gcc/testsuite/gcc.target/arm/acle/cdp2.c: Likewise.
> >>>   * gcc/testsuite/gcc.target/arm/acle/ldc.c: Likewise.
> >>>   * gcc/testsuite/gcc.target/arm/acle/ldc2.c: Likewise.
> >>>   * gcc/testsuite/gcc.target/arm/acle/ldc2l.c: Likewise.
> >>>   * gcc/testsuite/gcc.target/arm/acle/ldcl.c: Likewise.
> >>>   * gcc/testsuite/gcc.target/arm/acle/mcr.c: Likewise.
> >>>   * gcc/testsuite/gcc.target/arm/acle/mcr2.c: Likewise.
> >>>   * gcc/testsuite/gcc.target/arm/acle/mcrr.c: Likewise.
> >>>   * gcc/testsuite/gcc.target/arm/acle/mcrr2.c: Likewise.
> >>>   * gcc/testsuite/gcc.target/arm/acle/mrc.c: Likewise.
> >>>   * gcc/testsuite/gcc.target/arm/acle/mrc2.c: Likewise.
> >>>   * gcc/testsuite/gcc.target/arm/acle/mrrc.c: Likewise.
> >>>   * gcc/testsuite/gcc.target/arm/acle/mrrc2.c: Likewise.
> >>>   * gcc/testsuite/gcc.target/arm/acle/stc.c: Likewise.
> >>>   * gcc/testsuite/gcc.target/arm/acle/stc2.c: Likewise.
> >>>   * gcc/testsuite/gcc.target/arm/acle/stc2l.c: Likewise.
> >>>   * gcc/testsuite/gcc.target/arm/acle/stcl.c: Likewise.
> >>>
> >>> Testing done: ACLE regression tests updated with tests for feature macro 
> >>> bits.
> >>> All regression tests pass.
> >>>
> >>> Okay for trunk?
> >>>
> >>>
> >>> 0001-Implement-__ARM_FEATURE_COPROC-coprocessor-intrinsic.patch
> >>>
> >>>
> >>> From 79d71aec9d2bdee936b240ae49368ff5f8d8fc48 Mon Sep 17 00:00:00 2001
> >>> From: Prakhar Bahuguna 
> >>> Date: Tue, 2 May 2017 13:43:40 +0100
> >>> Subject: [PAT

[arm-embedded] [PATCH, GCC/LTO, ping] Fix PR69866: LTO with def for weak alias in regular object file

2017-06-20 Thread Thomas Preudhomme

Hi,

We have decided to apply the referenced fix (r249352) to the 
ARM/embedded-6-branch along with its initial commit (r249224) to fix an ICE with 
LTO and aliases.


Fix PR69866

2017-06-20  Thomas Preud'homme  

Backport from mainline
2017-06-15  Jan Hubicka  
Thomas Preud'homme  

gcc/
PR lto/69866
* lto-symtab.c (lto_symtab_merge_symbols): Drop useless definitions
that resolved externally.
2017-06-15  Thomas Preud'homme  

gcc/testsuite/
PR lto/69866
* gcc.dg/lto/pr69866_0.c: New test.
* gcc.dg/lto/pr69866_1.c: Likewise.

Backport from mainline
2017-06-18  Jan Hubicka  

gcc/testsuite/
* gcc.dg/lto/pr69866_0.c: This test needs alias.

Best regards,

Thomas
--- Begin Message ---
> The new test fails on darwin with the usual
> 
> FAIL: gcc.dg/lto/pr69866 c_lto_pr69866_0.o-c_lto_pr69866_1.o link, -O0 -flto 
> -flto-partition=none
> 
> IMO it requires a
> 
> /* { dg-require-alias "" } */

Yep,I will add it shortly.

Honza
> 
> directive.
> 
> TIA
> 
> Dominique
--- End Message ---


[arm-embedded] [PATCH, GCC/ARM, Stage 1] Rename FPSCR builtins to correct names

2017-06-20 Thread Thomas Preudhomme

Hi,

We have decided to apply the following patch to the embedded-6-branch to fix 
naming of an ARM intrinsic.


ChangeLog entry is as follows:

2017-06-20  Thomas Preud'homme  

Backport from mainline
2017-05-04  Prakhar Bahuguna  

gcc/
* gcc/config/arm/arm-builtins.c (arm_init_builtins): Rename
__builtin_arm_ldfscr to __builtin_arm_get_fpscr, and rename
__builtin_arm_stfscr to __builtin_arm_set_fpscr.

gcc/testsuite/
* gcc.target/arm/fpscr.c: New file.


Best regards,

Thomas
--- Begin Message ---

Hi Prakhar,
Sorry for the delay,

On 22/03/17 10:46, Prakhar Bahuguna wrote:

The GCC documentation in section 6.60.8 ARM Floating Point Status and Control
Intrinsics states that the FPSCR register can be read and written to using the
intrinsics __builtin_arm_get_fpscr and __builtin_arm_set_fpscr. However, these
are misnamed within GCC itself and these intrinsic names are not recognised.
This patch corrects the intrinsic names to match the documentation, and adds
tests to verify these intrinsics generate the correct instructions.

Testing done: Ran regression tests on arm-none-eabi for Cortex-M4.

2017-03-09  Prakhar Bahuguna  

gcc/ChangeLog:

* gcc/config/arm/arm-builtins.c (arm_init_builtins): Rename
  __builtin_arm_ldfscr to __builtin_arm_get_fpscr, and rename
  __builtin_arm_stfscr to __builtin_arm_set_fpscr.
* gcc/testsuite/gcc.target/arm/fpscr.c: New file.

Okay for stage 1?


I see that the mistake was in not addressing one of the review comments in:
https://gcc.gnu.org/ml/gcc-patches/2014-04/msg01832.html
properly in the patch that added these functions :(

This is ok for stage 1 if a bootstrap and test on arm-none-linux-gnueabihf 
works fine
I don't think we want to maintain the __builtin_arm_[ld,st]fscr names for 
backwards compatibility
as they were not documented and are __builtin_arm* functions that we don't 
guarantee to maintain.

Thanks,
Kyrill


--

Prakhar Bahuguna


--- End Message ---


[PATCH, GCC/ARM] Remove ARMv8-M code for D17-D31

2017-06-20 Thread Thomas Preudhomme

Hi,

Function cmse_nonsecure_entry_clear_before_return has code to deal with
high VFP register (D16-D31) while ARMv8-M Baseline and Mainline both do
not support more than 16 double VFP registers (D0-D15). This makes this
security-sensitive code harder to read for not much benefit since
libcall for cmse_nonsecure_call functions do not deal with those high
VFP registers anyway.

This commit gets rid of this code for simplicity and fixes 2 issues in
the same function:

- stop the first loop when reaching maxregno to avoid dealing with VFP
  registers if targetting Thumb-1 or using -mfloat-abi=soft
- include maxregno in that loop

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-06-13  Thomas Preud'homme  

* config/arm/arm.c (arm_option_override): Forbid ARMv8-M Security
Extensions with more than 16 double VFP registers.
(cmse_nonsecure_entry_clear_before_return): Remove second entry of
to_clear_mask and all code related to it and make the remaining
entry a 64-bit scalar integer variable and adapt code accordingly.

Testing: Testsuite shows no regression when run for ARMv8-M Baseline and
ARMv8-M Mainline.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 259597d8890ee84c5bd92b12b6f9f6521c8dcd2e..60a4d1f46765d285de469f51fbb5a0ad76d56d9b 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3620,6 +3620,11 @@ arm_option_override (void)
   if (use_cmse && !arm_arch_cmse)
 error ("target CPU does not support ARMv8-M Security Extensions");
 
+  /* We don't clear D16-D31 VFP registers for cmse_nonsecure_call functions
+ and ARMv8-M Baseline and Mainline do not allow such configuration.  */
+  if (use_cmse && LAST_VFP_REGNUM > LAST_LO_VFP_REGNUM)
+error ("ARMv8-M Security Extensions incompatible with selected FPU");
+
   /* Disable scheduling fusion by default if it's not armv7 processor
  or doesn't prefer ldrd/strd.  */
   if (flag_schedule_fusion == 2
@@ -24996,15 +25001,15 @@ thumb1_expand_prologue (void)
 void
 cmse_nonsecure_entry_clear_before_return (void)
 {
-  uint64_t to_clear_mask[2];
+  uint64_t to_clear_mask;
   uint32_t padding_bits_to_clear = 0;
   uint32_t * padding_bits_to_clear_ptr = &padding_bits_to_clear;
   int regno, maxregno = IP_REGNUM;
   tree result_type;
   rtx result_rtl;
 
-  to_clear_mask[0] = (1ULL << (NUM_ARG_REGS)) - 1;
-  to_clear_mask[0] |= (1ULL << IP_REGNUM);
+  to_clear_mask = (1ULL << (NUM_ARG_REGS)) - 1;
+  to_clear_mask |= (1ULL << IP_REGNUM);
 
   /* If we are not dealing with -mfloat-abi=soft we will need to clear VFP
  registers.  We also check that TARGET_HARD_FLOAT and !TARGET_THUMB1 hold
@@ -25015,23 +25020,22 @@ cmse_nonsecure_entry_clear_before_return (void)
   maxregno = LAST_VFP_REGNUM;
 
   float_mask &= ~((1ULL << FIRST_VFP_REGNUM) - 1);
-  to_clear_mask[0] |= float_mask;
-
-  float_mask = (1ULL << (maxregno - 63)) - 1;
-  to_clear_mask[1] = float_mask;
+  to_clear_mask |= float_mask;
 
   /* Make sure we don't clear the two scratch registers used to clear the
 	 relevant FPSCR bits in output_return_instruction.  */
   emit_use (gen_rtx_REG (SImode, IP_REGNUM));
-  to_clear_mask[0] &= ~(1ULL << IP_REGNUM);
+  to_clear_mask &= ~(1ULL << IP_REGNUM);
   emit_use (gen_rtx_REG (SImode, 4));
-  to_clear_mask[0] &= ~(1ULL << 4);
+  to_clear_mask &= ~(1ULL << 4);
 }
 
+  gcc_assert ((unsigned) maxregno <= sizeof (to_clear_mask) * __CHAR_BIT__);
+
   /* If the user has defined registers to be caller saved, these are no longer
  restored by the function before returning and must thus be cleared for
  security purposes.  */
-  for (regno = NUM_ARG_REGS; regno < LAST_VFP_REGNUM; regno++)
+  for (regno = NUM_ARG_REGS; regno <= maxregno; regno++)
 {
   /* We do not touch registers that can be used to pass arguments as per
 	 the AAPCS, since these should never be made callee-saved by user
@@ -25041,7 +25045,7 @@ cmse_nonsecure_entry_clear_before_return (void)
   if (IN_RANGE (regno, IP_REGNUM, PC_REGNUM))
 	continue;
   if (call_used_regs[regno])
-	to_clear_mask[regno / 64] |= (1ULL << (regno % 64));
+	to_clear_mask |= (1ULL << regno);
 }
 
   /* Make sure we do not clear the registers used to return the result in.  */
@@ -25052,7 +25056,7 @@ cmse_nonsecure_entry_clear_before_return (void)
 
   /* No need to check that we return in registers, because we don't
 	 support returning on stack yet.  */
-  to_clear_mask[0]
+  to_clear_mask
 	&= ~compute_not_to_clear_mask (result_type, result_rtl, 0,
    padding_bits_to_clear_ptr);
 }
@@ -25063,7 +25067,7 @@ cmse_nonsecure_entry_clear_before_return (void)
   /* Padding bits to clear is not 0 so we know we are dealing with
 	 returning a composite type, which only uses r0.  Let's make sure that
 	 r1-r3 is cleared too, we will use r1 as a scratch register.  */
-  gcc_assert ((to_cle

Re: [PATCH, contrib] Support multi-tool sum files in dg-cmp-results.sh

2017-06-20 Thread Thomas Preudhomme

Hi Mike,

Sorry, there was a mistake in the patch I sent. Please find an updated patch 
below.

ChangeLog entry unchanged:


*** contrib/ChangeLog ***

2017-06-14  Thomas Preud'homme  

* dg-cmp-results.sh: Keep test result lines rather than throwing
header and summary to support sum files with multiple tools.


Is this still ok?

Best regards,

Thomas

On 19/06/17 16:55, Mike Stump wrote:

On Jun 14, 2017, at 5:30 AM, Thomas Preudhomme  
wrote:


2017-06-14  Thomas Preud'homme  

* dg-cmp-results.sh: Keep test result lines rather than throwing
header and summary to support sum files with multiple tools.

Tested successfully on sum file with single tool with similar results
and on sum file with multiple tools now showing a regression with patch
proposed in https://gcc.gnu.org/ml/gcc-patches/2017-06/msg00875.html

Is this ok for trunk?


Ok.

diff --git a/contrib/dg-cmp-results.sh b/contrib/dg-cmp-results.sh
index d291769547dcd2a02ecf6f80d60d6be7802af4fd..921e9337d1f8ffea78ef566c351fb48a8f6ca064 100755
--- a/contrib/dg-cmp-results.sh
+++ b/contrib/dg-cmp-results.sh
@@ -90,8 +90,7 @@ echo "Newer log file: $NFILE"
 sed $E -e '/^[[:space:]]+===/,$d' $NFILE
 
 # Create a temporary file from the old file's interesting section.
-sed $E -e "1,/$header/d" \
-  -e '/^[[:space:]]+===/,$d' \
+sed $E -e '/^Running target /,/^[[:space:]]+===.*Summary ===/!d' \
   -e '/^[A-Z]+:/!d' \
   -e '/^(WARNING|ERROR):/d' \
   -e 's/\r$//' \
@@ -101,8 +100,7 @@ sed $E -e "1,/$header/d" \
   >/tmp/o$$-$OBASE
 
 # Create a temporary file from the new file's interesting section.
-sed $E -e "1,/$header/d" \
-  -e '/^[[:space:]]+===/,$d' \
+sed $E -e '/^Running target /,/^[[:space:]]+===.*Summary ===/!d' \
   -e '/^[A-Z]+:/!d' \
   -e '/^(WARNING|ERROR):/d' \
   -e 's/\r$//' \


[PATCH, GCC/contrib] Fix variant selection in dg-cmp-results.sh

2017-06-21 Thread Thomas Preudhomme

Hi,

Commit r249422 to dg-cmp-results.sh broke the variant selection feature
where one can restrict the regression test to a specific target variant. This
fix restores the feature.


ChangeLog entry is as follows:

*** contrib/ChangeLog ***

2017-06-21  Thomas Preud'homme  

* dg-cmp-results.sh: Restore filtering on target variant.


Tested on a file with multiple variants which now gives sane results.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/contrib/dg-cmp-results.sh b/contrib/dg-cmp-results.sh
index 921e9337d1f8ffea78ef566c351fb48a8f6ca064..5f2fed5ec3ff0c66d22bc07c84571568730fbcac 100755
--- a/contrib/dg-cmp-results.sh
+++ b/contrib/dg-cmp-results.sh
@@ -90,7 +90,7 @@ echo "Newer log file: $NFILE"
 sed $E -e '/^[[:space:]]+===/,$d' $NFILE
 
 # Create a temporary file from the old file's interesting section.
-sed $E -e '/^Running target /,/^[[:space:]]+===.*Summary ===/!d' \
+sed $E -e "/$header/,/^[[:space:]]+===.*Summary ===/!d" \
   -e '/^[A-Z]+:/!d' \
   -e '/^(WARNING|ERROR):/d' \
   -e 's/\r$//' \
@@ -100,7 +100,7 @@ sed $E -e '/^Running target /,/^[[:space:]]+===.*Summary ===/!d' \
   >/tmp/o$$-$OBASE
 
 # Create a temporary file from the new file's interesting section.
-sed $E -e '/^Running target /,/^[[:space:]]+===.*Summary ===/!d' \
+sed $E -e "/$header/,/^[[:space:]]+===.*Summary ===/!d" \
   -e '/^[A-Z]+:/!d' \
   -e '/^(WARNING|ERROR):/d' \
   -e 's/\r$//' \


Re: [PATCH, GCC/ARM, Stage 1] Rename FPSCR builtins to correct names

2017-06-23 Thread Thomas Preudhomme

Hi Kyrill,

On 10/04/17 15:01, Kyrill Tkachov wrote:

Hi Prakhar,
Sorry for the delay,

On 22/03/17 10:46, Prakhar Bahuguna wrote:

The GCC documentation in section 6.60.8 ARM Floating Point Status and Control
Intrinsics states that the FPSCR register can be read and written to using the
intrinsics __builtin_arm_get_fpscr and __builtin_arm_set_fpscr. However, these
are misnamed within GCC itself and these intrinsic names are not recognised.
This patch corrects the intrinsic names to match the documentation, and adds
tests to verify these intrinsics generate the correct instructions.

Testing done: Ran regression tests on arm-none-eabi for Cortex-M4.

2017-03-09  Prakhar Bahuguna  

gcc/ChangeLog:

* gcc/config/arm/arm-builtins.c (arm_init_builtins): Rename
  __builtin_arm_ldfscr to __builtin_arm_get_fpscr, and rename
  __builtin_arm_stfscr to __builtin_arm_set_fpscr.
* gcc/testsuite/gcc.target/arm/fpscr.c: New file.

Okay for stage 1?


I see that the mistake was in not addressing one of the review comments in:
https://gcc.gnu.org/ml/gcc-patches/2014-04/msg01832.html
properly in the patch that added these functions :(

This is ok for stage 1 if a bootstrap and test on arm-none-linux-gnueabihf works
fine
I don't think we want to maintain the __builtin_arm_[ld,st]fscr names for
backwards compatibility
as they were not documented and are __builtin_arm* functions that we don't
guarantee to maintain.


How about a backport to GCC 5, 6 & 7? The patch applied cleanly on each of these 
versions and the testsuite didn't show any regression for any of the backport 
when run for Cortex-M7.


Patches attached for reference.

ChangeLog entries:

*** gcc/ChangeLog ***

2017-06-20  Thomas Preud'homme  

Backport from mainline
2017-05-04  Prakhar Bahuguna  

* gcc/config/arm/arm-builtins.c (arm_init_builtins): Rename
__builtin_arm_ldfscr to __builtin_arm_get_fpscr, and rename
__builtin_arm_stfscr to __builtin_arm_set_fpscr.


*** gcc/testsuite/ChangeLog ***

2017-06-20  Thomas Preud'homme  

Backport from mainline
2017-05-04  Prakhar Bahuguna  

gcc/testsuite/
* gcc.target/arm/fpscr.c: New file.


Best regards,

Thomas
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index da321440384628fb1770ff9e96377b341c61da6a..ab0e7c0167ac287b774378c3ecfb15a37d5362e7 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2017-06-20  Thomas Preud'homme  
+
+	Backport from mainline
+	2017-05-04  Prakhar Bahuguna  
+
+	* gcc/config/arm/arm-builtins.c (arm_init_builtins): Rename
+	__builtin_arm_ldfscr to __builtin_arm_get_fpscr, and rename
+	__builtin_arm_stfscr to __builtin_arm_set_fpscr.
+
 2017-06-22  Martin Liska  
 
 	Backport from mainline
diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 6f4fd9bdb9774b942f7f51145a406258a82ac1e7..edd6dac6ab73d24447e8c9f6e39c5ba22fbf9302 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -1747,10 +1747,10 @@ arm_init_builtins (void)
 	= build_function_type_list (unsigned_type_node, NULL);
 
   arm_builtin_decls[ARM_BUILTIN_GET_FPSCR]
-	= add_builtin_function ("__builtin_arm_ldfscr", ftype_get_fpscr,
+	= add_builtin_function ("__builtin_arm_get_fpscr", ftype_get_fpscr,
 ARM_BUILTIN_GET_FPSCR, BUILT_IN_MD, NULL, NULL_TREE);
   arm_builtin_decls[ARM_BUILTIN_SET_FPSCR]
-	= add_builtin_function ("__builtin_arm_stfscr", ftype_set_fpscr,
+	= add_builtin_function ("__builtin_arm_set_fpscr", ftype_set_fpscr,
 ARM_BUILTIN_SET_FPSCR, BUILT_IN_MD, NULL, NULL_TREE);
 }
 }
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index b411b9dbc108f12bd1931f57d3f4c1f315161ca0..a865ed054597c12de76a953fcf751209c1e4b84c 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,10 @@
+2017-06-20  Thomas Preud'homme  
+
+	Backport from mainline
+	2017-05-04  Prakhar Bahuguna  
+
+	* gcc.target/arm/fpscr.c: New file.
+
 2017-06-22  Martin Liska  
 
 	Backport from mainline
diff --git a/gcc/testsuite/gcc.target/arm/fpscr.c b/gcc/testsuite/gcc.target/arm/fpscr.c
new file mode 100644
index ..7b4d71d72d8964f6da0d0604bf59aeb4a895df43
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/fpscr.c
@@ -0,0 +1,16 @@
+/* Test the fpscr builtins.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_fp_ok } */
+/* { dg-skip-if "need fp instructions" { *-*-* } { "-mfloat-abi=soft" } { "" } } */
+/* { dg-add-options arm_fp } */
+
+void
+test_fpscr ()
+{
+  volatile unsigned int status = __builtin_arm_get_fpscr ();
+  __builtin_arm_set_fpscr (status);
+}
+
+/* { dg-final { scan-assembler "mrc\tp10, 7, r\[0-9\]+, cr1, cr0, 0" } } */
+/* { dg-final { scan-assembler "mcr\tp10, 7, r\[0-9\]+, cr1, cr0, 0" } } */
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index b24b70c7ef819ea3b45b6019b0db4ad37c6dfce8..61578113c5e3dd8cadcaf5e234f0cd5bb7cced38 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2017-06-20  Thomas Preud

Re: [PATCH, GCC/ARM, Stage 1] Rename FPSCR builtins to correct names

2017-06-26 Thread Thomas Preudhomme

Hi Christophe,

On 23/06/17 20:10, Christophe Lyon wrote:

Hi Thomas,

On 23 June 2017 at 17:48, Thomas Preudhomme
 wrote:

Hi Kyrill,


On 10/04/17 15:01, Kyrill Tkachov wrote:


Hi Prakhar,
Sorry for the delay,

On 22/03/17 10:46, Prakhar Bahuguna wrote:


The GCC documentation in section 6.60.8 ARM Floating Point Status and
Control
Intrinsics states that the FPSCR register can be read and written to
using the
intrinsics __builtin_arm_get_fpscr and __builtin_arm_set_fpscr. However,
these
are misnamed within GCC itself and these intrinsic names are not
recognised.
This patch corrects the intrinsic names to match the documentation, and
adds
tests to verify these intrinsics generate the correct instructions.

Testing done: Ran regression tests on arm-none-eabi for Cortex-M4.

2017-03-09  Prakhar Bahuguna  

gcc/ChangeLog:

 * gcc/config/arm/arm-builtins.c (arm_init_builtins): Rename
   __builtin_arm_ldfscr to __builtin_arm_get_fpscr, and rename
   __builtin_arm_stfscr to __builtin_arm_set_fpscr.
 * gcc/testsuite/gcc.target/arm/fpscr.c: New file.

Okay for stage 1?



I see that the mistake was in not addressing one of the review comments
in:
https://gcc.gnu.org/ml/gcc-patches/2014-04/msg01832.html
properly in the patch that added these functions :(

This is ok for stage 1 if a bootstrap and test on arm-none-linux-gnueabihf
works
fine
I don't think we want to maintain the __builtin_arm_[ld,st]fscr names for
backwards compatibility
as they were not documented and are __builtin_arm* functions that we don't
guarantee to maintain.



How about a backport to GCC 5, 6 & 7? The patch applied cleanly on each of
these versions and the testsuite didn't show any regression for any of the
backport when run for Cortex-M7.



Three's a problem with GCC-5:
 gcc.target/arm/fpscr.c: unknown effective target keyword
`arm_fp_ok' for " dg-require-effective-target 4 arm_fp_ok "

Indeed arm_fp_ok effective-target does not exist in the gcc-5 branch.


Oh no. I remember not seeing anything but I can indeed see this with 
compare_tests from the sum file I save after each testing. Alright, what is done 
is done, working on a patch now.


Best regards,

Thomas


[PATCH, GCC/ARM, gcc-5-branch] Fix gcc.target/arm/fpscr.c

2017-06-26 Thread Thomas Preudhomme

Hi,

As raised by Christophe Lyon, fpscr.c FAILs because arm_fp_ok and arm_fp
are not defined in GCC 5. This commit changes the test to use the same
recipe as gcc.target/arm/cmp-2.c

ChangeLog entry is as follows:


*** gcc/testsuite/ChangeLog ***

2017-06-26  Thomas Preud'homme  

* gcc.target/arm/fpscr.c: Require arm_vfp_ok instead of arm_fp_ok and
add -mfpu=vfp -mfloat-abi=softfp instead of fp_ok options.


Ok for GCC 5?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/fpscr.c b/gcc/testsuite/gcc.target/arm/fpscr.c
index 7b4d71d72d8964f6da0d0604bf59aeb4a895df43..cafba4e8d67545bd210477230b9682fe86620e23 100644
--- a/gcc/testsuite/gcc.target/arm/fpscr.c
+++ b/gcc/testsuite/gcc.target/arm/fpscr.c
@@ -1,9 +1,9 @@
 /* Test the fpscr builtins.  */
 
 /* { dg-do compile } */
-/* { dg-require-effective-target arm_fp_ok } */
+/* { dg-require-effective-target arm_vfp_ok } */
 /* { dg-skip-if "need fp instructions" { *-*-* } { "-mfloat-abi=soft" } { "" } } */
-/* { dg-add-options arm_fp } */
+/* { dg-options "-mfpu=vfp -mfloat-abi=softfp" } */
 
 void
 test_fpscr ()


Re: [PATCH, ARM] Implement __ARM_FEATURE_COPROC coprocessor intrinsic feature macro

2017-06-26 Thread Thomas Preudhomme

Hi Christophe,

On 21/06/17 17:57, Christophe Lyon wrote:

Hi,


On 19 June 2017 at 11:32, Richard Earnshaw (lists)
 wrote:

On 16/06/17 15:56, Prakhar Bahuguna wrote:

On 16/06/2017 15:37:18, Richard Earnshaw (lists) wrote:

On 16/06/17 08:48, Prakhar Bahuguna wrote:

On 15/06/2017 17:23:43, Richard Earnshaw (lists) wrote:

On 14/06/17 10:35, Prakhar Bahuguna wrote:

The ARM ACLE defines the __ARM_FEATURE_COPROC macro which indicates which
coprocessor intrinsics are available for the target. If __ARM_FEATURE_COPROC is
undefined, the target does not support coprocessor intrinsics. The feature
levels are defined as follows:

+-+---+--+
| **Bit** | **Value** | **Intrinsics Available** |
+-+---+--+
| 0   | 0x1   | __arm_cdp __arm_ldc, __arm_ldcl, __arm_stc,  |
| |   | __arm_stcl, __arm_mcr and __arm_mrc  |
+-+---+--+
| 1   | 0x2   | __arm_cdp2, __arm_ldc2, __arm_stc2, __arm_ldc2l, |
| |   | __arm_stc2l, __arm_mcr2 and __arm_mrc2   |
+-+---+--+
| 2   | 0x4   | __arm_mcrr and __arm_mrrc|
+-+---+--+
| 3   | 0x8   | __arm_mcrr2 and __arm_mrrc2  |
+-+---+--+

This patch implements full support for this feature macro as defined in section
5.9 of the ACLE
(https://developer.arm.com/products/software-development-tools/compilers/arm-compiler-5/docs/101028/latest/5-feature-test-macros).

gcc/ChangeLog:

2017-06-14  Prakhar Bahuguna  

   * config/arm/arm-c.c (arm_cpu_builtins): New block to define
__ARM_FEATURE_COPROC according to support.

2017-06-14  Prakhar Bahuguna  
   * gcc/testsuite/gcc.target/arm/acle/cdp.c: Add feature macro bitmap
   test.
   * gcc/testsuite/gcc.target/arm/acle/cdp2.c: Likewise.
   * gcc/testsuite/gcc.target/arm/acle/ldc.c: Likewise.
   * gcc/testsuite/gcc.target/arm/acle/ldc2.c: Likewise.
   * gcc/testsuite/gcc.target/arm/acle/ldc2l.c: Likewise.
   * gcc/testsuite/gcc.target/arm/acle/ldcl.c: Likewise.
   * gcc/testsuite/gcc.target/arm/acle/mcr.c: Likewise.
   * gcc/testsuite/gcc.target/arm/acle/mcr2.c: Likewise.
   * gcc/testsuite/gcc.target/arm/acle/mcrr.c: Likewise.
   * gcc/testsuite/gcc.target/arm/acle/mcrr2.c: Likewise.
   * gcc/testsuite/gcc.target/arm/acle/mrc.c: Likewise.
   * gcc/testsuite/gcc.target/arm/acle/mrc2.c: Likewise.
   * gcc/testsuite/gcc.target/arm/acle/mrrc.c: Likewise.
   * gcc/testsuite/gcc.target/arm/acle/mrrc2.c: Likewise.
   * gcc/testsuite/gcc.target/arm/acle/stc.c: Likewise.
   * gcc/testsuite/gcc.target/arm/acle/stc2.c: Likewise.
   * gcc/testsuite/gcc.target/arm/acle/stc2l.c: Likewise.
   * gcc/testsuite/gcc.target/arm/acle/stcl.c: Likewise.

Testing done: ACLE regression tests updated with tests for feature macro bits.
All regression tests pass.

Okay for trunk?


0001-Implement-__ARM_FEATURE_COPROC-coprocessor-intrinsic.patch


 From 79d71aec9d2bdee936b240ae49368ff5f8d8fc48 Mon Sep 17 00:00:00 2001
From: Prakhar Bahuguna 
Date: Tue, 2 May 2017 13:43:40 +0100
Subject: [PATCH] Implement __ARM_FEATURE_COPROC coprocessor intrinsic feature
  macro

---
  gcc/config/arm/arm-c.c| 19 +++
  gcc/testsuite/gcc.target/arm/acle/cdp.c   |  3 +++
  gcc/testsuite/gcc.target/arm/acle/cdp2.c  |  3 +++
  gcc/testsuite/gcc.target/arm/acle/ldc.c   |  3 +++
  gcc/testsuite/gcc.target/arm/acle/ldc2.c  |  3 +++
  gcc/testsuite/gcc.target/arm/acle/ldc2l.c |  3 +++
  gcc/testsuite/gcc.target/arm/acle/ldcl.c  |  3 +++
  gcc/testsuite/gcc.target/arm/acle/mcr.c   |  3 +++
  gcc/testsuite/gcc.target/arm/acle/mcr2.c  |  3 +++
  gcc/testsuite/gcc.target/arm/acle/mcrr.c  |  3 +++
  gcc/testsuite/gcc.target/arm/acle/mcrr2.c |  3 +++
  gcc/testsuite/gcc.target/arm/acle/mrc.c   |  3 +++
  gcc/testsuite/gcc.target/arm/acle/mrc2.c  |  3 +++
  gcc/testsuite/gcc.target/arm/acle/mrrc.c  |  3 +++
  gcc/testsuite/gcc.target/arm/acle/mrrc2.c |  3 +++
  gcc/testsuite/gcc.target/arm/acle/stc.c   |  3 +++
  gcc/testsuite/gcc.target/arm/acle/stc2.c  |  3 +++
  gcc/testsuite/gcc.target/arm/acle/stc2l.c |  3 +++
  gcc/testsuite/gcc.target/arm/acle/stcl.c  |  3 +++
  19 files changed, 73 insertions(+)

diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
index 3abe7d1f1f5..3daf4e5e1f3 100644
--- a/gcc/config/arm/arm-c.c
+++ b/gcc/config/arm/arm-c.c
@@ -200,6 +200,25 @@ arm_cpu_builtins (struct cpp_reader* pfile)
def_or_undef_macro (pfile, "__ARM_FEATURE_IDIV", TARGET_IDIV);

def_or_undef_macro (pfile, "__ARM_ASM_SYNTAX_UNIFIED__", 
inline_asm_unified);
+
+  if ((!TARGET_THUMB || TARGET_THUMB2) && arm_arch4 &&


(!TARGET_THU

Re: [PATCH, ARM] Implement __ARM_FEATURE_COPROC coprocessor intrinsic feature macro

2017-06-26 Thread Thomas Preudhomme



On 26/06/17 15:16, Christophe Lyon wrote:

On 26 June 2017 at 16:09, Thomas Preudhomme
 wrote:

Hi Christophe,


On 21/06/17 17:57, Christophe Lyon wrote:


Hi,


On 19 June 2017 at 11:32, Richard Earnshaw (lists)
 wrote:


On 16/06/17 15:56, Prakhar Bahuguna wrote:


On 16/06/2017 15:37:18, Richard Earnshaw (lists) wrote:


On 16/06/17 08:48, Prakhar Bahuguna wrote:


On 15/06/2017 17:23:43, Richard Earnshaw (lists) wrote:


On 14/06/17 10:35, Prakhar Bahuguna wrote:


The ARM ACLE defines the __ARM_FEATURE_COPROC macro which indicates
which
coprocessor intrinsics are available for the target. If
__ARM_FEATURE_COPROC is
undefined, the target does not support coprocessor intrinsics. The
feature
levels are defined as follows:


+-+---+--+
| **Bit** | **Value** | **Intrinsics Available**
|

+-+---+--+
| 0   | 0x1   | __arm_cdp __arm_ldc, __arm_ldcl, __arm_stc,
|
| |   | __arm_stcl, __arm_mcr and __arm_mrc
|

+-+---+--+
| 1   | 0x2   | __arm_cdp2, __arm_ldc2, __arm_stc2,
__arm_ldc2l, |
| |   | __arm_stc2l, __arm_mcr2 and __arm_mrc2
|

+-+---+--+
| 2   | 0x4   | __arm_mcrr and __arm_mrrc
|

+-+---+--+
| 3   | 0x8   | __arm_mcrr2 and __arm_mrrc2
|

+-+---+--+

This patch implements full support for this feature macro as defined
in section
5.9 of the ACLE

(https://developer.arm.com/products/software-development-tools/compilers/arm-compiler-5/docs/101028/latest/5-feature-test-macros).

gcc/ChangeLog:

2017-06-14  Prakhar Bahuguna  

* config/arm/arm-c.c (arm_cpu_builtins): New block to define
 __ARM_FEATURE_COPROC according to support.

2017-06-14  Prakhar Bahuguna  
* gcc/testsuite/gcc.target/arm/acle/cdp.c: Add feature macro
bitmap
test.
* gcc/testsuite/gcc.target/arm/acle/cdp2.c: Likewise.
* gcc/testsuite/gcc.target/arm/acle/ldc.c: Likewise.
* gcc/testsuite/gcc.target/arm/acle/ldc2.c: Likewise.
* gcc/testsuite/gcc.target/arm/acle/ldc2l.c: Likewise.
* gcc/testsuite/gcc.target/arm/acle/ldcl.c: Likewise.
* gcc/testsuite/gcc.target/arm/acle/mcr.c: Likewise.
* gcc/testsuite/gcc.target/arm/acle/mcr2.c: Likewise.
* gcc/testsuite/gcc.target/arm/acle/mcrr.c: Likewise.
* gcc/testsuite/gcc.target/arm/acle/mcrr2.c: Likewise.
* gcc/testsuite/gcc.target/arm/acle/mrc.c: Likewise.
* gcc/testsuite/gcc.target/arm/acle/mrc2.c: Likewise.
* gcc/testsuite/gcc.target/arm/acle/mrrc.c: Likewise.
* gcc/testsuite/gcc.target/arm/acle/mrrc2.c: Likewise.
* gcc/testsuite/gcc.target/arm/acle/stc.c: Likewise.
* gcc/testsuite/gcc.target/arm/acle/stc2.c: Likewise.
* gcc/testsuite/gcc.target/arm/acle/stc2l.c: Likewise.
* gcc/testsuite/gcc.target/arm/acle/stcl.c: Likewise.

Testing done: ACLE regression tests updated with tests for feature
macro bits.
All regression tests pass.

Okay for trunk?


0001-Implement-__ARM_FEATURE_COPROC-coprocessor-intrinsic.patch


  From 79d71aec9d2bdee936b240ae49368ff5f8d8fc48 Mon Sep 17 00:00:00
2001
From: Prakhar Bahuguna 
Date: Tue, 2 May 2017 13:43:40 +0100
Subject: [PATCH] Implement __ARM_FEATURE_COPROC coprocessor
intrinsic feature
   macro

---
   gcc/config/arm/arm-c.c| 19 +++
   gcc/testsuite/gcc.target/arm/acle/cdp.c   |  3 +++
   gcc/testsuite/gcc.target/arm/acle/cdp2.c  |  3 +++
   gcc/testsuite/gcc.target/arm/acle/ldc.c   |  3 +++
   gcc/testsuite/gcc.target/arm/acle/ldc2.c  |  3 +++
   gcc/testsuite/gcc.target/arm/acle/ldc2l.c |  3 +++
   gcc/testsuite/gcc.target/arm/acle/ldcl.c  |  3 +++
   gcc/testsuite/gcc.target/arm/acle/mcr.c   |  3 +++
   gcc/testsuite/gcc.target/arm/acle/mcr2.c  |  3 +++
   gcc/testsuite/gcc.target/arm/acle/mcrr.c  |  3 +++
   gcc/testsuite/gcc.target/arm/acle/mcrr2.c |  3 +++
   gcc/testsuite/gcc.target/arm/acle/mrc.c   |  3 +++
   gcc/testsuite/gcc.target/arm/acle/mrc2.c  |  3 +++
   gcc/testsuite/gcc.target/arm/acle/mrrc.c  |  3 +++
   gcc/testsuite/gcc.target/arm/acle/mrrc2.c |  3 +++
   gcc/testsuite/gcc.target/arm/acle/stc.c   |  3 +++
   gcc/testsuite/gcc.target/arm/acle/stc2.c  |  3 +++
   gcc/testsuite/gcc.target/arm/acle/stc2l.c |  3 +++
   gcc/testsuite/gcc.target/arm/acle/stcl.c  |  3 +++
   19 files changed, 73 insertions(+)

diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
index 3abe7d1f1f5..3daf4e5e1f3 100644
--- a/gcc/config/arm/arm-c.c
+++ b/gcc/config/arm/arm-c.c
@@ -200,6 +200,25 @@ arm_cpu_builtins (struct cpp_reader* pfile)
 def_or_undef_macro (pfile, "__ARM_FEATURE_IDIV", TARGET_IDIV);

 def_or_undef_macro (pfile, "__ARM_ASM_SYNTAX_UNIFIED__",
inline_a

Re: [PATCH, ARM] Implement __ARM_FEATURE_COPROC coprocessor intrinsic feature macro

2017-06-28 Thread Thomas Preudhomme



On 26/06/17 17:01, Thomas Preudhomme wrote:

On 26/06/17 15:16, Christophe Lyon wrote:




You mean the macro is expected not to be defined on ARMv8-A ?


Correct. Most instructions its value represent are not available on ARMv8-A and 
for those that are the intrinsics are deprecated.


I've just noticed that many such instructions not available on ARMv8-A are 
accepted by GNU as. I would like to enable/disable coprocessor intrinsics tests 
based on what GNU as returns regarding availability of these instructions so 
hold on a bit more.


Best regards,

Thomas


Re: [PATCH, GCC/ARM, gcc-5-branch, ping] Fix gcc.target/arm/fpscr.c

2017-06-28 Thread Thomas Preudhomme

Ping?

Best regards,

Thomas

On 26/06/17 12:32, Thomas Preudhomme wrote:

Hi,

As raised by Christophe Lyon, fpscr.c FAILs because arm_fp_ok and arm_fp
are not defined in GCC 5. This commit changes the test to use the same
recipe as gcc.target/arm/cmp-2.c

ChangeLog entry is as follows:


*** gcc/testsuite/ChangeLog ***

2017-06-26  Thomas Preud'homme  

 * gcc.target/arm/fpscr.c: Require arm_vfp_ok instead of arm_fp_ok and
 add -mfpu=vfp -mfloat-abi=softfp instead of fp_ok options.


Ok for GCC 5?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/fpscr.c b/gcc/testsuite/gcc.target/arm/fpscr.c
index 7b4d71d72d8964f6da0d0604bf59aeb4a895df43..cafba4e8d67545bd210477230b9682fe86620e23 100644
--- a/gcc/testsuite/gcc.target/arm/fpscr.c
+++ b/gcc/testsuite/gcc.target/arm/fpscr.c
@@ -1,9 +1,9 @@
 /* Test the fpscr builtins.  */
 
 /* { dg-do compile } */
-/* { dg-require-effective-target arm_fp_ok } */
+/* { dg-require-effective-target arm_vfp_ok } */
 /* { dg-skip-if "need fp instructions" { *-*-* } { "-mfloat-abi=soft" } { "" } } */
-/* { dg-add-options arm_fp } */
+/* { dg-options "-mfpu=vfp -mfloat-abi=softfp" } */
 
 void
 test_fpscr ()


Re: [PATCH, GCC/testsuite/ARM] Consistently check for neon in vect effective targets

2017-06-28 Thread Thomas Preudhomme

On 20/06/17 13:44, Christophe Lyon wrote:




The results with a more recent trunk (r249356)) are here:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/249356-consistent_neon_check.patch/report-build-info.html

They are slightly different, but still tedious to check ;-)


I've checked arm-none-linux-gnueabi and arm-none-linux-gnueabihf and found that:

* there's no new FAIL
* changes to UNSUPPORTED and NA are for the same files
* changes are only for tests in a vect directory
* changes for arm-none-linux-gnueabihf are only when targeting vfp without neon 
(tests are disabled because there is no vector unit)


Changes to arm-none-linux-gnueabi makes sense since this defaults to soft 
floating point and none of the test disabled adds any option to select another 
variant.


I believe this all makes sense.

Therefore, is this ok to commit?

Best regards,

Thomas
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index ded6383cc1f9a1489cd83e1dace0c2fc48e252c3..aa8550c9d2cf0ae7e157d9c67fa06ad811651421 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2916,7 +2916,7 @@ proc check_effective_target_vect_int { } {
 	 || [istarget alpha*-*-*]
 	 || [istarget ia64-*-*] 
 	 || [istarget aarch64*-*-*]
-	 || [check_effective_target_arm32]
+	 || [is-effective-target arm_neon]
 	 || ([istarget mips*-*-*]
 		 && ([et-is-effective-target mips_loongson]
 		 || [et-is-effective-target mips_msa])) } {
@@ -2944,8 +2944,7 @@ proc check_effective_target_vect_intfloat_cvt { } {
 if { [istarget i?86-*-*] || [istarget x86_64-*-*]
 	 || ([istarget powerpc*-*-*]
 		 && ![istarget powerpc-*-linux*paired*])
-	 || ([istarget arm*-*-*]
-		 && [check_effective_target_arm_neon_ok])
+	 || [is-effective-target arm_neon]
 	 || ([istarget mips*-*-*]
 		 && [et-is-effective-target mips_msa]) } {
 	set et_vect_intfloat_cvt_saved($et_index) 1
@@ -2987,8 +2986,7 @@ proc check_effective_target_vect_uintfloat_cvt { } {
 	 || ([istarget powerpc*-*-*]
 		 && ![istarget powerpc-*-linux*paired*])
 	 || [istarget aarch64*-*-*]
-	 || ([istarget arm*-*-*]
-		 && [check_effective_target_arm_neon_ok])
+	 || [is-effective-target arm_neon]
 	 || ([istarget mips*-*-*]
 		 && [et-is-effective-target mips_msa]) } {
 	set et_vect_uintfloat_cvt_saved($et_index) 1
@@ -3016,8 +3014,7 @@ proc check_effective_target_vect_floatint_cvt { } {
 if { [istarget i?86-*-*] || [istarget x86_64-*-*]
 	 || ([istarget powerpc*-*-*]
 		 && ![istarget powerpc-*-linux*paired*])
-	 || ([istarget arm*-*-*]
-		 && [check_effective_target_arm_neon_ok])
+	 || [is-effective-target arm_neon]
 	 || ([istarget mips*-*-*]
 		 && [et-is-effective-target mips_msa]) } {
 	set et_vect_floatint_cvt_saved($et_index) 1
@@ -3043,8 +3040,7 @@ proc check_effective_target_vect_floatuint_cvt { } {
 	set et_vect_floatuint_cvt_saved($et_index) 0
 if { ([istarget powerpc*-*-*]
 	  && ![istarget powerpc-*-linux*paired*])
-	|| ([istarget arm*-*-*]
-		&& [check_effective_target_arm_neon_ok])
+	|| [is-effective-target arm_neon]
 	|| ([istarget mips*-*-*]
 		&& [et-is-effective-target mips_msa]) } {
 	   set et_vect_floatuint_cvt_saved($et_index) 1
@@ -4903,7 +4899,7 @@ proc check_effective_target_vect_shift { } {
 	 || [istarget ia64-*-*]
 	 || [istarget i?86-*-*] || [istarget x86_64-*-*]
 	 || [istarget aarch64*-*-*]
-	 || [check_effective_target_arm32]
+	 || [is-effective-target arm_neon]
 	 || ([istarget mips*-*-*]
 		 && ([et-is-effective-target mips_msa]
 		 || [et-is-effective-target mips_loongson])) } {
@@ -4921,7 +4917,7 @@ proc check_effective_target_whole_vector_shift { } {
 	 || [istarget ia64-*-*]
 	 || [istarget aarch64*-*-*]
 	 || [istarget powerpc64*-*-*]
-	 || ([check_effective_target_arm32]
+	 || ([is-effective-target arm_neon]
 	 && [check_effective_target_arm_little_endian])
 	 || ([istarget mips*-*-*]
 	 && [et-is-effective-target mips_loongson]) } {
@@ -4945,8 +4941,7 @@ proc check_effective_target_vect_bswap { } {
 } else {
 	set et_vect_bswap_saved($et_index) 0
 	if { [istarget aarch64*-*-*]
- || ([istarget arm*-*-*]
-&& [check_effective_target_arm_neon])
+ || [is-effective-target arm_neon]
 	   } {
 	   set et_vect_bswap_saved($et_index) 1
 	}
@@ -4969,7 +4964,7 @@ proc check_effective_target_vect_shift_char { } {
 	set et_vect_shift_char_saved($et_index) 0
 	if { ([istarget powerpc*-*-*]
  && ![istarget powerpc-*-linux*paired*])
-	 || [check_effective_target_arm32]
+	 || [is-effective-target arm_neon]
 	 || ([istarget mips*-*-*]
 		 && [et-is-effective-target mips_msa]) } {
 	   set et_vect_shift_char_saved($et_index) 1
@@ -4987,10 +4982,10 @@ proc check_effective_target_vect_shift_char { } {
 
 proc check_effective_target_vect_long { } {
  

Re: [PATCH, GCC/testsuite/ARM] Consistently check for neon in vect effective targets

2017-06-28 Thread Thomas Preudhomme



On 28/06/17 15:59, Kyrill Tkachov wrote:

Hi Thomas,

On 28/06/17 15:49, Thomas Preudhomme wrote:

On 20/06/17 13:44, Christophe Lyon wrote:




The results with a more recent trunk (r249356)) are here:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/249356-consistent_neon_check.patch/report-build-info.html 



They are slightly different, but still tedious to check ;-)


I've checked arm-none-linux-gnueabi and arm-none-linux-gnueabihf and found that:

* there's no new FAIL
* changes to UNSUPPORTED and NA are for the same files
* changes are only for tests in a vect directory
* changes for arm-none-linux-gnueabihf are only when targeting vfp without 
neon (tests are disabled because there is no vector unit)


Changes to arm-none-linux-gnueabi makes sense since this defaults to soft 
floating point and none of the test disabled adds any option to select another 
variant.


I believe this all makes sense.

Therefore, is this ok to commit?

Best regards,

Thomas


@@ -4987,10 +4982,10 @@ proc check_effective_target_vect_shift_char { } {

  proc check_effective_target_vect_long { } {
  if { [istarget i?86-*-*] || [istarget x86_64-*-*]
- || (([istarget powerpc*-*-*]
-  && ![istarget powerpc-*-linux*paired*])
+ || (([istarget powerpc*-*-*]
+  && ![istarget powerpc-*-linux*paired*])
&& [check_effective_target_ilp32])


Is this just a whitespace change?
If it is intended then okay.


It is yes, trailing whitespace. I took the liberty to fix it because I was 
changing some other issues in the same procedure.




This is okay with a ChangeLog entry.


Sorry, I should have pasted it again from the initial message.

2017-06-06  Thomas Preud'homme  

* lib/target-supports.exp (check_effective_target_vect_int): Replace
current ARM check by ARM NEON's availability check.
(check_effective_target_vect_intfloat_cvt): Likewise.
(check_effective_target_vect_uintfloat_cvt): Likewise.
(check_effective_target_vect_floatint_cvt): Likewise.
(check_effective_target_vect_floatuint_cvt): Likewise.
(check_effective_target_vect_shift): Likewise.
(check_effective_target_whole_vector_shift): Likewise.
(check_effective_target_vect_bswap): Likewise.
(check_effective_target_vect_shift_char): Likewise.
(check_effective_target_vect_long): Likewise.
(check_effective_target_vect_float): Likewise.
(check_effective_target_vect_perm): Likewise.
(check_effective_target_vect_perm_byte): Likewise.
(check_effective_target_vect_perm_short): Likewise.
(check_effective_target_vect_widen_sum_hi_to_si_pattern): Likewise.
(check_effective_target_vect_widen_sum_qi_to_hi): Likewise.
(check_effective_target_vect_widen_mult_qi_to_hi): Likewise.
(check_effective_target_vect_widen_mult_hi_to_si): Likewise.
(check_effective_target_vect_widen_mult_qi_to_hi_pattern): Likewise.
(check_effective_target_vect_widen_mult_hi_to_si_pattern): Likewise.
(check_effective_target_vect_widen_shift): Likewise.
(check_effective_target_vect_extract_even_odd): Likewise.
(check_effective_target_vect_interleave): Likewise.
(check_effective_target_vect_multiple_sizes): Likewise.
(check_effective_target_vect64): Likewise.
(check_effective_target_vect_max_reduc): Likewise.



Thanks, this looks like a good change.
Kyrill


Thanks!

Best regards,

Thomas


Re: [PATCH, GCC/ARM, ping] Remove ARMv8-M code for D17-D31

2017-06-28 Thread Thomas Preudhomme

Ping?

*** gcc/ChangeLog ***

2017-06-13  Thomas Preud'homme  

* config/arm/arm.c (arm_option_override): Forbid ARMv8-M Security
Extensions with more than 16 double VFP registers.
(cmse_nonsecure_entry_clear_before_return): Remove second entry of
to_clear_mask and all code related to it and make the remaining
entry a 64-bit scalar integer variable and adapt code accordingly.

Best regards,

Thomas

On 20/06/17 16:01, Thomas Preudhomme wrote:

Hi,

Function cmse_nonsecure_entry_clear_before_return has code to deal with
high VFP register (D16-D31) while ARMv8-M Baseline and Mainline both do
not support more than 16 double VFP registers (D0-D15). This makes this
security-sensitive code harder to read for not much benefit since
libcall for cmse_nonsecure_call functions do not deal with those high
VFP registers anyway.

This commit gets rid of this code for simplicity and fixes 2 issues in
the same function:

- stop the first loop when reaching maxregno to avoid dealing with VFP
   registers if targetting Thumb-1 or using -mfloat-abi=soft
- include maxregno in that loop

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-06-13  Thomas Preud'homme  

 * config/arm/arm.c (arm_option_override): Forbid ARMv8-M Security
 Extensions with more than 16 double VFP registers.
 (cmse_nonsecure_entry_clear_before_return): Remove second entry of
 to_clear_mask and all code related to it and make the remaining
 entry a 64-bit scalar integer variable and adapt code accordingly.

Testing: Testsuite shows no regression when run for ARMv8-M Baseline and
ARMv8-M Mainline.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 259597d8890ee84c5bd92b12b6f9f6521c8dcd2e..60a4d1f46765d285de469f51fbb5a0ad76d56d9b 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3620,6 +3620,11 @@ arm_option_override (void)
   if (use_cmse && !arm_arch_cmse)
 error ("target CPU does not support ARMv8-M Security Extensions");
 
+  /* We don't clear D16-D31 VFP registers for cmse_nonsecure_call functions
+ and ARMv8-M Baseline and Mainline do not allow such configuration.  */
+  if (use_cmse && LAST_VFP_REGNUM > LAST_LO_VFP_REGNUM)
+error ("ARMv8-M Security Extensions incompatible with selected FPU");
+
   /* Disable scheduling fusion by default if it's not armv7 processor
  or doesn't prefer ldrd/strd.  */
   if (flag_schedule_fusion == 2
@@ -24996,15 +25001,15 @@ thumb1_expand_prologue (void)
 void
 cmse_nonsecure_entry_clear_before_return (void)
 {
-  uint64_t to_clear_mask[2];
+  uint64_t to_clear_mask;
   uint32_t padding_bits_to_clear = 0;
   uint32_t * padding_bits_to_clear_ptr = &padding_bits_to_clear;
   int regno, maxregno = IP_REGNUM;
   tree result_type;
   rtx result_rtl;
 
-  to_clear_mask[0] = (1ULL << (NUM_ARG_REGS)) - 1;
-  to_clear_mask[0] |= (1ULL << IP_REGNUM);
+  to_clear_mask = (1ULL << (NUM_ARG_REGS)) - 1;
+  to_clear_mask |= (1ULL << IP_REGNUM);
 
   /* If we are not dealing with -mfloat-abi=soft we will need to clear VFP
  registers.  We also check that TARGET_HARD_FLOAT and !TARGET_THUMB1 hold
@@ -25015,23 +25020,22 @@ cmse_nonsecure_entry_clear_before_return (void)
   maxregno = LAST_VFP_REGNUM;
 
   float_mask &= ~((1ULL << FIRST_VFP_REGNUM) - 1);
-  to_clear_mask[0] |= float_mask;
-
-  float_mask = (1ULL << (maxregno - 63)) - 1;
-  to_clear_mask[1] = float_mask;
+  to_clear_mask |= float_mask;
 
   /* Make sure we don't clear the two scratch registers used to clear the
 	 relevant FPSCR bits in output_return_instruction.  */
   emit_use (gen_rtx_REG (SImode, IP_REGNUM));
-  to_clear_mask[0] &= ~(1ULL << IP_REGNUM);
+  to_clear_mask &= ~(1ULL << IP_REGNUM);
   emit_use (gen_rtx_REG (SImode, 4));
-  to_clear_mask[0] &= ~(1ULL << 4);
+  to_clear_mask &= ~(1ULL << 4);
 }
 
+  gcc_assert ((unsigned) maxregno <= sizeof (to_clear_mask) * __CHAR_BIT__);
+
   /* If the user has defined registers to be caller saved, these are no longer
  restored by the function before returning and must thus be cleared for
  security purposes.  */
-  for (regno = NUM_ARG_REGS; regno < LAST_VFP_REGNUM; regno++)
+  for (regno = NUM_ARG_REGS; regno <= maxregno; regno++)
 {
   /* We do not touch registers that can be used to pass arguments as per
 	 the AAPCS, since these should never be made callee-saved by user
@@ -25041,7 +25045,7 @@ cmse_nonsecure_entry_clear_before_return (void)
   if (IN_RANGE (regno, IP_REGNUM, PC_REGNUM))
 	continue;
   if (call_used_regs[regno])
-	to_clear_mask[regno / 64] |= (1ULL << (regno % 64));
+	to_clear_mask |= (1ULL << regno);
 }
 
   /* Make sure we do not clear the registers used to return the result in. 

Re: [PATCH, GCC/ARM] Remove ARMv8-M code for D17-D31

2017-06-29 Thread Thomas Preudhomme

Hi Richard,

On 28/06/17 16:56, Richard Earnshaw (lists) wrote:

On 20/06/17 16:01, Thomas Preudhomme wrote:

Hi,

Function cmse_nonsecure_entry_clear_before_return has code to deal with
high VFP register (D16-D31) while ARMv8-M Baseline and Mainline both do
not support more than 16 double VFP registers (D0-D15). This makes this
security-sensitive code harder to read for not much benefit since
libcall for cmse_nonsecure_call functions do not deal with those high
VFP registers anyway.

This commit gets rid of this code for simplicity and fixes 2 issues in
the same function:

- stop the first loop when reaching maxregno to avoid dealing with VFP
   registers if targetting Thumb-1 or using -mfloat-abi=soft
- include maxregno in that loop



This is silently baking in dangerous assumptions about GCC's internal
numbering of the registers.  That's not a good idea from a long-term
portability perspective.

At the very least you need to assert that all the interesting registers
are numbered in the range 0..63; but ideally the code should just handle
pretty much any assignment of internal register numbers.


Well there is already this:

gcc_assert ((unsigned) maxregno <= sizeof (to_clear_mask) * __CHAR_BIT__);



Did you consider using sbitmaps rather than doing all the multi-word
stuff by steam?


No but am happy to. I'll respin the patch.

Best regards,

Thomas


[PATCH, GCC/ARM, 0/3] Add support for ARMv8-R

2017-06-29 Thread Thomas Preudhomme

Hi,

This patch series adds support for the ARMv8-R architecture[1] and ARM 
Cortex-R52[2] to GCC. The patch series consist of the following patches:


[ 1/3] Add missing MIDR information for ARM Cortex-R7 and Cortex-R8 processor
[ 2/3] Add support for ARMv8-R architecture
[ 3/3] Add support for ARM Cortex-R52

[1] 
https://developer.arm.com/products/architecture/r-profile/docs/ddi0568/latest/arm-architecture-reference-manual-supplement-armv8-for-the-armv8-r-aarch32-architecture-profile

[2] https://developer.arm.com/products/processors/cortex-r/cortex-r52


[PATCH 1/3, GCC/ARM] Add MIDR info for ARM Cortex-R7 and Cortex-R8

2017-06-29 Thread Thomas Preudhomme

Hi,

The driver is missing MIDR information for processors ARM Cortex-R7 and
Cortex-R8 to support -march/-mcpu/-mtune=native on the command line.
This patch adds the missing information.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-01-31  Thomas Preud'homme  

* config/arm/driver-arm.c (arm_cpu_table): Add entry for ARM
Cortex-R7 and Cortex-R8 processors.

Is this ok for master?

Best regards,

Thomas
diff --git a/gcc/config/arm/driver-arm.c b/gcc/config/arm/driver-arm.c
index b034f13fda63f5892bbd9879d72f4b02e2632d69..29873d57a1e45fd989f6ff01dd4a2ae7320d93bb 100644
--- a/gcc/config/arm/driver-arm.c
+++ b/gcc/config/arm/driver-arm.c
@@ -54,6 +54,8 @@ static struct vendor_cpu arm_cpu_table[] = {
 {"0xd09", "armv8-a+crc", "cortex-a73"},
 {"0xc14", "armv7-r", "cortex-r4"},
 {"0xc15", "armv7-r", "cortex-r5"},
+{"0xc17", "armv7-r", "cortex-r7"},
+{"0xc18", "armv7-r", "cortex-r8"},
 {"0xc20", "armv6-m", "cortex-m0"},
 {"0xc21", "armv6-m", "cortex-m1"},
 {"0xc23", "armv7-m", "cortex-m3"},


[PATCH 2/3, GCC/ARM] Add support for ARMv8-R architecture

2017-06-29 Thread Thomas Preudhomme

Hi,

This patch adds support for ARMv8-R architecture [1] which was recently
announced. User level instructions for ARMv8-R are the same as those in
ARMv8-A Aarch32 mode so this patch define ARMv8-R to have the same
features as ARMv8-A in ARM backend.

[1] 
https://developer.arm.com/products/architecture/r-profile/docs/ddi0568/latest/arm-architecture-reference-manual-supplement-armv8-for-the-armv8-r-aarch32-architecture-profile


ChangeLog entries are as follow:

*** gcc/ChangeLog ***

2017-01-31  Thomas Preud'homme  

* config/arm/arm-cpus.in (armv8-r, armv8-r+rcr): Add new entry.
* config/arm/arm-cpu-cdata.h: Regenerate.
* config/arm/arm-cpu-data.h: Regenerate.
* config/arm/arm-isa.h (ISA_ARMv8r): Define macro.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm.h (enum base_architecture): Add BASE_ARCH_8R
enumerator.
* config/arm/bpabi.h (BE8_LINK_SPEC): Add entry for ARMv8-R and
ARMv8-R with CRC extensions.
* doc/invoke.texi: Mention -march=armv8-r and -march=armv8-r+crc
options.  Document meaning of -march=armv8-r+rcr.

*** gcc/testsuite/ChangeLog ***

2017-01-31  Thomas Preud'homme  

* lib/target-supports.exp: Generate
check_effective_target_arm_arch_v8r_ok, add_options_for_arm_arch_v8r
and check_effective_target_arm_arch_v8r_multilib.

*** libgcc/ChangeLog ***

2017-01-31  Thomas Preud'homme  

* config/arm/lib1funcs.S: Defined __ARM_ARCH__ to 8 for ARMv8-R.

Tested by building an arm-none-eabi GCC cross-compiler targetting
ARMv8-R.

Is this ok for stage1?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm-cpu-cdata.h b/gcc/config/arm/arm-cpu-cdata.h
index b3888120daa8494eb41bde0368122ad2f06d81af..0a122f5febaaceeeb5a405cb5a64e1edd9b044f3 100644
--- a/gcc/config/arm/arm-cpu-cdata.h
+++ b/gcc/config/arm/arm-cpu-cdata.h
@@ -1041,6 +1041,20 @@ static const struct arm_arch_core_flag arm_arch_core_flags[] =
 },
   },
   {
+"armv8-r",
+{
+  ISA_ARMv8r,
+  isa_nobit
+},
+  },
+  {
+"armv8-r+crc",
+{
+  ISA_ARMv8r,isa_bit_crc32,
+  isa_nobit
+},
+  },
+  {
 "iwmmxt",
 {
   ISA_ARMv5te,isa_bit_xscale,isa_bit_iwmmxt,
diff --git a/gcc/config/arm/arm-cpu-data.h b/gcc/config/arm/arm-cpu-data.h
index d6200f9bdc09a9d0c973853b0152a2800eaf2fe5..48c1d88032c1c5dc7c6cba71511f79fe9f2533ea 100644
--- a/gcc/config/arm/arm-cpu-data.h
+++ b/gcc/config/arm/arm-cpu-data.h
@@ -1478,6 +1478,26 @@ static const struct processors all_architectures[] =
 NULL
   },
   {
+"armv8-r", TARGET_CPU_cortexr4,
+(TF_CO_PROC),
+"8R", BASE_ARCH_8R,
+{
+  ISA_ARMv8r,
+  isa_nobit
+},
+NULL
+  },
+  {
+"armv8-r+crc", TARGET_CPU_cortexr4,
+(TF_CO_PROC),
+"8R", BASE_ARCH_8R,
+{
+  ISA_ARMv8r,isa_bit_crc32,
+  isa_nobit
+},
+NULL
+  },
+  {
 "iwmmxt", TARGET_CPU_iwmmxt,
 (TF_LDSCHED | TF_STRONG | TF_XSCALE),
 "5TE", BASE_ARCH_5TE,
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index fc5d935182ba70de5ab2aefeec492318f42e95c5..be1f0ca4e38ae76683b77d8c3b79a066e62325d7 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -287,6 +287,20 @@ begin arch armv8-m.main+dsp
  isa ARMv8m_main bit_ARMv7em
 end arch armv8-m.main+dsp
 
+begin arch armv8-r
+ tune for cortex-r4
+ tune flags CO_PROC
+ base 8R
+ isa ARMv8r
+end arch armv8-r
+
+begin arch armv8-r+crc
+ tune for cortex-r4
+ tune flags CO_PROC
+ base 8R
+ isa ARMv8r bit_crc32
+end arch armv8-r+crc
+
 begin arch iwmmxt
  tune for iwmmxt
  tune flags LDSCHED STRONG XSCALE
diff --git a/gcc/config/arm/arm-isa.h b/gcc/config/arm/arm-isa.h
index 6050bca95587f68a3671dd2144cf845b83da3692..24ec398b346f8effb346235d6f3ab20eb6f70e0f 100644
--- a/gcc/config/arm/arm-isa.h
+++ b/gcc/config/arm/arm-isa.h
@@ -125,6 +125,7 @@ enum isa_feature
 #define ISA_ARMv8_2a	ISA_ARMv8_1a, isa_bit_ARMv8_2
 #define ISA_ARMv8m_base ISA_ARMv6m, isa_bit_ARMv8, isa_bit_cmse, isa_bit_tdiv
 #define ISA_ARMv8m_main ISA_ARMv7m, isa_bit_ARMv8, isa_bit_cmse
+#define ISA_ARMv8r	ISA_ARMv8a
 
 /* List of all FPU bits to strip out if -mfpu is used to override the
default.  isa_bit_fp16 is deliberately missing from this list.  */
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index cbcd85d9906d1fc797ab33b3d61969f32b9cc566..7bab5de5a39e9192c97851929b83175648158cdf 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -461,10 +461,16 @@ EnumValue
 Enum(arm_arch) String(armv8-m.main+dsp) Value(33)
 
 EnumValue
-Enum(arm_arch) String(iwmmxt) Value(34)
+Enum(arm_arch) String(armv8-r) Value(34)
 
 EnumValue
-Enum(arm_arch) String(iwmmxt2) Value(35)
+Enum(arm_arch) String(armv8-r+crc) Value(35)
+
+EnumValue
+Enum(arm_arch) String(iwmmxt) Value(36)
+
+EnumValue
+Enum(arm_arch) String(iwmmxt2) Value(37)
 
 Enum
 Name(arm_fpu) Type(enum fpu_type)
diff --git a/gcc/config/arm/arm.h b/gcc/config/ar

[PATCH 3/3, GCC/ARM] Add support for ARM Cortex-R52 processor

2017-06-29 Thread Thomas Preudhomme

Hi,

This patch adds support for the ARM Cortex-R52 processor rencently
announced.

[1] https://developer.arm.com/products/processors/cortex-r/cortex-r52

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-01-31  Thomas Preud'homme  

* config/arm/arm-cpus.in (cortex-r52): Add new entry.
* config/arm/arm-cpu.h: Regenerate.
* config/arm/arm-cpu-cdata.h: Regenerate.
* config/arm/arm-cpu-data.h: Regenerate.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/bpabi.h (BE8_LINK_SPEC): Add entry for ARM Cortex-R52.
* config/arm/driver-arm.c (arm_cpu_table): Add entry for ARM
Cortex-R52.
* doc/invoke.texi: Mention -mtune=cortex-r52.

Tested by building an arm-none-eabi GCC cross-compiler targeting Cortex-R52.

Is this ok for stage1?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm-cpu-cdata.h b/gcc/config/arm/arm-cpu-cdata.h
index 0a122f5febaaceeeb5a405cb5a64e1edd9b044f3..043b5b2db09146b5686a5fe602f907164f9d84c5 100644
--- a/gcc/config/arm/arm-cpu-cdata.h
+++ b/gcc/config/arm/arm-cpu-cdata.h
@@ -803,6 +803,13 @@ static const struct arm_arch_core_flag arm_arch_core_flags[] =
 },
   },
   {
+"cortex-r52",
+{
+  ISA_ARMv8r,isa_bit_crc32,
+  isa_nobit
+},
+  },
+  {
 "armv2",
 {
   ISA_ARMv2,isa_bit_mode26,
diff --git a/gcc/config/arm/arm-cpu-data.h b/gcc/config/arm/arm-cpu-data.h
index 48c1d88032c1c5dc7c6cba71511f79fe9f2533ea..0677132382fad2f1baf1fbdf5c0b03fe32f752e2 100644
--- a/gcc/config/arm/arm-cpu-data.h
+++ b/gcc/config/arm/arm-cpu-data.h
@@ -1132,6 +1132,16 @@ static const struct processors all_cores[] =
 },
 &arm_v7m_tune
   },
+  {
+"cortex-r52", TARGET_CPU_cortexr52,
+(TF_LDSCHED),
+"8R", BASE_ARCH_8R,
+{
+  ISA_ARMv8r,isa_bit_crc32,
+  isa_nobit
+},
+&arm_cortex_tune
+  },
   {NULL, TARGET_CPU_arm_none, 0, NULL, BASE_ARCH_0, {isa_nobit}, NULL}
 };
 
diff --git a/gcc/config/arm/arm-cpu.h b/gcc/config/arm/arm-cpu.h
index cd282db02f56f4416ff82eb3d8d569cd99fb0d41..4d6ea61d07dc98540f0f75679d8ef6f7eafc10bb 100644
--- a/gcc/config/arm/arm-cpu.h
+++ b/gcc/config/arm/arm-cpu.h
@@ -132,6 +132,7 @@ enum processor_type
   TARGET_CPU_cortexa73cortexa53,
   TARGET_CPU_cortexm23,
   TARGET_CPU_cortexm33,
+  TARGET_CPU_cortexr52,
   TARGET_CPU_arm_none
 };
 
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index be1f0ca4e38ae76683b77d8c3b79a066e62325d7..139aa561d3f918655978e44b5bcb6c0b50747a08 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -1104,6 +1104,16 @@ begin cpu cortex-m33
  costs v7m
 end cpu cortex-m33
 
+
+# V8 R-profile implementations.
+begin cpu cortex-r52
+ cname cortexr52
+ tune flags LDSCHED
+ architecture armv8-r+crc
+ costs cortex
+end cpu cortex-r52
+
+
 # FPU entries
 # format:
 # begin fpu 
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index 7bab5de5a39e9192c97851929b83175648158cdf..ccd1a7661fb97938ddea7670eebe1a0f48efb929 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -354,6 +354,9 @@ Enum(processor_type) String(cortex-m23) Value( TARGET_CPU_cortexm23)
 EnumValue
 Enum(processor_type) String(cortex-m33) Value( TARGET_CPU_cortexm33)
 
+EnumValue
+Enum(processor_type) String(cortex-r52) Value( TARGET_CPU_cortexr52)
+
 Enum
 Name(arm_arch) Type(int)
 Known ARM architectures (for use with the -march= option):
diff --git a/gcc/config/arm/bpabi.h b/gcc/config/arm/bpabi.h
index c394ac805c7577113ed72b31a06ff93dc7f5f490..c3dca1cd4833afd67e56a276ef0e9c1e17f4fae4 100644
--- a/gcc/config/arm/bpabi.h
+++ b/gcc/config/arm/bpabi.h
@@ -100,7 +100,7 @@
|march=armv8-m.main	\
|march=armv8-m.main+dsp|mcpu=cortex-m33		\
|march-armv8-r	\
-   |march-armv8-r+crc	\
+   |march-armv8-r+crc|mcpu=cortex-r52			\
:%{!r:--be8}}}"
 #else
 #define BE8_LINK_SPEC \
@@ -142,7 +142,7 @@
|march=armv8-m.main	\
|march=armv8-m.main+dsp|mcpu=cortex-m33		\
|march=armv8-r	\
-   |march=armv8-r+crc	\
+   |march=armv8-r+crc|mcpu=cortex-r52			\
:%{!r:--be8}}}"
 #endif
 
diff --git a/gcc/config/arm/driver-arm.c b/gcc/config/arm/driver-arm.c
index 29873d57a1e45fd989f6ff01dd4a2ae7320d93bb..00f8128e6911a79f83da03bf731c1cc9127c7285 100644
--- a/gcc/config/arm/driver-arm.c
+++ b/gcc/config/arm/driver-arm.c
@@ -56,6 +56,7 @@ static struct vendor_cpu arm_cpu_table[] = {
 {"0xc15", "armv7-r", "cortex-r5"},
 {"0xc17", "armv7-r", "cortex-r7"},
 {"0xc18", "armv7-r", "cortex-r8"},
+{"0xd13", "armv8-r+crc", "cortex-r52"},
 {"0xc20", "armv6-m", "cortex-m0"},
 {"0xc21", "armv6-m", "cortex-m1"},
 {"0xc23", "armv7-m", "cortex-m3"},
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9ea580626749dc9d27bb72d56bbbef6a474a5055..a871837426485dd6a87c541386964bf85dfafde7 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15212,6 +15212,7 @@ Permissible names are: @samp{arm2}, @samp{arm250},
 @sa

Re: [PATCH, GCC/ARM, 0/3] Add support for ARMv8-R

2017-06-29 Thread Thomas Preudhomme

On 29/06/17 15:34, Christophe Lyon wrote:

On 29 June 2017 at 15:52, Thomas Preudhomme
 wrote:

Hi,

This patch series adds support for the ARMv8-R architecture[1] and ARM
Cortex-R52[2] to GCC. The patch series consist of the following patches:


Hi Thomas,

I think you need to rebase your patch because Richard's recent series
changed the contents
of arm-cpu-data.h and arm-cpu-cdata.h.


Err yes indeed. Thanks!



Why do you link armv8-r architecture definition to cortex-r4?


I understand, where did I do such a thing?

Best regards,

Thomas


Re: [PATCH 2/3, GCC/ARM] Add support for ARMv8-R architecture

2017-06-29 Thread Thomas Preudhomme

Please ignore this patch. I'll respin the patch on a more recent GCC.

Best regards,

Thomas

On 29/06/17 14:55, Thomas Preudhomme wrote:

Hi,

This patch adds support for ARMv8-R architecture [1] which was recently
announced. User level instructions for ARMv8-R are the same as those in
ARMv8-A Aarch32 mode so this patch define ARMv8-R to have the same
features as ARMv8-A in ARM backend.

[1] 
https://developer.arm.com/products/architecture/r-profile/docs/ddi0568/latest/arm-architecture-reference-manual-supplement-armv8-for-the-armv8-r-aarch32-architecture-profile 



ChangeLog entries are as follow:

*** gcc/ChangeLog ***

2017-01-31  Thomas Preud'homme  

 * config/arm/arm-cpus.in (armv8-r, armv8-r+rcr): Add new entry.
 * config/arm/arm-cpu-cdata.h: Regenerate.
 * config/arm/arm-cpu-data.h: Regenerate.
 * config/arm/arm-isa.h (ISA_ARMv8r): Define macro.
 * config/arm/arm-tables.opt: Regenerate.
 * config/arm/arm.h (enum base_architecture): Add BASE_ARCH_8R
 enumerator.
 * config/arm/bpabi.h (BE8_LINK_SPEC): Add entry for ARMv8-R and
 ARMv8-R with CRC extensions.
 * doc/invoke.texi: Mention -march=armv8-r and -march=armv8-r+crc
 options.  Document meaning of -march=armv8-r+rcr.

*** gcc/testsuite/ChangeLog ***

2017-01-31  Thomas Preud'homme  

 * lib/target-supports.exp: Generate
 check_effective_target_arm_arch_v8r_ok, add_options_for_arm_arch_v8r
 and check_effective_target_arm_arch_v8r_multilib.

*** libgcc/ChangeLog ***

2017-01-31  Thomas Preud'homme  

 * config/arm/lib1funcs.S: Defined __ARM_ARCH__ to 8 for ARMv8-R.

Tested by building an arm-none-eabi GCC cross-compiler targetting
ARMv8-R.

Is this ok for stage1?

Best regards,

Thomas


Re: [PATCH 3/3, GCC/ARM] Add support for ARM Cortex-R52 processor

2017-06-29 Thread Thomas Preudhomme

Please ignore this patch. I'll respin the patch on a more recent GCC.

Best regards,

Thomas

On 29/06/17 14:56, Thomas Preudhomme wrote:

Hi,

This patch adds support for the ARM Cortex-R52 processor rencently
announced.

[1] https://developer.arm.com/products/processors/cortex-r/cortex-r52

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-01-31  Thomas Preud'homme  

 * config/arm/arm-cpus.in (cortex-r52): Add new entry.
 * config/arm/arm-cpu.h: Regenerate.
 * config/arm/arm-cpu-cdata.h: Regenerate.
 * config/arm/arm-cpu-data.h: Regenerate.
 * config/arm/arm-tables.opt: Regenerate.
 * config/arm/bpabi.h (BE8_LINK_SPEC): Add entry for ARM Cortex-R52.
 * config/arm/driver-arm.c (arm_cpu_table): Add entry for ARM
 Cortex-R52.
 * doc/invoke.texi: Mention -mtune=cortex-r52.

Tested by building an arm-none-eabi GCC cross-compiler targeting Cortex-R52.

Is this ok for stage1?

Best regards,

Thomas


Re: [PATCH, GCC/ARM, 0/3] Add support for ARMv8-R

2017-06-29 Thread Thomas Preudhomme

On 29/06/17 16:12, Christophe Lyon wrote:

On 29 June 2017 at 16:37, Thomas Preudhomme




Why do you link armv8-r architecture definition to cortex-r4?



I understand, where did I do such a thing?



In patch #2 you have:
diff --git a/gcc/config/arm/arm-cpu-data.h b/gcc/config/arm/arm-cpu-data.h
index 
d6200f9bdc09a9d0c973853b0152a2800eaf2fe5..48c1d88032c1c5dc7c6cba71511f79fe9f2533ea
100644
--- a/gcc/config/arm/arm-cpu-data.h
+++ b/gcc/config/arm/arm-cpu-data.h
@@ -1478,6 +1478,26 @@ static const struct processors all_architectures[] =
  NULL
},
{
+"armv8-r", TARGET_CPU_cortexr4,
+(TF_CO_PROC),
+"8R", BASE_ARCH_8R,
+{
+  ISA_ARMv8r,
+  isa_nobit
+},
+NULL
+  },
+  {
+"armv8-r+crc", TARGET_CPU_cortexr4,
+(TF_CO_PROC),
+"8R", BASE_ARCH_8R,
+{
+  ISA_ARMv8r,isa_bit_crc32,
+  isa_nobit
+},
+NULL
+  },
+  {
  "iwmmxt", TARGET_CPU_iwmmxt,
  (TF_LDSCHED | TF_STRONG | TF_XSCALE),
  "5TE", BASE_ARCH_5TE,

Both entries point to TARGET_CPU_cortexr4. I guess that's because r52
is only defined in patch #3, but then why not update this in patch #3
are replace r4 with r52?

Not sure I'm very clear :-)


You are. I must have forgotten about that setting when working on patch #3. I'll 
update this. Thanks for your vigilance :-)


Best regards,

Thomas


Re: [PATCH, GCC/ARM, gcc-5-branch, ping2] Fix gcc.target/arm/fpscr.c

2017-06-30 Thread Thomas Preudhomme

Ping?

Best regards,

Thomas

On 28/06/17 12:35, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On 26/06/17 12:32, Thomas Preudhomme wrote:

Hi,

As raised by Christophe Lyon, fpscr.c FAILs because arm_fp_ok and arm_fp
are not defined in GCC 5. This commit changes the test to use the same
recipe as gcc.target/arm/cmp-2.c

ChangeLog entry is as follows:


*** gcc/testsuite/ChangeLog ***

2017-06-26  Thomas Preud'homme  

 * gcc.target/arm/fpscr.c: Require arm_vfp_ok instead of arm_fp_ok and
 add -mfpu=vfp -mfloat-abi=softfp instead of fp_ok options.


Ok for GCC 5?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/fpscr.c b/gcc/testsuite/gcc.target/arm/fpscr.c
index 7b4d71d72d8964f6da0d0604bf59aeb4a895df43..cafba4e8d67545bd210477230b9682fe86620e23 100644
--- a/gcc/testsuite/gcc.target/arm/fpscr.c
+++ b/gcc/testsuite/gcc.target/arm/fpscr.c
@@ -1,9 +1,9 @@
 /* Test the fpscr builtins.  */
 
 /* { dg-do compile } */
-/* { dg-require-effective-target arm_fp_ok } */
+/* { dg-require-effective-target arm_vfp_ok } */
 /* { dg-skip-if "need fp instructions" { *-*-* } { "-mfloat-abi=soft" } { "" } } */
-/* { dg-add-options arm_fp } */
+/* { dg-options "-mfpu=vfp -mfloat-abi=softfp" } */
 
 void
 test_fpscr ()


Fix ChangeLog format in r247584

2017-07-04 Thread Thomas Preudhomme

Hi,

This patch fixes relative pathnames in gcc/ChangeLog for r247584. Committed as 
obvious to trunk, GCC 5, 6 and 7.


Best regards,

Thomas
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index f9e00198bbfd352960685b5c72193570e232e68a..39bdcb12ebbad3cdbdce6b9d4dd87c28610e37fe 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -5826,7 +5826,7 @@
 
 2017-05-04  Prakhar Bahuguna  
 
-	* gcc/config/arm/arm-builtins.c (arm_init_builtins): Rename
+	* config/arm/arm-builtins.c (arm_init_builtins): Rename
 	__builtin_arm_ldfscr to __builtin_arm_get_fpscr, and rename
 	__builtin_arm_stfscr to __builtin_arm_set_fpscr.
 


Re: [PATCH 1/3, GCC/ARM, ping] Add MIDR info for ARM Cortex-R7 and Cortex-R8

2017-07-04 Thread Thomas Preudhomme

Ping?

Best regards,

Thomas

On 29/06/17 14:55, Thomas Preudhomme wrote:

Hi,

The driver is missing MIDR information for processors ARM Cortex-R7 and
Cortex-R8 to support -march/-mcpu/-mtune=native on the command line.
This patch adds the missing information.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-01-31  Thomas Preud'homme  

 * config/arm/driver-arm.c (arm_cpu_table): Add entry for ARM
 Cortex-R7 and Cortex-R8 processors.

Is this ok for master?

Best regards,

Thomas
diff --git a/gcc/config/arm/driver-arm.c b/gcc/config/arm/driver-arm.c
index b034f13fda63f5892bbd9879d72f4b02e2632d69..29873d57a1e45fd989f6ff01dd4a2ae7320d93bb 100644
--- a/gcc/config/arm/driver-arm.c
+++ b/gcc/config/arm/driver-arm.c
@@ -54,6 +54,8 @@ static struct vendor_cpu arm_cpu_table[] = {
 {"0xd09", "armv8-a+crc", "cortex-a73"},
 {"0xc14", "armv7-r", "cortex-r4"},
 {"0xc15", "armv7-r", "cortex-r5"},
+{"0xc17", "armv7-r", "cortex-r7"},
+{"0xc18", "armv7-r", "cortex-r8"},
 {"0xc20", "armv6-m", "cortex-m0"},
 {"0xc21", "armv6-m", "cortex-m1"},
 {"0xc23", "armv7-m", "cortex-m3"},


[PATCH, GCC/ARM] Fix cmse_nonsecure_entry return insn size

2017-11-08 Thread Thomas Preudhomme

Hi,

A number of instructions are output in assembler form by
output_return_instruction () when compiling a function with the
cmse_nonsecure_entry attribute for Armv8-M Mainline with hardfloat float
ABI. However, the corresponding thumb2_cmse_entry_return insn pattern
does not account for all these instructions in its computing of the
length of the instruction.

This may lead GCC to use the wrong branching instruction due to
incorrect computation of the offset between the branch instruction's
address and the target address.

This commit fixes the mismatch between what output_return_instruction ()
does and what the pattern think it does and adds a note warning about
mismatch in the affected functions' heading comments to ensure code does
not get out of sync again.

Note: no test is provided because the C testcase is fragile (only works
on GCC 6) and the extracted RTL test fails to compile due to bugs in the
RTL frontend (PR82815 and PR82817)

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2017-10-30  Thomas Preud'homme  

* config/arm/arm.c (output_return_instruction): Add comments to
indicate requirement for cmse_nonsecure_entry return to account
for the size of clearing instruction output here.
(thumb_exit): Likewise.
* config/arm/thumb2.md (thumb2_cmse_entry_return): Fix length for
return in hardfloat mode.

Testing: Bootstrapped on arm-linux-gnueabihf and testsuite shows no
regression.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 033ec255a577f782201527f57f45802bc0eb45e0..9919f54242d9317125a104f9777d76a85de80e9b 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -19417,7 +19417,12 @@ arm_get_vfp_saved_size (void)
 
 /* Generate a function exit sequence.  If REALLY_RETURN is false, then do
everything bar the final return instruction.  If simple_return is true,
-   then do not output epilogue, because it has already been emitted in RTL.  */
+   then do not output epilogue, because it has already been emitted in RTL.
+
+   Note: do not forget to update length attribute of corresponding insn pattern
+   when changing assembly output (eg. length attribute of
+   thumb2_cmse_entry_return when updating Armv8-M Mainline Security Extensions
+   register clearing sequences).  */
 const char *
 output_return_instruction (rtx operand, bool really_return, bool reverse,
bool simple_return)
@@ -23950,7 +23955,12 @@ thumb_pop (FILE *f, unsigned long mask)
 
 /* Generate code to return from a thumb function.
If 'reg_containing_return_addr' is -1, then the return address is
-   actually on the stack, at the stack pointer.  */
+   actually on the stack, at the stack pointer.
+
+   Note: do not forget to update length attribute of corresponding insn pattern
+   when changing assembly output (eg. length attribute of epilogue_insns when
+   updating Armv8-M Baseline Security Extensions register clearing
+   sequences).  */
 static void
 thumb_exit (FILE *f, int reg_containing_return_addr)
 {
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index b78c3d256aeafc2eeb3dcdc2b9b07b1af9df5294..776d611d2538e790a5f504995050ffdfc51d7193 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -1132,7 +1132,7 @@
; we adapt the length accordingly.
(set (attr "length")
  (if_then_else (match_test "TARGET_HARD_FLOAT")
-  (const_int 12)
+  (const_int 34)
   (const_int 8)))
; We do not support predicate execution of returns from cmse_nonsecure_entry
; functions because we need to clear the APSR.  Since predicable has to be


[PATCH, GCC/testsuite] Fix retrieval of testname

2017-11-09 Thread Thomas Preudhomme

When gcc-dg-runtest is used to run a test the test is run several times
with different options. For clarity of the log, the test infrastructure
then append the options to the testname. This means that all the code
that must deal with the testcase itself (eg. removing the output files
after the test has run) needs to remove the option name.

There is already a pattern (see below) for this in several place of the
testsuite framework but it is also missing in many places. This patch
fixes all of these places. The pattern is as follows:

set testcase [testname-for-summary]
; The name might include a list of options; extract the file name.
set testcase [lindex $testcase 0]

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-11-08  Thomas Preud'homme  

* lib/scanasm.exp (scan-assembler): Extract filename from testname used
in summary.
(scan-assembler-not): Likewise.
(scan-hidden): Likewise.
(scan-not-hidden): Likewise.
(scan-stack-usage): Likewise.
(scan-stack-usage-not): Likewise.
(scan-assembler-times): Likewise.
(scan-assembler-dem): Likewise.
(scan-assembler-dem-not): Likewise.
(object-size): Likewise.
(scan-lto-assembler): Likewise.
* lib/scandump.exp (scan-dump): Likewise.
(scan-dump-times): Likewise.
(scan-dump-not): Likewise.
(scan-dump-dem): Likewise.
(scan-dump-dem-not): Likewise

Testing: Ran testsuite on bootstrap aarch64-linux-gnu and
x86_64-linux-gnu compiled with C, fortran and ada support without any 
regression.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index a66bb28253196410554405facefa8641d1020c1d..33286152f30df959a4bffa81634d0bfe7b898e8f 100644
--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -78,7 +78,9 @@ proc dg-scan { name positive testcase output_file orig_args } {
 
 proc scan-assembler { args } {
 set testcase [testname-for-summary]
-set output_file "[file rootname [file tail $testcase]].s"
+# The name might include a list of options; extract the file name.
+set filename [lindex $testcase 0]
+set output_file "[file rootname [file tail $filename]].s"
 dg-scan "scan-assembler" 1 $testcase $output_file $args
 }
 
@@ -89,7 +91,9 @@ force_conventional_output_for scan-assembler
 
 proc scan-assembler-not { args } {
 set testcase [testname-for-summary]
-set output_file "[file rootname [file tail $testcase]].s"
+# The name might include a list of options; extract the file name.
+set filename [lindex $testcase 0]
+set output_file "[file rootname [file tail $filename]].s"
 
 dg-scan "scan-assembler-not" 0 $testcase $output_file $args
 }
@@ -117,7 +121,9 @@ proc hidden-scan-for { symbol } {
 
 proc scan-hidden { args } {
 set testcase [testname-for-summary]
-set output_file "[file rootname [file tail $testcase]].s"
+# The name might include a list of options; extract the file name.
+set filename [lindex $testcase 0]
+set output_file "[file rootname [file tail $filename]].s"
 
 set symbol [lindex $args 0]
 
@@ -133,7 +139,9 @@ proc scan-hidden { args } {
 
 proc scan-not-hidden { args } {
 set testcase [testname-for-summary]
-set output_file "[file rootname [file tail $testcase]].s"
+# The name might include a list of options; extract the file name.
+set filename [lindex $testcase 0]
+set output_file "[file rootname [file tail $filename]].s"
 
 set symbol [lindex $args 0]
 set hidden_scan [hidden-scan-for $symbol]
@@ -163,7 +171,9 @@ proc scan-file-not { output_file args } {
 
 proc scan-stack-usage { args } {
 set testcase [testname-for-summary]
-set output_file "[file rootname [file tail $testcase]].su"
+# The name might include a list of options; extract the file name.
+set filename [lindex $testcase 0]
+set output_file "[file rootname [file tail $filename]].su"
 
 dg-scan "scan-file" 1 $testcase $output_file $args
 }
@@ -173,7 +183,9 @@ proc scan-stack-usage { args } {
 
 proc scan-stack-usage-not { args } {
 set testcase [testname-for-summary]
-set output_file "[file rootname [file tail $testcase]].su"
+# The name might include a list of options; extract the file name.
+set filename [lindex $testcase 0]
+set output_file "[file rootname [file tail $filename]].su"
 
 dg-scan "scan-file-not" 0 $testcase $output_file $args
 }
@@ -230,12 +242,14 @@ proc scan-assembler-times { args } {
 }
 
 set testcase [testname-for-summary]
+# The name might include a list of options; extract the file name.
+set filename [lindex $testcase 0]
 set pattern [lindex $args 0]
 set times [lindex $args 1]
 set pp_pattern [make_pattern_printable $pattern]
 
 # This must match the rule in gcc-dg.exp.
-set output_file "[file rootname [file tail $testcase]].s"
+set output_file "[fi

[PATCH, GCC/testsuite/ARM] Consolidate sources for cmse tests

2017-11-10 Thread Thomas Preudhomme

For the most part, testcases under gcc.target/arm/cmse/baseline and
gcc.target/arm/cmse/mainline are duplicate copies with only different
dejagnu directives. Although there is no requirement for them to be
similar, having them both identical allow to compare the code generated
and make it easier in case of change in code generation to both
architecture to update the testcases (if one needs updating so does the
other).

Similarly all the tests in gcc.target/arm/cmse/mainline/ have
the same source but are duplicate copies.

This patch moves all the code in the tests to a parent directory:
gcc.target/arm/cmse for tests shared by Armv8-M Baseline and Mainline
and gcc.target/arm/cmse/mainline for tests *only* shared by the various
float ABI of Armv8-M Mainline. C includes are then used where the code
used to sit.

Note that the cmse-13.c test used to differ slightly between
architectures and float ABI tested in the first floating-point constant
passed to bar: sometimes 1.0 and sometimes 3.0. This patch settles on
3.0 to not confuse with the 1.0 constant used to clear VFP registers in
some of the configurations.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-11-03  Thomas Preud'homme  

* gcc.target/arm/cmse/bitfield-4.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-4.c: Remove code and include
above file.
* gcc.target/arm/cmse/mainline/bitfield-4.c: Likewise.
* gcc.target/arm/cmse/bitfield-5.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-5.c: Remove code and include
above file.
* gcc.target/arm/cmse/mainline/bitfield-5.c: Likewise.
* gcc.target/arm/cmse/bitfield-6.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-6.c: Remove code and include
above file.
* gcc.target/arm/cmse/mainline/bitfield-6.c: Likewise.
* gcc.target/arm/cmse/bitfield-7.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-7.c: Remove code and include
above file.
* gcc.target/arm/cmse/mainline/bitfield-7.c: Likewise.
* gcc.target/arm/cmse/bitfield-8.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-8.c: Remove code and include
above file.
* gcc.target/arm/cmse/mainline/bitfield-8.c: Likewise.
* gcc.target/arm/cmse/bitfield-9.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-9.c: Remove code and include
above file.
* gcc.target/arm/cmse/mainline/bitfield-9.c: Likewise.
* gcc.target/arm/cmse/bitfield-and-union.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-and-union-1.c: Rename into ...
* gcc.target/arm/cmse/baseline/bitfield-and-union.c: This.  Remove code
and include above bitfield-and-union.x file.
* gcc.target/arm/cmse/mainline/bitfield-and-union-1.c: Rename into ...
* gcc.target/arm/cmse/mainline/bitfield-and-union.c: this.  Remove code
and include above bitfield-and-union.x file.
* gcc.target/arm/cmse/cmse-13.x: New file.
* gcc.target/arm/cmse/baseline/cmse-13.c: Remove code and include above
file.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-13.c: Likewise.
* gcc.target/arm/cmse/cmse-5.x: New file.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: Remove code and
include above file.
	* gcc.target/arm/cmse/mainline/harFor the most part, testcases under 
gcc.target/arm/cmse/baseline and

gcc.target/arm/cmse/mainline are duplicate copies with only different
dejagnu directives. Although there is no requirement for them to be
similar, having them both identical allow to compare the code generated
and make it easier in case of change in code generation to both
architecture to update the testcases (if one needs updating so does the
other).

Similarly all the tests in gcc.target/arm/cmse/mainline/ have
the same source but are duplicate copies.

This patch moves all the code in the tests to a parent directory:
gcc.target/arm/cmse for tests shared by Armv8-M Baseline and Mainline
and gcc.target/arm/cmse/mainline for tests *only* shared by the various
float ABI of Armv8-M Mainline. C includes are then used where the code
used to sit.

Note that the cmse-13.c test used to differ slightly between
architectures and float ABI tested in the first floating-point constant
passed to bar: sometimes 1.0 and sometimes 3.0. This patch settles on
3.0 to not confuse with the 1.0 constant used to clear VFP registers in
some of the configurations.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-11-03  Thomas Preud'homme  

* gcc.target/arm/cmse/bitfield-4.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-4.c: Remove code and include
above file.
* gcc.target/arm/cmse/m

Re: [PATCH, GCC/testsuite/ARM] Consolidate sources for cmse tests

2017-11-10 Thread Thomas Preudhomme
ewise.
* gcc.target/arm/cmse/union-2.x: New file.
* gcc.target/arm/cmse/baseline/union-2.c: Remove code and include above
file.
* gcc.target/arm/cmse/mainline/union-2.c: Likewise.

Testing: Running cmse.exp for both Armv8-M Baseline and Mainline
shows no regression.

Is this ok for trunk?

Best regards,

Thomas

On 10/11/17 11:19, Thomas Preudhomme wrote:

For the most part, testcases under gcc.target/arm/cmse/baseline and
gcc.target/arm/cmse/mainline are duplicate copies with only different
dejagnu directives. Although there is no requirement for them to be
similar, having them both identical allow to compare the code generated
and make it easier in case of change in code generation to both
architecture to update the testcases (if one needs updating so does the
other).

Similarly all the tests in gcc.target/arm/cmse/mainline/ have
the same source but are duplicate copies.

This patch moves all the code in the tests to a parent directory:
gcc.target/arm/cmse for tests shared by Armv8-M Baseline and Mainline
and gcc.target/arm/cmse/mainline for tests *only* shared by the various
float ABI of Armv8-M Mainline. C includes are then used where the code
used to sit.

Note that the cmse-13.c test used to differ slightly between
architectures and float ABI tested in the first floating-point constant
passed to bar: sometimes 1.0 and sometimes 3.0. This patch settles on
3.0 to not confuse with the 1.0 constant used to clear VFP registers in
some of the configurations.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-11-03  Thomas Preud'homme  

 * gcc.target/arm/cmse/bitfield-4.x: New file.
 * gcc.target/arm/cmse/baseline/bitfield-4.c: Remove code and include
 above file.
 * gcc.target/arm/cmse/mainline/bitfield-4.c: Likewise.
 * gcc.target/arm/cmse/bitfield-5.x: New file.
 * gcc.target/arm/cmse/baseline/bitfield-5.c: Remove code and include
 above file.
 * gcc.target/arm/cmse/mainline/bitfield-5.c: Likewise.
 * gcc.target/arm/cmse/bitfield-6.x: New file.
 * gcc.target/arm/cmse/baseline/bitfield-6.c: Remove code and include
 above file.
 * gcc.target/arm/cmse/mainline/bitfield-6.c: Likewise.
 * gcc.target/arm/cmse/bitfield-7.x: New file.
 * gcc.target/arm/cmse/baseline/bitfield-7.c: Remove code and include
 above file.
 * gcc.target/arm/cmse/mainline/bitfield-7.c: Likewise.
 * gcc.target/arm/cmse/bitfield-8.x: New file.
 * gcc.target/arm/cmse/baseline/bitfield-8.c: Remove code and include
 above file.
 * gcc.target/arm/cmse/mainline/bitfield-8.c: Likewise.
 * gcc.target/arm/cmse/bitfield-9.x: New file.
 * gcc.target/arm/cmse/baseline/bitfield-9.c: Remove code and include
 above file.
 * gcc.target/arm/cmse/mainline/bitfield-9.c: Likewise.
 * gcc.target/arm/cmse/bitfield-and-union.x: New file.
 * gcc.target/arm/cmse/baseline/bitfield-and-union-1.c: Rename into ...
 * gcc.target/arm/cmse/baseline/bitfield-and-union.c: This.  Remove code
 and include above bitfield-and-union.x file.
 * gcc.target/arm/cmse/mainline/bitfield-and-union-1.c: Rename into ...
 * gcc.target/arm/cmse/mainline/bitfield-and-union.c: this.  Remove code
 and include above bitfield-and-union.x file.
 * gcc.target/arm/cmse/cmse-13.x: New file.
 * gcc.target/arm/cmse/baseline/cmse-13.c: Remove code and include above
 file.
 * gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Likewise.
 * gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise.
 * gcc.target/arm/cmse/mainline/soft/cmse-13.c: Likewise.
 * gcc.target/arm/cmse/mainline/softfp/cmse-13.c: Likewise.
 * gcc.target/arm/cmse/cmse-5.x: New file.
 * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: Remove code and
 include above file.
 * gcc.target/arm/cmse/mainline/harFor the most part, testcases under 
gcc.target/arm/cmse/baseline and

gcc.target/arm/cmse/mainline are duplicate copies with only different
dejagnu directives. Although there is no requirement for them to be
similar, having them both identical allow to compare the code generated
and make it easier in case of change in code generation to both
architecture to update the testcases (if one needs updating so does the
other).

Similarly all the tests in gcc.target/arm/cmse/mainline/ have
the same source but are duplicate copies.

This patch moves all the code in the tests to a parent directory:
gcc.target/arm/cmse for tests shared by Armv8-M Baseline and Mainline
and gcc.target/arm/cmse/mainline for tests *only* shared by the various
float ABI of Armv8-M Mainline. C includes are then used where the code
used to sit.

Note that the cmse-13.c test used to differ slightly between
architectures and float ABI tested in the first floating-point constant
passed to bar: sometimes 1.0 and sometimes 3.0. This patch settles on
3.0 to not confuse with the 1.0 constant used to clear VFP registers in
some of the configurations.

ChangeLog entry is 

[PATCH, GCC/ARM] Fix ICE in Armv8-M Security Extensions code

2017-11-15 Thread Thomas Preudhomme

Hi,

Commit r253825 which introduced some sanity checks for sbitmap revealed
a bug in the conversion of cmse_nonsecure_entry_clear_before_return ()
to using bitmap structure. bitmap_and expects that the two bitmaps have
the same length, yet the code in
cmse_nonsecure_entry_clear_before_return () have different size for
to_clear_bitmap and to_clear_arg_regs_bitmap, with the assumption that
bitmap_and would behave has if the bits not allocated were in fact zero.
This commit makes sure both bitmap are equally sized.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-11-13  Thomas Preud'homme  

* config/arm/arm.c (cmse_nonsecure_entry_clear_before_return): Allocate
to_clear_arg_regs_bitmap to the same size as to_clear_bitmap.

Testing: Bootstrapped GCC on arm-none-linux-gnueabihf target and
testsuite shows no regression. Running cmse.exp tests for Armv8-M
Baseline and Mainline shows FAIL->PASS for bitfield-1, bitfield-2,
bitfield-3 and struct-1 testcases.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index db99303f3fb7a2196f48358e74fa4d98f31f045e..106e3edce0d6f2518eb391c436c5213a78d1275b 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -25205,7 +25205,8 @@ cmse_nonsecure_entry_clear_before_return (void)
   if (padding_bits_to_clear != 0)
 {
   rtx reg_rtx;
-  auto_sbitmap to_clear_arg_regs_bitmap (R0_REGNUM + NUM_ARG_REGS);
+  int to_clear_bitmap_size = SBITMAP_SIZE ((sbitmap) to_clear_bitmap);
+  auto_sbitmap to_clear_arg_regs_bitmap (to_clear_bitmap_size);
 
   /* Padding bits to clear is not 0 so we know we are dealing with
 	 returning a composite type, which only uses r0.  Let's make sure that


[PATCH, GCC/testsuite/ARM] Fix selection of effective target for cmse tests

2017-11-15 Thread Thomas Preudhomme

Hi,

Some of the tests in the gcc.target/arm/cmse directory (eg.
gcc.target/arm/cmse/mainline/bitfield-4.c) are failing when run without
an architecture specified in RUNTESTFLAGS due to them not adding the
option to select an Armv8-M architecture.

This patch fixes the issue by adding the right option from the exp file
so that no architecture fiddling is necessary in the individual tests.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-11-03  Thomas Preud'homme  

* gcc.target/arm/cmse/cmse.exp: Add option to select Armv8-M Baseline
or Armv8-M Mainline when running the respective tests.
* gcc.target/arm/cmse/baseline/cmse-11.c: Remove architecture check and
selection.
* gcc.target/arm/cmse/baseline/cmse-13.c: Likewise.
* gcc.target/arm/cmse/baseline/cmse-2.c: Likewise.
* gcc.target/arm/cmse/baseline/cmse-6.c: Likewise.
* gcc.target/arm/cmse/baseline/softfp.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp-sp/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp-sp/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-8.c: Likewise.

Testing: Running cmse.exp for both Armv8-M Baseline and Mainline shows
no regression. Running it for a toolchain defaulting to Armv8-M Baseline
but with RUNTESTFLAGS unset sees some FAIL->PASS.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c
index 795544fe11d9d7f24086be16916a5bfee89d7b44..230b255963f56a6c29b91d2501b43fed6eda2476 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c
@@ -1,7 +1,5 @@
 /* { dg-do compile } */
 /* { dg-options "-mcmse" }  */
-/* { dg-require-effective-target arm_arch_v8m_base_ok } */
-/* { dg-add-options arm_arch_v8m_base } */
 
 int __attribute__ ((cmse_nonsecure_call)) (*bar) (int);
 
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c
index 7208a2cedd2f4f8296b2801d6f5e5d7838b26551..7ab3219e860e993e2eca3bbee2e885f59b7b3cb4 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c
@@ -1,7 +1,5 @@
 /* { dg-do compile } */
 /* { dg-options "-mcmse" } */
-/* { dg-require-effective-target arm_arch_v8m_base_ok } */
-/* { dg-add-options arm_arch_v8m_base } */
 
 #include "../cmse-13.x"
 
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c
index fec7dc10484b14db5796f5f431a9306c3b2e307c..d5115ecf2bdb3e87dc6a92244cb204e753f25b07 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c
@@ -1,7 +1,5 @@
 /* { dg-do compile } */
 /* { dg-options "-mcmse" }  */
-/* { dg-require-effective-target arm_arch_v8m_base_ok } */
-/* { dg-add-options arm_arch_v8m_base } */
 
 extern float bar (void);
 
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c
index 43d45e7a63e56edfebc203c8f0e516dc13fbbd65..cae4f343621d1a19a8893ea4950d33e5e1842fb5 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c
@@ -1,7 +1,5 @@
 /* { dg-do compile } */
 /* { dg-options "-mcmse" }  */
-/* { dg-require-effective-target arm_arch_v8m_base_ok } */
-/* { dg-add-options arm_arch_v8m_base } */
 
 int __attribute__ ((cmse_nonsecure_call)) (*bar) (double);
 
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/softfp.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/softfp.c
index ca76e12cd9287fd12b7eb7add638973f5d314939..3d383ff6ee17677120e3e1e81726785c30f3b25c 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/softfp.c
+++ b/gcc/testsuite/gcc.target/arm/cm

[PATCH, GCC/testsuite/ARM] Rework expectation for call to Armv8-M nonsecure function

2017-11-15 Thread Thomas Preudhomme

Hi,

Testcase gcc.target/arm/cmse/cmse-14.c checks whether bar is called via
__gnu_cmse_nonsecure_call libcall and not via a direct call. However the
pattern is a bit surprising in that it needs to explicitely allow "by"
due to allowing anything before the 'b'.

This patch rewrites the logic to look for b as a first non-whitespace
letter followed iby anything (to match bl and conditional branches)
followed by some spaces and then bar.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-11-01  Thomas Preud'homme  

* gcc.target/arm/cmse/cmse-14.c: Change logic to match branch
instruction to bar.

Testing: Test still passes for both Armv8-M Baseline and Mainline.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c b/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c
index 701e9ee7e318a07278099548f9b7042a1fde1204..df1ea52bec533c36a738d7d3b2b2ff749b0f3713 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c
@@ -10,4 +10,4 @@ int foo (void)
 }
 
 /* { dg-final { scan-assembler "bl\t__gnu_cmse_nonsecure_call" } } */
-/* { dg-final { scan-assembler-not "b\[^ y\n\]*\\s+bar" } } */
+/* { dg-final { scan-assembler-not "^(.*\\s)?bl?\[^\\s]*\\s+bar" } } */


[PATCH, GCC/ARM] Use bitmap to control cmse_nonsecure_call register clearing

2017-11-15 Thread Thomas Preudhomme

Hi,

As part of r253256, cmse_nonsecure_entry_clear_before_return has been
rewritten to use auto_sbitmap instead of an integer bitfield to control
which register needs to be cleared. This commit continue this work in
cmse_nonsecure_call_clear_caller_saved.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-10-16  Thomas Preud'homme  

* config/arm/arm.c (cmse_nonsecure_call_clear_caller_saved): Use
auto_sbitap instead of integer bitfield to control register needing
clearing.

Testing: bootstrapped on arm-linux-gnueabihf and no regression in the
testsuite.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 9919f54242d9317125a104f9777d76a85de80e9b..7384b96fea0179334a6010b099df68c8e2a0fc32 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -16990,10 +16990,11 @@ cmse_nonsecure_call_clear_caller_saved (void)
 
   FOR_BB_INSNS (bb, insn)
 	{
-	  uint64_t to_clear_mask, float_mask;
+	  unsigned address_regnum, regno, maxregno =
+	TARGET_HARD_FLOAT_ABI ? D7_VFP_REGNUM : NUM_ARG_REGS - 1;
+	  auto_sbitmap to_clear_bitmap (maxregno + 1);
 	  rtx_insn *seq;
 	  rtx pat, call, unspec, reg, cleared_reg, tmp;
-	  unsigned int regno, maxregno;
 	  rtx address;
 	  CUMULATIVE_ARGS args_so_far_v;
 	  cumulative_args_t args_so_far;
@@ -17024,18 +17025,21 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	continue;
 
 	  /* Determine the caller-saved registers we need to clear.  */
-	  to_clear_mask = (1LL << (NUM_ARG_REGS)) - 1;
-	  maxregno = NUM_ARG_REGS - 1;
+	  bitmap_clear (to_clear_bitmap);
+	  bitmap_set_range (to_clear_bitmap, R0_REGNUM, NUM_ARG_REGS);
+
 	  /* Only look at the caller-saved floating point registers in case of
 	 -mfloat-abi=hard.  For -mfloat-abi=softfp we will be using the
 	 lazy store and loads which clear both caller- and callee-saved
 	 registers.  */
 	  if (TARGET_HARD_FLOAT_ABI)
 	{
-	  float_mask = (1LL << (D7_VFP_REGNUM + 1)) - 1;
-	  float_mask &= ~((1LL << FIRST_VFP_REGNUM) - 1);
-	  to_clear_mask |= float_mask;
-	  maxregno = D7_VFP_REGNUM;
+	  auto_sbitmap float_bitmap (maxregno + 1);
+
+	  bitmap_clear (float_bitmap);
+	  bitmap_set_range (float_bitmap, FIRST_VFP_REGNUM,
+D7_VFP_REGNUM - FIRST_VFP_REGNUM + 1);
+	  bitmap_ior (to_clear_bitmap, to_clear_bitmap, float_bitmap);
 	}
 
 	  /* Make sure the register used to hold the function address is not
@@ -17043,7 +17047,9 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  address = RTVEC_ELT (XVEC (unspec, 0), 0);
 	  gcc_assert (MEM_P (address));
 	  gcc_assert (REG_P (XEXP (address, 0)));
-	  to_clear_mask &= ~(1LL << REGNO (XEXP (address, 0)));
+	  address_regnum = REGNO (XEXP (address, 0));
+	  if (address_regnum < R0_REGNUM + NUM_ARG_REGS)
+	bitmap_clear_bit (to_clear_bitmap, address_regnum);
 
 	  /* Set basic block of call insn so that df rescan is performed on
 	 insns inserted here.  */
@@ -17064,6 +17070,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  FOREACH_FUNCTION_ARGS (fntype, arg_type, args_iter)
 	{
 	  rtx arg_rtx;
+	  uint64_t to_clear_args_mask;
 	  machine_mode arg_mode = TYPE_MODE (arg_type);
 
 	  if (VOID_TYPE_P (arg_type))
@@ -17076,10 +17083,18 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  arg_rtx = arm_function_arg (args_so_far, arg_mode, arg_type,
 	  true);
 	  gcc_assert (REG_P (arg_rtx));
-	  to_clear_mask
-		&= ~compute_not_to_clear_mask (arg_type, arg_rtx,
-	   REGNO (arg_rtx),
-	   padding_bits_to_clear_ptr);
+	  to_clear_args_mask
+		= compute_not_to_clear_mask (arg_type, arg_rtx,
+	 REGNO (arg_rtx),
+	 padding_bits_to_clear_ptr);
+	  if (to_clear_args_mask)
+		{
+		  for (regno = R0_REGNUM; regno <= maxregno; regno++)
+		{
+		  if (to_clear_args_mask & (1ULL << regno))
+			bitmap_clear_bit (to_clear_bitmap, regno);
+		}
+		}
 
 	  first_param = false;
 	}
@@ -17138,7 +17153,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	 call.  */
 	  for (regno = R0_REGNUM; regno <= maxregno; regno++)
 	{
-	  if (!(to_clear_mask & (1LL << regno)))
+	  if (!bitmap_bit_p (to_clear_bitmap, regno))
 		continue;
 
 	  /* If regno is an even vfp register and its successor is also to
@@ -17147,7 +17162,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 		{
 		  if (TARGET_VFP_DOUBLE
 		  && VFP_REGNO_OK_FOR_DOUBLE (regno)
-		  && to_clear_mask & (1LL << (regno + 1)))
+		  && bitmap_bit_p (to_clear_bitmap, (regno + 1)))
 		emit_move_insn (gen_rtx_REG (DFmode, regno++),
 CONST0_RTX (DFmode));
 		  else
@@ -17161,7 +17176,6 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  seq = get_insns ();
 	  end_sequence ();
 	  emit_insn_before (seq, insn);
-
 	}
 }
 }
@@ -25188,7 +25202,7 @@ cmse_nonsecure_entry_clear_before_return (void)
   if (padding_bits_to_clear != 0)
 {
   rtx

[PATCH, GCC/ARM] Factor out CMSE register clearing code

2017-11-15 Thread Thomas Preudhomme

Hi,

Functions cmse_nonsecure_call_clear_caller_saved and
cmse_nonsecure_entry_clear_before_return both contain very similar code
to clear registers. What's worse, they differ slightly at times so if a
bug is found in one careful thoughts is needed to decide whether the
other function needs fixing too.

This commit addresses the situation by factoring the two pieces of code
into a new function. In doing so the code generated to clear VFP
registers in cmse_nonsecure_call now uses the same sequence as
cmse_nonsecure_entry functions. Tests expectation are thus updated
accordingly.

ChangeLog entry are as follow:

*** gcc/ChangeLog ***

2017-10-24  Thomas Preud'homme  

* config/arm/arm.c (cmse_clear_registers): New function.
(cmse_nonsecure_call_clear_caller_saved): Replace register clearing
code by call to cmse_clear_registers.
(cmse_nonsecure_entry_clear_before_return): Likewise.

*** gcc/ChangeLog ***

2017-10-24  Thomas Preud'homme  

* gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Adapt expectations
to vmov instructions now generated.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-8.c: Likewise.

Testing: bootstrapped on arm-linux-gnueabihf and no regression in the
testsuite.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 9b494e9529a4470c18192a4561e03d2f80e90797..22c9add0722974902b2a89b2b0a75759ff8ba37c 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -16991,6 +16991,128 @@ compute_not_to_clear_mask (tree arg_type, rtx arg_rtx, int regno,
   return not_to_clear_mask;
 }
 
+/* Clear registers secret before doing a cmse_nonsecure_call or returning from
+   a cmse_nonsecure_entry function.  TO_CLEAR_BITMAP indicates which registers
+   are to be fully cleared, using the value in register CLEARING_REG if more
+   efficient.  The PADDING_BITS_LEN entries array PADDING_BITS_TO_CLEAR gives
+   the bits that needs to be cleared in caller-saved core registers, with
+   SCRATCH_REG used as a scratch register for that clearing.
+
+   NOTE: one of three following assertions must hold:
+   - SCRATCH_REG is a low register
+   - CLEARING_REG is in the set of registers fully cleared (ie. its bit is set
+ in TO_CLEAR_BITMAP)
+   - CLEARING_REG is a low register.  */
+
+static void
+cmse_clear_registers (sbitmap to_clear_bitmap, uint32_t *padding_bits_to_clear,
+		  int padding_bits_len, rtx scratch_reg, rtx clearing_reg)
+{
+  bool saved_clearing = false;
+  rtx saved_clearing_reg = NULL_RTX;
+  int i, regno, clearing_regno, minregno = R0_REGNUM, maxregno = minregno - 1;
+
+  gcc_assert (arm_arch_cmse);
+
+  if (!bitmap_empty_p (to_clear_bitmap))
+{
+  minregno = bitmap_first_set_bit (to_clear_bitmap);
+  maxregno = bitmap_last_set_bit (to_clear_bitmap);
+}
+  clearing_regno = REGNO (clearing_reg);
+
+  /* Clear padding bits.  */
+  gcc_assert (padding_bits_len <= NUM_ARG_REGS);
+  for (i = 0, regno = R0_REGNUM; i < padding_bits_len; i++, regno++)
+{
+  uint64_t mask;
+  rtx rtx16, dest, cleared_reg = gen_rtx_REG (SImode, regno);
+
+  if (padding_bits_to_clear[i] == 0)
+	continue;
+
+  /* If this is a Thumb-1 target and SCRATCH_REG is not a low register, use
+	 CLEARING_REG as scratch.  */
+  if (TARGET_THUMB1
+	  && REGNO (scratch_reg) > LAST_LO_REGNUM)
+	{
+	  /* clearing_reg is not to be cleared, copy its value into scratch_reg
+	 such that we can use clearing_reg to clear the unused bits in the
+	 arguments.  */
+	  if ((clearing_regno > maxregno
+	   || !bitmap_bit_p (to_clear_bitmap, clearing_regno))
+	  && !saved_clearing)
+	{
+	  gcc_assert (clearing_regno <= LAST_LO_REGNUM);
+	  emit_move_insn (scratch_reg, clearing_reg);
+	  saved_clearing = true;
+	  saved_clearing_reg = scratch_reg;
+	}
+	  scratch_reg = clearing_reg;
+	}
+
+  /* Fill the lower half of the negated padding_bits_to_clear[i].  */
+  mask = (~padding_bits_to_clear[i]) & 0x;
+  emit_move_insn (scratch_reg, gen_int_mode (mask, SImode));
+
+  /* Fill the top half of the negated padding_bits_to_clear[i].  */
+  mask = (~padding_bits_to_clear[i]) >> 16;
+  rtx16 = gen_int_mode (16, SImode);
+  dest = gen_rtx_ZERO_EXTRACT (SImode, scratch_reg, rtx16, rtx16);
+  if (mask)
+	emit_insn (gen_rtx_SET (dest, gen_int_mode (mask, SImode)));
+
+  emit_insn (gen_andsi3 (cleared_reg, cleared_reg, scratch_reg));
+}
+  if (saved_clearing)
+emit_move_insn (clearing_reg, saved_clearing_reg);
+
+
+  /* Clear full registers.  */
+
+  /* If not marked for clearing, clearing_reg already does not contain
+ any secret.  */
+  if (clearing_regno <= ma

[PATCH, GCC/ARM] Do no clobber r4 in Armv8-M nonsecure call

2017-11-15 Thread Thomas Preudhomme

Hi,

Expanders for Armv8-M nonsecure call unnecessarily clobber r4 despite
the libcall they perform not writing to r4.  Furthermore, the
requirement for the branch target address to be in r4 as expected by
the libcall is modeled in a convoluted way in the define_insn patterns:
the address is a register match_operand constrained by the match_dup
for the clobber which is guaranteed to be r4 due to the expander.

This patch simplifies all this by simply requiring the address to be in
r4 and removing the clobbers. Expanders are left alone because
cmse_nonsecure_call_clear_caller_saved relies on branch target memory
attributes which would be lost if expanding to reg:SI R4_REGNUM.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-10-24  Thomas Preud'homme  

* config/arm/arm.md (R4_REGNUM): Define constant.
(nonsecure_call_internal): Remove r4 clobber.
(nonsecure_call_value_internal): Likewise.
* config/arm/thumb1.md (nonsecure_call_reg_thumb1_v5): Remove second
clobber and resequence match_operands.
(nonsecure_call_value_reg_thumb1_v5): Likewise.
* config/arm/thumb2.md (nonsecure_call_reg_thumb2): Likewise.
(nonsecure_call_value_reg_thumb2): Likewise.

Testing: Bootstrapped on arm-linux-gnueabihf and testsuite shows no
regression.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index ddb9d8f359007c1d86d497aef0ff5fc0e4061813..6b0794ede9fbc5a4f41e1f4a92acb9b649a277bc 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -30,6 +30,7 @@
 (define_constants
   [(R0_REGNUM 0)	; First CORE register
(R1_REGNUM	  1)	; Second CORE register
+   (R4_REGNUM	  4)	; Fifth CORE register
(IP_REGNUM	 12)	; Scratch register
(SP_REGNUM	 13)	; Stack pointer
(LR_REGNUM14)	; Return address register
@@ -8118,14 +8119,13 @@
 			   UNSPEC_NONSECURE_MEM)
 		(match_operand 1 "general_operand" ""))
 	  (use (match_operand 2 "" ""))
-	  (clobber (reg:SI LR_REGNUM))
-	  (clobber (reg:SI 4))])]
+	  (clobber (reg:SI LR_REGNUM))])]
   "use_cmse"
   "
   {
 rtx tmp;
 tmp = copy_to_suggested_reg (XEXP (operands[0], 0),
- gen_rtx_REG (SImode, 4),
+ gen_rtx_REG (SImode, R4_REGNUM),
  SImode);
 
 operands[0] = replace_equiv_address (operands[0], tmp);
@@ -8210,14 +8210,13 @@
 UNSPEC_NONSECURE_MEM)
 			 (match_operand 2 "general_operand" "")))
 	  (use (match_operand 3 "" ""))
-	  (clobber (reg:SI LR_REGNUM))
-	  (clobber (reg:SI 4))])]
+	  (clobber (reg:SI LR_REGNUM))])]
   "use_cmse"
   "
   {
 rtx tmp;
 tmp = copy_to_suggested_reg (XEXP (operands[1], 0),
- gen_rtx_REG (SImode, 4),
+ gen_rtx_REG (SImode, R4_REGNUM),
  SImode);
 
 operands[1] = replace_equiv_address (operands[1], tmp);
diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index 5d196a673355a7acf7d0ed30f21b997b815913f5..f91659386bf240172bd9a3076722683c8a50dff4 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -1732,12 +1732,11 @@
 )
 
 (define_insn "*nonsecure_call_reg_thumb1_v5"
-  [(call (unspec:SI [(mem:SI (match_operand:SI 0 "register_operand" "l*r"))]
+  [(call (unspec:SI [(mem:SI (reg:SI R4_REGNUM))]
 		UNSPEC_NONSECURE_MEM)
-	 (match_operand 1 "" ""))
-   (use (match_operand 2 "" ""))
-   (clobber (reg:SI LR_REGNUM))
-   (clobber (match_dup 0))]
+	 (match_operand 0 "" ""))
+   (use (match_operand 1 "" ""))
+   (clobber (reg:SI LR_REGNUM))]
   "TARGET_THUMB1 && use_cmse && !SIBLING_CALL_P (insn)"
   "bl\\t__gnu_cmse_nonsecure_call"
   [(set_attr "length" "4")
@@ -1779,12 +1778,11 @@
 (define_insn "*nonsecure_call_value_reg_thumb1_v5"
   [(set (match_operand 0 "" "")
 	(call (unspec:SI
-	   [(mem:SI (match_operand:SI 1 "register_operand" "l*r"))]
+	   [(mem:SI (reg:SI R4_REGNUM))]
 	   UNSPEC_NONSECURE_MEM)
-	  (match_operand 2 "" "")))
-   (use (match_operand 3 "" ""))
-   (clobber (reg:SI LR_REGNUM))
-   (clobber (match_dup 1))]
+	  (match_operand 1 "" "")))
+   (use (match_operand 2 "" ""))
+   (clobber (reg:SI LR_REGNUM))]
   "TARGET_THUMB1 && use_cmse"
   "bl\\t__gnu_cmse_nonsecure_call"
   [(set_attr "length" "4")
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 776d611d2538e790a5f504995050ffdfc51d7193..d56a8bd167575263edc2a4b3f66bda34a4a7a72a 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -555,12 +555,11 @@
 )
 
 (define_insn "*nonsecure_call_reg_thumb2"
-  [(call (unspec:SI [(mem:SI (match_operand:SI 0 "s_register_operand" "r"))]
+  [(call (unspec:SI [(mem:SI (reg:SI R4_REGNUM))]
 		UNSPEC_NONSECURE_MEM)
-	 (match_operand 1 "" ""))
-   (use (match_operand 2 "" ""))
-   (clobber (reg:SI LR_REGNUM))
-   (clobber (match_dup 0))]
+	 (match_operand 0 "" ""))
+   (use (match_operand 1 "" ""))
+   (clobber (reg:SI LR_REGNUM))]
   "TARGET_THUMB2 && use_cmse"
   "bl\\t__gnu_cmse_nonsecure_ca

Re: [PATCH] Use bswap framework in store-merging (PR tree-optimization/78821)

2017-11-17 Thread Thomas Preudhomme

Hi Jakub,

On 16/11/17 17:06, Jakub Jelinek wrote:

Hi!

This patch uses the bswap pass framework inside of the store merging
pass to handle adjacent stores which produce together a 16/32/64 bit
store of bswapped value (loaded or from SSA_NAME) or identity (usually
only from SSA_NAME, the code prefers to use the existing store merging
code if coming from identity load, because it e.g. can handle arbitrary
sizes, not just 16/32/64 bits).

There are small tweaks to the bswap code to make it usable inside of
the store merging pass.  Then when processing the stores, we record
what find_bswap_or_nop_1 returns and do a small sanity check on it,
and when doing coalesce_immediate_stores (i.e. the splitting into
groups), we try for 64-bit, 32-bit and 16-bit sizes if we can extend/shift
(according to endianity) and perform_symbolic_merge them together.
If it is possible, we turn those 2+ adjacent stores that make together
{64,32,16} bits into a separate group and process it specially later
(we need to treat it as a single store rather than multiple, so
split_group is only very lightweight for that case).


Nice, the two finally merged! I took a look at the bswap part and it all looked 
good to me code and comment wise. I only have one small nit regarding a 
space/tab change (see below).




Bootstrapped/regtested on {x86_64,i686,powerpc64le,powerpc64}-linux, ok for 
trunk?

The cases this patch can handle are less common than rhs_code INTEGER_CST
(stores of constants to adjacent memory) or MEM_REF (adjacent memory
copying), but are more common than the bitwise ops, during combined
x86_64+i686 bootstraps/regtests it triggered:
lrotate_expr  974   2528
nop_expr  720   1711
(lrotate_expr stands for bswap, nop_expr for identity, the first column is
the actual count of such new stores, the second is the original number of
stores that have been optimized this way).


Are you saying that lrotate_expr is just the title and it also includes 32- and 
64-bit bswap or is it only the count of lrotate_expr nodes?




2017-11-16  Jakub Jelinek  

PR tree-optimization/78821
* gimple-ssa-store-merging.c (find_bswap_or_nop_load): Give up
if base is TARGET_MEM_REF.  If base is not MEM_REF, set base_addr
to the address of the base rather than the base itself.
(find_bswap_or_nop_1): Just use pointer comparison for vuse check.
(find_bswap_or_nop_finalize): New function.
(find_bswap_or_nop): Use it.
(bswap_replace): Return a tree rather than bool, change first
argument from gimple * to gimple_stmt_iterator, allow inserting
into an empty sequence, allow ins_stmt to be NULL - then emit
all stmts into gsi.  Fix up MEM_REF address gimplification.
(pass_optimize_bswap::execute): Adjust bswap_replace caller.
Formatting fix.
(struct store_immediate_info): Add N and INS_STMT non-static
data members.
(store_immediate_info::store_immediate_info): Initialize them
from newly added ctor args.
(merged_store_group::apply_stores): Formatting fixes.  Sort by
bitpos at the end.
(stmts_may_clobber_ref_p): For stores call also
refs_anti_dependent_p.
(gather_bswap_load_refs): New function.
(imm_store_chain_info::try_coalesce_bswap): New method.
(imm_store_chain_info::coalesce_immediate_stores): Use it.
(split_group): Handle LROTATE_EXPR and NOP_EXPR rhs_code specially.
(imm_store_chain_info::output_merged_store): Fail if number of
new estimated stmts is bigger or equal than old.  Handle LROTATE_EXPR
and NOP_EXPR rhs_code.
(pass_store_merging::process_store): Compute n and ins_stmt, if
ins_stmt is non-NULL and the store rhs is otherwise invalid, use
LROTATE_EXPR rhs_code.  Pass n and ins_stmt to store_immediate_info
ctor.
(pass_store_merging::execute): Calculate dominators.

* gcc.dg/store_merging_16.c: New test.

--- gcc/gimple-ssa-store-merging.c.jj   2017-11-16 10:45:09.239185205 +0100
+++ gcc/gimple-ssa-store-merging.c  2017-11-16 15:34:08.560080214 +0100
@@ -369,7 +369,10 @@ find_bswap_or_nop_load (gimple *stmt, tr
base_addr = get_inner_reference (ref, &bitsize, &bitpos, &offset, &mode,
   &unsignedp, &reversep, &volatilep);
  
-  if (TREE_CODE (base_addr) == MEM_REF)

+  if (TREE_CODE (base_addr) == TARGET_MEM_REF)
+/* Do not rewrite TARGET_MEM_REF.  */
+return false;
+  else if (TREE_CODE (base_addr) == MEM_REF)
  {
offset_int bit_offset = 0;
tree off = TREE_OPERAND (base_addr, 1);
@@ -401,6 +404,8 @@ find_bswap_or_nop_load (gimple *stmt, tr
  
bitpos += bit_offset.to_shwi ();

  }
+  else
+base_addr = build_fold_addr_expr (base_addr);
  
if (bitpos % BITS_PER_UNIT)

  return false;
@@ -743,8 +748,7 @@ find_bswap_or_nop_1 (gimple *stmt, struc
  if (TYPE_PRECISION (n1.type) != TYPE_PRECIS

Re: [PATCH, GCC/ARM] Fix cmse_nonsecure_entry return insn size

2017-11-21 Thread Thomas Preudhomme

Hi Kyrill,

On 09/11/17 14:26, Kyrill Tkachov wrote:

Hi Thomas,

On 08/11/17 09:50, Thomas Preudhomme wrote:

Hi,

A number of instructions are output in assembler form by
output_return_instruction () when compiling a function with the
cmse_nonsecure_entry attribute for Armv8-M Mainline with hardfloat float
ABI. However, the corresponding thumb2_cmse_entry_return insn pattern
does not account for all these instructions in its computing of the
length of the instruction.

This may lead GCC to use the wrong branching instruction due to
incorrect computation of the offset between the branch instruction's
address and the target address.

This commit fixes the mismatch between what output_return_instruction ()
does and what the pattern think it does and adds a note warning about
mismatch in the affected functions' heading comments to ensure code does
not get out of sync again.

Note: no test is provided because the C testcase is fragile (only works
on GCC 6) and the extracted RTL test fails to compile due to bugs in the
RTL frontend (PR82815 and PR82817)

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2017-10-30  Thomas Preud'homme 

* config/arm/arm.c (output_return_instruction): Add comments to
indicate requirement for cmse_nonsecure_entry return to account
for the size of clearing instruction output here.
(thumb_exit): Likewise.
* config/arm/thumb2.md (thumb2_cmse_entry_return): Fix length for
return in hardfloat mode.

Testing: Bootstrapped on arm-linux-gnueabihf and testsuite shows no
regression.

Is this ok for trunk?



Ok for trunk and for the branches after a few days.


I've committed the patch to gcc-7-branch (see attached) after another round of 
testing since nobody reported a regression since. Thanks.


Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 989957f048e3c757ef4665d0387ecdc66d26a7dd..7b3f4c1011dc37cb01654f70cfbffadd57d382ec 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -19316,7 +19316,12 @@ arm_get_vfp_saved_size (void)
 
 /* Generate a function exit sequence.  If REALLY_RETURN is false, then do
everything bar the final return instruction.  If simple_return is true,
-   then do not output epilogue, because it has already been emitted in RTL.  */
+   then do not output epilogue, because it has already been emitted in RTL.
+
+   Note: do not forget to update length attribute of corresponding insn pattern
+   when changing assembly output (eg. length attribute of
+   thumb2_cmse_entry_return when updating Armv8-M Mainline Security Extensions
+   register clearing sequences).  */
 const char *
 output_return_instruction (rtx operand, bool really_return, bool reverse,
bool simple_return)
@@ -23809,7 +23814,12 @@ thumb_pop (FILE *f, unsigned long mask)
 
 /* Generate code to return from a thumb function.
If 'reg_containing_return_addr' is -1, then the return address is
-   actually on the stack, at the stack pointer.  */
+   actually on the stack, at the stack pointer.
+
+   Note: do not forget to update length attribute of corresponding insn pattern
+   when changing assembly output (eg. length attribute of epilogue_insns when
+   updating Armv8-M Baseline Security Extensions register clearing
+   sequences).  */
 static void
 thumb_exit (FILE *f, int reg_containing_return_addr)
 {
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 2e7580f220eae1524fef69719b1796f50f5cf27c..35f8e9bbf24058c129cbb117c74d1a4bebbf9f38 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -1132,7 +1132,7 @@
; we adapt the length accordingly.
(set (attr "length")
  (if_then_else (match_test "TARGET_HARD_FLOAT")
-  (const_int 12)
+  (const_int 34)
   (const_int 8)))
; We do not support predicate execution of returns from cmse_nonsecure_entry
; functions because we need to clear the APSR.  Since predicable has to be


Re: [PATCH, GCC/ARM] Use bitmap to control cmse_nonsecure_call register clearing

2017-11-22 Thread Thomas Preudhomme

Thanks Kyrill.

Committed the attached rebased patch (same patch but without the last hunk 
because a better fix was done in an earlier commit).


Best regards,

Thomas

On 22/11/17 11:57, Kyrill Tkachov wrote:

Hi Thomas,

On 15/11/17 17:08, Thomas Preudhomme wrote:

Hi,

As part of r253256, cmse_nonsecure_entry_clear_before_return has been
rewritten to use auto_sbitmap instead of an integer bitfield to control
which register needs to be cleared. This commit continue this work in
cmse_nonsecure_call_clear_caller_saved.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-10-16  Thomas Preud'homme 

    * config/arm/arm.c (cmse_nonsecure_call_clear_caller_saved): Use
    auto_sbitap instead of integer bitfield to control register needing
    clearing.

Testing: bootstrapped on arm-linux-gnueabihf and no regression in the
testsuite.

Is this ok for trunk?



Ok for trunk.
Thanks for this conversion. It's much easier to understand the code
without having to think about the bitmasks and shifts.

Kyrill


Best regards,

Thomas


diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 106e3edce0d6f2518eb391c436c5213a78d1275b..092cd61d49382101bce9b8c5f04de31965dcdc77 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -17007,10 +17007,11 @@ cmse_nonsecure_call_clear_caller_saved (void)
 
   FOR_BB_INSNS (bb, insn)
 	{
-	  uint64_t to_clear_mask, float_mask;
+	  unsigned address_regnum, regno, maxregno =
+	TARGET_HARD_FLOAT_ABI ? D7_VFP_REGNUM : NUM_ARG_REGS - 1;
+	  auto_sbitmap to_clear_bitmap (maxregno + 1);
 	  rtx_insn *seq;
 	  rtx pat, call, unspec, reg, cleared_reg, tmp;
-	  unsigned int regno, maxregno;
 	  rtx address;
 	  CUMULATIVE_ARGS args_so_far_v;
 	  cumulative_args_t args_so_far;
@@ -17041,18 +17042,21 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	continue;
 
 	  /* Determine the caller-saved registers we need to clear.  */
-	  to_clear_mask = (1LL << (NUM_ARG_REGS)) - 1;
-	  maxregno = NUM_ARG_REGS - 1;
+	  bitmap_clear (to_clear_bitmap);
+	  bitmap_set_range (to_clear_bitmap, R0_REGNUM, NUM_ARG_REGS);
+
 	  /* Only look at the caller-saved floating point registers in case of
 	 -mfloat-abi=hard.  For -mfloat-abi=softfp we will be using the
 	 lazy store and loads which clear both caller- and callee-saved
 	 registers.  */
 	  if (TARGET_HARD_FLOAT_ABI)
 	{
-	  float_mask = (1LL << (D7_VFP_REGNUM + 1)) - 1;
-	  float_mask &= ~((1LL << FIRST_VFP_REGNUM) - 1);
-	  to_clear_mask |= float_mask;
-	  maxregno = D7_VFP_REGNUM;
+	  auto_sbitmap float_bitmap (maxregno + 1);
+
+	  bitmap_clear (float_bitmap);
+	  bitmap_set_range (float_bitmap, FIRST_VFP_REGNUM,
+D7_VFP_REGNUM - FIRST_VFP_REGNUM + 1);
+	  bitmap_ior (to_clear_bitmap, to_clear_bitmap, float_bitmap);
 	}
 
 	  /* Make sure the register used to hold the function address is not
@@ -17060,7 +17064,9 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  address = RTVEC_ELT (XVEC (unspec, 0), 0);
 	  gcc_assert (MEM_P (address));
 	  gcc_assert (REG_P (XEXP (address, 0)));
-	  to_clear_mask &= ~(1LL << REGNO (XEXP (address, 0)));
+	  address_regnum = REGNO (XEXP (address, 0));
+	  if (address_regnum < R0_REGNUM + NUM_ARG_REGS)
+	bitmap_clear_bit (to_clear_bitmap, address_regnum);
 
 	  /* Set basic block of call insn so that df rescan is performed on
 	 insns inserted here.  */
@@ -17081,6 +17087,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  FOREACH_FUNCTION_ARGS (fntype, arg_type, args_iter)
 	{
 	  rtx arg_rtx;
+	  uint64_t to_clear_args_mask;
 	  machine_mode arg_mode = TYPE_MODE (arg_type);
 
 	  if (VOID_TYPE_P (arg_type))
@@ -17093,10 +17100,18 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  arg_rtx = arm_function_arg (args_so_far, arg_mode, arg_type,
 	  true);
 	  gcc_assert (REG_P (arg_rtx));
-	  to_clear_mask
-		&= ~compute_not_to_clear_mask (arg_type, arg_rtx,
-	   REGNO (arg_rtx),
-	   padding_bits_to_clear_ptr);
+	  to_clear_args_mask
+		= compute_not_to_clear_mask (arg_type, arg_rtx,
+	 REGNO (arg_rtx),
+	 padding_bits_to_clear_ptr);
+	  if (to_clear_args_mask)
+		{
+		  for (regno = R0_REGNUM; regno <= maxregno; regno++)
+		{
+		  if (to_clear_args_mask & (1ULL << regno))
+			bitmap_clear_bit (to_clear_bitmap, regno);
+		}
+		}
 
 	  first_param = false;
 	}
@@ -17155,7 +17170,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	 call.  */
 	  for (regno = R0_REGNUM; regno <= maxregno; regno++)
 	{
-	  if (!(to_clear_mask & (1LL << regno)))
+	  if (!bitmap_bit_p (to_clear_bitmap, regno))
 		continue;
 
 	  /* If regno is an even vfp register and its successor is also to
@@ -17164,7 +17179,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 		{
 		  if (TARGET_VFP_DOUBLE
 		  &&

[PATCH, GCC/ARM] Remove useless variable in CMSE code

2017-11-22 Thread Thomas Preudhomme

Hi,

Functions cmse_nonsecure_call_clear_caller_saved () and
cmse_nonsecure_entry_clear_before_return () use a separate variable
holding a pointer to padding_bits_to_clear array's first entry which is
used when calling function compute_not_to_clear_mask ().  This does not
save space over using &padding_bits_to_clear[0] directly so this commit
gets rid of it.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-11-08  Thomas Preud'homme  

* config/arm/arm.c (cmse_nonsecure_call_clear_caller_saved): Get rid of
padding_bits_to_clear_ptr.
(cmse_nonsecure_entry_clear_before_return): Likewise.

Testing: Bootstrapped an arm-none-linux-gnueabihf compiler and
regression test does not show any regression.

Committed as obvious.

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 7384b96fea0179334a6010b099df68c8e2a0fc32..bcb708c1b316ea08969e118fb0949b941ff19c27 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -17002,7 +17002,6 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  bool using_r4, first_param = true;
 	  function_args_iterator args_iter;
 	  uint32_t padding_bits_to_clear[4] = {0U, 0U, 0U, 0U};
-	  uint32_t * padding_bits_to_clear_ptr = &padding_bits_to_clear[0];
 
 	  if (!NONDEBUG_INSN_P (insn))
 	continue;
@@ -17086,7 +17085,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  to_clear_args_mask
 		= compute_not_to_clear_mask (arg_type, arg_rtx,
 	 REGNO (arg_rtx),
-	 padding_bits_to_clear_ptr);
+	 &padding_bits_to_clear[0]);
 	  if (to_clear_args_mask)
 		{
 		  for (regno = R0_REGNUM; regno <= maxregno; regno++)
@@ -25134,7 +25133,6 @@ cmse_nonsecure_entry_clear_before_return (void)
 {
   int regno, maxregno = TARGET_HARD_FLOAT ? LAST_VFP_REGNUM : IP_REGNUM;
   uint32_t padding_bits_to_clear = 0;
-  uint32_t * padding_bits_to_clear_ptr = &padding_bits_to_clear;
   auto_sbitmap to_clear_bitmap (maxregno + 1);
   tree result_type;
   rtx result_rtl;
@@ -25187,7 +25185,7 @@ cmse_nonsecure_entry_clear_before_return (void)
   gcc_assert (REG_P (result_rtl));
   to_clear_return_mask
 	= compute_not_to_clear_mask (result_type, result_rtl, 0,
- padding_bits_to_clear_ptr);
+ &padding_bits_to_clear);
   if (to_clear_return_mask)
 	{
 	  gcc_assert ((unsigned) maxregno < sizeof (long long) * __CHAR_BIT__);


Re: [PATCH, GCC/ARM] Factor out CMSE register clearing code

2017-11-22 Thread Thomas Preudhomme



On 22/11/17 14:45, Kyrill Tkachov wrote:

Hi Thomas,

On 15/11/17 17:12, Thomas Preudhomme wrote:

Hi,

Functions cmse_nonsecure_call_clear_caller_saved and
cmse_nonsecure_entry_clear_before_return both contain very similar code
to clear registers. What's worse, they differ slightly at times so if a
bug is found in one careful thoughts is needed to decide whether the
other function needs fixing too.

This commit addresses the situation by factoring the two pieces of code
into a new function. In doing so the code generated to clear VFP
registers in cmse_nonsecure_call now uses the same sequence as
cmse_nonsecure_entry functions. Tests expectation are thus updated
accordingly.

ChangeLog entry are as follow:

*** gcc/ChangeLog ***

2017-10-24  Thomas Preud'homme 

    * config/arm/arm.c (cmse_clear_registers): New function.
    (cmse_nonsecure_call_clear_caller_saved): Replace register clearing
    code by call to cmse_clear_registers.
    (cmse_nonsecure_entry_clear_before_return): Likewise.

*** gcc/ChangeLog ***

2017-10-24  Thomas Preud'homme 

    * gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Adapt expectations
    to vmov instructions now generated.
    * gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: Likewise.
    * gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: Likewise.
    * gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise.
    * gcc.target/arm/cmse/mainline/hard/cmse-7.c: Likewise.
    * gcc.target/arm/cmse/mainline/hard/cmse-8.c: Likewise.

Testing: bootstrapped on arm-linux-gnueabihf and no regression in the
testsuite.

Is this ok for trunk?



This looks mostly ok, but I have a concern from reading the code that I'd like 
some help with...


diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
9b494e9529a4470c18192a4561e03d2f80e90797..22c9add0722974902b2a89b2b0a75759ff8ba37c 
100644

--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -16991,6 +16991,128 @@ compute_not_to_clear_mask (tree arg_type, rtx arg_rtx, 
int regno,

    return not_to_clear_mask;
  }

+/* Clear registers secret before doing a cmse_nonsecure_call or returning from
+   a cmse_nonsecure_entry function.  TO_CLEAR_BITMAP indicates which registers
+   are to be fully cleared, using the value in register CLEARING_REG if more
+   efficient.  The PADDING_BITS_LEN entries array PADDING_BITS_TO_CLEAR gives
+   the bits that needs to be cleared in caller-saved core registers, with
+   SCRATCH_REG used as a scratch register for that clearing.
+
+   NOTE: one of three following assertions must hold:
+   - SCRATCH_REG is a low register
+   - CLEARING_REG is in the set of registers fully cleared (ie. its bit is set
+ in TO_CLEAR_BITMAP)
+   - CLEARING_REG is a low register.  */
+
+static void
+cmse_clear_registers (sbitmap to_clear_bitmap, uint32_t *padding_bits_to_clear,
+  int padding_bits_len, rtx scratch_reg, rtx clearing_reg)
+{
+  bool saved_clearing = false;
+  rtx saved_clearing_reg = NULL_RTX;
+  int i, regno, clearing_regno, minregno = R0_REGNUM, maxregno = minregno - 1;
+

Here minregno becomes 0 and maxregno becomes -1...

+  gcc_assert (arm_arch_cmse);
+
+  if (!bitmap_empty_p (to_clear_bitmap))
+    {
+  minregno = bitmap_first_set_bit (to_clear_bitmap);
+  maxregno = bitmap_last_set_bit (to_clear_bitmap);
+    }


...and here is a path on maxregno may not be set to a proper register number...



If bitmap is empty yes, ie. if no bit is set and no register should be cleared.



+
+  for (regno = minregno; regno <= maxregno; regno++)
+    {
+  if (!bitmap_bit_p (to_clear_bitmap, regno))
+    continue;
+

...and here we iterate from minregno (potentially 0) to maxregno (potentially 
-1) which will lead to trouble.

Are there any guarantees that this case will not occur?


It absolutely does occur and that's on purpose. If maxregno is -1 it means there 
is no bit to clear and so it is fine to do nothing.


Best regards,

Thomas


[arm-embedded] [PATCH, GCC/LTO, ping] Fix PR69866: LTO with def for weak alias in regular object file

2017-11-28 Thread Thomas Preudhomme

Hi,

We have decided to apply the forwarded patch to the embedded-7-branch to fix an 
ICE when doing partial LTO with weak symbols.


ChangeLog entry is as follows:

2017-11-28  Thomas Preud'homme  

Backport from mainline
2017-06-15  Jan Hubicka  
Thomas Preud'homme  

PR lto/69866
* lto-symtab.c (lto_symtab_merge_symbols): Drop useless definitions
that resolved externally.

Backport from mainline
2017-06-15  Thomas Preud'homme  

PR lto/69866
* gcc.dg/lto/pr69866_0.c: New test.
* gcc.dg/lto/pr69866_1.c: Likewise.


Best regards,

Thomas
--- Begin Message ---
Hi,
I am testing the following. Let me know if it works for you.

Honza

Index: lto/lto-symtab.c
===
--- lto/lto-symtab.c(revision 249213)
+++ lto/lto-symtab.c(working copy)
@@ -952,6 +952,42 @@
  if (tgt)
node->resolve_alias (tgt, true);
}
+ /* If the symbol was preempted outside IR, see if we want to get rid
+of the definition.  */
+ if (node->analyzed
+ && !DECL_EXTERNAL (node->decl)
+ && (node->resolution == LDPR_PREEMPTED_REG
+ || node->resolution == LDPR_RESOLVED_IR
+ || node->resolution == LDPR_RESOLVED_EXEC
+ || node->resolution == LDPR_RESOLVED_DYN))
+   {
+ DECL_EXTERNAL (node->decl) = 1;
+ /* If alias to local symbol was preempted by external definition,
+we know it is not pointing to the local symbol.  Remove it.  */
+ if (node->alias
+ && !node->weakref
+ && !node->transparent_alias
+ && node->get_alias_target ()->binds_to_current_def_p ())
+   {
+ node->alias = false;
+ node->remove_all_references ();
+ node->definition = false;
+ node->analyzed = false;
+ node->cpp_implicit_alias = false;
+   }
+ else if (!node->alias
+  && node->definition
+  && node->get_availability () <= AVAIL_INTERPOSABLE)
+   {
+ if ((cnode = dyn_cast  (node)) != NULL)
+   cnode->reset ();
+ else
+   {
+ node->analyzed = node->definition = false;
+ node->remove_all_references ();
+   }
+   }
+   }
 
  if (!(cnode = dyn_cast  (node))
  || !cnode->clone_of
--- End Message ---


[PATCH, GCC/testsuite] Improve fstack_protector effective target

2017-11-30 Thread Thomas Preudhomme

Hi,

Effective target fstack_protector fails to return an error for
newlib-based target (such as arm-none-eabi targets) which does not
support stack protector. This is due to the test being too simplist for
stack protection code to be generated by GCC: it does not contain a
local buffer and does not read unknown input.

This commit adds a small local buffer with a copy of the filename to
trigger stack protector code to be generated. The filename is used
instead of the full path so as to ensure the size will fit in the local
buffer.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-11-28  Thomas Preud'homme  

* lib/target-supports.exp (check_effective_target_fstack_protector):
Copy filename in local buffer to trigger stack protection.

Testing: Ran gcc.dg/pr38616 on arm-none-eabi and arm-linux-gnueabihf,
the former is now UNSUPPORTED while the latter continues to PASS.

Is this ok for stage3?

Best regards,

Thomas
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index d30fd368922713d3695f22710197ce7094c977cd..8aff16a25823ec48e76ad6ad8fdc8db998a45877 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -1064,7 +1064,11 @@ proc check_effective_target_static {} {
 # Return 1 if the target supports -fstack-protector
 proc check_effective_target_fstack_protector {} {
 return [check_runtime fstack_protector {
-	int main (void) { return 0; }
+	#include 
+	int main (int argc, char *argv[]) {
+	  char buf[64];
+	  return !strcpy (buf, strrchr (argv[0], '/'));
+	}
 } "-fstack-protector"]
 }
 


[PATCH, GCC/testsuite] Fix dump-noaddr dumpbase

2017-12-05 Thread Thomas Preudhomme

Hi,

dump-noaddr test FAILS when $tmpdir is not the same as the directory
where runtest is called from. Note that this does not happen when
running make check because tmpdir is set to srcdir.

In that case, file mkdir will create the directory in the current
directory while GCC is invoked from tmpdir and hence -dumpbase look
for dump1 and dump2 relative to tmpdir. This patch forces dumpbase to
be relative to tmpdir which will work in all case.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-12-05  Thomas Preud'homme  

* gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Set dump base
relative to tmpdir.

Testing: Successfully ran unsorted.exp via make check and out of tree
testing using runtest from /test with tmpdir set in
/test/site.exp to .

Is this ok for stage3?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x b/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x
index d14d494570944b2be82c2575204cdbf4b15721ca..68d6c3e38325cabbdd280ecf05e663dbcda99900 100644
--- a/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x
+++ b/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x
@@ -11,10 +11,10 @@ proc dump_compare { src options } {
 foreach option $option_list {
 	file delete -force dump1
 	file mkdir dump1
-	c-torture-compile $src "$option $options -dumpbase dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
+	c-torture-compile $src "$option $options -dumpbase [pwd]/dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
 	file delete -force dump2
 	file mkdir dump2
-	c-torture-compile $src "$option $options -dumpbase dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
+	c-torture-compile $src "$option $options -dumpbase [pwd]/dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
 	foreach dump1 [lsort [glob -nocomplain dump1/*]] {
 	regsub dump1/ $dump1 dump2/ dump2
 	set dumptail "gcc.c-torture/unsorted/[file tail $dump1]"


Re: [PATCH, GCC/testsuite] Fix dump-noaddr dumpbase

2017-12-05 Thread Thomas Preudhomme



On 05/12/17 17:54, Andrew Pinski wrote:

On Tue, Dec 5, 2017 at 9:50 AM, Thomas Preudhomme
 wrote:

Hi,

dump-noaddr test FAILS when $tmpdir is not the same as the directory
where runtest is called from. Note that this does not happen when
running make check because tmpdir is set to srcdir.

In that case, file mkdir will create the directory in the current
directory while GCC is invoked from tmpdir and hence -dumpbase look
for dump1 and dump2 relative to tmpdir. This patch forces dumpbase to
be relative to tmpdir which will work in all case.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-12-05  Thomas Preud'homme  

 * gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Set dump base
 relative to tmpdir.

Testing: Successfully ran unsorted.exp via make check and out of tree
testing using runtest from /test with tmpdir set in
/test/site.exp to .

Is this ok for stage3?


https://gcc.gnu.org/ml/gcc-patches/2012-06/msg01752.html
I don't remember where this discussion went last time.
Maybe this time there will be a resolution :).


FWIW, I agree with Matt, creating the dump in tmpdir makes more sense. I think 
his patch can be simplified though because the compiler seems to be invoked from 
tmpdir so it can at least be omitted from the -dumpbase.


Best regards,

Thomas


Re: [PATCH, GCC/testsuite] Fix dump-noaddr dumpbase

2017-12-05 Thread Thomas Preudhomme

Hi Mike,

Thanks, I've tested after the two commits and it works both in tree and out of 
tree. It'll simplify comparing in tree results Vs out of tree for us, thanks a lot!


Would you consider a backport to stable branches if nobody complains after a 
week?

Best regards,

Thomas

On 05/12/17 19:27, Mike Stump wrote:

On Dec 5, 2017, at 11:11 AM, Thomas Preudhomme  
wrote:


On 05/12/17 17:54, Andrew Pinski wrote:

On Tue, Dec 5, 2017 at 9:50 AM, Thomas Preudhomme
 wrote:

Hi,

dump-noaddr test FAILS when $tmpdir is not the same as the directory
where runtest is called from. Note that this does not happen when
running make check because tmpdir is set to srcdir.

In that case, file mkdir will create the directory in the current
directory while GCC is invoked from tmpdir and hence -dumpbase look
for dump1 and dump2 relative to tmpdir. This patch forces dumpbase to
be relative to tmpdir which will work in all case.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-12-05  Thomas Preud'homme  

 * gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Set dump base
 relative to tmpdir.

Testing: Successfully ran unsorted.exp via make check and out of tree
testing using runtest from /test with tmpdir set in
/test/site.exp to .

Is this ok for stage3?

https://gcc.gnu.org/ml/gcc-patches/2012-06/msg01752.html
I don't remember where this discussion went last time.
Maybe this time there will be a resolution :).


FWIW, I agree with Matt, creating the dump in tmpdir makes more sense. I think 
his patch can be simplified though because the compiler seems to be invoked 
from tmpdir so it can at least be omitted from the -dumpbase.


Sounds reasonable.  I've added that on top of his patch and checked that in.  
Let us know if it works or not.



[PATCH, GCC/ARM] Multilib mapping for Armv8-R

2018-02-13 Thread Thomas Preudhomme

Hi,

Due to there being no multilib mapping for Armv8-R, default multilib
targeting -march=armv4t with softfloat floating-point arithmetic is
being used. This patch maps it instead to the existing Armv7 multilibs.
Note that since there is no single-precision multilib compatible with
R profile, -march=armv8-r+fp.sp is mapped to -march=armv7 ie. Armv7
with softfloat floating-point.

Changelog entry is as follows:

*** gcc/ChangeLog ***

2018-02-12  Thomas Preud'homme  

* config/arm/t-multilib: Map Armv8-R to Armv7 multilibs.

Testing:

Ran -print-multi-directory for all combinations of extensions one can
pass to -march=armv8-r (including no extension but only considering a
single ordering of extension). All gave the expected result. Details in
appendix.

Is this ok for stage4?

Best regards,

Thomas

Appendix: output of -print-multi-directory for all extensions available
to -march=armv8-r

% for ext in "" +crc +fp.sp +simd +crypto +crc+fp.sp +crc+simd +crc+crypto 
+fp.sp+simd +fp.sp+crypto +simd+crypto +crc+fp.sp+simd +crc+fp.sp+crypto 
+crc+simd+crypto +fp.sp+simd+crypto +crc+fp.sp+simd+crypto ; do 
cmd="arm-none-eabi-gcc -march=armv8-r${ext} -mfloat-abi=soft 
-print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done
arm-none-eabi-gcc -march=armv8-r -mfloat-abi=soft -print-multi-directory: 
thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc -mfloat-abi=soft -print-multi-directory: 
thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+fp.sp -mfloat-abi=soft -print-multi-directory: 
thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+simd -mfloat-abi=soft -print-multi-directory: 
thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crypto -mfloat-abi=soft -print-multi-directory: 
thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc+simd -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc+crypto -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+fp.sp+simd -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+fp.sp+crypto -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+simd+crypto -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+crypto -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc+simd+crypto -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+fp.sp+simd+crypto -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd+crypto -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp


% for ext in "" +crc +fp.sp +simd +crypto +crc+fp.sp +crc+simd +crc+crypto 
+fp.sp+simd +fp.sp+crypto +simd+crypto +crc+fp.sp+simd +crc+fp.sp+crypto 
+crc+simd+crypto +fp.sp+simd+crypto +crc+fp.sp+simd+crypto ; do 
cmd="arm-none-eabi-gcc -march=armv8-r${ext} -mfloat-abi=softfp 
-print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done
arm-none-eabi-gcc -march=armv8-r -mfloat-abi=softfp -print-multi-directory: 
thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc -mfloat-abi=softfp -print-multi-directory: 
thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+fp.sp -mfloat-abi=softfp 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+simd -mfloat-abi=softfp -print-multi-directory: 
thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp -mfloat-abi=softfp 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc+simd -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+crc+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+fp.sp+simd -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+fp.sp+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+simd+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+crc+simd+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+fp.sp+simd+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp


% for ext in "" +crc +fp.sp +simd +crypto +crc+fp.sp +crc+simd +crc+crypto 
+fp.sp+simd +fp.sp+crypto +simd+cryp

Re: [PATCH, GCC/ARM] Multilib mapping for Armv8-R

2018-02-16 Thread Thomas Preudhomme
/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+crc+simd+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+fp.sp+simd+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp


% for ext in "" +crc +fp.sp +simd +crypto +crc+fp.sp +crc+simd +crc+crypto 
+fp.sp+simd +fp.sp+crypto +simd+crypto +crc+fp.sp+simd +crc+fp.sp+crypto 
+crc+simd+crypto +fp.sp+simd+crypto +crc+fp.sp+simd+crypto ; do 
cmd="arm-none-eabi-gcc -march=armv8-r${ext} -mfloat-abi=hard 
-print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done

arm-none-eabi-gcc -march=armv8-r -mfloat-abi=hard -print-multi-directory: .
arm-none-eabi-gcc -march=armv8-r+crc -mfloat-abi=hard -print-multi-directory: .
arm-none-eabi-gcc -march=armv8-r+fp.sp -mfloat-abi=hard -print-multi-directory: 
.
arm-none-eabi-gcc -march=armv8-r+simd -mfloat-abi=hard -print-multi-directory: 
thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+crypto -mfloat-abi=hard -print-multi-directory: 
thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp -mfloat-abi=hard 
-print-multi-directory: .
arm-none-eabi-gcc -march=armv8-r+crc+simd -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+crc+crypto -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+fp.sp+simd -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+fp.sp+crypto -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+simd+crypto -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+crypto -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+crc+simd+crypto -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+fp.sp+simd+crypto -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd+crypto -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard


On 13/02/18 10:27, Kyrill Tkachov wrote:

Hi Thomas,

On 13/02/18 10:24, Thomas Preudhomme wrote:

Hi,

Due to there being no multilib mapping for Armv8-R, default multilib
targeting -march=armv4t with softfloat floating-point arithmetic is
being used. This patch maps it instead to the existing Armv7 multilibs.
Note that since there is no single-precision multilib compatible with
R profile, -march=armv8-r+fp.sp is mapped to -march=armv7 ie. Armv7
with softfloat floating-point.



Thanks for doing this.


Changelog entry is as follows:

*** gcc/ChangeLog ***

2018-02-12  Thomas Preud'homme 

    * config/arm/t-multilib: Map Armv8-R to Armv7 multilibs.

Testing:

Ran -print-multi-directory for all combinations of extensions one can
pass to -march=armv8-r (including no extension but only considering a
single ordering of extension). All gave the expected result. Details in
appendix.

Is this ok for stage4?

Best regards,

Thomas

Appendix: output of -print-multi-directory for all extensions available
to -march=armv8-r



Can you please add a representative subset of these as tests to 
gcc.target/arm/multilib.exp.
That way we can have the peace of mind that they have sane mappings as we go 
forward.




diff --git a/gcc/config/arm/t-multilib b/gcc/config/arm/t-multilib
index 
2f790097670e1bf81b56b069a6b1582763aab6e9..cd5927a7c9ec053b4d5b9725f7b30daeca3b1aa3 
100644

--- a/gcc/config/arm/t-multilib
+++ b/gcc/config/arm/t-multilib
@@ -70,6 +70,7 @@ v8_a_simd_variants    := $(call all_feat_combs, simd crypto)
  v8_1_a_simd_variants    := $(call all_feat_combs, simd crypto)
  v8_2_a_simd_variants    := $(call all_feat_combs, simd fp16 fp16fml crypto 
dotprod)

  v8_4_a_simd_variants    := $(call all_feat_combs, simd fp16 crypto)
+v8_r_nosimd_variants    := $(call all_feat_combs, crc fp.sp)

  ifneq (,$(HAS_APROFILE))
  include $(srcdir)/config/arm/t-aprofile
@@ -105,6 +106,20 @@ MULTILIB_MATCHES    += march?armv7+fp=march?armv7-r+fp+idiv

  MULTILIB_MATCHES    += $(foreach ARCH, $(all_early_arch), \
   march?armv5te+fp=march?$(ARCH)+fp)
+#
+# Armv8-r: map down onto common v7 code.


Please use Armv8-R.



  +# Note 1: there is no single-precision armv7 multilib so +fp.sp is mapped
+# down to softfloat armv7 (second MULTILIB_MATCHES).
+# Note 2: +fp.sp being a subset of +simd and +crypto, there is no need to
+# consider the combination of +fp.sp with a simd extension since matching
+# is run after canonicalization
+MULTILIB_MATCHES    += march?armv7=march?armv8-r
+MULTILIB_MATCHES    += $(foreach ARCH, $(v8_r_nosimd_variants), \
+ march?armv7=march?armv8-r$(ARCH))
+MULTILIB_MATCHES    += $(foreach ARCH

[PATCH, arm-embedded] Multilib mapping for Armv8-R

2018-02-27 Thread Thomas Preudhomme

Hi,

We have decided to apply the following patch to the
ARM/embedded-7-branch to provide better multilib for Armv8-R targets.

Due to there being no multilib mapping for Armv8-R, default multilib
built for -march=armv4t with softfloat floating-point arithmetic is
being used. This patch maps it instead to the existing Armv7 multilibs.
Note that mapping for single-precision Armv8-R has been left out due to
there being no Arm implementation of that architecture variant.

Changelog entry is as follows:

*** gcc/ChangeLog ***

2018-02-26  Thomas Preud'homme  

* config/arm/t-rmprofile: Map Armv8-R and Armv8-R with CRC extension to
Armv7 multilibs.

Testing:

Ran -print-multi-directory for all combinations of
-march=armv8-r/-march=armv8-r+crc with
-mfpu=neon-fp-armv8/crypto-neon-fp-armv8. All gave the expected result. Details
in appendix.

Is this ok for stage4?

Best regards,

Thomas

Appendix: output of -print-multi-directory for all supported Armv8-R
configuration single precision FPU excepted.

% for ext in "" +crc; do cmd="arm-none-eabi-gcc -march=armv8-r${ext} 
-mfloat-abi=soft -print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done
arm-none-eabi-gcc -march=armv8-r -mfloat-abi=soft -print-multi-directory: 
thumb/v7-ar
arm-none-eabi-gcc -march=armv8-r+crc -mfloat-abi=soft -print-multi-directory: 
thumb/v7-ar


% for ext in "" +crc; do for fpu in neon-fp-armv8 crypto-neon-fp-armv8; do 
cmd="arm-none-eabi-gcc -march=armv8-r${ext} -mfpu=${fpu} -mfloat-abi=softfp 
-print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done ; done
arm-none-eabi-gcc -march=armv8-r -mfpu=neon-fp-armv8 -mfloat-abi=softfp 
-print-multi-directory: thumb/v7-ar/fpv3/softfp
arm-none-eabi-gcc -march=armv8-r -mfpu=crypto-neon-fp-armv8 -mfloat-abi=softfp 
-print-multi-directory: thumb/v7-ar/fpv3/softfp
arm-none-eabi-gcc -march=armv8-r+crc -mfpu=neon-fp-armv8 -mfloat-abi=softfp 
-print-multi-directory: thumb/v7-ar/fpv3/softfp
arm-none-eabi-gcc -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp


% for ext in "" +crc; do for fpu in neon-fp-armv8 crypto-neon-fp-armv8; do 
cmd="arm-none-eabi-gcc -march=armv8-r${ext} -mfpu=${fpu} -mfloat-abi=hard 
-print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done ; done
arm-none-eabi-gcc -march=armv8-r -mfpu=neon-fp-armv8 -mfloat-abi=hard 
-print-multi-directory: thumb/v7-ar/fpv3/hard
arm-none-eabi-gcc -march=armv8-r -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard 
-print-multi-directory: thumb/v7-ar/fpv3/hard
arm-none-eabi-gcc -march=armv8-r+crc -mfpu=neon-fp-armv8 -mfloat-abi=hard 
-print-multi-directory: thumb/v7-ar/fpv3/hard
arm-none-eabi-gcc -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard 
-print-multi-directory: thumb/v7-ar/fpv3/hard


% for ext in "" +crc; do cmd="arm-none-eabi-gcc -mthumb -march=armv8-r${ext} 
-mfpu=${fpu} -mfloat-abi=soft -print-multi-directory" ; echo -n "$cmd: " ; eval 
$cmd ; done
arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=soft -print-multi-directory: .
arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=soft -print-multi-directory: .


% for ext in "" +crc; do for fpu in neon-fp-armv8 crypto-neon-fp-armv8; do 
cmd="arm-none-eabi-gcc -mthumb -march=armv8-r${ext} -mfpu=${fpu} 
-mfloat-abi=softfp -print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done 
; done
arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=neon-fp-armv8 -mfloat-abi=softfp 
-print-multi-directory: thumb/v7-ar/fpv3/softfp
arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp
arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=neon-fp-armv8 
-mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp
arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp


% for ext in "" +crc; do for fpu in neon-fp-armv8 crypto-neon-fp-armv8; do 
cmd="arm-none-eabi-gcc -mthumb -march=armv8-r${ext} -mfpu=${fpu} 
-mfloat-abi=hard -print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done ; 
done
arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=neon-fp-armv8 -mfloat-abi=hard 
-print-multi-directory: thumb/v7-ar/fpv3/hard
arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=hard -print-multi-directory: thumb/v7-ar/fpv3/hard
arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=neon-fp-armv8 
-mfloat-abi=hard -print-multi-directory: thumb/v7-ar/fpv3/hard
arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=hard -print-multi-directory: thumb/v7-ar/fpv3/hard
diff --git a/gcc/config/arm/t-rmprofile b/gcc/config/arm/t-rmprofile
index d4bc9fde4c5544812bde4743ccc18d68c1c25132..a3a24d59fb29b42a36177bd2d2ebfae4e50e5a10 100644
--- a/gcc/config/arm/t-rmprofile
+++ b/gcc/config/arm/t-rmprofile
@@ -135,6 +135,8 @@ MULTILIB_MATCHES   += ma

[arm-embedded] Allow -mcpu=cortex-m33+nodsp

2018-02-27 Thread Thomas Preudhomme

Hi, we decided to apply the following patch to ARM/embedded-7-branch to
support -mcpu=cortex-m33+nodsp.

DSP instructions are optional for Arm Cortex-M33, yet its -mcpu option
does not allow +nodsp. Users are thus left with using
-march=armv8-m.main -mtune=cortex-m33. This patch creates a new cpu
cortex-m33+nodsp since there is no mechanism on GCC 7 for CPU
extensions. Since GCC passes the -mcpu parameter down to GAS verbatim
and that GAS does not support +nodsp for cortex-m33, this patch also
special cases -mcpu=cortex-m33 in arm_file_start to output a .arch
option instead of .cpu.

2018-02-26  Thomas Preud'homme  

* config/arm/arm-cpus.in (cortex-m33+nodsp): New CPU.
* config/arm/arm-cpu-cdata.h: Regenerate.
* config/arm/arm-cpu-data.h: Likewise.
* config/arm/arm-cpu.h: Likewise.
* config/arm/arm-tables.opt: Likewise.
* config/arm/arm-tune.md: Likewise.
* config/arm/arm.c (arm_file_start): Special case
* -mcpu=cortex-m33+nodsp to emit .arch armv8-m.main instead.
* doc/invoke.texi: Document cortex-m33+nodsp as a valid value for -mcpu
and -mtune.

Testing: Compiled a hello world with -S -mcpu=cortex-m33 and with
-S -mcpu=cortex-m33+dsp and compared both assembly files. The latter
correctly emits .arch armv8-m.main instead of .cpu cortex-m33.

Best regards,

Thomas
diff --git a/gcc/ChangeLog.arm b/gcc/ChangeLog.arm
index a98ecb028f6800a516f6cd252390ceac1e08911b..e09bd132d224aee511591143d86efff8bb156d60 100644
--- a/gcc/ChangeLog.arm
+++ b/gcc/ChangeLog.arm
@@ -1,3 +1,9 @@
+2018-02-26  Thomas Preud'homme  
+
+	* config/arm/arm-cpus.in (cortex-m33+nodsp): Define.
+	* doc/invoke.texi: Document +nodsp as a valid extension for
+	-mcpu=cortex-m33.
+
 2017-11-23  Thomas Preud'homme  
 
 	Cherry-pick from GCC 7
diff --git a/gcc/config/arm/arm-cpu-cdata.h b/gcc/config/arm/arm-cpu-cdata.h
index 27571c841d928fe9c331006bfc9608c4e75b60d8..f5e34c830ca28196ded0912c230f719a6ff5681e 100644
--- a/gcc/config/arm/arm-cpu-cdata.h
+++ b/gcc/config/arm/arm-cpu-cdata.h
@@ -789,6 +789,13 @@ static const struct arm_arch_core_flag arm_arch_core_flags[] =
 },
   },
   {
+"cortex-m33+nodsp",
+{
+  ISA_ARMv8m_main,
+  isa_nobit
+},
+  },
+  {
 "cortex-r52",
 {
   ISA_ARMv8r,isa_bit_crc32,
diff --git a/gcc/config/arm/arm-cpu-data.h b/gcc/config/arm/arm-cpu-data.h
index e474efa02ed93a93ae00ac2057a9bc841c48b87f..30902ecabc6c72e46e6f6aa1d92b9980fd639dcd 100644
--- a/gcc/config/arm/arm-cpu-data.h
+++ b/gcc/config/arm/arm-cpu-data.h
@@ -1221,6 +1221,17 @@ static const struct processors all_cores[] =
 &arm_v7m_tune
   },
   {
+"cortex-m33+nodsp",
+TARGET_CPU_cortexm33nodsp,
+(TF_LDSCHED),
+"8M_MAIN", BASE_ARCH_8M_MAIN,
+{
+  ISA_ARMv8m_main,
+  isa_nobit
+},
+&arm_v7m_tune
+  },
+  {
 "cortex-r52",
 TARGET_CPU_cortexr52,
 (TF_LDSCHED),
diff --git a/gcc/config/arm/arm-cpu.h b/gcc/config/arm/arm-cpu.h
index 502965081faa625abc93d97559517baf50972e1b..22566495fdf0da0ad75b81a5956eecb898c38684 100644
--- a/gcc/config/arm/arm-cpu.h
+++ b/gcc/config/arm/arm-cpu.h
@@ -130,6 +130,7 @@ enum processor_type
   TARGET_CPU_cortexa73cortexa53,
   TARGET_CPU_cortexm23,
   TARGET_CPU_cortexm33,
+  TARGET_CPU_cortexm33nodsp,
   TARGET_CPU_cortexr52,
   TARGET_CPU_arm_none
 };
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 5f18dfb35687888bc7f642785693f75658a96733..7368a067db92b384f83fdb4a0af6cb77cff4e6f4 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -1090,6 +1090,13 @@ begin cpu cortex-m33
  costs v7m
 end cpu cortex-m33
 
+begin cpu cortex-m33+nodsp
+ cname cortexm33nodsp
+ tune flags LDSCHED
+ architecture armv8-m.main
+ costs v7m
+end cpu cortex-m33+nodsp
+
 # V8 R-profile implementations.
 begin cpu cortex-r52
  cname cortexr52
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index ede44f497edd69390bbbe6de5a913430b546c547..a46bc3c7f8ba6048969bae4d37a7be3c5242ce6a 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -349,6 +349,9 @@ EnumValue
 Enum(processor_type) String(cortex-m33) Value( TARGET_CPU_cortexm33)
 
 EnumValue
+Enum(processor_type) String(cortex-m33+nodsp) Value( TARGET_CPU_cortexm33nodsp)
+
+EnumValue
 Enum(processor_type) String(cortex-r52) Value( TARGET_CPU_cortexr52)
 
 Enum
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index 519c0556fe76a5a391cd268bb50541c77a4596d4..542b7972d21cd3c9986229e91ce0841522e3b52f 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -57,5 +57,5 @@
 	cortexa73,exynosm1,xgene1,
 	cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,
 	cortexa73cortexa53,cortexm23,cortexm33,
-	cortexr52"
+	cortexm33nodsp,cortexr52"
 	(const (symbol_ref "((enum attr_tune) arm_tune)")))
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 8f2639f722b1c6a7a3541aa030221811f565fe5e..b37a8ae475489f2f12f8d0

Re: [PATCH, GCC/testsuite] Fix dump-noaddr dumpbase

2018-03-01 Thread Thomas Preudhomme
Finally committed to gcc-7-branch, sorry for doing this so late. I've merged the 
two commits into one. Patch attached for reference.


Best regards,

Thomas

On 05/12/17 21:26, Mike Stump wrote:

On Dec 5, 2017, at 12:56 PM, Thomas Preudhomme  
wrote:


Thanks, I've tested after the two commits and it works both in tree and out of 
tree. It'll simplify comparing in tree results Vs out of tree for us, thanks a 
lot!

Would you consider a backport to stable branches if nobody complains after a 
week?


Yeah, back port is Ok.

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index b211dec4ffb20359f50bbc695481977282eb0525..b78c5f59bfc1121cf61071e41bd11551a9ab7122 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,12 @@
+2017-02-27  Thomas Preud'homme  
+
+	Backport from mainline
+	2017-12-05  Matthew Gretton-Dann  
+	with follow-up r255433 commit.
+
+	* gcc.c-torture/unsorted/dump-noaddr.x: Generate dump files in
+	tmpdir.
+
 2018-02-26  Carl Love  
 
 	Backport from mainline: commit 257747 on 2018-02-16.
diff --git a/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x b/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x
index d14d494570944b2be82c2575204cdbf4b15721ca..e86f36a1861fc4dc46bd449d78403f510ec4b920 100644
--- a/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x
+++ b/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x
@@ -9,14 +9,14 @@ proc dump_compare { src options } {
 
 # loop through all the options
 foreach option $option_list {
-	file delete -force dump1
-	file mkdir dump1
+	file delete -force $tmpdir/dump1
+	file mkdir $tmpdir/dump1
 	c-torture-compile $src "$option $options -dumpbase dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
-	file delete -force dump2
-	file mkdir dump2
+	file delete -force $tmpdir/dump2
+	file mkdir $tmpdir/dump2
 	c-torture-compile $src "$option $options -dumpbase dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
-	foreach dump1 [lsort [glob -nocomplain dump1/*]] {
-	regsub dump1/ $dump1 dump2/ dump2
+	foreach dump1 [lsort [glob -nocomplain $tmpdir/dump1/*]] {
+	set dump2 "$tmpdir/dump2/[file tail $dump1]"
 	set dumptail "gcc.c-torture/unsorted/[file tail $dump1]"
 	regsub {\.\d+((t|r|i)\.[^.]+)$} $dumptail {.*\1} dumptail
 	set tmp [ diff "$dump1" "$dump2" ]
@@ -29,8 +29,8 @@ proc dump_compare { src options } {
 	}
 	}
 }
-file delete -force dump1
-file delete -force dump2
+file delete -force $tmpdir/dump1
+file delete -force $tmpdir/dump2
 }
 
 dump_compare $src $options


[PATCH, GCC/testsuite/ARM] Fix copysign_softfloat_1.c option directives

2018-03-01 Thread Thomas Preudhomme

gcc.target/arm/copysign_softfloat_1.c's use of arm_arch_v6t2 in
dg-add-option changes the architecture to -march=armv6t2. Since the test
only requires Thumb-2 capable architecture, we just need to add -mthumb
on the command line since arm_thumb2_ok guarantees by definition that
doing that is enough to select Thumb-2. This fixes warning on the
command line when having -mcpu=cortex-m3 in RUNTESTFLAGS for instance.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2018-03-01  Thomas Preud'homme  diff --git a/gcc/testsuite/gcc.target/arm/copysign_softfloat_1.c b/gcc/testsuite/gcc.target/arm/copysign_softfloat_1.c
index fdbeeadc01e1c9b9a7810a8ff8b23c58f6c429a5..a14922f1c12aeb4a22ee38fde188691d5a89de81 100644
--- a/gcc/testsuite/gcc.target/arm/copysign_softfloat_1.c
+++ b/gcc/testsuite/gcc.target/arm/copysign_softfloat_1.c
@@ -1,7 +1,6 @@
 /* { dg-do run } */
 /* { dg-require-effective-target arm_thumb2_ok } */
-/* { dg-add-options arm_arch_v6t2 } */
-/* { dg-additional-options "-O2 --save-temps" } */
+/* { dg-additional-options "-mthumb -O2 --save-temps" } */
 
 extern void abort (void);
 


[PATCH, GCC/testsuite] Fix FAIL display for some scan-*-times directives

2018-03-13 Thread Thomas Preudhomme

Hi,

scan-assembler-times and scan-tree-dump-times dejagnu directives show a
different output in the summary files depending on whether they PASS or
FAIL. This means that dg-cmp-results would not show a regression because
it would not see a connection between the two output.

The difference comes from the FAIL showing the number of actual times
the pattern was match, presumably to help debugging. This patch moves
the info regarding the actual number of times the pattern match in a
separate verbose message. This keeps the message unchanged but let
developers have the required debug message with -v.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2018-03-13  Thomas Preud'homme  

* lib/scanasm.exp (scan-assembler-times): Move FAIL debug info into a
separate verbose message.
* lib/scandump.exp (scan-dump-times): Likewise.

Testing: Made a modified version of gcc.dg/nand.c and
gcc.dg/torture/pr61772.c to FAIL their scan-assembler-times and
scan-tree-dump-times respective directives. Without the patch
dg-cmp-results does not flag any regression but does with the patch.

Is this ok for stage 4?

Best regards,

Thomas
diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index 3a775b0a812775193cf1181337a5b890cde74133..61e0f3f48aeea5785689c5df7a15dc2ccbc71029 100644
--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -266,7 +266,8 @@ proc scan-assembler-times { args } {
 if {$result_count == $times} {
 	pass "$testcase scan-assembler-times $pp_pattern $times"
 } else {
-	fail "$testcase scan-assembler-times $pp_pattern $times (found $result_count times)"
+	verbose -log "$testcase: $pp_pattern found $result_count times"
+	fail "$testcase scan-assembler-times $pp_pattern $times"
 }
 }
 
diff --git a/gcc/testsuite/lib/scandump.exp b/gcc/testsuite/lib/scandump.exp
index 4e3da972ae4ed09c9874eb384daf825e6e2dcde3..be8fbe8b461dc81d5683fe323c0913f678daa1e0 100644
--- a/gcc/testsuite/lib/scandump.exp
+++ b/gcc/testsuite/lib/scandump.exp
@@ -110,7 +110,8 @@ proc scan-dump-times { args } {
 if {$result_count == $times} {
 pass "$testname"
 } else {
-fail "$testname (found $result_count times)"
+	verbose -log "$testcase: pattern found $result_count times"
+fail "$testname"
 }
 }
 


[arm-embedded][PATCH] Add multilib mapping for -mcpu=cortex-m33+nodsp

2018-03-15 Thread Thomas Preudhomme

Hi,

Currently -mcpu=cortex-m33+nodsp gets assigned the thumb multilib due to
lack of mapping from -mcpu=cortex-m33+nodsp to an -march option. This
leads to link failures for linking Armv4T Thumb code from the multilib
with Armv8-M Mainline code from the code being compiled.

This patch adds the appropriate mapping.

ChangeLog entry is as follows:

*** gcc/ChangeLog.arm ***

2018-03-14  Thomas Preud'homme  

* config/arm/t-rmprofile: Add mapping from -mcpu=cortex-m33+nodsp to
-march=armv8-m.main.

Testing: A hello world fails to link without the patch with a multilib
build but succeeds with the patch.

We've decided to apply this patch to the ARM/embedded-7-branch branch.

Best regards,

Thomas
diff --git a/gcc/config/arm/t-rmprofile b/gcc/config/arm/t-rmprofile
index a3a24d59fb29b42a36177bd2d2ebfae4e50e5a10..54411795215b8aff90ba9cfb806ec7b33db4caea 100644
--- a/gcc/config/arm/t-rmprofile
+++ b/gcc/config/arm/t-rmprofile
@@ -102,6 +102,7 @@ MULTILIB_MATCHES   += march?armv7e-m=mcpu?cortex-m4
 MULTILIB_MATCHES   += march?armv7e-m=mcpu?cortex-m7
 MULTILIB_MATCHES   += march?armv8-m.base=mcpu?cortex-m23
 MULTILIB_MATCHES   += march?armv8-m.main=mcpu?cortex-m33
+MULTILIB_MATCHES   += march?armv8-m.main=mcpu?cortex-m33+nodsp
 MULTILIB_MATCHES   += march?armv7=mcpu?cortex-r4
 MULTILIB_MATCHES   += march?armv7=mcpu?cortex-r4f
 MULTILIB_MATCHES   += march?armv7=mcpu?cortex-r5


[arm-embedded][PATCH] Add multilib mapping for -mcpu=cortex-r52

2018-03-15 Thread Thomas Preudhomme

Hi,

Currently -mcpu=cortex-r52 gets assigned the default multilib due to
lack of mapping from -mcpu=cortex-r52 to an -march option. This is
inconsistent with -march=armv8-r which gets the thumb/v7-ar multilib.

This patch adds the appropriate mapping.

ChangeLog entry is as follows:

*** gcc/ChangeLog.arm ***

2018-03-15  Thomas Preud'homme  

* config/arm/t-rmprofile: Add mapping from -mcpu=cortex-r52 to
-march=armv7.

Testing: -mcpu=cortex-r52 -print-multi-directory prints . (ie. default
mutlilib) without the patch with a multilib build but prints the
expected thumb/v7-ar with the patch.

We've decided to apply this patch to the ARM/embedded-7-branch.

Best regards,

Thomas


[PATCH, GCC/ARM] Fix PR85203: cmse_nonsecure_caller returns wrong result

2018-04-04 Thread Thomas Preudhomme

Hi,

__builtin_cmse_nonsecure_caller implementation returns true in almost
all cases due to 2 separate bugs:

* gen_addsi is used instead of gen_andsi to retrieve the lsb
* the lsb boolean value is not negated but the specification [1] says
  the intrinsic should return true for a nonsecure caller and a
  nonsecure caller is characterized with LR's lsb being 0

This was not caught due to (1) lack of runtime test and (2) the existing
RTL scan not taking into account that '.' matches newline in Tcl regular
expressions.

This patch fixes the implementation issues and improves testing of
cmse_nonsecure_caller by (1) adding a runtime test for the secure caller
case and (2) looking for an SET insn of an AND expression in the right
function. This leaves the nonsecure caller case only partly tested
since the exact value being AND and the negation are not covered by the
scan and the existing test infrastructure does not allow 2 separate
compilation and link to be performed. It is enough though to catch the
current incorrect behavior.

The patch also reorganize the scan directives in cmse-1.c to more easily
identify what function they are intended to test in the file.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2018-04-04  Thomas Preud'homme  

PR target/85203
* config/arm/arm-builtins.c (arm_expand_builtin): Change
expansion to perform a bitwise AND of the argument followed by a
boolean negation of the result.

*** gcc/testsuite/ChangeLog ***

2018-04-04  Thomas Preud'homme  

PR target/85203
* gcc.target/arm/cmse/cmse-1.c: Tighten cmse_nonsecure_caller RTL scan
to match a single insn of the baz function.  Move scan directives at
the end of the file below the functions they are trying to test for
better readability.
* gcc.target/arm/cmse/cmse-16.c: New testcase.

Testing: No bootstrap since only M profile builtin code has been changed
but regression testing for arm-none-eabi targeting Arm Cortex-M23 and
Cortex-M33 shows no regression.

Is this ok for stage4?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 8940d1f6311bccf86664ab2eaa938735eec595f6..184eb2a934308717b6e1054e376487a297f8d5de 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -2600,7 +2600,9 @@ arm_expand_builtin (tree exp,
 case ARM_BUILTIN_CMSE_NONSECURE_CALLER:
   target = gen_reg_rtx (SImode);
   op0 = arm_return_addr (0, NULL_RTX);
-  emit_insn (gen_addsi3 (target, op0, const1_rtx));
+  emit_insn (gen_andsi3 (target, op0, const1_rtx));
+  op1 = gen_rtx_EQ (SImode, target, const0_rtx);
+  emit_insn (gen_cstoresi4 (target, op1, target, const0_rtx));
   return target;
 
 case ARM_BUILTIN_TEXTRMSB:
diff --git a/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c b/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c
index c13272eed683aa06db027cd4646e5fe67817212b..f764153cb17b796ccd0d20abb78d5cf56be52911 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c
@@ -71,6 +71,20 @@ baz (void)
 {
   return cmse_nonsecure_caller ();
 }
+/* { dg-final { scan-assembler "baz:" } } */
+/* { dg-final { scan-assembler "__acle_se_baz:" } } */
+/* { dg-final { scan-assembler-not "\tcmse_nonsecure_caller" } } */
+/* Look for an andsi of 1 with a register in function baz, ie.
+
+;; Function baz
+
+(insn  (set (reg:SI )
+ (and:SI (reg:SI )
+	 (const_int 1 )
+   >
+(insn
+*/
+/* { dg-final { scan-rtl-dump "\n;; Function baz\[^\n\]*\[^(\]+\[^;\]*\n\\(insn \[^(\]+ \\(set \\(reg\[^:\]*:SI \[^)\]+\\)\[^(\]*\\(and:SI \\(reg\[^:\]*:SI \[^)\]+\\)\[^(\]*\\((const_int 1|reg\[^:\]*:SI) \[^)\]+\\)\[^(\]+(\\(nil\\)\[^(\]+)?\\(insn" expand } } */
 
 typedef int __attribute__ ((cmse_nonsecure_call)) (int_nsfunc_t) (void);
 
@@ -86,6 +100,11 @@ qux (int_nsfunc_t * callback)
 {
   fp = cmse_nsfptr_create (callback);
 }
+/* { dg-final { scan-assembler "qux:" } } */
+/* { dg-final { scan-assembler "__acle_se_qux:" } } */
+/* { dg-final { scan-assembler "bic" } } */
+/* { dg-final { scan-assembler "push\t\{r4, r5, r6" } } */
+/* { dg-final { scan-assembler "msr\tAPSR_nzcvq" } } */
 
 int call_callback (void)
 {
@@ -94,13 +113,4 @@ int call_callback (void)
   else
 return default_callback ();
 }
-/* { dg-final { scan-assembler "baz:" } } */
-/* { dg-final { scan-assembler "__acle_se_baz:" } } */
-/* { dg-final { scan-assembler "qux:" } } */
-/* { dg-final { scan-assembler "__acle_se_qux:" } } */
-/* { dg-final { scan-assembler-not "\tcmse_nonsecure_caller" } } */
-/* { dg-final { scan-rtl-dump "and.*reg.*const_int 1" expand } } */
-/* { dg-final { scan-assembler "bic" } } */
-/* { dg-final { scan-assembler "push\t\{r4, r5, r6" } } */
-/* { dg-final { scan-assembler "msr\tAPSR_nzcvq" } } */
 /* { dg-final { scan-assembler-times "bl\\s+__gnu_cmse_nonsecure_call" 1 } } */
diff --git a/gcc/testsuite/gcc.target/arm/cmse/cmse-16.c b/gcc/testsui

Re: [PATCH, GCC/ARM] Fix PR85203: cmse_nonsecure_caller returns wrong result

2018-04-04 Thread Thomas Preudhomme

Oops, forgot the link.

On 04/04/18 18:03, Thomas Preudhomme wrote:

Hi,

__builtin_cmse_nonsecure_caller implementation returns true in almost
all cases due to 2 separate bugs:

* gen_addsi is used instead of gen_andsi to retrieve the lsb
* the lsb boolean value is not negated but the specification [1] says
   the intrinsic should return true for a nonsecure caller and a
   nonsecure caller is characterized with LR's lsb being 0


[1] 
https://static.docs.arm.com/ecm0359818/10/ECM0359818_armv8m_security_extensions_reqs_on_dev_tools_1_0.pdf


Best regards,

Thomas



This was not caught due to (1) lack of runtime test and (2) the existing
RTL scan not taking into account that '.' matches newline in Tcl regular
expressions.

This patch fixes the implementation issues and improves testing of
cmse_nonsecure_caller by (1) adding a runtime test for the secure caller
case and (2) looking for an SET insn of an AND expression in the right
function. This leaves the nonsecure caller case only partly tested
since the exact value being AND and the negation are not covered by the
scan and the existing test infrastructure does not allow 2 separate
compilation and link to be performed. It is enough though to catch the
current incorrect behavior.

The patch also reorganize the scan directives in cmse-1.c to more easily
identify what function they are intended to test in the file.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2018-04-04  Thomas Preud'homme  

 PR target/85203
 * config/arm/arm-builtins.c (arm_expand_builtin): Change
 expansion to perform a bitwise AND of the argument followed by a
 boolean negation of the result.

*** gcc/testsuite/ChangeLog ***

2018-04-04  Thomas Preud'homme  

 PR target/85203
 * gcc.target/arm/cmse/cmse-1.c: Tighten cmse_nonsecure_caller RTL scan
 to match a single insn of the baz function.  Move scan directives at
 the end of the file below the functions they are trying to test for
 better readability.
 * gcc.target/arm/cmse/cmse-16.c: New testcase.

Testing: No bootstrap since only M profile builtin code has been changed
but regression testing for arm-none-eabi targeting Arm Cortex-M23 and
Cortex-M33 shows no regression.

Is this ok for stage4?

Best regards,

Thomas


Re: [PATCH, GCC/ARM] Fix PR85203: cmse_nonsecure_caller returns wrong result

2018-04-04 Thread Thomas Preudhomme

Hi Kyrill,

On 04/04/18 18:19, Kyrill Tkachov wrote:

Hi Thomas,

On 04/04/18 18:03, Thomas Preudhomme wrote:

Hi,

__builtin_cmse_nonsecure_caller implementation returns true in almost
all cases due to 2 separate bugs:

* gen_addsi is used instead of gen_andsi to retrieve the lsb
* the lsb boolean value is not negated but the specification [1] says
   the intrinsic should return true for a nonsecure caller and a
   nonsecure caller is characterized with LR's lsb being 0

This was not caught due to (1) lack of runtime test and (2) the existing
RTL scan not taking into account that '.' matches newline in Tcl regular
expressions.

This patch fixes the implementation issues and improves testing of
cmse_nonsecure_caller by (1) adding a runtime test for the secure caller
case and (2) looking for an SET insn of an AND expression in the right
function. This leaves the nonsecure caller case only partly tested
since the exact value being AND and the negation are not covered by the
scan and the existing test infrastructure does not allow 2 separate
compilation and link to be performed. It is enough though to catch the
current incorrect behavior.

The patch also reorganize the scan directives in cmse-1.c to more easily
identify what function they are intended to test in the file.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2018-04-04  Thomas Preud'homme 

    PR target/85203
    * config/arm/arm-builtins.c (arm_expand_builtin): Change
    expansion to perform a bitwise AND of the argument followed by a
    boolean negation of the result.

*** gcc/testsuite/ChangeLog ***

2018-04-04  Thomas Preud'homme 

    PR target/85203
    * gcc.target/arm/cmse/cmse-1.c: Tighten cmse_nonsecure_caller RTL scan
    to match a single insn of the baz function.  Move scan directives at
    the end of the file below the functions they are trying to test for
    better readability.
    * gcc.target/arm/cmse/cmse-16.c: New testcase.

Testing: No bootstrap since only M profile builtin code has been changed
but regression testing for arm-none-eabi targeting Arm Cortex-M23 and
Cortex-M33 shows no regression.

Is this ok for stage4?



Ok, thanks for fixing this.
Does this need backporting to the branches?


Yes to gcc-7-branch only.

Best regards,

Thomas


Re: [PATCH, GCC/ARM] Fix PR85203: cmse_nonsecure_caller returns wrong result

2018-04-05 Thread Thomas Preudhomme

Hi Kyrill,

On 04/04/18 18:20, Thomas Preudhomme wrote:

Hi Kyrill,

On 04/04/18 18:19, Kyrill Tkachov wrote:

Hi Thomas,

On 04/04/18 18:03, Thomas Preudhomme wrote:

Hi,

__builtin_cmse_nonsecure_caller implementation returns true in almost
all cases due to 2 separate bugs:

* gen_addsi is used instead of gen_andsi to retrieve the lsb
* the lsb boolean value is not negated but the specification [1] says
   the intrinsic should return true for a nonsecure caller and a
   nonsecure caller is characterized with LR's lsb being 0

This was not caught due to (1) lack of runtime test and (2) the existing
RTL scan not taking into account that '.' matches newline in Tcl regular
expressions.

This patch fixes the implementation issues and improves testing of
cmse_nonsecure_caller by (1) adding a runtime test for the secure caller
case and (2) looking for an SET insn of an AND expression in the right
function. This leaves the nonsecure caller case only partly tested
since the exact value being AND and the negation are not covered by the
scan and the existing test infrastructure does not allow 2 separate
compilation and link to be performed. It is enough though to catch the
current incorrect behavior.

The patch also reorganize the scan directives in cmse-1.c to more easily
identify what function they are intended to test in the file.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2018-04-04  Thomas Preud'homme 

    PR target/85203
    * config/arm/arm-builtins.c (arm_expand_builtin): Change
    expansion to perform a bitwise AND of the argument followed by a
    boolean negation of the result.

*** gcc/testsuite/ChangeLog ***

2018-04-04  Thomas Preud'homme 

    PR target/85203
    * gcc.target/arm/cmse/cmse-1.c: Tighten cmse_nonsecure_caller RTL scan
    to match a single insn of the baz function.  Move scan directives at
    the end of the file below the functions they are trying to test for
    better readability.
    * gcc.target/arm/cmse/cmse-16.c: New testcase.

Testing: No bootstrap since only M profile builtin code has been changed
but regression testing for arm-none-eabi targeting Arm Cortex-M23 and
Cortex-M33 shows no regression.

Is this ok for stage4?



Ok, thanks for fixing this.
Does this need backporting to the branches?


Yes to gcc-7-branch only.


The patch applies cleanly on gcc-7-branch and the same testing shows no 
regression. Ok to apply to gcc-7-branch once the patch has baked for 7 days in 
trunk?


Best regards,

Thomas


[PATCH, GCC/ARM] Fix PR85261: ICE with FPSCR setter builtin

2018-04-06 Thread Thomas Preudhomme

Instruction pattern for setting the FPSCR expects the input value to be
in a register. However, __builtin_arm_set_fpscr expander does not ensure
that this is the case and as a result GCC ICEs when the builtin is
called with a constant literal.

This commit fixes the builtin to force the input value into a register.
It also remove the unneeded volatile in the existing fpscr test and
fixes the function prototype.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-04-06  Thomas Preud'homme  

PR target/85261
* config/arm/arm-builtins.c (arm_expand_builtin): Force input operand
into register.

*** gcc/testsuite/ChangeLog ***

2018-04-06  Thomas Preud'homme  

PR target/85261
* gcc.target/arm/fpscr.c: Add call to __builtin_arm_set_fpscr with
literal value.  Expect 2 MCR instruction.  Fix function prototype.
Remove volatile keyword.

Testing: Built an arm-none-eabi GCC cross-compiler and testsuite shows
no regression.

Is this ok for stage4?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 8940d1f6311bccf86664ab2eaa938735eec595f6..e100d933a77c5de4a13cb961d1bff40f57f2ea80 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -2592,7 +2592,7 @@ arm_expand_builtin (tree exp,
 	  icode = CODE_FOR_set_fpscr;
 	  arg0 = CALL_EXPR_ARG (exp, 0);
 	  op0 = expand_normal (arg0);
-	  pat = GEN_FCN (icode) (op0);
+	  pat = GEN_FCN (icode) (force_reg (SImode, op0));
 	}
   emit_insn (pat);
   return target;
diff --git a/gcc/testsuite/gcc.target/arm/fpscr.c b/gcc/testsuite/gcc.target/arm/fpscr.c
index 7b4d71d72d8964f6da0d0604bf59aeb4a895df43..4c3eaf7fcf75ad8582071ecb110fd1e4976a3b24 100644
--- a/gcc/testsuite/gcc.target/arm/fpscr.c
+++ b/gcc/testsuite/gcc.target/arm/fpscr.c
@@ -6,11 +6,14 @@
 /* { dg-add-options arm_fp } */
 
 void
-test_fpscr ()
+test_fpscr (void)
 {
-  volatile unsigned int status = __builtin_arm_get_fpscr ();
+  unsigned status;
+
+  __builtin_arm_set_fpscr (0);
+  status = __builtin_arm_get_fpscr ();
   __builtin_arm_set_fpscr (status);
 }
 
 /* { dg-final { scan-assembler "mrc\tp10, 7, r\[0-9\]+, cr1, cr0, 0" } } */
-/* { dg-final { scan-assembler "mcr\tp10, 7, r\[0-9\]+, cr1, cr0, 0" } } */
+/* { dg-final { scan-assembler-times "mcr\tp10, 7, r\[0-9\]+, cr1, cr0, 0" 2 } } */


Re: [PATCH, GCC/ARM] Fix PR85261: ICE with FPSCR setter builtin

2018-04-06 Thread Thomas Preudhomme



On 06/04/18 17:08, Ramana Radhakrishnan wrote:

On 06/04/2018 16:54, Thomas Preudhomme wrote:

Instruction pattern for setting the FPSCR expects the input value to be
in a register. However, __builtin_arm_set_fpscr expander does not ensure
that this is the case and as a result GCC ICEs when the builtin is
called with a constant literal.

This commit fixes the builtin to force the input value into a register.
It also remove the unneeded volatile in the existing fpscr test and
fixes the function prototype.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-04-06  Thomas Preud'homme  

PR target/85261
* config/arm/arm-builtins.c (arm_expand_builtin): Force input operand
into register.

*** gcc/testsuite/ChangeLog ***

2018-04-06  Thomas Preud'homme  

PR target/85261
* gcc.target/arm/fpscr.c: Add call to __builtin_arm_set_fpscr with
literal value.  Expect 2 MCR instruction.  Fix function prototype.
Remove volatile keyword.

Testing: Built an arm-none-eabi GCC cross-compiler and testsuite shows
no regression.

Is this ok for stage4?

Best regards,

Thomas



(sorry about the duplicate for those who get it)


LGTM, though in this case I would prefer a bootstrap and regression run
as this is automatically exercised most with gcc.dg/atomic_*.c and you
really need this tested on linux than just bare-metal as I'm not sure
how this gets tested on arm-none-eabi.


Oh it is indeed. Didn't realized it was used anywhere. Will start bootstrap 
right away.




What about earlier branches, have you looked ? This is a silly target
bug and fixes should go back to older branches in this particular case
after baking this on trunk for some time.


GCC 6 and 7 are affected as well and a backport will be done once it has baked 
long enough of course.


Best regards,

Thomas


Re: [PATCH, GCC/ARM] Fix PR85261: ICE with FPSCR setter builtin

2018-04-09 Thread Thomas Preudhomme

Hi Ramana,

On 06/04/18 17:17, Thomas Preudhomme wrote:



On 06/04/18 17:08, Ramana Radhakrishnan wrote:

On 06/04/2018 16:54, Thomas Preudhomme wrote:

Instruction pattern for setting the FPSCR expects the input value to be
in a register. However, __builtin_arm_set_fpscr expander does not ensure
that this is the case and as a result GCC ICEs when the builtin is
called with a constant literal.

This commit fixes the builtin to force the input value into a register.
It also remove the unneeded volatile in the existing fpscr test and
fixes the function prototype.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-04-06  Thomas Preud'homme  

PR target/85261
* config/arm/arm-builtins.c (arm_expand_builtin): Force input operand
into register.

*** gcc/testsuite/ChangeLog ***

2018-04-06  Thomas Preud'homme  

PR target/85261
* gcc.target/arm/fpscr.c: Add call to __builtin_arm_set_fpscr with
literal value.  Expect 2 MCR instruction.  Fix function prototype.
Remove volatile keyword.

Testing: Built an arm-none-eabi GCC cross-compiler and testsuite shows
no regression.

Is this ok for stage4?

Best regards,

Thomas



(sorry about the duplicate for those who get it)


LGTM, though in this case I would prefer a bootstrap and regression run
as this is automatically exercised most with gcc.dg/atomic_*.c and you
really need this tested on linux than just bare-metal as I'm not sure
how this gets tested on arm-none-eabi.


Oh it is indeed. Didn't realized it was used anywhere. Will start bootstrap 
right away.


Done with --with-arch=armv8-a --with-mode=thumb --with-fpu=neon-vfpv4 
--with-float=hard --enable-languages=c,c++,fortran --with-system-zlib 
--enable-plugins --enable-bootstrap. Testsuite for that GCC does not show any 
regression either.


Ok to commit?





What about earlier branches, have you looked ? This is a silly target
bug and fixes should go back to older branches in this particular case
after baking this on trunk for some time.


GCC 6 and 7 are affected as well and a backport will be done once it has baked 
long enough of course.


Will now bootstrap and regtest against GCC 6 and 7. Will let you know once that 
is finished.


Best regards,

Thomas


Re: [PATCH, GCC/ARM] Fix PR85203: cmse_nonsecure_caller returns wrong result

2018-04-11 Thread Thomas Preudhomme

Hi Kyrill,

One week went by so I've committed the change to GCC 7 as announced.

Best regards,

Thomas

On 05/04/18 16:36, Kyrill Tkachov wrote:


On 05/04/18 16:13, Thomas Preudhomme wrote:

Hi Kyrill,

On 04/04/18 18:20, Thomas Preudhomme wrote:

Hi Kyrill,

On 04/04/18 18:19, Kyrill Tkachov wrote:

Hi Thomas,

On 04/04/18 18:03, Thomas Preudhomme wrote:

Hi,

__builtin_cmse_nonsecure_caller implementation returns true in almost
all cases due to 2 separate bugs:

* gen_addsi is used instead of gen_andsi to retrieve the lsb
* the lsb boolean value is not negated but the specification [1] says
   the intrinsic should return true for a nonsecure caller and a
   nonsecure caller is characterized with LR's lsb being 0

This was not caught due to (1) lack of runtime test and (2) the existing
RTL scan not taking into account that '.' matches newline in Tcl regular
expressions.

This patch fixes the implementation issues and improves testing of
cmse_nonsecure_caller by (1) adding a runtime test for the secure caller
case and (2) looking for an SET insn of an AND expression in the right
function. This leaves the nonsecure caller case only partly tested
since the exact value being AND and the negation are not covered by the
scan and the existing test infrastructure does not allow 2 separate
compilation and link to be performed. It is enough though to catch the
current incorrect behavior.

The patch also reorganize the scan directives in cmse-1.c to more easily
identify what function they are intended to test in the file.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2018-04-04  Thomas Preud'homme 

    PR target/85203
    * config/arm/arm-builtins.c (arm_expand_builtin): Change
    expansion to perform a bitwise AND of the argument followed by a
    boolean negation of the result.

*** gcc/testsuite/ChangeLog ***

2018-04-04  Thomas Preud'homme 

    PR target/85203
    * gcc.target/arm/cmse/cmse-1.c: Tighten cmse_nonsecure_caller RTL scan
    to match a single insn of the baz function.  Move scan directives at
    the end of the file below the functions they are trying to test for
    better readability.
    * gcc.target/arm/cmse/cmse-16.c: New testcase.

Testing: No bootstrap since only M profile builtin code has been changed
but regression testing for arm-none-eabi targeting Arm Cortex-M23 and
Cortex-M33 shows no regression.

Is this ok for stage4?



Ok, thanks for fixing this.
Does this need backporting to the branches?


Yes to gcc-7-branch only.


The patch applies cleanly on gcc-7-branch and the same testing shows no 
regression. Ok to apply to gcc-7-branch once the patch has baked for 7 days in 
trunk?



Yes, thanks.
Kyrill


Best regards,

Thomas




Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-09-13 Thread Thomas Preudhomme
Hi all,

Ping? This new version changes both the middle-end and back-end part
so will need a review for both of those.

Best regards,

Thomas
On Wed, 29 Aug 2018 at 11:07, Thomas Preudhomme
 wrote:
>
> Forgot another important change in ARM backend:
>
> The expander were causing one too many indirection which was what
> caused the test failure in glibc. The new expanders code skip the
> creation of a move from the memory reference of the guard's address to
> a register since this is done in the insn themselves. I think during
> the initial implementation of the first version of the patch I had
> issues with loading the address and used that to load the address. As
> can be seen from the absence of regression on the runtime stack
> protector test in glibc, this is now working properly, also confirmed
> by manual inspection of the code.
>
> I've attached the interdiff from previous version for reference.
>
> Best regards,
>
> Thomas
> On Wed, 29 Aug 2018 at 10:51, Thomas Preudhomme
>  wrote:
> >
> > Resend hopefully without HTML this time.
> >
> > On Wed, 29 Aug 2018 at 10:49, Thomas Preudhomme
> >  wrote:
> > >
> > > Hi,
> > >
> > > I've reworked the patch fixing PR85434 (spilling of stack protector 
> > > guard's address on ARM) to address the testsuite regression on powerpc 
> > > and x86 as well as glibc testsuite regression on ARM. Issues were due to 
> > > unconditionally attempting to generate the new patterns. The code now 
> > > tests if there is a pattern for them for the target before generating 
> > > them. In the ARM side of the patch, I've also added a more specific 
> > > predicate for the new patterns. The new patch is found below.
> > >
> > >
> > > In case of high register pressure in PIC mode, address of the stack
> > > protector's guard can be spilled on ARM targets as shown in PR85434,
> > > thus allowing an attacker to control what the canary would be compared
> > > against. ARM does lack stack_protect_set and stack_protect_test insn
> > > patterns, defining them does not help as the address is expanded
> > > regularly and the patterns only deal with the copy and test of the
> > > guard with the canary.
> > >
> > > This problem does not occur for x86 targets because the PIC access and
> > > the test can be done in the same instruction. Aarch64 is exempt too
> > > because PIC access insn pattern are mov of UNSPEC which prevents it from
> > > the second access in the epilogue being CSEd in cse_local pass with the
> > > first access in the prologue.
> > >
> > > The approach followed here is to create new "combined" set and test
> > > standard pattern names that take the unexpanded guard and do the set or
> > > test. This allows the target to use an opaque pattern (eg. using UNSPEC)
> > > to hide the individual instructions being generated to the compiler and
> > > split the pattern into generic load, compare and branch instruction
> > > after register allocator, therefore avoiding any spilling. This is here
> > > implemented for the ARM targets. For targets not implementing these new
> > > standard pattern names, the existing stack_protect_set and
> > > stack_protect_test pattern names are used.
> > >
> > > To be able to split PIC access after register allocation, the functions
> > > had to be augmented to force a new PIC register load and to control
> > > which register it loads into. This is because sharing the PIC register
> > > between prologue and epilogue could lead to spilling due to CSE again
> > > which an attacker could use to control what the canary gets compared
> > > against.
> > >
> > > ChangeLog entries are as follows:
> > >
> > > *** gcc/ChangeLog ***
> > >
> > > 2018-08-09  Thomas Preud'homme  
> > >
> > > * target-insns.def (stack_protect_combined_set): Define new standard
> > > pattern name.
> > > (stack_protect_combined_test): Likewise.
> > > * cfgexpand.c (stack_protect_prologue): Try new
> > > stack_protect_combined_set pattern first.
> > > * function.c (stack_protect_epilogue): Try new
> > > stack_protect_combined_test pattern first.
> > > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> > > parameters to control which register to use as PIC register and force
> > > reloading PIC register respectively.  Insert in the stream of insns if
> > &

[PATCH, GCC/ARM] Fix PR87374: ICE with -mslow-flash-data and -mword-relocations

2018-09-26 Thread Thomas Preudhomme
Hi,

GCC ICEs under -mslow-flash-data and -mword-relocations because there
is no way to load an address, both literal pools and MOVW/MOVT being
forbidden. This patch gives an error message when both options are
specified by the user and adds the according dg-skip-if directives for
tests that use either of these options.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-09-25  Thomas Preud'homme  

PR target/87374
* config/arm/arm.c (arm_option_check_internal): Disable the combined
use of -mslow-flash-data and -mword-relocations.

*** gcc/testsuite/ChangeLog ***

2018-09-25  Thomas Preud'homme  

PR target/87374
* gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data and
-mword-relocations would be passed when compiling the test.
* gcc.target/arm/movsi_movt.c: Likewise.
* gcc.target/arm/pr81863.c: Likewise.
* gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise.
* gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise.
* gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
* gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
* gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
* gcc.target/arm/tls-disable-literal-pool.c: Likewise.


Testing: Bootstrapped in Thumb-2 mode. No testsuite regression when
targeting arm-none-eabi. Modified tests get skipped as expected when
running the testsuite with -mslow-flash-data (pr81863.c) or
-mword-relocations (all the others).


Is this ok for trunk? I'd also appreciate guidance on whether this is
worth a backport. It's a simple patch but on the other hand it only
prevents some option combination, it does not fix anything so I have
mixed feelings.

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 6332e68df05..5beffc875c1 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -2893,17 +2893,22 @@ arm_option_check_internal (struct gcc_options *opts)
   flag_pic = 0;
 }
 
-  /* We only support -mpure-code and -mslow-flash-data on M-profile targets
- with MOVT.  */
-  if ((target_pure_code || target_slow_flash_data)
-  && (!TARGET_HAVE_MOVT || arm_arch_notm || flag_pic || TARGET_NEON))
+  if (target_pure_code || target_slow_flash_data)
 {
   const char *flag = (target_pure_code ? "-mpure-code" :
 	 "-mslow-flash-data");
-  error ("%s only supports non-pic code on M-profile targets with the "
-	 "MOVT instruction", flag);
-}
 
+  /* We only support -mpure-code and -mslow-flash-data on M-profile targets
+	 with MOVT.  */
+  if (!TARGET_HAVE_MOVT || arm_arch_notm || flag_pic || TARGET_NEON)
+	error ("%s only supports non-pic code on M-profile targets with the "
+	   "MOVT instruction", flag);
+
+  /* Cannot load addresses: -mslow-flash-data forbids literal pool and
+	 -mword-relocations forbids relocation of MOVT/MOVW.  */
+  if (target_word_relocations)
+	error ("%s incompatible with -mword-relocations", flag);
+}
 }
 
 /* Recompute the global settings depending on target attribute options.  */
diff --git a/gcc/testsuite/gcc.target/arm/movdi_movt.c b/gcc/testsuite/gcc.target/arm/movdi_movt.c
index e2a28ccbd99..a01ffa0dc93 100644
--- a/gcc/testsuite/gcc.target/arm/movdi_movt.c
+++ b/gcc/testsuite/gcc.target/arm/movdi_movt.c
@@ -1,4 +1,5 @@
 /* { dg-do compile { target { arm_cortex_m && { arm_thumb2_ok || arm_thumb1_movt_ok } } } } */
+/* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
 /* { dg-options "-O2 -mslow-flash-data" } */
 
 unsigned long long
diff --git a/gcc/testsuite/gcc.target/arm/movsi_movt.c b/gcc/testsuite/gcc.target/arm/movsi_movt.c
index 3cf46e2fd17..19d202ecd33 100644
--- a/gcc/testsuite/gcc.target/arm/movsi_movt.c
+++ b/gcc/testsuite/gcc.target/arm/movsi_movt.c
@@ -1,4 +1,5 @@
 /* { dg-do compile { target { arm_cortex_m && { arm_thumb2_ok || arm_thumb1_movt_ok } } } } */
+/* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
 /* { dg-options "-O2 -mslow-flash-data" } */
 
 unsigned
diff --git a/gcc/testsuite/gcc.target/arm/pr81863.c b/gcc/testsuite/gcc.target/arm/pr81863.c
index 63b1ed66b2c..225a0c5cc2b 100644
--- a/gcc/testsuite/gcc.target/arm/pr81863.c
+++ b/gcc/testsuite/gcc.target/arm/pr81863.c
@@ -1,5 +1,6 @@
 /* testsuite/gcc.target/arm/pr48183.c */
 /* { dg-do compile } */
+/* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mslow-flash-data" } } */
 /* { dg-options "-O2 -mword-relocations -march=armv7-a -marm" } */
 /* { dg-final { scan-assembler-not "\[\\t \]+movw" } } */
 
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c
index 089a72b67f3..d10391a69ac 100644
--- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c
+++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c
@@ -6,6 +6,7 @@
 /* { dg-do compile } */
 /* { dg-require-e

  1   2   3   4   5   >