[arm-embedded] Back port to 4.7 embedded branch to use single mul instruciton for Os

2013-08-28 Thread Terry Guo
Hi,

I just did this trunk back port to enable 4.7 embedded branch to use single
multiply instruction when optimize for size. Without this back port, current
4.7 embedded branch will use a group instructions to replace single multiply
instruction.

BR,
Terry

gcc/

2013-08-28  Terry Guo  

Backport from mainline r201237
2013-07-24  Terry Guo  

* config/arm/arm.c (thumb1_size_rtx_costs): Assign proper cost for
shift_add/shift_sub0/shift_sub1 RTXs.


gcc/testsuite/

2013-08-28  Terry Guo  

Backport from mainline r201237
2013-07-24  Terry Guo  

* gcc.target/arm/thumb1-Os-mult.c: New test case.






Ask for approval to backport a trunk LTO fix to 4.7 branch

2013-09-12 Thread Terry Guo
Hi there,

The FSF 4.7 branch still has bug
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54598. May I backport trunk fix
to 4.7 branch?

Thanks and best regards,
Terry




[arm-embedded] merged with FSF 4.7 branch until revision 202551

2013-09-12 Thread Terry Guo
Hello,

The arm/embedded-4_7-branch is just synced with FSF 4.7 branch. Lots of bug
fixes are included now.

BR,
Terry




[Patch ARM] Fix that miss DMB instruction for ARMv6-M

2012-10-08 Thread Terry Guo
Hi,

When running libstdc++ regression test on Cortex-M0, the case 49445.cc fails
with error message:

/tmp/ccMqZdgc.o: In function `std::atomic::load(std::memory_order)
const':^M
/home/build/work/GCC-4-7-build/build-native/gcc-final/arm-none-eabi/armv6-m/
libstdc++-v3/include/atomic:202: undefined reference to
`__sync_synchronize'^M
/home/build/work/GCC-4-7-build/build-native/gcc-final/arm-none-eabi/armv6-m/
libstdc++-v3/include/atomic:202: undefined reference to
`__sync_synchronize'^M
/tmp/ccMqZdgc.o: In function `std::atomic::load(std::memory_order)
const':^M
/home/build/work/GCC-4-7-build/build-native/gcc-final/arm-none-eabi/armv6-m/
libstdc++-v3/include/atomic:202: undefined reference to
`__sync_synchronize'^M
/home/build/work/GCC-4-7-build/build-native/gcc-final/arm-none-eabi/armv6-m/
libstdc++-v3/include/atomic:202: undefined reference to
`__sync_synchronize'^M
collect2: error: ld returned 1 exit status^M
compiler exited with status 1

After investigation, the reason is current gcc doesn't think armv6-m has DMB
instruction. While according to ARM manuals, it has. With this wrong
assumption, the expand_mem_thread_fence will generate a call to library
function __sync_synchronize rather than DMB instruction. While no code to
implement this library function, so the error generates.

The attached patch intends to fix this issue by letting gcc also think
armv6-m has DMB instruction. Is it OK to trunk?

BR,
Terry

2012-10-08  Terry Guo  

* config/arm/arm.c (arm_arch6m): New variable to denote armv6-m
architecture.
* config/arm/arm.h (TARGET_HAVE_DMB): The armv6-m also has DMB
instruction.



armv6m-dmb.patch
Description: Binary data


RE: [Patch ARM] Fix that miss DMB instruction for ARMv6-M

2012-10-09 Thread Terry Guo


> -Original Message-
> From: Richard Earnshaw
> Sent: Tuesday, October 09, 2012 10:01 PM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [Patch ARM] Fix that miss DMB instruction for ARMv6-M
> 
> On 08/10/12 08:29, Terry Guo wrote:
> > Hi,
> >
> > When running libstdc++ regression test on Cortex-M0, the case
> 49445.cc fails
> > with error message:
> >
> > /tmp/ccMqZdgc.o: In function
> `std::atomic::load(std::memory_order)
> > const':^M
> > /home/build/work/GCC-4-7-build/build-native/gcc-final/arm-none-
> eabi/armv6-m/
> > libstdc++-v3/include/atomic:202: undefined reference to
> > `__sync_synchronize'^M
> > /home/build/work/GCC-4-7-build/build-native/gcc-final/arm-none-
> eabi/armv6-m/
> > libstdc++-v3/include/atomic:202: undefined reference to
> > `__sync_synchronize'^M
> > /tmp/ccMqZdgc.o: In function
> `std::atomic::load(std::memory_order)
> > const':^M
> > /home/build/work/GCC-4-7-build/build-native/gcc-final/arm-none-
> eabi/armv6-m/
> > libstdc++-v3/include/atomic:202: undefined reference to
> > `__sync_synchronize'^M
> > /home/build/work/GCC-4-7-build/build-native/gcc-final/arm-none-
> eabi/armv6-m/
> > libstdc++-v3/include/atomic:202: undefined reference to
> > `__sync_synchronize'^M
> > collect2: error: ld returned 1 exit status^M
> > compiler exited with status 1
> >
> > After investigation, the reason is current gcc doesn't think armv6-m
> has DMB
> > instruction. While according to ARM manuals, it has. With this wrong
> > assumption, the expand_mem_thread_fence will generate a call to
> library
> > function __sync_synchronize rather than DMB instruction. While no
> code to
> > implement this library function, so the error generates.
> >
> > The attached patch intends to fix this issue by letting gcc also
> think
> > armv6-m has DMB instruction. Is it OK to trunk?
> >
> > BR,
> > Terry
> >
> > 2012-10-08  Terry Guo  
> >
> >  * config/arm/arm.c (arm_arch6m): New variable to denote
> armv6-m
> > architecture.
> >  * config/arm/arm.h (TARGET_HAVE_DMB): The armv6-m also has
> DMB
> > instruction.
> >
> >
> 
> Ok.
> 
> R.

Thanks Richard. Is it OK to 4.7?

BR,
Terry





[RFC] New feature to reuse one multilib among different targets

2012-10-09 Thread Terry Guo
Hello Joseph,

Please help to review this new Multilib feature. It intends to provide user
a chance to define their own multilib selection rules. Those rules will be
appended to rules generated by gcc script genmultilib. Thus when gcc can't
find suitable multilib from its own rules, it can fall back to certain
existing multilib according to rules provided by user. This feature is
called multilib reuse.

With multilib reuse, we can link a better multilib rather than always using
the default multilib when fail to find exactly matched multilib. This
feature also can help to reduce the total number of multilib variants.

For simplicity, the rules used by multilib reuse are same with the rules in
variable multilib_select. For example, to reuse multilib (in folder dirM and
built with option "optA optB optC") among targets "optA optD optE" and "optA
optF optG", we can define following reuse rules:

MULTILIB_REUSE = dirM optA optD optE;dirM optA optF optG;

The above method requires user to define such rules in Multilib Makefile
fragment and those rules are eventually turned into gcc built-in rules. Any
change to them require to rebuild the gcc. To make it easy to adjust reuse
rules, my patch turns MULTILIB_REUSE into a gcc spec named multilib_reuse.
So follow the way how gcc handle spec, it's easy to change reuse rules
without rebuilding the gcc. Suppose we need to share same multilib with
target "optH optI optJ", we can write following spec file and feed it to
gcc:

*multilib_reuse:
+ dirM optH optI optJ;

In summary, we can use fragment to provide some pre-decided rules and use
spec to change rules on the fly.

Does this feature make sense and is it ok to trunk? Please advise. Thanks.

BR,
Terry

2012-10-10  Terry Guo  

* genmultilib (MULTILIB_REUSE): New macro.
* Makefile.in (s-mlib): Add a new argument MULTILIB_REUSE.
* gcc.c (multilib_reuse): New spec.
(set_multilib_dir): Use multilib_reuse.

gcc-multilib-reuse.patch
Description: Binary data


RE: [Patch, test] Enable to prune warnings for tests defined in one exp file

2012-10-10 Thread Terry Guo


> -Original Message-
> From: Eric Botcazou [mailto:ebotca...@adacore.com]
> Sent: Wednesday, October 10, 2012 3:56 PM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org; Richard Guenther
> Subject: Re: [Patch, test] Enable to prune warnings for tests defined
> in one exp file
> 
> > 2012-08-27  Terry Guo  
> >
> > * lib/gcc-dg.exp (dg_runtest_extra_prunes): New variable to
> > define rules
> > that will be applied to all tests in a .exp file.
> > (gcc-dg-prune): Include rules defined by the above variable.
> >
> > * gcc.target/arm/arm.exp (dg_runtest_extra_prunes): Skip all
> > the harmless
> > architecture switch conflict warnings.
> 
> Why does this appear in the gcc/ChangeLog on the 4.7 branch?
> 
> --
> Eric Botcazou
> 

Thanks Eric for reminding and very sorry for my mistake. I am correcting it.

BR,
Terry




[PATCH,ARM] Fix PR55019 Incorrectly use live argument register to save high register in thumb1 prologue

2012-10-22 Thread Terry Guo
Hi,

Attached patch intends to fix bug 55019 which is exposed on 4.7 branch.
Although this bug can't be reproduced on trunk, I think this fix is still
useful to make trunk more robust. Tested with trunk regression test on
cortex-m0 and cortex-m3, no regression found. Also tested with various
benchmark like Dhrystone/coremark/eembc_v1 on cortex-m0, no regression on
performance and code size. Is it ok to go upstream and 4.7 branch?

BR,
Terry

gcc/ChangeLog

2012-10-22  Terry Guo  

PR target/55019
* config/arm/arm.c (thumb1_expand_prologue): Don't push high regs
with
live argument regs.

gcc/testsuite/ChangeLog

2012-10-22  Terry Guo  

PR target/55019
* gcc.target/arm/pr55019.c: New.

thumb1-argument-register-issue.patch
Description: Binary data


[arm-embedded] Backport trunk thumb1 pic fix to embedded-4_8-branch

2013-11-26 Thread Terry Guo
Hi,

This patch back ported trunk fix at r205391 to arm/embedded-4_8-branch.

BR,
Terry

gcc/ChangeLog.arm
2013-11-27  Terry Guo  

Backport mainline r205391
2013-11-26  Terry Guo  

* config/arm/arm.c (require_pic_register): Handle high pic base
register for thumb-1.
(arm_load_pic_register): Also initialize high pic base register.
* doc/invoke.texi: Update documentation for option -mpic-register.

gcc/testsuite/ChangeLog.arm
2013-11-27  Terry Guo  

Backport mainline r205391
2013-11-26  Terry Guo  

* gcc.target/arm/thumb1-pic-high-reg.c: New case.
* gcc.target/arm/thumb1-pic-single-base.c: New case.




RE: [arm-embedded] Backport trunk thumb1 pic fix to embedded-4_8-branch

2013-11-27 Thread Terry Guo


> -Original Message-
> From: Joey Ye [mailto:joey.ye...@gmail.com]
> Sent: Thursday, November 28, 2013 10:56 AM
> To: Terry Guo
> Cc: gcc-patches
> Subject: Re: [arm-embedded] Backport trunk thumb1 pic fix to embedded-
> 4_8-branch
> 
> Terry, this is a bug fix to pic register. I feel it should also be in
gcc-4_8-branch.
> 
> - Joey

OK. Will ask maintainer to approve this.

BR,
Terry

> 
> On Wed, Nov 27, 2013 at 11:45 AM, Terry Guo  wrote:
> > Hi,
> >
> > This patch back ported trunk fix at r205391 to arm/embedded-4_8-branch.
> >
> > BR,
> > Terry
> >
> > gcc/ChangeLog.arm
> > 2013-11-27  Terry Guo  
> >
> > Backport mainline r205391
> > 2013-11-26  Terry Guo  
> >
> > * config/arm/arm.c (require_pic_register): Handle high pic base
> > register for thumb-1.
> > (arm_load_pic_register): Also initialize high pic base register.
> > * doc/invoke.texi: Update documentation for option
-mpic-register.
> >
> > gcc/testsuite/ChangeLog.arm
> > 2013-11-27  Terry Guo  
> >
> > Backport mainline r205391
> > 2013-11-26  Terry Guo  
> >
> > * gcc.target/arm/thumb1-pic-high-reg.c: New case.
> > * gcc.target/arm/thumb1-pic-single-base.c: New case.
> >
> >





[arm-embedded] Backport a trunk M4 CPU pipeline tuning to embedded-4_8-branch

2013-11-27 Thread Terry Guo
Hi,

This patch back ported a trunk M4 CPU pipeline tuning to
embedded-4_8-branch.

BR,
Terry

gcc/ChangeLog.arm
2013-11-28  Terry Guo  

Backport mainline r198021
2013-04-17  Terry Guo  

* config/arm/cortex-m4.md: Add a new bypass.

The patch itself:
Index: gcc/config/arm/cortex-m4.md
===
--- gcc/config/arm/cortex-m4.md (revision 205472)
+++ gcc/config/arm/cortex-m4.md (working copy)
@@ -84,6 +84,10 @@
(eq_attr "type" "store4"))
   "cortex_m4_ex*5")
 
+(define_bypass 1 "cortex_m4_load1"
+ "cortex_m4_store1_1,cortex_m4_store1_2"
+ "arm_no_early_store_addr_dep")
+
 ;; If the address of load or store depends on the result of the preceding
 ;; instruction, the latency is increased by one.




[arm-embedded] Backport trunk Cortex-M4 FPU tuning to embedded-4_8-branch

2013-11-27 Thread Terry Guo
Hi,

This patch back ported a trunk cortex-m4 FPU tuning to embedded-4_8-branch.

BR,
Terry

gcc/ChangeLog.arm
2013-11-28  Terry Guo  
 
Backport mainline r198084
2013-04-19  Terry Guo  

* config/arm/cortex-m4-fpu.md (cortex_m4_v): Delete cpu unit.
Replace with ...
(cortex_m4_v_a,  cortex_m4_v_b): ... new cpu units.
(cortex_m4_v, cortex_m4_exa_va, cortex_m4_exb_vb): New reservations.
(cortex_m4_fmacs): Use new reservations.
(cortex_m4_f_load, cortex_m4_f_store): Likewise.Index: gcc/config/arm/cortex-m4-fpu.md
===
--- gcc/config/arm/cortex-m4-fpu.md (revision 205473)
+++ gcc/config/arm/cortex-m4-fpu.md (working copy)
@@ -18,10 +18,14 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; <http://www.gnu.org/licenses/>.
 
-;; Use an artifial unit to model FPU.
-(define_cpu_unit "cortex_m4_v" "cortex_m4")
+;; Use two artificial units to model FPU.
+(define_cpu_unit "cortex_m4_v_a" "cortex_m4")
+(define_cpu_unit "cortex_m4_v_b" "cortex_m4")
 
+(define_reservation "cortex_m4_v" "cortex_m4_v_a+cortex_m4_v_b")
 (define_reservation "cortex_m4_ex_v" "cortex_m4_ex+cortex_m4_v")
+(define_reservation "cortex_m4_exa_va" "cortex_m4_a+cortex_m4_v_a")
+(define_reservation "cortex_m4_exb_vb" "cortex_m4_b+cortex_m4_v_b")
 
 ;; Integer instructions following VDIV or VSQRT complete out-of-order.
 (define_insn_reservation "cortex_m4_fdivs" 15
@@ -44,10 +48,12 @@
(eq_attr "type" "fmuls"))
   "cortex_m4_ex_v")
 
+;; Integer instructions following multiply-accumulate instructions
+;; complete out-of-order.
 (define_insn_reservation "cortex_m4_fmacs" 4
   (and (eq_attr "tune" "cortexm4")
(eq_attr "type" "fmacs,ffmas"))
-  "cortex_m4_ex_v*3")
+  "cortex_m4_ex_v,cortex_m4_v*2")
 
 (define_insn_reservation "cortex_m4_ffariths" 1
   (and (eq_attr "tune" "cortexm4")
@@ -77,12 +83,12 @@
 (define_insn_reservation "cortex_m4_f_load" 2
   (and (eq_attr "tune" "cortexm4")
(eq_attr "type" "f_loads"))
-  "cortex_m4_ex_v*2")
+  "cortex_m4_exa_va,cortex_m4_exb_vb")
 
-(define_insn_reservation "cortex_m4_f_store" 2
+(define_insn_reservation "cortex_m4_f_store" 1
   (and (eq_attr "tune" "cortexm4")
(eq_attr "type" "f_stores"))
-  "cortex_m4_ex_v*2")
+  "cortex_m4_exa_va")
 
 (define_insn_reservation "cortex_m4_f_loadd" 3
   (and (eq_attr "tune" "cortexm4")

RE: [Patch, ARM] Fix ICE when high register is used as pic base register for thumb1 target

2013-11-27 Thread Terry Guo


> -Original Message-
> From: Richard Earnshaw
> Sent: Tuesday, November 26, 2013 5:44 PM
> To: Terry Guo
> Cc: Ramana Radhakrishnan; gcc-patches@gcc.gnu.org
> Subject: Re: [Patch, ARM] Fix ICE when high register is used as pic base
> register for thumb1 target
> 
> On 26/11/13 04:18, Terry Guo wrote:
> > Hi,
> >
> > This patch intends to fix ICE when high register is used for pic base
> > register for thumb1 target. Tested with gcc regression test, no new
> > regressions. Is it OK to trunk?
> >
> > BR,
> > Terry
> >
> > gcc/ChangeLog:
> >
> > 2013-11-26  Terry Guo  
> >
> > * config/arm/arm.c (require_pic_register): Handle high pic
> > base register for
> > thumb-1.
> > (arm_load_pic_register): Also initialize high pic base register.
> >     * doc/invoke.texi: Update documentation for option
-mpic-register.
> >
> > gcc/testsuite/ChangeLog:
> >
> > 2013-11-26  Terry Guo  
> >
> > * gcc.target/arm/thumb1-pic-high.c: New case.
> > * gcc.target/arm/thumb1-pic-single-base.c: New case.
> >
> >
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index
> > 501d080..f0b46e9 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -12216,8 +12216,11 @@ before execution begins.
> >
> >  @item -mpic-register=@var{reg}
> >  @opindex mpic-register
> > -Specify the register to be used for PIC addressing.  The default is
> > R10 -unless stack-checking is enabled, when R9 is used.
> > +Specify the register to be used for PIC addressing.
> > +For standard PIC base case, the default will be any suitable register
> > +determined by compiler.  For single PIC base case, the default is R9
> > +if target is EABI based or stack-checking is enabled, otherwise the
> > +default is R10.
> >
> 
> Please can you put @samp{} around the uses of R9 and R10.
> Otherwise, OK.
> R.
> 

Thanks Richard. The updated patch is committed to trunk. Is it OK to
backport to FSF 4.8 branch as a bug fix?

BR,
Terry




[Patch, ARM] Add v7m specific extra rtx cost table

2013-11-28 Thread Terry Guo
Hello,

This patch intends to add a specific extra rtx cost table for v7-m profile
targets. Tested with gcc regression test, no new regressions. Is it OK to
trunk?

BR,
Terry

2013-11-28  Terry Guo  

   * config/arm/aarch-cost-tables.h (v7m_extra_costs): New table.diff --git a/gcc/config/arm/aarch-cost-tables.h 
b/gcc/config/arm/aarch-cost-tables.h
index d3e7dd2..52e18a1 100644
--- a/gcc/config/arm/aarch-cost-tables.h
+++ b/gcc/config/arm/aarch-cost-tables.h
@@ -223,5 +223,105 @@ const struct cpu_cost_table cortexa53_extra_costs =
 };
 
 
+const struct cpu_cost_table v7m_extra_costs =
+{
+  /* ALU */
+  {
+0, /* Arith.  */
+0, /* Logical.  */
+0, /* Shift.  */
+0, /* Shift_reg.  */
+0, /* Arith_shift.  */
+COSTS_N_INSNS (1), /* Arith_shift_reg.  */
+0, /* Log_shift.  */
+COSTS_N_INSNS (1), /* Log_shift_reg.  */
+0, /* Extend.  */
+COSTS_N_INSNS (1), /* Extend_arith.  */
+0, /* Bfi.  */
+0, /* Bfx.  */
+0, /* Clz.  */
+COSTS_N_INSNS (1), /* non_exec.  */
+false  /* non_exec_costs_exec.  */
+  },
+  {
+/* MULT SImode */
+{
+  COSTS_N_INSNS (1),   /* Simple.  */
+  COSTS_N_INSNS (1),   /* Flag_setting.  */
+  COSTS_N_INSNS (2),   /* Extend.  */
+  COSTS_N_INSNS (1),   /* Add.  */
+  COSTS_N_INSNS (3),   /* Extend_add.  */
+  COSTS_N_INSNS (8)/* Idiv.  */
+},
+/* MULT DImode */
+{
+  0,   /* Simple (N/A).  */
+  0,   /* Flag_setting (N/A).  */
+  COSTS_N_INSNS (2),   /* Extend.  */
+  0,   /* Add (N/A).  */
+  COSTS_N_INSNS (3),   /* Extend_add.  */
+  0/* Idiv (N/A).  */
+}
+  },
+  /* LD/ST */
+  {
+COSTS_N_INSNS (2), /* Load.  */
+0, /* Load_sign_extend.  */
+COSTS_N_INSNS (3), /* Ldrd.  */
+COSTS_N_INSNS (2), /* Ldm_1st.  */
+1, /* Ldm_regs_per_insn_1st.  */
+1, /* Ldm_regs_per_insn_subsequent.  */
+COSTS_N_INSNS (2), /* Loadf.  */
+COSTS_N_INSNS (3), /* Loadd.  */
+COSTS_N_INSNS (1),  /* Load_unaligned.  */
+COSTS_N_INSNS (2), /* Store.  */
+COSTS_N_INSNS (3), /* Strd.  */
+COSTS_N_INSNS (2), /* Stm_1st.  */
+1, /* Stm_regs_per_insn_1st.  */
+1, /* Stm_regs_per_insn_subsequent.  */
+COSTS_N_INSNS (2), /* Storef.  */
+COSTS_N_INSNS (3), /* Stored.  */
+COSTS_N_INSNS (1)  /* Store_unaligned.  */
+  },
+  {
+/* FP SFmode */
+{
+  COSTS_N_INSNS (7),   /* Div.  */
+  COSTS_N_INSNS (2),   /* Mult.  */
+  COSTS_N_INSNS (5),   /* Mult_addsub.  */
+  COSTS_N_INSNS (3),   /* Fma.  */
+  COSTS_N_INSNS (1),   /* Addsub.  */
+  0,   /* Fpconst.  */
+  0,   /* Neg.  */
+  0,   /* Compare.  */
+  0,   /* Widen.  */
+  0,   /* Narrow.  */
+  0,   /* Toint.  */
+  0,   /* Fromint.  */
+  0/* Roundint.  */
+},
+/* FP DFmode */
+{
+  COSTS_N_INSNS (15),  /* Div.  */
+  COSTS_N_INSNS (5),   /* Mult.  */
+  COSTS_N_INSNS (7),   /* Mult_addsub.  */
+  COSTS_N_INSNS (7),   /* Fma.  */
+  COSTS_N_INSNS (3),   /* Addsub.  */
+  0,   /* Fpconst.  */
+  0,   /* Neg.  */
+  0,   /* Compare.  */
+  0,   /* Widen.  */
+  0,   /* Narrow.  */
+  0,   /* Toint.  */
+  0,   /* Fromint.  */
+  0/* Roundint.  */
+}
+  },
+  /* Vector */
+  {
+COSTS_N_INSNS (1)  /* Alu.  */
+  }
+};
+
 #endif /* GCC_AARCH_COST_TABLES_H */
 
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 129e428..cbd201e 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -1473,7 +1473,7 @@ const struct tune_params arm_cortex_a9_tune =
 const struct tune_params arm_v7m_tune =
 {
   arm_9e_rtx_costs,
-  &generic_extra_costs,
+  &v7m_extra_costs,
   NULL,/* Sched adj cost.  */
   1,   /* Constant limit.  */
   5,   /* Max cond insns.  */


RE: [Patch, ARM] Add v7m specific extra rtx cost table

2013-11-28 Thread Terry Guo

> -Original Message-
> From: Richard Earnshaw
> Sent: Thursday, November 28, 2013 7:09 PM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [Patch, ARM] Add v7m specific extra rtx cost table
> 
> On 28/11/13 10:34, Terry Guo wrote:
> > Hello,
> >
> > This patch intends to add a specific extra rtx cost table for v7-m
> > profile targets. Tested with gcc regression test, no new regressions.
> > Is it OK to trunk?
> >
> > BR,
> > Terry
> >
> > 2013-11-28  Terry Guo  
> >
> >* config/arm/aarch-cost-tables.h (v7m_extra_costs): New
table.
> >
> 
> You haven't mentioned the change in arm.c
> 
> >
> > m4-extra-cost-table-upstream-v1.txt
> >
> >
> > diff --git a/gcc/config/arm/aarch-cost-tables.h
> > b/gcc/config/arm/aarch-cost-tables.h
> > index d3e7dd2..52e18a1 100644
> > --- a/gcc/config/arm/aarch-cost-tables.h
> > +++ b/gcc/config/arm/aarch-cost-tables.h
> > @@ -223,5 +223,105 @@ const struct cpu_cost_table
> > cortexa53_extra_costs =  };
> >
> >
> > +const struct cpu_cost_table v7m_extra_costs =
> 
> As Kyrill says, this should be in arm.c
> 
> OK with that change.
> 
> R.

Thank you all. I will update the patch and commit it.

BR,
Terry




[Patch, ARM] check value of --with-arch against arm-arches.def

2013-12-29 Thread Terry Guo
Hi There,

This patch intends to check value of --with-arch against the arm-arches.def,
rather than current solution that use hard coded things in config.gcc.
Tested with various values of --with-arch and it works. Is it ok to trunk?

BR,
Terry

2013-12-30  Terry Guo  

 * config.gcc (arm*-*-*): Check --with-arch against arm-arches.def.diff --git a/gcc/config.gcc b/gcc/config.gcc
index 24dbaf9..7c4a0b9 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3455,19 +3455,17 @@ case "${target}" in
fi
done
 
-   case "$with_arch" in
-   "" \
-   | armv[23456] | armv2a | armv3m | armv4t | armv5t \
-   | armv5te | armv6j |armv6k | armv6z | armv6zk | armv6-m \
-   | armv7 | armv7-a | armv7-r | armv7-m | armv8-a \
-   | iwmmxt | ep9312)
-   # OK
-   ;;
-   *)
-   echo "Unknown arch used in --with-arch=$with_arch" 1>&2
-   exit 1
-   ;;
-   esac
+   # See if it matches any of the entries in arm-arches.def
+   if [ x"$with_arch" = x ] \
+   || grep "^ARM_ARCH(\"$with_arch\"," \
+   ${srcdir}/config/arm/arm-arches.def \
+   > /dev/null; then
+ # OK
+ true
+   else
+ echo "Unknown arch used in --with-arch=$with_arch" 1>&2
+ exit 1
+   fi
 
case "$with_float" in
"" \


[GCC, ARM] Backport trunk fix to 4.8 branch to properly handle rtx of ARM PLD instruction

2014-01-15 Thread Terry Guo
Hi there,

With trunk enhancement at
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00533.html, gcc can properly
handle PLD rtx. Otherwise the PLD rtx will be treated as SET rtx and gcc
will end up with ICE. The attached patch intends to back port this
enhancement to 4.8 branch. Tested with gcc regression test, no new
regressions. Is it ok to back port?

BR,
Terry

2014-01-15  Terry Guo  

Backported from mainline r204575 and applied to file arm.c.
2013-11-08  James Greenhalgh  

* config/arm/aarch-common.c
(search_term): New typedef.
(shift_rtx_costs): New array.
(arm_rtx_shift_left_p): New.
(arm_find_sub_rtx_with_search_term): Likewise.
(arm_find_sub_rtx_with_code): Likewise.
(arm_early_load_addr_dep): Add sanity checking.
(arm_no_early_alu_shift_dep): Likewise.
(arm_no_early_alu_shift_value_dep): Likewise.
(arm_no_early_mul_dep): Likewise.
(arm_no_early_store_addr_dep): Likewise.Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 206619)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,20 @@
+2014-01-15  Terry Guo  
+
+   Backported from mainline r204575 and applied to file arm.c.
+   2013-11-08  James Greenhalgh  
+
+   * config/arm/aarch-common.c
+   (search_term): New typedef.
+   (shift_rtx_costs): New array.
+   (arm_rtx_shift_left_p): New.
+   (arm_find_sub_rtx_with_search_term): Likewise.
+   (arm_find_sub_rtx_with_code): Likewise.
+   (arm_early_load_addr_dep): Add sanity checking.
+   (arm_no_early_alu_shift_dep): Likewise.
+   (arm_no_early_alu_shift_value_dep): Likewise.
+   (arm_no_early_mul_dep): Likewise.
+   (arm_no_early_store_addr_dep): Likewise.
+
 2014-01-14  Uros Bizjak  
 
Revert:
Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c(revision 206619)
+++ gcc/config/arm/arm.c(working copy)
@@ -1161,6 +1161,30 @@
   TLS_DESCSEQ  /* GNU scheme */
 };
 
+typedef struct
+{
+  rtx_code search_code;
+  rtx search_result;
+  bool find_any_shift;
+} search_term;
+
+/* Return TRUE if X is either an arithmetic shift left, or
+   is a multiplication by a power of two.  */
+bool
+arm_rtx_shift_left_p (rtx x)
+{
+  enum rtx_code code = GET_CODE (x);
+
+  if (code == MULT && CONST_INT_P (XEXP (x, 1))
+  && exact_log2 (INTVAL (XEXP (x, 1))) > 0)
+return true;
+
+  if (code == ASHIFT)
+return true;
+
+  return false;
+}
+
 /* The maximum number of insns to be used when loading a constant.  */
 inline static int
 arm_constant_limit (bool size_p)
@@ -24604,62 +24628,116 @@
 *pretend_size = (NUM_ARG_REGS - nregs) * UNITS_PER_WORD;
 }
 
-/* Return nonzero if the CONSUMER instruction (a store) does not need
-   PRODUCER's value to calculate the address.  */
+static rtx_code shift_rtx_codes[] =
+  { ASHIFT, ROTATE, ASHIFTRT, LSHIFTRT,
+ROTATERT, ZERO_EXTEND, SIGN_EXTEND };
 
-int
-arm_no_early_store_addr_dep (rtx producer, rtx consumer)
+/* Callback function for arm_find_sub_rtx_with_code.
+   DATA is safe to treat as a SEARCH_TERM, ST.  This will
+   hold a SEARCH_CODE.  PATTERN is checked to see if it is an
+   RTX with that code.  If it is, write SEARCH_RESULT in ST
+   and return 1.  Otherwise, or if we have been passed a NULL_RTX
+   return 0.  If ST.FIND_ANY_SHIFT then we are interested in
+   anything which can reasonably be described as a SHIFT RTX.  */
+static int
+arm_find_sub_rtx_with_search_term (rtx *pattern, void *data)
 {
-  rtx value = PATTERN (producer);
-  rtx addr = PATTERN (consumer);
+  search_term *st = (search_term *) data;
+  rtx_code pattern_code;
+  int found = 0;
 
-  if (GET_CODE (value) == COND_EXEC)
-value = COND_EXEC_CODE (value);
-  if (GET_CODE (value) == PARALLEL)
-value = XVECEXP (value, 0, 0);
-  value = XEXP (value, 0);
-  if (GET_CODE (addr) == COND_EXEC)
-addr = COND_EXEC_CODE (addr);
-  if (GET_CODE (addr) == PARALLEL)
-addr = XVECEXP (addr, 0, 0);
-  addr = XEXP (addr, 0);
+  gcc_assert (pattern);
+  gcc_assert (st);
 
-  return !reg_overlap_mentioned_p (value, addr);
+  /* Poorly formed patterns can really ruin our day.  */
+  if (*pattern == NULL_RTX)
+return 0;
+
+  pattern_code = GET_CODE (*pattern);
+
+  if (st->find_any_shift)
+{
+  unsigned i = 0;
+
+  /* Left shifts might have been canonicalized to a MULT of some
+power of two.  Make sure we catch them.  */
+  if (arm_rtx_shift_left_p (*pattern))
+   found = 1;
+  else
+   for (i = 0; i < ARRAY_SIZE (shift_rtx_codes); i++)
+ if (pattern_code == shift_rtx_codes[i])
+   found = 1;
+}
+
+  if (pattern_code == st->search_code)
+found = 1;
+
+  if (found)
+st->search_result = *pattern;
+
+  return found;
 }
 
-/* Return nonzero if the CONSUMER instruction (a stor

RE: [GCC, ARM] Backport trunk fix to 4.8 branch to properly handle rtx of ARM PLD instruction

2014-01-15 Thread Terry Guo


> -Original Message-
> From: Richard Earnshaw
> Sent: Wednesday, January 15, 2014 5:54 PM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [GCC, ARM] Backport trunk fix to 4.8 branch to properly
handle
> rtx of ARM PLD instruction
> 
> On 15/01/14 09:23, Terry Guo wrote:
> > Hi there,
> >
> > With trunk enhancement at
> > http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00533.html, gcc can
> > properly handle PLD rtx. Otherwise the PLD rtx will be treated as SET
> > rtx and gcc will end up with ICE. The attached patch intends to back
> > port this enhancement to 4.8 branch. Tested with gcc regression test,
> > no new regressions. Is it ok to back port?
> >
> > BR,
> > Terry
> >
> > 2014-01-15  Terry Guo  
> >
> > Backported from mainline r204575 and applied to file arm.c.
> > 2013-11-08  James Greenhalgh  
> >
> > * config/arm/aarch-common.c
> > (search_term): New typedef.
> > (shift_rtx_costs): New array.
> > (arm_rtx_shift_left_p): New.
> > (arm_find_sub_rtx_with_search_term): Likewise.
> > (arm_find_sub_rtx_with_code): Likewise.
> > (arm_early_load_addr_dep): Add sanity checking.
> > (arm_no_early_alu_shift_dep): Likewise.
> > (arm_no_early_alu_shift_value_dep): Likewise.
> > (arm_no_early_mul_dep): Likewise.
> > (arm_no_early_store_addr_dep): Likewise.
> >
> 
> Is there a PR for this?
> 

No. It is firstly found on arm/embedded-4_8-branch and then found on
upstream 4.8 branch. The trunk hasn't such issue. Do I need to report a PR
against 4.8? If so, I am willing to do it along with the test case.

BR,
Terry




[PATCH,committed] [MAINTAINERS] Update email address

2018-09-03 Thread Terry Guo
Hi,

This is to update my email address per my recent job change.

BR,
Terry

2018-09-04  Xuepeng Guo  

   * MAINTAINERS: Update my email address.

Index: MAINTAINERS
===
--- MAINTAINERS (revision 264074)
+++ MAINTAINERS (working copy)
@@ -395,7 +395,7 @@
 Yury Gribov
 Jon Grimm  
 Laurent Guerby 
-Xuepeng Guo
+Xuepeng Guo
 Wei Guozhi 
 Mostafa Hagog  
 Andrew Haley   


Re: [PATCH v2 1/6] [MIPS] Split Loongson (MMI) from loongson3a

2018-09-04 Thread Terry Guo
On Tue, Sep 4, 2018 at 11:53 AM, Paul Hua  wrote:
> On Mon, Sep 3, 2018 at 8:29 PM Paul Hua  wrote:
>>
>>
>
> Hi:
>
> The v2 patch add:
> * gcc/doc/invoke.texi (-mloongson-mmi): Document.
>
> Thanks
> Paul Hua

Hi Paul,

For the new files, I think the copyright year should be just 2018.

diff --git a/gcc/config/mips/loongson-mmi.md b/gcc/config/mips/loongson-mmi.md
new file mode 100644
index 000..ad23f67
--- /dev/null
+++ b/gcc/config/mips/loongson-mmi.md
@@ -0,0 +1,903 @@
+;; Machine description for Loongson MultiMedia extensions Instructions (MMI).
+;; Copyright (C) 2008-2018 Free Software Foundation, Inc.
+;; Contributed by CodeSourcery.
+;;

BR,
Terry


[Patch,GCC/Thumb1]Use immediate_operand in 64bit split pattern

2014-08-19 Thread Terry Guo
Hi there,

The gcc now uses immediate_operand for const_double_operand, update this
split pattern accordingly. Tested with gcc regression test on thumb1 target,
no regression. Is it ok to trunk?

BR,
Terry

2014-08-20  Terry Guo  

 * config/arm/thumb1.md (64bit splitter): Replace const_double_operand
 with immediate_operand.diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index 65d55dd..020d83b 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -639,7 +639,7 @@
 ; thumb1_movdi_insn has a better way to handle them.
 (define_split
   [(set (match_operand:ANY64 0 "arm_general_register_operand" "")
-   (match_operand:ANY64 1 "const_double_operand" ""))]
+   (match_operand:ANY64 1 "immediate_operand" ""))]
   "TARGET_THUMB1 && reload_completed && !satisfies_constraint_J (operands[1])"
   [(set (match_dup 0) (match_dup 1))
(set (match_dup 2) (match_dup 3))]


[Patch, ARM] New feature to minimize the literal load for armv7-m target

2013-11-05 Thread Terry Guo
Hi,

This patch intends to minimize the use of literal pool for some armv7-m
targets that have slower speed to load data from flash than to fetch
instruction from flash. The normal literal load instruction is now replaced
by MOVW/MOVT instructions. A new option -mslow-flash-data is created for
this purpose. So far this feature doesn't support PIC code and target that
isn't based on armv7-m.

Tested with GCC regression test on QEMU for cortex-m3. No new regressions.
Is it OK to trunk?

BR,
Terry

2013-11-06  Terry Guo  

 * doc/invoke.texi (-mslow-flash-data): Document new option.
 * config/arm/arm.opt (mslow-flash-data): New option.
 * config/arm/arm-protos.h
(arm_max_const_double_inline_cost): Declare it.
 * config/arm/arm.h (TARGET_USE_MOVT): Always true when
disable literal pools.
 (arm_disable_literal_pool): Declare it.
 * config/arm/arm.c (arm_disable_literal_pool): New
variable.
 (arm_option_override): Handle new option.
 (thumb2_legitimate_address_p): Invalid certain address
format.
 (arm_max_const_double_inline_cost): New function.
 * config/arm/arm.md (types.md): Include it a little
earlier.
 (use_literal_pool): New attribute.
 (enabled): Use new attribute.
 (split pattern): Replace symbol+offset with MOVW/MOVT.diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 944cf10..c5b16da 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -121,6 +121,7 @@ extern rtx arm_gen_compare_reg (RTX_CODE, rtx, rtx, rtx);
 extern rtx arm_gen_return_addr_mask (void);
 extern void arm_reload_in_hi (rtx *);
 extern void arm_reload_out_hi (rtx *);
+extern int arm_max_const_double_inline_cost (void);
 extern int arm_const_double_inline_cost (rtx);
 extern bool arm_const_double_by_parts (rtx);
 extern bool arm_const_double_by_immediates (rtx);
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 1781b75..25927a1 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -329,7 +329,9 @@ extern void (*arm_lang_output_object_attributes_hook)(void);
 
 /* Should MOVW/MOVT be used in preference to a constant pool.  */
 #define TARGET_USE_MOVT \
-  (arm_arch_thumb2 && !optimize_size && !current_tune->prefer_constant_pool)
+  (arm_arch_thumb2 \
+   && (arm_disable_literal_pool \
+   || (!optimize_size && !current_tune->prefer_constant_pool)))
 
 /* We could use unified syntax for arm mode, but for now we just use it
for Thumb-2.  */
@@ -554,6 +556,9 @@ extern int arm_arch_thumb_hwdiv;
than core registers.  */
 extern int prefer_neon_for_64bits;
 
+/* Nonzero if shouldn't use literal pool in generated code.  */
+extern int arm_disable_literal_pool;
+
 #ifndef TARGET_DEFAULT
 #define TARGET_DEFAULT  (MASK_APCS_FRAME)
 #endif
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 78554e8..de2a9c0 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -864,6 +864,9 @@ int arm_arch_thumb_hwdiv;
than core registers.  */
 int prefer_neon_for_64bits = 0;
 
+/* Nonzero if shouldn't use literal pool in generated code.  */
+int arm_disable_literal_pool = 0;
+
 /* In case of a PRE_INC, POST_INC, PRE_DEC, POST_DEC memory reference,
we must report the mode of the memory reference from
TARGET_PRINT_OPERAND to TARGET_PRINT_OPERAND_ADDRESS.  */
@@ -2505,6 +2508,16 @@ arm_option_override (void)
   if (TARGET_APCS_FRAME)
 flag_shrink_wrap = false;
 
+  /* We only support -mslow-flash-data on armv7-m targets.  */
+  if (target_slow_flash_data
+  && ((!(arm_arch7 && !arm_arch_notm) && !arm_arch7em)
+ || (TARGET_THUMB1 || flag_pic || TARGET_NEON)))
+error ("-mslow-flash-data only supports non-pic code on armv7-m targets");
+
+  /* Currently, for slow flash data, we just disable literal pools.  */
+  if (target_slow_flash_data)
+arm_disable_literal_pool = 1;
+
   /* Register global variables with the garbage collector.  */
   arm_add_gc_roots ();
 }
@@ -6348,6 +6361,25 @@ thumb2_legitimate_address_p (enum machine_mode mode, rtx 
x, int strict_p)
  && thumb2_legitimate_index_p (mode, xop0, strict_p)));
 }
 
+  /* Normally we can assign constant values to its target register without
+ the help of constant pool.  But there are cases we have to use constant
+ pool like:
+ 1) assign a label to register.
+ 2) sign-extend a 8bit value to 32bit and then assign to register.
+
+ Constant pool access in format:
+ (set (reg r0) (mem (symbol_ref (".LC0"
+ will cause the use of literal pool (later in function arm_reorg).
+ So here we mark such format as an invalid format, then compiler
+ will adjust it into:
+ (set (reg r0) (symbol_ref (".LC0")))
+

RE: [Patch, ARM] New feature to minimize the literal load for armv7-m target

2013-11-19 Thread Terry Guo
Ping.

BR,
Terry

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Terry Guo
> Sent: Wednesday, November 06, 2013 2:11 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw; Ramana Radhakrishnan
> Subject: [Patch, ARM] New feature to minimize the literal load for armv7-m
> target
> 
> Hi,
> 
> This patch intends to minimize the use of literal pool for some armv7-m
> targets that have slower speed to load data from flash than to fetch
> instruction from flash. The normal literal load instruction is now
replaced
> by MOVW/MOVT instructions. A new option -mslow-flash-data is created for
> this purpose. So far this feature doesn't support PIC code and target that
> isn't based on armv7-m.
> 
> Tested with GCC regression test on QEMU for cortex-m3. No new
> regressions.
> Is it OK to trunk?
> 
> BR,
> Terry
> 
> 2013-11-06  Terry Guo  
> 
>  * doc/invoke.texi (-mslow-flash-data): Document new
option.
>  * config/arm/arm.opt (mslow-flash-data): New option.
>  * config/arm/arm-protos.h
> (arm_max_const_double_inline_cost): Declare it.
>  * config/arm/arm.h (TARGET_USE_MOVT): Always true when
> disable literal pools.
>  (arm_disable_literal_pool): Declare it.
>  * config/arm/arm.c (arm_disable_literal_pool): New
> variable.
>  (arm_option_override): Handle new option.
>  (thumb2_legitimate_address_p): Invalid certain address
> format.
>  (arm_max_const_double_inline_cost): New function.
>  * config/arm/arm.md (types.md): Include it a little
> earlier.
>  (use_literal_pool): New attribute.
>  (enabled): Use new attribute.
>  (split pattern): Replace symbol+offset with MOVW/MOVT.




RE: [Patch, ARM] New feature to minimize the literal load for armv7-m target

2013-11-20 Thread Terry Guo


> -Original Message-
> From: Richard Earnshaw
> Sent: Wednesday, November 20, 2013 10:41 PM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org; Ramana Radhakrishnan
> Subject: Re: [Patch, ARM] New feature to minimize the literal load for
armv7-
> m target
> 
> On 06/11/13 06:10, Terry Guo wrote:
> > Hi,
> >
> > This patch intends to minimize the use of literal pool for some armv7-m
> > targets that have slower speed to load data from flash than to fetch
> > instruction from flash. The normal literal load instruction is now
replaced
> > by MOVW/MOVT instructions. A new option -mslow-flash-data is created
> for
> > this purpose. So far this feature doesn't support PIC code and target
that
> > isn't based on armv7-m.
> >
> > Tested with GCC regression test on QEMU for cortex-m3. No new
> regressions.
> > Is it OK to trunk?
> >
> > BR,
> > Terry
> >
> > 2013-11-06  Terry Guo  
> >
> >  * doc/invoke.texi (-mslow-flash-data): Document new
option.
> >  * config/arm/arm.opt (mslow-flash-data): New option.
> >  * config/arm/arm-protos.h
> > (arm_max_const_double_inline_cost): Declare it.
> >  * config/arm/arm.h (TARGET_USE_MOVT): Always true when
> > disable literal pools.
> literal pools are disabled.
> 
> >  (arm_disable_literal_pool): Declare it.
> >  * config/arm/arm.c (arm_disable_literal_pool): New
> > variable.
> >  (arm_option_override): Handle new option.
> >  (thumb2_legitimate_address_p): Invalid certain address
> > format.
> 
> Invalidate.  What address formats?
> 
> >  (arm_max_const_double_inline_cost): New function.
> >  * config/arm/arm.md (types.md): Include it a little
> > earlier.
> 
> Include it before ...
> 
> >  (use_literal_pool): New attribute.
> >  (enabled): Use new attribute.
> >  (split pattern): Replace symbol+offset with MOVW/MOVT.
> >
> >
> 
> Comments inline.
> 
> > diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> > index 1781b75..25927a1 100644
> > --- a/gcc/config/arm/arm.h
> > +++ b/gcc/config/arm/arm.h
> > @@ -554,6 +556,9 @@ extern int arm_arch_thumb_hwdiv;
> > than core registers.  */
> >  extern int prefer_neon_for_64bits;
> >
> > +/* Nonzero if shouldn't use literal pool in generated code.  */
> 'if we shouldn't use literal pools'
> 
> > +extern int arm_disable_literal_pool;
> 
> This should be a bool, values stored in it should be true/false not 1/0.
> 
> > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > index 78554e8..de2a9c0 100644
> > --- a/gcc/config/arm/arm.c
> > +++ b/gcc/config/arm/arm.c
> > @@ -864,6 +864,9 @@ int arm_arch_thumb_hwdiv;
> > than core registers.  */
> >  int prefer_neon_for_64bits = 0;
> >
> > +/* Nonzero if shouldn't use literal pool in generated code.  */
> > +int arm_disable_literal_pool = 0;
> 
> Similar comments to above.
> 
> > @@ -6348,6 +6361,25 @@ thumb2_legitimate_address_p (enum
> machine_mode mode, rtx x, int strict_p)
> >   && thumb2_legitimate_index_p (mode, xop0, strict_p)));
> >  }
> >
> > +  /* Normally we can assign constant values to its target register
without
> 'to target registers'
> 
> > + the help of constant pool.  But there are cases we have to use
constant
> > + pool like:
> > + 1) assign a label to register.
> > + 2) sign-extend a 8bit value to 32bit and then assign to register.
> > +
> > + Constant pool access in format:
> > + (set (reg r0) (mem (symbol_ref (".LC0"
> > + will cause the use of literal pool (later in function arm_reorg).
> > + So here we mark such format as an invalid format, then compiler
> 'then the compiler'
> 
> > @@ -16114,6 +16146,18 @@ push_minipool_fix (rtx insn, HOST_WIDE_INT
> address, rtx *loc,
> >minipool_fix_tail = fix;
> >  }
> >
> > +/* Return maximum allowed cost of synthesizing a 64-bit constant VAL
> inline.
> > +   Returns 99 if we always want to synthesize the value.  */
> 
> Needs to mention that the cost is in terms of 'insns' (see the function
> below it).
> 
> > +int
> > +arm_max_const_double_inline_cost ()
> > +{
> > +  /* Let the value get synthesized to avoid the use of literal pools.
*/
> > +  i

[arm-embedded] backport trunk -mslow-flash-data to embedded-4_8-branch

2013-11-25 Thread Terry Guo
Hi,

The trunk patch to support new option -mslow-flash-data at revision 205342
is back ported to arm/embedded-4_8-branch. Tested with regression test and
no regressions.

BR,
Terry

gcc/ChangeLog:
2013-11-26  Terry Guo  

Backport mainline r205342
2013-11-25  Terry Guo  

* doc/invoke.texi (-mslow-flash-data): Document new option.
* config/arm/arm.opt (mslow-flash-data): New option.
* config/arm/arm-protos.h (arm_max_const_double_inline_cost):
Declare
it.
* config/arm/arm.h (TARGET_USE_MOVT): Always true when literal pools
are disabled.
(arm_disable_literal_pool): Declare it.
* config/arm/arm.c (arm_disable_literal_pool): New variable.
(arm_option_override): Handle new option.
(thumb2_legitimate_address_p): Don't allow symbol references when
literal pools are disabled.
(arm_max_const_double_inline_cost): New function.
* config/arm/arm.md (types.md): Include it before ...
(use_literal_pool): New attribute.
(enabled): Use new attribute.
(split pattern): Replace symbol+offset with MOVW/MOVT.

gcc/testsuite/ChangeLog:
2013-11-26  Terry Guo  

Backport mainline r205342
2013-11-25  Terry Guo  

* gcc.target/arm/thumb2-slow-flash-data.c: New.




[Patch, ARM] Fix ICE when high register is used as pic base register for thumb1 target

2013-11-25 Thread Terry Guo
Hi,

This patch intends to fix ICE when high register is used for pic base
register for thumb1 target. Tested with gcc regression test, no new
regressions. Is it OK to trunk?

BR,
Terry

gcc/ChangeLog:

2013-11-26  Terry Guo  

* config/arm/arm.c (require_pic_register): Handle high pic base
register for
thumb-1.
(arm_load_pic_register): Also initialize high pic base register.
* doc/invoke.texi: Update documentation for option -mpic-register.

gcc/testsuite/ChangeLog:

2013-11-26  Terry Guo  

* gcc.target/arm/thumb1-pic-high.c: New case.
* gcc.target/arm/thumb1-pic-single-base.c: New case.From 44dc01379291b53d6eeb227d7006d3541e27dd93 Mon Sep 17 00:00:00 2001
From: Terry Guo 
Date: Tue, 26 Nov 2013 10:10:50 +0800
Subject: [PATCH] pic v6

---
 gcc/config/arm/arm.c   |   18 --
 gcc/doc/invoke.texi|7 +--
 gcc/testsuite/gcc.target/arm/thumb1-pic-high-reg.c |   11 +++
 .../gcc.target/arm/thumb1-pic-single-base.c|   11 +++
 4 files changed, 43 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/thumb1-pic-high-reg.c
 create mode 100644 gcc/testsuite/gcc.target/arm/thumb1-pic-single-base.c

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index dc3dbdb..4af6c05 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -5917,7 +5917,8 @@ require_pic_register (void)
   if (!crtl->uses_pic_offset_table)
 {
   gcc_assert (can_create_pseudo_p ());
-  if (arm_pic_register != INVALID_REGNUM)
+  if (arm_pic_register != INVALID_REGNUM
+ && !(TARGET_THUMB1 && arm_pic_register > LAST_LO_REGNUM))
{
  if (!cfun->machine->pic_reg)
cfun->machine->pic_reg = gen_rtx_REG (Pmode, arm_pic_register);
@@ -5943,7 +5944,12 @@ require_pic_register (void)
  crtl->uses_pic_offset_table = 1;
  start_sequence ();
 
- arm_load_pic_register (0UL);
+ if (TARGET_THUMB1 && arm_pic_register != INVALID_REGNUM
+ && arm_pic_register > LAST_LO_REGNUM)
+   emit_move_insn (cfun->machine->pic_reg,
+   gen_rtx_REG (Pmode, arm_pic_register));
+ else
+   arm_load_pic_register (0UL);
 
  seq = get_insns ();
  end_sequence ();
@@ -6202,6 +6208,14 @@ arm_load_pic_register (unsigned long saved_regs 
ATTRIBUTE_UNUSED)
  emit_insn (gen_movsi (pic_offset_table_rtx, pic_tmp));
  emit_insn (gen_pic_add_dot_plus_four (pic_reg, pic_reg, labelno));
}
+ else if (arm_pic_register != INVALID_REGNUM
+  && arm_pic_register > LAST_LO_REGNUM
+  && REGNO (pic_reg) <= LAST_LO_REGNUM)
+   {
+ emit_insn (gen_pic_load_addr_unified (pic_reg, pic_rtx, labelno));
+ emit_move_insn (gen_rtx_REG (Pmode, arm_pic_register), pic_reg);
+ emit_use (gen_rtx_REG (Pmode, arm_pic_register));
+   }
  else
emit_insn (gen_pic_load_addr_unified (pic_reg, pic_rtx, labelno));
}
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 501d080..f0b46e9 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -12216,8 +12216,11 @@ before execution begins.
 
 @item -mpic-register=@var{reg}
 @opindex mpic-register
-Specify the register to be used for PIC addressing.  The default is R10
-unless stack-checking is enabled, when R9 is used.
+Specify the register to be used for PIC addressing.
+For standard PIC base case, the default will be any suitable register
+determined by compiler.  For single PIC base case, the default is R9
+if target is EABI based or stack-checking is enabled, otherwise
+the default is R10.
 
 @item -mpic-data-is-text-relative
 @opindex mpic-data-is-text-relative
diff --git a/gcc/testsuite/gcc.target/arm/thumb1-pic-high-reg.c 
b/gcc/testsuite/gcc.target/arm/thumb1-pic-high-reg.c
new file mode 100644
index 000..df269fc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb1-pic-high-reg.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_thumb1_ok } */
+/* { dg-options "-mthumb -fpic -mpic-register=9" } */
+
+int g_test;
+
+int
+foo (int par)
+{
+g_test = par;
+}
diff --git a/gcc/testsuite/gcc.target/arm/thumb1-pic-single-base.c 
b/gcc/testsuite/gcc.target/arm/thumb1-pic-single-base.c
new file mode 100644
index 000..6e9b257
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb1-pic-single-base.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_thumb1_ok } */
+/* { dg-options "-mthumb -fpic -msingle-pic-base" } */
+
+int g_test;
+
+int
+foo (int par)
+{
+g_test = par;
+}
-- 
1.7.9.5


[arm-embedded] Backport trunk new arm rtx cost model to embedded-4_8-branch

2013-11-26 Thread Terry Guo
Hi,

This backport intends to enable new arm rtx cost model in trunk for
embedded-4_8-branch. The backport incorporates all relevant trunk commits
and some minor tweaks for embedded-4_8-branch. Tested with gcc regression
test and found one regression related to case pr42575.c. The upstream gcc
has same issue. We are working on it.

BR,
Terry




[1/2][PATCH,ARM]Generate UAL assembly code for Thumb-1 target

2014-10-21 Thread Terry Guo
Hi There,

This is the first patch to enable GCC generate UAL assembly code for Thumb1
target. This new option enables user to specify which syntax is used in
their inline assembly code.  If the inline assembly code uses UAL format,
then gcc does nothing because gcc generates UAL code as well. If the inline
assembly code uses non-UAL, then gcc will insert some directives in final
assembly code. Is it ok to trunk?

BR,
Terry

2014-10-21  Terry Guo  

* config/arm/arm.h (TARGET_UNIFIED_ASM): Also include thumb1.
(ASM_APP_ON): Redefined.
* config/arm/arm.c (arm_option_override): Thumb2 always uses UAL
for inline assembly code.
* config/arm/arm.opt (masm-syntax-unified): New option.
* doc/invoke.texi (-masm-syntax-unified): Document new option.diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 3623c70..e654e22 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -165,6 +165,8 @@ extern char arm_arch_name[];
  } \
if (TARGET_IDIV)\
  builtin_define ("__ARM_ARCH_EXT_IDIV__"); \
+   if (inline_asm_unified) \
+ builtin_define ("__ARM_ASM_SYNTAX_UNIFIED__");\
 } while (0)
 
 #include "config/arm/arm-opts.h"
@@ -348,8 +350,8 @@ extern void (*arm_lang_output_object_attributes_hook)(void);
|| (!optimize_size && !current_tune->prefer_constant_pool)))
 
 /* We could use unified syntax for arm mode, but for now we just use it
-   for Thumb-2.  */
-#define TARGET_UNIFIED_ASM TARGET_THUMB2
+   for thumb mode.  */
+#define TARGET_UNIFIED_ASM (TARGET_THUMB)
 
 /* Nonzero if this chip provides the DMB instruction.  */
 #define TARGET_HAVE_DMB(arm_arch6m || arm_arch7)
@@ -2144,8 +2146,13 @@ extern int making_const_table;
 #define CC_STATUS_INIT \
   do { cfun->machine->thumb1_cc_insn = NULL_RTX; } while (0)
 
+#undef ASM_APP_ON
+#define ASM_APP_ON (inline_asm_unified ? "\t.syntax unified" : \
+   "\t.syntax divided\n")
+
 #undef  ASM_APP_OFF
-#define ASM_APP_OFF (TARGET_ARM ? "" : "\t.thumb\n")
+#define ASM_APP_OFF (TARGET_ARM ? "\t.arm\n\t.syntax divided\n" : \
+"\t.thumb\n\t.syntax unified\n")
 
 /* Output a push or a pop instruction (only used when profiling).
We can't push STATIC_CHAIN_REGNUM (r12) directly with Thumb-1.  We know
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 1ee0eb3..9ccf73c 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3121,6 +3121,11 @@ arm_option_override (void)
   if (target_slow_flash_data)
 arm_disable_literal_pool = true;
 
+  /* Thumb2 inline assembly code should always use unified syntax.
+ This will apply to ARM and Thumb1 eventually.  */
+  if (TARGET_THUMB2)
+inline_asm_unified = 1;
+
   /* Register global variables with the garbage collector.  */
   arm_add_gc_roots ();
 }
diff --git a/gcc/config/arm/arm.opt b/gcc/config/arm/arm.opt
index 0a80513..50f4c7d 100644
--- a/gcc/config/arm/arm.opt
+++ b/gcc/config/arm/arm.opt
@@ -271,3 +271,7 @@ Use Neon to perform 64-bits operations rather than core 
registers.
 mslow-flash-data
 Target Report Var(target_slow_flash_data) Init(0)
 Assume loading data from flash is slower than fetching instructions.
+
+masm-syntax-unified
+Target Report Var(inline_asm_unified) Init(0)
+Assume unified syntax for Thumb inline assembly code.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 23f272f..c30c858 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -545,6 +545,7 @@ Objective-C and Objective-C++ Dialects}.
 -munaligned-access @gol
 -mneon-for-64bits @gol
 -mslow-flash-data @gol
+-masm-syntax-unified @gol
 -mrestrict-it}
 
 @emph{AVR Options}
@@ -12954,6 +12955,14 @@ Therefore literal load is minimized for better 
performance.
 This option is only supported when compiling for ARMv7 M-profile and
 off by default.
 
+@item -masm-syntax-unified
+@opindex masm-syntax-unified
+Assume the Thumb1 inline assembly code are using unified syntax.
+The default is currently off, which means divided syntax is assumed.
+However, this may change in future releases of GCC.  Divided syntax
+should be considered deprecated.  This option has no effect when
+generating Thumb2 code.  Thumb2 assembly code always uses unified syntax.
+
 @item -mrestrict-it
 @opindex mrestrict-it
 Restricts generation of IT blocks to conform to the rules of ARMv8.


[2/2][PATCH,ARM]Generate UAL assembly code for Thumb-1 target

2014-10-21 Thread Terry Guo
Hi there,

Attached patch intends to enable GCC generate UAL format code for Thumb1
target. Tested with regression test and no regressions. Is it OK to trunk?

BR,
Terry

2014-10-21  Terry Guo  

   * config/arm/arm.c (arm_output_mi_thunk): Use UAL for Thumb1
target.
   * config/arm/thumb1.md: Likewise.

gcc/testsuite
2014-10-21  Terry Guo  

* gcc.target/arm/anddi_notdi-1.c: Match with UAL format.
* gcc.target/arm/pr40956.c: Likewise.
* gcc.target/arm/thumb1-Os-mult.c: Likewise.
 * gcc.target/arm/thumb1-load-64bit-constant-3.c: Likewise.diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 9ccf73c..dc73244 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -28615,12 +28615,14 @@ arm_output_mi_thunk (FILE *file, tree thunk 
ATTRIBUTE_UNUSED,
  fputs ("\tldr\tr3, ", file);
  assemble_name (file, label);
  fputs ("+4\n", file);
- asm_fprintf (file, "\t%s\t%r, %r, r3\n",
+ asm_fprintf (file, "\t%ss\t%r, %r, r3\n",
   mi_op, this_regno, this_regno);
}
   else if (mi_delta != 0)
{
- asm_fprintf (file, "\t%s\t%r, %r, #%d\n",
+ /* Thumb1 unified syntax requires s suffix in instruction name when
+one of the operands is immediate.  */
+ asm_fprintf (file, "\t%ss\t%r, %r, #%d\n",
   mi_op, this_regno, this_regno,
   mi_delta);
}
diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index 020d83b..8a2abe9 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -29,7 +29,7 @@
(clobber (reg:CC CC_REGNUM))
   ]
   "TARGET_THUMB1"
-  "add\\t%Q0, %Q0, %Q2\;adc\\t%R0, %R0, %R2"
+  "adds\\t%Q0, %Q0, %Q2\;adcs\\t%R0, %R0, %R2"
   [(set_attr "length" "4")
(set_attr "type" "multiple")]
 )
@@ -42,9 +42,9 @@
   "*
static const char * const asms[] =
{
- \"add\\t%0, %0, %2\",
- \"sub\\t%0, %0, #%n2\",
- \"add\\t%0, %1, %2\",
+ \"adds\\t%0, %0, %2\",
+ \"subs\\t%0, %0, #%n2\",
+ \"adds\\t%0, %1, %2\",
  \"add\\t%0, %0, %2\",
  \"add\\t%0, %0, %2\",
  \"add\\t%0, %1, %2\",
@@ -56,7 +56,7 @@
if ((which_alternative == 2 || which_alternative == 6)
&& CONST_INT_P (operands[2])
&& INTVAL (operands[2]) < 0)
- return \"sub\\t%0, %1, #%n2\";
+ return (which_alternative == 2) ? \"subs\\t%0, %1, #%n2\" : \"sub\\t%0, 
%1, #%n2\";
return asms[which_alternative];
   "
   "&& reload_completed && CONST_INT_P (operands[2])
@@ -105,7 +105,7 @@
  (match_operand:DI 2 "register_operand"  "l")))
(clobber (reg:CC CC_REGNUM))]
   "TARGET_THUMB1"
-  "sub\\t%Q0, %Q0, %Q2\;sbc\\t%R0, %R0, %R2"
+  "subs\\t%Q0, %Q0, %Q2\;sbcs\\t%R0, %R0, %R2"
   [(set_attr "length" "4")
(set_attr "type" "multiple")]
 )
@@ -115,7 +115,7 @@
(minus:SI (match_operand:SI 1 "register_operand" "l")
  (match_operand:SI 2 "reg_or_int_operand" "lPd")))]
   "TARGET_THUMB1"
-  "sub\\t%0, %1, %2"
+  "subs\\t%0, %1, %2"
   [(set_attr "length" "2")
(set_attr "conds" "set")
(set_attr "type" "alus_sreg")]
@@ -133,9 +133,9 @@
  "TARGET_THUMB1 && !arm_arch6"
   "*
   if (which_alternative < 2)
-return \"mov\\t%0, %1\;mul\\t%0, %2\";
+return \"mov\\t%0, %1\;muls\\t%0, %2\";
   else
-return \"mul\\t%0, %2\";
+return \"muls\\t%0, %2\";
   "
   [(set_attr "length" "4,4,2")
(set_attr "type" "muls")]
@@ -147,9 +147,9 @@
 (match_operand:SI 2 "register_operand" "l,0,0")))]
   "TARGET_THUMB1 && arm_arch6"
   "@
-   mul\\t%0, %2
-   mul\\t%0, %1
-   mul\\t%0, %1"
+   muls\\t%0, %2
+   muls\\t%0, %1
+   muls\\t%0, %1"
   [(set_attr "length" "2")
(set_attr "type" "muls")]
 )
@@ -159,7 +159,7 @@
(and:SI (match_operand:SI 1 "register_operand" "%0")
(match_operand:SI 2 "register_operand" "l")))]
   "TARGET_THUMB1"
-  "and\\t%0, %2"
+  "ands\\t%0, %2"
   [(set_attr "length" "2")
(set_attr "type"  "logic_imm")
(set_attr "conds" "set")])
@@ -202,7 +202,7 @@
(and:SI (not:SI (match_operand

[Patch, GCC/Thumb-1]Mishandle the label type insn in function thumb1_reorg

2014-06-10 Thread Terry Guo
Hi There,

The thumb1_reorg function use macro INSN_CODE to find expected instructions.
But the macro INSN_CODE doesn’t work for label type instruction. The
INSN_CODE(label_insn) will return the label number. When we have a lot of
labels and current label_insn is the first insn of basic block, the
INSN_CODE(label_insn) could accidentally equal to CODE_FOR_cbranchsi4_insn
in this case. This leads to ICE due to SET_SRC(label_insn) in subsequent
code. In general we should skip all such improper insns. This is the purpose
of attached small patch.

Some failures in recent gcc regression test on thumb1 target are caused by
this reason. So with this patch, all of them passed and no new failures. Is
it ok to trunk?

BR,
Terry

2014-06-10  Terry Guo  

 * config/arm/arm.c (thumb1_reorg): Move to next basic block if the head
 of current basic block isn’t a proper insn.   diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index ccad548..3ebe424 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -16939,7 +16939,8 @@ thumb1_reorg (void)
insn = PREV_INSN (insn);
 
   /* Find the last cbranchsi4_insn in basic block BB.  */
-  if (INSN_CODE (insn) != CODE_FOR_cbranchsi4_insn)
+  if (!NONDEBUG_INSN_P (insn)
+ || INSN_CODE (insn) != CODE_FOR_cbranchsi4_insn)
continue;
 
   /* Get the register with which we are comparing.  */


RE: [Patch, GCC/Thumb-1]Mishandle the label type insn in function thumb1_reorg

2014-06-18 Thread Terry Guo


> -Original Message-
> From: Richard Earnshaw
> Sent: Wednesday, June 18, 2014 4:31 PM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org; Ramana Radhakrishnan
> Subject: Re: [Patch, GCC/Thumb-1]Mishandle the label type insn in function
> thumb1_reorg
> 
> On 10/06/14 12:42, Terry Guo wrote:
> > Hi There,
> >
> > The thumb1_reorg function use macro INSN_CODE to find expected
> instructions.
> > But the macro INSN_CODE doesn’t work for label type instruction. The
> > INSN_CODE(label_insn) will return the label number. When we have a lot
> of
> > labels and current label_insn is the first insn of basic block, the
> > INSN_CODE(label_insn) could accidentally equal to
> CODE_FOR_cbranchsi4_insn
> > in this case. This leads to ICE due to SET_SRC(label_insn) in subsequent
> > code. In general we should skip all such improper insns. This is the purpose
> > of attached small patch.
> >
> > Some failures in recent gcc regression test on thumb1 target are caused by
> > this reason. So with this patch, all of them passed and no new failures. Is
> > it ok to trunk?
> >
> > BR,
> > Terry
> >
> > 2014-06-10  Terry Guo  
> >
> >  * config/arm/arm.c (thumb1_reorg): Move to next basic block if the
> head
> >  of current basic block isn’t a proper insn.
> >
> 
> I think you should just test that "insn != BB_HEAD (bb)".  The loop
> immediately above this deals with the !NON-DEBUG insns, so the logic is
> confusing the way you've written it.
> 
> R.
> 

Thanks for comments. The patch is updated and tested. No more ICE. Is this one 
OK?

BR,
Terry
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 85d2114..463707e 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -16946,7 +16946,8 @@ thumb1_reorg (void)
insn = PREV_INSN (insn);
 
   /* Find the last cbranchsi4_insn in basic block BB.  */
-  if (INSN_CODE (insn) != CODE_FOR_cbranchsi4_insn)
+  if (insn == BB_HEAD (bb)
+ || INSN_CODE (insn) != CODE_FOR_cbranchsi4_insn)
continue;
 
   /* Get the register with which we are comparing.  */

Ping^3 : [PATCH] [gcc, combine] PR46164: Don't combine the insns if a volatile register is contained.

2015-04-20 Thread Terry Guo
Hi there,

Is this one ok to trunk?

BR,
Terry

On Wed, Apr 15, 2015 at 6:45 PM, Hale Wang  wrote:
> Ping for trunk?
>
> Hale
>
>> -Original Message-
>> From: Richard Sandiford [mailto:rdsandif...@googlemail.com]
>> Sent: Friday, February 27, 2015 4:04 AM
>> To: Terry Guo
>> Cc: Segher Boessenkool; Richard Sandiford; GCC Patches; Hale Wang
>> Subject: Re: Ping : [PATCH] [gcc, combine] PR46164: Don't combine the
> insns
>> if a volatile register is contained.
>>
>> Terry Guo  writes:
>> > On Thu, Feb 26, 2015 at 1:55 PM, Segher Boessenkool
>> >  wrote:
>> >> On Tue, Feb 17, 2015 at 11:39:34AM +0800, Terry Guo wrote:
>> >>> On Sun, Feb 15, 2015 at 7:35 PM, Segher Boessenkool
>> >>>  wrote:
>> >>> > Hi Terry,
>> >>> >
>> >>> > I still think this is stage1 material.
>> >>> >
>> >>> >> + /* Don't combine if dest contains a user specified register and
>> >>> >> i3 contains
>> >>> >> + ASM_OPERANDS, because the user specified register (same with
>> >>> >> dest) in i3
>> >>> >> + would be replaced by the src of insn which might be different
>> with
>> >>> >> + the user's expectation.  */
>> >>> >
>> >>> > "Do not eliminate a register asm in an asm input" or similar?
>> >>> > Text explaining why REG_USERVAR_P && HARD_REGISTER_P works
>> here
>> >>> > would be good to have, too.
>> >>
>> >>> diff --git a/gcc/combine.c b/gcc/combine.c index f779117..aeb2854
>> >>> 100644
>> >>> --- a/gcc/combine.c
>> >>> +++ b/gcc/combine.c
>> >>> @@ -1779,7 +1779,7 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3,
>> >>> rtx_insn *pred ATTRIBUTE_UNUSED,  {
>> >>>int i;
>> >>>const_rtx set = 0;
>> >>> -  rtx src, dest;
>> >>> +  rtx src, dest, asm_op;
>> >>>rtx_insn *p;
>> >>>  #ifdef AUTO_INC_DEC
>> >>>rtx link;
>> >>> @@ -1914,6 +1914,14 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3,
>> rtx_insn *pred ATTRIBUTE_UNUSED,
>> >>>set = expand_field_assignment (set);
>> >>>src = SET_SRC (set), dest = SET_DEST (set);
>> >>>
>> >>> +  /* Use REG_USERVAR_P and HARD_REGISTER_P to check whether
>> DEST is a user
>> >>> + specified register, and do not eliminate such register if it is
> in an
>> >>> + asm input because we may end up with something different with
>> user's
>> >>> + expectation.  */
>> >>
>> >> That doesn't explain why this will hit (almost) only on register asms.
>> >> The user's expectation doesn't matter that much either: GCC would
>> >> violate its own documentation / promises, that matters more ;-)
>> >>
>> >>> +  if (REG_P (dest) && REG_USERVAR_P (dest) && HARD_REGISTER_P
>> (dest)
>> >>> +  && ((asm_op = extract_asm_operands (PATTERN (i3))) != NULL))
>> >>
>> >> You do not need the temporary variable, nor the != 0 or the extra
>> >> parens; just write
>> >>
>> >>  && extract_asm_operands (PATTERN (i3))
>> >>
>> >> Cheers,
>> >>
>> >>
>> >> Segher
>> >
>> > Thanks for comments. Patch is updated now. Please review again.
>>
>> Looks good to me FWIW.
>>
>> Thanks,
>> Richard
>
>
>


Re: Ping^3 : [PATCH] [gcc, combine] PR46164: Don't combine the insns if a volatile register is contained.

2015-04-21 Thread Terry Guo
On Tue, Apr 21, 2015 at 11:03 AM, Segher Boessenkool
 wrote:
> On Tue, Apr 21, 2015 at 09:39:16AM +0800, Terry Guo wrote:
>> Is this one ok to trunk?
>
> Probably, if you send the patch + changelog entry :-)
>
> Did you fix the comment?  REG_USERVAR_P and HARD_REGISTER_P can be
> set for more than just register asm.
>
>
> Segher

Sorry for missing the patch. I believe that I addressed your patch.
Please review it again to make sure my understanding is correct. The
patch is attached and here is the URL to it
https://gcc.gnu.org/ml/gcc-patches/2015-02/msg01593.html. The
ChangeLog:

gcc/ChangeLog:
2015-04-21  Terry Guo  

   PR rtl-optimization/64818
   * combine.c (can_combine_p): Don't combine if DEST is a user-specified
   register.

gcc/testsuite/ChangeLog:

2015-04-21  Terry Guo  

   PR rtl-optimization/64818
   * gcc.target/arm/pr64818.c: New.


pr64818-combine-user-specified-register.patch-5
Description: Binary data


Re: Ping^3 : [PATCH] [gcc, combine] PR46164: Don't combine the insns if a volatile register is contained.

2015-04-21 Thread Terry Guo
On Wed, Apr 22, 2015 at 9:44 AM, Segher Boessenkool
 wrote:
> On Tue, Apr 21, 2015 at 03:13:38PM +0800, Terry Guo wrote:
>> > Did you fix the comment?  REG_USERVAR_P and HARD_REGISTER_P can be
>> > set for more than just register asm.
>>
>> Sorry for missing the patch. I believe that I addressed your patch.
>> Please review it again to make sure my understanding is correct.
>
>> +  /* Use REG_USERVAR_P and HARD_REGISTER_P to check whether DEST is a user
>> + specified register, and do not eliminate such register if it is in an
>> + asm input.  Otherwise if allow such elimination, we may break the
>> + register asm usage defined in GCC manual.  */
>> +  if (REG_P (dest) && REG_USERVAR_P (dest) && HARD_REGISTER_P (dest)
>> +  && extract_asm_operands (PATTERN (i3)))
>> +return 0;
>
> The "to check whether DEST is a user-specified register" part is not
> correct; this check can for example also match for function arguments
> (which are hard regs) that were combined into any "normal" user var.
> I don't see how we would do a better check, and disallowing combination
> in this case is harmless (or even good); but the comment is misleading.
>
>
> Segher

Thanks for reviewing. Patch is updated per you suggestion. The
ChangeLog is also updated as below:

gcc/ChangeLog:
2015-04-22 Hale Wang 
Terry Guo  

   PR rtl-optimization/64818
   * combine.c (can_combine_p): Don't combine user-specified register if
   it is in an asm input.

gcc/testsuite/ChangeLog:
2015-04-22 Hale Wang 
Terry Guo  

   PR rtl-optimization/64818
   * gcc.target/arm/pr64818.c: New.
diff --git a/gcc/combine.c b/gcc/combine.c
index 6f0007a..6cd55dd 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -1910,6 +1910,15 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3, rtx_insn 
*pred ATTRIBUTE_UNUSED,
   set = expand_field_assignment (set);
   src = SET_SRC (set), dest = SET_DEST (set);
 
+  /* Do not eliminate user-specified register if it is in an
+ asm input because we may break the register asm usage defined
+ in GCC manual if allow to do so.
+ Be aware that this may cover more cases than we expect but this
+ should be harmless.  */
+  if (REG_P (dest) && REG_USERVAR_P (dest) && HARD_REGISTER_P (dest)
+  && extract_asm_operands (PATTERN (i3)))
+return 0;
+
   /* Don't eliminate a store in the stack pointer.  */
   if (dest == stack_pointer_rtx
   /* Don't combine with an insn that sets a register to itself if it has
diff --git a/gcc/testsuite/gcc.target/arm/pr64818.c 
b/gcc/testsuite/gcc.target/arm/pr64818.c
new file mode 100644
index 000..bddd846
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr64818.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-O1" } */
+
+char temp[16];
+extern int foo1 (void);
+
+void foo (void)
+{
+  int i;
+  int len;
+
+  while (1)
+  {
+len = foo1 ();
+register int a asm ("r0") = 5;
+register char *b asm ("r1") = temp;
+register int c asm ("r2") = len;
+asm volatile ("mov %[r0], %[r0]\n  mov %[r1], %[r1]\n  mov %[r2], %[r2]\n"
+  : "+m"(*b)
+  : [r0]"r"(a), [r1]"r"(b), [r2]"r"(c));
+
+for (i = 0; i < len; i++)
+{
+  if (temp[i] == 10)
+  return;
+}
+  }
+}
+
+/* { dg-final { scan-assembler "\[\\t \]+mov\ r1,\ r1" } } */


Re: Ping^3 : [PATCH] [gcc, combine] PR46164: Don't combine the insns if a volatile register is contained.

2015-04-22 Thread Terry Guo
On Wed, Apr 22, 2015 at 10:30 AM, Segher Boessenkool
 wrote:
> On Wed, Apr 22, 2015 at 10:21:43AM +0800, Terry Guo wrote:
>> gcc/ChangeLog:
>> 2015-04-22 Hale Wang 
>>     Terry Guo  
>>
>>PR rtl-optimization/64818
>>* combine.c (can_combine_p): Don't combine user-specified register if
>>it is in an asm input.
>>
>> gcc/testsuite/ChangeLog:
>> 2015-04-22 Hale Wang 
>> Terry Guo  
>>
>>PR rtl-optimization/64818
>>* gcc.target/arm/pr64818.c: New.
>
> This is okay for trunk, if it has been bootstrapped and regression tested.
>
> Thanks,
>
>
> Segher

Thanks Segher. The patch is tested with bootstrap and regression test
for x86_64. No problem found. Committed as revision 222306.

BR,
Terry


[Patch][ARM]Correct options for arm test case pr65710

2015-04-22 Thread Terry Guo
Hi there,

This patch is to correct options in arm test case pr65710.c. I reused
some existing test case as template to produce this case, but forgot
to update the options. Is it OK to trunk?

BR,
Terry

2015-04-23  Terry Guo  

   * gcc.target/arm/pr65710.c: Update the options.
diff --git a/gcc/testsuite/gcc.target/arm/pr65710.c 
b/gcc/testsuite/gcc.target/arm/pr65710.c
index 139bc64..737b7f3 100644
--- a/gcc/testsuite/gcc.target/arm/pr65710.c
+++ b/gcc/testsuite/gcc.target/arm/pr65710.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv6-m -mthumb -O3 -w -mfloat-abi=soft" } */
+/* { dg-options "-mthumb -O2 -mfloat-abi=soft" } */
 
 struct ST {
   char *buffer;


Re: [Patch][ARM]Correct options for arm test case pr65710

2015-04-23 Thread Terry Guo
On Thu, Apr 23, 2015 at 4:23 PM, Kyrill Tkachov  wrote:
> Hi Terry,
>
> On 23/04/15 02:56, Terry Guo wrote:
>>
>>   /* { dg-do compile } */
>> -/* { dg-options "-march=armv6-m -mthumb -O3 -w -mfloat-abi=soft" } */
>> +/* { dg-options "-mthumb -O2 -mfloat-abi=soft" } */
>>
>
>
> If you have really need the -mthumb here, don't you also
> need to check for a thumb effective target?
> I see in the testsuite we use things like:
> /* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */
> or
> /* { dg-require-effective-target arm_thumb2_ok } */
>
> Kyrill
>

The -mthumb is necessary to reproduce this bug currently. But it is
better to enable running this test for other targets, so I prefer not
limiting this case for thumb1 target.

BR,
Terry


Re: [Patch][ARM]Correct options for arm test case pr65710

2015-04-23 Thread Terry Guo
On Thu, Apr 23, 2015 at 4:37 PM, Kyrill Tkachov  wrote:
>
> On 23/04/15 09:25, Terry Guo wrote:
>>
>> On Thu, Apr 23, 2015 at 4:23 PM, Kyrill Tkachov 
>> wrote:
>>>
>>> Hi Terry,
>>>
>>> On 23/04/15 02:56, Terry Guo wrote:
>>>>
>>>>/* { dg-do compile } */
>>>> -/* { dg-options "-march=armv6-m -mthumb -O3 -w -mfloat-abi=soft" } */
>>>> +/* { dg-options "-mthumb -O2 -mfloat-abi=soft" } */
>>>>
>>>
>>> If you have really need the -mthumb here, don't you also
>>> need to check for a thumb effective target?
>>> I see in the testsuite we use things like:
>>> /* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */
>>> or
>>> /* { dg-require-effective-target arm_thumb2_ok } */
>>>
>>> Kyrill
>>>
>> The -mthumb is necessary to reproduce this bug currently. But it is
>> better to enable running this test for other targets, so I prefer not
>> limiting this case for thumb1 target.
>
> Hi Terry,
>
> I had a closer look. This test has lots of warnings so
> you really need that -w in the original options as well.
> I was concerned that if you try adding -mthumb when we're
> testing a target that doesn't support Thumb (say -march=armv4)
> you'll get an error "target CPU does not support THUMB instructions"
> but having tried it out myself, I see that we have this as a warning,
> not an error, so the -w will silence it!
>
> I would prefer if you added a:
>
> /* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */
>
>
> So that the test will use Thumb (1 or 2) if possible, and not
> test irrelevant to this test (defaulting -marm) behaviour otherwise.
>
> Kyrill
>
>
>

Patch is updated per your suggestion. Is it OK now?

BR,
Terry

diff --git a/gcc/testsuite/gcc.target/arm/pr65710.c
b/gcc/testsuite/gcc.target/arm/pr65710.c
index 139bc64..227059b 100644
--- a/gcc/testsuite/gcc.target/arm/pr65710.c
+++ b/gcc/testsuite/gcc.target/arm/pr65710.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv6-m -mthumb -O3 -w -mfloat-abi=soft" } */
+/* { dg-options "-mthumb -O2 -mfloat-abi=soft -w" } */
+/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */

 struct ST {
   char *buffer;


[Patch, ARM]Fix pattern that is missed for Thumb-1 UAL

2014-11-14 Thread Terry Guo
Hi there,

Attached patch intends to fix a pattern that is found still non-UAL when do
gcc thumb-1 bootstrap. A test case is reduced and attached. Tested with gcc
regression test on pre-v6 thumb1 and v6 thumb1. No regression. Multilib can
be built for both of them.
Is it OK to trunk?

BR,
Terry

gcc/ChangeLog:
2014-11-14  Terry Guo  

 * config/arm/thumb1.md (*addsi3_cbranch_scratch): Updated to UAL
format.

gcc/testsuite/ChangeLog:
2014-11-14  Terry Guo  

 * gcc.target/arm/thumb1-ual-1.c: New test.diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index 3d6f80b..ddedc39 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -1420,13 +1420,13 @@
 if (INTVAL (operands[2]) < 0)
   output_asm_insn (\"subs\t%0, %1, %2\", operands);
 else
-  output_asm_insn (\"add\t%0, %1, %2\", operands);
+  output_asm_insn (\"adds\t%0, %1, %2\", operands);
 break;
case 3:
 if (INTVAL (operands[2]) < 0)
   output_asm_insn (\"subs\t%0, %0, %2\", operands);
 else
-  output_asm_insn (\"add\t%0, %0, %2\", operands);
+  output_asm_insn (\"adds\t%0, %0, %2\", operands);
 break;
}
 
diff --git a/gcc/testsuite/gcc.target/arm/thumb1-ual-1.c 
b/gcc/testsuite/gcc.target/arm/thumb1-ual-1.c
new file mode 100644
index 000..a2e439c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb1-ual-1.c
@@ -0,0 +1,87 @@
+/* Test Thumb1 insn pattern addsi3_cbranch_scratch.  */
+/* { dg-options "-O2" } */
+/* { dg-skip-if "" { ! { arm_thumb1 } } } */
+
+struct real_value {
+
+  unsigned int cl : 2;
+  unsigned int decimal : 1;
+  unsigned int sign : 1;
+  unsigned int signalling : 1;
+  unsigned int canonical : 1;
+  unsigned int uexp : (32 - 6);
+  unsigned long sig[((128 + (8 * 4)) / (8 * 4))];
+};
+
+enum real_value_class {
+  rvc_zero,
+  rvc_normal,
+  rvc_inf,
+  rvc_nan
+};
+
+extern void exit(int);
+extern int foo(long long *, int, int);
+
+int
+real_to_integer (const struct real_value *r, int *fail, int precision)
+{
+  long long val[2 * (((64*(8)) + 64) / 64)];
+  int exp;
+  int words, w;
+  int result;
+
+  switch (r->cl)
+{
+case rvc_zero:
+underflow:
+  return 100;
+
+case rvc_inf:
+case rvc_nan:
+overflow:
+  *fail = 1;
+
+  if (r->sign)
+ return 200;
+  else
+ return 300;
+
+case rvc_normal:
+  if (r->decimal)
+ return 400;
+
+  exp = ((int)((r)->uexp ^ (unsigned int)(1 << ((32 - 6) - 1))) - (1 << 
((32 - 6) - 1)));
+  if (exp <= 0)
+ goto underflow;
+
+
+  if (exp > precision)
+ goto overflow;
+  words = (precision + 64 - 1) / 64;
+  w = words * 64;
+  for (int i = 0; i < words; i++)
+ {
+   int j = ((128 + (8 * 4)) / (8 * 4)) - (words * 2) + (i * 2);
+   if (j < 0)
+ val[i] = 0;
+   else
+ val[i] = r->sig[j];
+   j += 1;
+   if (j >= 0)
+ val[i] |= (unsigned long long) r->sig[j] << (8 * 4);
+ }
+
+
+  result = foo(val, words, w);
+
+  if (r->sign)
+ return -result;
+  else
+ return result;
+
+default:
+  exit(2);
+}
+}
+


[PATCH][wwwdocs] Update 5.0 changes.html with Thumb1 UAL

2014-11-17 Thread Terry Guo
Hi there,

This patch documents recent Thumb-1 UAL feature in trunk. Is it OK?

BR,
TerryIndex: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.27
diff -u -r1.27 changes.html
--- changes.html17 Nov 2014 20:14:38 -  1.27
+++ changes.html18 Nov 2014 02:42:31 -
@@ -368,6 +368,16 @@
 

 
+ARM
+ 
+   The Thumb-1 assembly code are now generated in unified syntax. The 
new option
+-masm-syntax-unified can be used to specify whether 
inline assembly
+code are using unified syntax. By default the option is off which means
+non-unified syntax is used. However this is subject to change in 
future releases.
+Eventually the non-unified syntax will be deprecated.
+  
+ 
+
 Operating Systems
 
   DragonFly BSD

RE: [PATCH][wwwdocs] Update 5.0 changes.html with Thumb1 UAL

2014-11-20 Thread Terry Guo


> -Original Message-
> From: Kyrill Tkachov [mailto:kyrylo.tkac...@arm.com]
> Sent: Tuesday, November 18, 2014 11:08 PM
> To: Terry Guo; gcc-patches@gcc.gnu.org
> Cc: ger...@pfeifer.com
> Subject: Re: [PATCH][wwwdocs] Update 5.0 changes.html with Thumb1 UAL
> 
> 
> On 18/11/14 02:48, Terry Guo wrote:
> > + 
> > +   The Thumb-1 assembly code are now generated in unified
syntax.
> The new option
> > +-masm-syntax-unified can be used to specify
whether
> inline assembly
> > +code are using unified syntax. By default the option is off
which
> means
> > +non-unified syntax is used. However this is subject to change
in future
> releases.
> > +Eventually the non-unified syntax will be deprecated.
> > +  
> > + 
> Hi Terry,
> 
> Sorry for the late comment, I see this has already been committed.
> 
> I think it should be "assembly code is now generated".
> Also "whether inline assembly code is using unified syntax".
> 
> Kyrill

Thanks for comments. I committed below patch to fix those typos.

BR,
Terry

Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.39
diff -u -r1.39 changes.html
--- htdocs/gcc-5/changes.html   19 Nov 2014 12:13:00 -  1.39
+++ htdocs/gcc-5/changes.html   20 Nov 2014 03:48:26 -
@@ -387,9 +387,9 @@
 
 ARM
  
-   The Thumb-1 assembly code are now generated in unified syntax.
The new option
+   The Thumb-1 assembly code is now generated in unified syntax.
The new option
 -masm-syntax-unified can be used to specify whether
inline assembly
-code are using unified syntax. By default the option is off which
means
+code is using unified syntax. By default the option is off which
means
 non-unified syntax is used. However this is subject to change in
future releases.
 Eventually the non-unified syntax will be deprecated.
   





[Patch,wwwdoc]Update 5.0 change for ARM new core cortex-m7

2014-11-25 Thread Terry Guo
Hi there,

This patch will document support and tuning for Cortex-M7 in GCC 5.0
changes. Is it ok to commit?

BR,
Terry

2014-11-26  Terry Guo  

 * htdocs/gcc-5/changes.html: Mention Cortex-M7.

Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.35
diff -u -r1.35 changes.html
--- htdocs/gcc-5/changes.html   18 Nov 2014 06:50:55 -  1.35
+++ htdocs/gcc-5/changes.html   18 Nov 2014 08:35:49 -
@@ -392,6 +392,9 @@
 non-unified syntax is used. However this is subject to change in
future releases.
 Eventually the non-unified syntax will be deprecated.
   
+ Support for the Cortex-M7 processor is now available through the
+-mcpu=cortex-m7 and -mtune=cortex-m7 options.
+ 
  
 
 IA-32/x86-64Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.35
diff -u -r1.35 changes.html
--- htdocs/gcc-5/changes.html   18 Nov 2014 06:50:55 -  1.35
+++ htdocs/gcc-5/changes.html   18 Nov 2014 08:35:49 -
@@ -392,6 +392,9 @@
 non-unified syntax is used. However this is subject to change in 
future releases.
 Eventually the non-unified syntax will be deprecated.
   
+ Support for the Cortex-M7 processor is now available through the
+-mcpu=cortex-m7 and -mtune=cortex-m7 options.
+ 
  
 
 IA-32/x86-64


RE: [Patch][ARM]Don't put volatile memory access in IT block for cortex-m7

2015-02-25 Thread Terry Guo


> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Richard Earnshaw
> Sent: Wednesday, February 18, 2015 2:45 AM
> To: Terry Guo; gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw; Ramana Radhakrishnan
> Subject: Re: [Patch][ARM]Don't put volatile memory access in IT block for
> cortex-m7
> 
> On 12/02/15 11:12, Terry Guo wrote:
> > Hi there,
> >
> > This patch intends to prevent gcc from putting volatile memory access
> > into IT block for target like cortex-m7.
> >
> > gcc/ChangeLog:
> >
> > 2015-02-12  Terry Guo  
> >
> > * config/arm/arm.c (arm_tune_cortex_m7): New global variable.
> > * config/arm/arm.h (TARGET_NO_VOLATILE_CE): New macro.
> > (arm_tune_cortex_m7): Declare new global variable.
> > * config/arm/arm.md (arm_comparison_operator): Disabled if not
> allow
> >  volatile memory access in IT block.
> >
> > gcc/testsuite/ChangeLog:
> >
> > 2015-02-12  Terry Guo  
> >
> > * gcc.target/arm/cortex-m7-it-volatile.c: New test.
> >
> 
> Not ok.
> 
> +/* Targets that don't support accessing volatile memory inside IT
> block.  */
> +#define TARGET_NO_VOLATILE_CE(arm_tune_cortex_m7)
> 
> Please don't create feature bits that explicitly test for a particular target.
> Instead, define generic 'features' and then arrange for either the
> architecture tables, or tuning tables (as appropriate) to enable that feature.
> 
> See how arm_arch_arm_hwdiv is defined for how to do this.
> 
> R.
> 

Thanks Richard.  Patch is updated per your suggestion. Is this one OK for 
current stage and 4.8/4.9?

BR,
Terry

gcc/testsuite/ChangeLog:

2015-02-25  Terry Guo  

* gcc.target/arm/no-volatile-in-it.c: New test.


gcc/ChangeLog:

2015-02-25  Terry Guo  

* config/arm/arm-cores.def (cortex-m7): Add flag FL_NO_VOLATILE_CE.
* config/arm/arm-protos.h (FL_NO_VOLATILE_CE): New flag.
(arm_arch_no_volatile_ce): Declare new global variable.
* config/arm/arm.c (arm_arch_no_volatile_ce): Define new global variable.
(arm_option_override): Assign value to arm_arch_no_volatile_ce.
* config/arm/arm.h (arm_arch_no_volatile_ce): Declare it.
(TARGET_NO_VOLATILE_CE): New macro.
* config/arm/arm.md (arm_comparison_operator): Disabled if not allow
volatile memory access in IT blockdiff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index d7e730d..b22ea7f 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -155,7 +155,7 @@ ARM_CORE("cortex-r4",   cortexr4, cortexr4, 
7R,  FL_LDSCHED, cortex)
 ARM_CORE("cortex-r4f", cortexr4f, cortexr4f,   7R,  
FL_LDSCHED, cortex)
 ARM_CORE("cortex-r5",  cortexr5, cortexr5, 7R,  FL_LDSCHED 
| FL_ARM_DIV, cortex)
 ARM_CORE("cortex-r7",  cortexr7, cortexr7, 7R,  FL_LDSCHED 
| FL_ARM_DIV, cortex)
-ARM_CORE("cortex-m7",  cortexm7, cortexm7, 7EM, 
FL_LDSCHED, cortex_m7)
+ARM_CORE("cortex-m7",  cortexm7, cortexm7, 7EM, FL_LDSCHED 
| FL_NO_VOLATILE_CE, cortex_m7)
 ARM_CORE("cortex-m4",  cortexm4, cortexm4, 7EM, 
FL_LDSCHED, v7m)
 ARM_CORE("cortex-m3",  cortexm3, cortexm3, 7M,  
FL_LDSCHED, v7m)
 ARM_CORE("marvell-pj4",marvell_pj4, marvell_pj4,   7A,  
FL_LDSCHED, 9e)
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 307babb..28ffe52 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -360,6 +360,7 @@ extern bool arm_is_constant_pool_ref (rtx);
 #define FL_CRC32  (1 << 25)  /* ARMv8 CRC32 instructions.  */
 
 #define FL_SMALLMUL   (1 << 26)   /* Small multiply supported.  */
+#define FL_NO_VOLATILE_CE   (1 << 27) /* No volatile memory in IT block.  */
 
 #define FL_IWMMXT (1 << 29)  /* XScale v2 or "Intel Wireless 
MMX technology".  */
 #define FL_IWMMXT2(1 << 30)   /* "Intel Wireless MMX2 technology".  */
@@ -482,6 +483,9 @@ extern int arm_arch_thumb2;
 extern int arm_arch_arm_hwdiv;
 extern int arm_arch_thumb_hwdiv;
 
+/* Nonzero if chip disallows volatile memory access in IT block.  */
+extern int arm_arch_no_volatile_ce;
+
 /* Nonzero if we should use Neon to handle 64-bits operations rather
than core registers.  */
 extern int prefer_neon_for_64bits;
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 297dfe1..8c10ea3 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -383,6 +383,9 @@ extern void (*arm_lang_output

[Ping^1] [PATCH] [gcc, combine] PR46164: Don't combine the insns if a volatile register is contained.

2015-02-25 Thread Terry Guo
On Tue, Feb 17, 2015 at 11:39 AM, Terry Guo  wrote:
> On Sun, Feb 15, 2015 at 7:35 PM, Segher Boessenkool
>  wrote:
>> Hi Terry,
>>
>> I still think this is stage1 material.
>>
>>> +  /* Don't combine if dest contains a user specified register and i3 
>>> contains
>>> + ASM_OPERANDS, because the user specified register (same with dest) in 
>>> i3
>>> + would be replaced by the src of insn which might be different with
>>> + the user's expectation.  */
>>
>> "Do not eliminate a register asm in an asm input" or similar?  Text
>> explaining why REG_USERVAR_P && HARD_REGISTER_P works here would be
>> good to have, too.
>>
>>> +  if (REG_P (dest) && REG_USERVAR_P (dest) && HARD_REGISTER_P (dest)
>>> +  && (GET_CODE (PATTERN (i3)) == SET
>>> +   && GET_CODE (SET_SRC (PATTERN (i3))) == ASM_OPERANDS))
>>> +return 0;
>>
>> That works only for asms with exactly one output.  You want
>> extract_asm_operands.
>>
>>
>> Segher
>
> Thanks Segher. Patch is updated per you suggestion. Is this one ok for stage 
> 1?
>
> BR,
> Terry

Hi Segher,

Any comments on the updated patch, is it OK?

BR,
Terry


Re: Ping : [PATCH] [gcc, combine] PR46164: Don't combine the insns if a volatile register is contained.

2015-02-26 Thread Terry Guo
On Thu, Feb 26, 2015 at 1:55 PM, Segher Boessenkool
 wrote:
> On Tue, Feb 17, 2015 at 11:39:34AM +0800, Terry Guo wrote:
>> On Sun, Feb 15, 2015 at 7:35 PM, Segher Boessenkool
>>  wrote:
>> > Hi Terry,
>> >
>> > I still think this is stage1 material.
>> >
>> >> +  /* Don't combine if dest contains a user specified register and i3 
>> >> contains
>> >> + ASM_OPERANDS, because the user specified register (same with dest) 
>> >> in i3
>> >> + would be replaced by the src of insn which might be different with
>> >> + the user's expectation.  */
>> >
>> > "Do not eliminate a register asm in an asm input" or similar?  Text
>> > explaining why REG_USERVAR_P && HARD_REGISTER_P works here would be
>> > good to have, too.
>
>> diff --git a/gcc/combine.c b/gcc/combine.c
>> index f779117..aeb2854 100644
>> --- a/gcc/combine.c
>> +++ b/gcc/combine.c
>> @@ -1779,7 +1779,7 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3, rtx_insn 
>> *pred ATTRIBUTE_UNUSED,
>>  {
>>int i;
>>const_rtx set = 0;
>> -  rtx src, dest;
>> +  rtx src, dest, asm_op;
>>rtx_insn *p;
>>  #ifdef AUTO_INC_DEC
>>rtx link;
>> @@ -1914,6 +1914,14 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3, rtx_insn 
>> *pred ATTRIBUTE_UNUSED,
>>set = expand_field_assignment (set);
>>src = SET_SRC (set), dest = SET_DEST (set);
>>
>> +  /* Use REG_USERVAR_P and HARD_REGISTER_P to check whether DEST is a user
>> + specified register, and do not eliminate such register if it is in an
>> + asm input because we may end up with something different with user's
>> + expectation.  */
>
> That doesn't explain why this will hit (almost) only on register asms.
> The user's expectation doesn't matter that much either: GCC would violate
> its own documentation / promises, that matters more ;-)
>
>> +  if (REG_P (dest) && REG_USERVAR_P (dest) && HARD_REGISTER_P (dest)
>> +  && ((asm_op = extract_asm_operands (PATTERN (i3))) != NULL))
>
> You do not need the temporary variable, nor the != 0 or the extra parens;
> just write
>
>  && extract_asm_operands (PATTERN (i3))
>
> Cheers,
>
>
> Segher

Thanks for comments. Patch is updated now. Please review again.

BR,
Terry


pr64818-combine-user-specified-register.patch-5
Description: Binary data


[PATCH][ARM]Automatically add -mthumb for thumb-only target when mode isn't specified

2015-03-01 Thread Terry Guo
Hi there,

If target mode isn't specified via either gcc configuration option
--with-mode or command line, this patch intends to improve gcc driver to
automatically add option -mthumb for thumb-only target. Tested with gcc
regression test for various arm targets, no regression. Is it OK?

BR,
Terry

gcc/ChangeLog:

2015-03-02  Terry Guo  

* common/config/arm/arm-common.c (arm_is_target_thumb_only): New
function.
* config/arm/arm-protos.h (FL_ Macros): Move to ...
* config/arm/arm-opts.h (FL_ Macros): ... here.
(struct arm_arch_core_flag): New struct.
(arm_arch_core_flags): New array for arch/core and flag map.
* config/arm/arm.h (MODE_SET_SPEC_FUNCTIONS): Define new SPEC
function.
(EXTRA_SPEC_FUNCTIONS): Include new SPEC function.
(MODE_SET_SPECS): New SPEC.
(DRIVER_SELF_SPECS): Include new SPEC.diff --git a/gcc/common/config/arm/arm-common.c 
b/gcc/common/config/arm/arm-common.c
index 86673b7..e17ee03 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -97,6 +97,28 @@ arm_rewrite_mcpu (int argc, const char **argv)
   return arm_rewrite_selected_cpu (argv[argc - 1]);
 }
 
+/* Called by driver to check whether the target denoted by current
+   command line options is thumb-only target.  If -march present,
+   check the last -march option.  If no -march, check the last -mcpu
+   option.  */
+const char *
+arm_is_target_thumb_only (int argc, const char **argv)
+{
+  unsigned int opt;
+
+  if (argc)
+{
+  for (opt = 0; opt < (ARRAY_SIZE (arm_arch_core_flags) - 1); opt++)
+   if ((strcmp (argv[argc - 1], arm_arch_core_flags[opt].name) == 0)
+   && ((arm_arch_core_flags[opt].flags & FL_NOTM) == 0))
+ return "-mthumb";
+
+  return NULL;
+}
+  else
+return NULL;
+}
+
 #undef ARM_CPU_NAME_LENGTH
 
 
diff --git a/gcc/config/arm/arm-opts.h b/gcc/config/arm/arm-opts.h
index 039e333..222d20e 100644
--- a/gcc/config/arm/arm-opts.h
+++ b/gcc/config/arm/arm-opts.h
@@ -77,4 +77,93 @@ enum arm_tls_type {
   TLS_GNU,
   TLS_GNU2
 };
+
+/* Flags used to identify the presence of processor capabilities.  */
+
+/* Bit values used to identify processor capabilities.  */
+#define FL_CO_PROC(1 << 0)/* Has external co-processor bus */
+#define FL_ARCH3M (1 << 1)/* Extended multiply */
+#define FL_MODE26 (1 << 2)/* 26-bit mode support */
+#define FL_MODE32 (1 << 3)/* 32-bit mode support */
+#define FL_ARCH4  (1 << 4)/* Architecture rel 4 */
+#define FL_ARCH5  (1 << 5)/* Architecture rel 5 */
+#define FL_THUMB  (1 << 6)/* Thumb aware */
+#define FL_LDSCHED(1 << 7)   /* Load scheduling necessary */
+#define FL_STRONG (1 << 8)   /* StrongARM */
+#define FL_ARCH5E (1 << 9)/* DSP extensions to v5 */
+#define FL_XSCALE (1 << 10)  /* XScale */
+/* spare (1 << 11) */
+#define FL_ARCH6  (1 << 12)   /* Architecture rel 6.  Adds
+media instructions.  */
+#define FL_VFPV2  (1 << 13)   /* Vector Floating Point V2.  */
+#define FL_WBUF  (1 << 14)   /* Schedule for write buffer ops.
+Note: ARM6 & 7 derivatives only.  */
+#define FL_ARCH6K (1 << 15)   /* Architecture rel 6 K extensions.  */
+#define FL_THUMB2 (1 << 16)  /* Thumb-2.  */
+#define FL_NOTM  (1 << 17)   /* Instructions not present in 
the 'M'
+profile.  */
+#define FL_THUMB_DIV  (1 << 18)  /* Hardware divide (Thumb mode).  
*/
+#define FL_VFPV3  (1 << 19)   /* Vector Floating Point V3.  */
+#define FL_NEON   (1 << 20)   /* Neon instructions.  */
+#define FL_ARCH7EM(1 << 21)  /* Instructions present in the 
ARMv7E-M
+architecture.  */
+#define FL_ARCH7  (1 << 22)   /* Architecture 7.  */
+#define FL_ARM_DIV(1 << 23)  /* Hardware divide (ARM mode).  */
+#define FL_ARCH8  (1 << 24)   /* Architecture 8.  */
+#define FL_CRC32  (1 << 25)  /* ARMv8 CRC32 instructions.  */
+
+#define FL_SMALLMUL   (1 << 26)   /* Small multiply supported.  */
+
+#define FL_IWMMXT (1 << 29)  /* XScale v2 or "Intel Wireless 
MMX technology".  */
+#define FL_IWMMXT2(1 << 30)   /* "Intel Wireless MMX2 technology".  */
+
+/* Flags that only effect tuning, not available instructions.  */
+#define FL_TUNE(FL_WBUF | FL_VFPV2 | FL_STRONG | FL_LDSCHED \
+| FL_CO_PROC)
+
+#define FL_FOR_ARCH2   FL_NOTM
+#define FL_F

Re: [PATCH][ARM]Automatically add -mthumb for thumb-only target when mode isn't specified

2015-03-02 Thread Terry Guo
On Mon, Mar 2, 2015 at 9:08 PM, Maxim Kuvyrkov
 wrote:
>> On Mar 2, 2015, at 4:44 AM, Terry Guo  wrote:
>>
>> Hi there,
>>
>> If target mode isn't specified via either gcc configuration option
>> --with-mode or command line, this patch intends to improve gcc driver to
>> automatically add option -mthumb for thumb-only target. Tested with gcc
>> regression test for various arm targets, no regression. Is it OK?
>>
>> BR,
>> Terry
>>
>> gcc/ChangeLog:
>>
>> 2015-03-02  Terry Guo  
>>
>>* common/config/arm/arm-common.c (arm_is_target_thumb_only): New
>> function.
>>* config/arm/arm-protos.h (FL_ Macros): Move to ...
>>* config/arm/arm-opts.h (FL_ Macros): ... here.
>>(struct arm_arch_core_flag): New struct.
>>(arm_arch_core_flags): New array for arch/core and flag map.
>>* config/arm/arm.h (MODE_SET_SPEC_FUNCTIONS): Define new SPEC
>> function.
>>(EXTRA_SPEC_FUNCTIONS): Include new SPEC function.
>>(MODE_SET_SPECS): New SPEC.
>>(DRIVER_SELF_SPECS): Include new SPEC.
>
> Did you consider approach of implementing this purely inside cc1 rather than 
> driver?
>
> We do not seem to need to pass -mthumb to assembler or linker since those 
> will pick up ARM-ness / Thumb-ness from function annotations.  Therefore we 
> need to handle -marm / -mthumb for cc1 only.  What am I missing?
>

The way GCC uses to find multitlib prevents us from doing this via
cc1. The target options should be properly constructed for gcc driver
to decide multilib path, which happens before cc1. For example, for
command line "arm-nonee-abi-gcc -mcpu=cortex-m3 -o hello.axf hello.c",
we need to figure out that -mthumb should be added inside the gcc
driver, otherwise such command line works like "arm-nonee-abi-gcc
-marm -mcpu=cortex-m3 -o hello.axf hello.c" and the arm mode multilib
will be linked. Thus we have to do this in gcc driver rather than cc1.

> Also, what's the significance of moving FL_* flags to arm-opts.h?  If you had 
> to separate FL_* definitions from the rest of arm-protos.h, then a new 
> dedicated file (e.g., arm-fl.h) would be a better choice for new home of FL_* 
> definitions.
>

I set up an arch/core<->flags map array for gcc driver to figure out
whether the target is thumb-only. Those FL_* flags are needed for this
map array. The arm-opts.h is used to share back end information with
gcc driver. Normally we tend to minimize such information. That's why
I just moved those FL_* flags rather than simply including the header
file which has FL_* flags. But maybe it is a good idea to save FL_*
into a separate file. I will try.

BR,
Terry


Re: [PATCH][ARM]Automatically add -mthumb for thumb-only target when mode isn't specified

2015-03-03 Thread Terry Guo
On Mon, Mar 2, 2015 at 9:08 PM, Maxim Kuvyrkov
 wrote:
>> On Mar 2, 2015, at 4:44 AM, Terry Guo  wrote:
>>
>> Hi there,
>>
>> If target mode isn't specified via either gcc configuration option
>> --with-mode or command line, this patch intends to improve gcc driver to
>> automatically add option -mthumb for thumb-only target. Tested with gcc
>> regression test for various arm targets, no regression. Is it OK?
>>
>> BR,
>> Terry
>>
>> gcc/ChangeLog:
>>
>> 2015-03-02  Terry Guo  
>>
>>* common/config/arm/arm-common.c (arm_is_target_thumb_only): New
>> function.
>>* config/arm/arm-protos.h (FL_ Macros): Move to ...
>>* config/arm/arm-opts.h (FL_ Macros): ... here.
>>(struct arm_arch_core_flag): New struct.
>>(arm_arch_core_flags): New array for arch/core and flag map.
>>* config/arm/arm.h (MODE_SET_SPEC_FUNCTIONS): Define new SPEC
>> function.
>>(EXTRA_SPEC_FUNCTIONS): Include new SPEC function.
>>(MODE_SET_SPECS): New SPEC.
>>(DRIVER_SELF_SPECS): Include new SPEC.
>
> Did you consider approach of implementing this purely inside cc1 rather than 
> driver?
>
> We do not seem to need to pass -mthumb to assembler or linker since those 
> will pick up ARM-ness / Thumb-ness from function annotations.  Therefore we 
> need to handle -marm / -mthumb for cc1 only.  What am I missing?
>
> Also, what's the significance of moving FL_* flags to arm-opts.h?  If you had 
> to separate FL_* definitions from the rest of arm-protos.h, then a new 
> dedicated file (e.g., arm-fl.h) would be a better choice for new home of FL_* 
> definitions.
>

Please find my answers in another email. The attached patch tries to
follow your idea that puts those FL_* into separate file named
arm-flags.h. Does it look good to you?

BR,
Terry


Re: [PATCH][ARM]Automatically add -mthumb for thumb-only target when mode isn't specified

2015-03-03 Thread Terry Guo
On Wed, Mar 4, 2015 at 10:44 AM, Terry Guo  wrote:
> On Mon, Mar 2, 2015 at 9:08 PM, Maxim Kuvyrkov
>  wrote:
>>> On Mar 2, 2015, at 4:44 AM, Terry Guo  wrote:
>>>
>>> Hi there,
>>>
>>> If target mode isn't specified via either gcc configuration option
>>> --with-mode or command line, this patch intends to improve gcc driver to
>>> automatically add option -mthumb for thumb-only target. Tested with gcc
>>> regression test for various arm targets, no regression. Is it OK?
>>>
>>> BR,
>>> Terry
>>>
>>> gcc/ChangeLog:
>>>
>>> 2015-03-02  Terry Guo  
>>>
>>>* common/config/arm/arm-common.c (arm_is_target_thumb_only): New
>>> function.
>>>* config/arm/arm-protos.h (FL_ Macros): Move to ...
>>>* config/arm/arm-opts.h (FL_ Macros): ... here.
>>>(struct arm_arch_core_flag): New struct.
>>>(arm_arch_core_flags): New array for arch/core and flag map.
>>>* config/arm/arm.h (MODE_SET_SPEC_FUNCTIONS): Define new SPEC
>>> function.
>>>(EXTRA_SPEC_FUNCTIONS): Include new SPEC function.
>>>(MODE_SET_SPECS): New SPEC.
>>>(DRIVER_SELF_SPECS): Include new SPEC.
>>
>> Did you consider approach of implementing this purely inside cc1 rather than 
>> driver?
>>
>> We do not seem to need to pass -mthumb to assembler or linker since those 
>> will pick up ARM-ness / Thumb-ness from function annotations.  Therefore we 
>> need to handle -marm / -mthumb for cc1 only.  What am I missing?
>>
>> Also, what's the significance of moving FL_* flags to arm-opts.h?  If you 
>> had to separate FL_* definitions from the rest of arm-protos.h, then a new 
>> dedicated file (e.g., arm-fl.h) would be a better choice for new home of 
>> FL_* definitions.
>>
>
> Please find my answers in another email. The attached patch tries to
> follow your idea that puts those FL_* into separate file named
> arm-flags.h. Does it look good to you?
>
> BR,
> Terry

Sorry for missing patch.
diff --git a/gcc/common/config/arm/arm-common.c 
b/gcc/common/config/arm/arm-common.c
index 86673b7..e17ee03 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -97,6 +97,28 @@ arm_rewrite_mcpu (int argc, const char **argv)
   return arm_rewrite_selected_cpu (argv[argc - 1]);
 }
 
+/* Called by driver to check whether the target denoted by current
+   command line options is thumb-only target.  If -march present,
+   check the last -march option.  If no -march, check the last -mcpu
+   option.  */
+const char *
+arm_is_target_thumb_only (int argc, const char **argv)
+{
+  unsigned int opt;
+
+  if (argc)
+{
+  for (opt = 0; opt < (ARRAY_SIZE (arm_arch_core_flags) - 1); opt++)
+   if ((strcmp (argv[argc - 1], arm_arch_core_flags[opt].name) == 0)
+   && ((arm_arch_core_flags[opt].flags & FL_NOTM) == 0))
+ return "-mthumb";
+
+  return NULL;
+}
+  else
+return NULL;
+}
+
 #undef ARM_CPU_NAME_LENGTH
 
 
diff --git a/gcc/config/arm/arm-flags.h b/gcc/config/arm/arm-flags.h
new file mode 100644
index 000..fe3a723
--- /dev/null
+++ b/gcc/config/arm/arm-flags.h
@@ -0,0 +1,92 @@
+/* Flags used to identify the presence of processor capabilities. 
+
+   Copyright (C) 2015 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_ARM_FLAGS_H
+#define GCC_ARM_FLAGS_H
+
+/* Bit values used to identify processor capabilities.  */
+#define FL_CO_PROC(1 << 0)/* Has external co-processor bus */
+#define FL_ARCH3M (1 << 1)/* Extended multiply */
+#define FL_MODE26 (1 << 2)/* 26-bit mode support */
+#define FL_MODE32 (1 << 3)/* 32-bit mode support */
+#define FL_ARCH4  (1 << 4)/* Architecture rel 4 */
+#define FL_ARCH5  (1 << 5)/* Architecture rel 5 */
+#define FL_THUMB  (1 << 6)/* Thumb aware */
+#define FL_LDSCHED(1 << 7)   /* Load 

Re: [PATCH][ARM]Automatically add -mthumb for thumb-only target when mode isn't specified

2015-03-04 Thread Terry Guo
>
> Thanks Terry (and everyone else) for explaining why we want to do this in the 
> driver.  The substance of the patch looks good to me, and below are some 
> comments and nit-picks.  (Also, I'm not an ARM maintainer, so this is a 
> review, not an approval to commit).
>
> Please make sure to update changelog before committing.
>

Thanks Maxim, your comments are great. I accepted all of them and
commented two of them that I am not clear.

<>
>> +
>> +struct arm_arch_core_flag
>> +{
>> +  const char *const name;
>> +  const unsigned long flags;
>> +};
>> +
>> +static const struct arm_arch_core_flag arm_arch_core_flags[] =
>> +{
>> +#undef ARM_CORE
>> +#define ARM_CORE(NAME, X, IDENT, ARCH, FLAGS, COSTS) \
>> +  {NAME, FLAGS | FL_FOR_ARCH##ARCH},
>> +#include "arm-cores.def"
>> +#undef ARM_CORE
>> +#undef ARM_ARCH
>> +#define ARM_ARCH(NAME, CORE, ARCH, FLAGS) \
>> +  {NAME, FLAGS},
>> +#include "arm-arches.def"
>> +#undef ARM_ARCH
>> +  {NULL, 0}
>> +};
>
> Did you consider implications from mixing ARCHes and CPUs in the same array?  
> It should not be a problem, but would you please double-check that cases like 
> "-march=cortex-a15" are properly caught as errors elsewhere in the driver?
>

Not sure I follow you correctly here. This array is just used for my
new arm_target_thumb_only function. It isn't used by any another code
in gcc. So I don't think mixing them will break gcc option check
mechanism. I tried below command and I can get error message:

$ ./install-native/bin/arm-none-eabi-gcc -march=cortex-a15 x.c -S -mthumb
arm-none-eabi-gcc: error: unrecognized argument in option '-march=cortex-a15'

>>  #endif
>> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
>> index 28ffe52..325a81c 100644
>> --- a/gcc/config/arm/arm-protos.h
>> +++ b/gcc/config/arm/arm-protos.h
>> @@ -325,75 +325,6 @@ extern const char *arm_rewrite_selected_cpu (const char 
>> *name);
>>
>>  extern bool arm_is_constant_pool_ref (rtx);
>>
>> -/* Flags used to identify the presence of processor capabilities.  */
>
> You've lost this comment in the new file.  Was it intentional?
>

The line is used as first line in new file arm-flags.h.

BR,
Terry

Here is updated ChangeLog:

gcc/ChangeLog:

2015-03-05  Terry Guo  

  * common/config/arm/arm-common.c (arm_target_thumb_only): New function.
  * config/arm/arm-protos.h: Move FL_* stuff into below file and
then include it.
  * config/arm/arm-flags.h: New file for FL_* stuff.
  * config/arm/arm-opts.h (struct arm_arch_core_flag): New struct.
  (arm_arch_core_flags): New array for arch/core and flag map.
  * config/arm/arm.h (TARGET_MODE_SPEC_FUNCTIONS): New SPEC function.
  (EXTRA_SPEC_FUNCTIONS): Include new SPEC function.
  (TARGET_MODE_SPECS): New SPEC.
  (DRIVER_SELF_SPECS): Include new SPEC.
diff --git a/gcc/common/config/arm/arm-common.c 
b/gcc/common/config/arm/arm-common.c
index 86673b7..5efb55e 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -97,6 +97,28 @@ arm_rewrite_mcpu (int argc, const char **argv)
   return arm_rewrite_selected_cpu (argv[argc - 1]);
 }
 
+/* Called by driver to check whether the target denoted by current
+   command line options is thumb-only target.  If -march present,
+   check the last -march option.  If no -march, check the last -mcpu
+   option.  */
+const char *
+arm_target_thumb_only (int argc, const char **argv)
+{
+  unsigned int opt;
+
+  if (argc)
+{
+  for (opt = 0; opt < (ARRAY_SIZE (arm_arch_core_flags) - 1); opt++)
+   if ((strcmp (argv[argc - 1], arm_arch_core_flags[opt].name) == 0)
+   && ((arm_arch_core_flags[opt].flags & FL_NOTM) == 0))
+ return "-mthumb";
+
+  return NULL;
+}
+  else
+return NULL;
+}
+
 #undef ARM_CPU_NAME_LENGTH
 
 
diff --git a/gcc/config/arm/arm-flags.h b/gcc/config/arm/arm-flags.h
new file mode 100644
index 000..e206ad1
--- /dev/null
+++ b/gcc/config/arm/arm-flags.h
@@ -0,0 +1,92 @@
+/* Flags used to identify the presence of processor capabilities.
+
+   Copyright (C) 2015 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should 

Re: [PATCH][wwwdocs] Update 5.0 changes.html with Thumb1 UAL

2015-04-12 Thread Terry Guo
On Sat, Apr 11, 2015 at 5:48 AM, Gerald Pfeifer  wrote:
> Hi Terry,
>
> I went ahead and committed some small changes to the description of
> -masm-syntax-unified.  Let me know if you disagree or would like to
> see further changes.
>
> Gerald
>

Thanks for the improvement. I am totally ok with them.

BR,
Terry

> Index: changes.html
> ===
> RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
> retrieving revision 1.101
> diff -u -r1.101 changes.html
> --- changes.html9 Apr 2015 23:30:47 -   1.101
> +++ changes.html10 Apr 2015 21:47:01 -
> @@ -636,8 +636,8 @@
>
>  ARM
>   
> -   The Thumb-1 assembly code is now generated in unified syntax. The 
> new option
> --masm-syntax-unified can be used to specify whether 
> inline assembly
> +  Thumb-1 assembly code is now generated in unified syntax. The new 
> option
> +-masm-syntax-unified specifies whether inline assembly
>  code is using unified syntax. By default the option is off which 
> means
>  non-unified syntax is used. However this is subject to change in 
> future releases.
>  Eventually the non-unified syntax will be deprecated.


RE: [PATCH, ARM, testsuite] Improve scd42-1.c for UAL

2015-01-25 Thread Terry Guo


> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Ramana Radhakrishnan
> Sent: Friday, January 23, 2015 6:35 PM
> To: Tony Liu
> Cc: gcc-patches; Ramana Radhakrishnan; Richard Earnshaw
> Subject: Re: [PATCH, ARM, testsuite] Improve scd42-1.c for UAL
> 
> On Thu, Jan 15, 2015 at 12:10 PM, Tony Liu  wrote:
> > Hi,
> >
> > This is the patch to improve the test case gcc.target/arm/scd42-1.c
> > for both UAL and non-UAL. It now checks UAL format assembly code for
> > Thumb1 and
> > Thumb2 while non-UAL format assembly code for ARM mode.
> 
> 
> OK.
> 
> Ramana
> >
> > With this patch, the test passes for both cases.
> >
> > Thanks,
> > Tony
> >
> > 2015-01-15  Tony Liu  
> >
> >* gcc.target/arm/scd42-1.c: Improve the check for UAL and
> > non-UAL cases.

Committed this patch on behalf of Tony:
https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=220102

BR,
Terry






[Patch][wwwdocs]Deprecate the ARM TPCS related options in gcc 5.0

2015-01-26 Thread Terry Guo
Hi there,

This patch intends to update gcc 5.0 change.html to deprecate TPCS related 
options because TPCS is obsoleted per the ABI document at 
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042e/IHI0042E_aapcs.pdf. 
Is it OK?

BR,
Terry

Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.72
diff -u -r1.72 changes.html
--- htdocs/gcc-5/changes.html   25 Jan 2015 23:47:32 -  1.72
+++ htdocs/gcc-5/changes.html   26 Jan 2015 03:23:35 -
@@ -501,8 +501,9 @@
The deprecated option -mwords-little-endian
has been removed.
   
-   The options relating to the old ABI -mapcs and
-  -mapcs-frame have been deprecated.
+   The options relating to the old ABI -mapcs,
+  -mapcs-frame, -mtpcs-frame and
+  -mtpcs-leaf-frame have been deprecated.
   
   The transitional options -mlra and -mno-lra
have been removed. The ARM backend now uses the local register allocator


RE: [Patch][wwwdocs]Deprecate the ARM TPCS related options in gcc 5.0

2015-01-27 Thread Terry Guo


> -Original Message-
> From: Gerald Pfeifer [mailto:ger...@pfeifer.com]
> Sent: Monday, January 26, 2015 7:34 PM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw; Ramana Radhakrishnan
> Subject: Re: [Patch][wwwdocs]Deprecate the ARM TPCS related options in
> gcc 5.0
> 
> On Monday 2015-01-26 16:47, Terry Guo wrote:
> > This patch intends to update gcc 5.0 change.html to deprecate TPCS
> > related options because TPCS is obsoleted per the ABI document at
> >
> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042e/IHI0042E_aapc
> s.pdf.
> > Is it OK?
> 
> From a language perspective I suggest to say "The options < here>> related to the old ABI..." or "The options related to the old ABI
--
> <> -- ...", where I somewhat prefer the former.
> 
> Please wait for Richard or Ramana for final review and approval.
> 
> Gerald

Thanks Gerald. Patch is updated. Is this one OK?

BR,
Terry

Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.73
diff -u -p -r1.73 changes.html
--- htdocs/gcc-5/changes.html   26 Jan 2015 09:40:03 -  1.73
+++ htdocs/gcc-5/changes.html   27 Jan 2015 09:35:32 -
@@ -513,8 +513,9 @@ void operator delete[] (void *, std::siz
The deprecated option -mwords-little-endian
has been removed.
   
-   The options relating to the old ABI -mapcs and
-  -mapcs-frame have been deprecated.
+   The options -mapcs, -mapcs-frame,
+  -mtpcs-frame and -mtpcs-leaf-frame
+  which are only applicable to the old ABI have been deprecated.
   
   The transitional options -mlra and
-mno-lra
have been removed. The ARM backend now uses the local register
allocator





RE: [Patch][wwwdocs]Deprecate the ARM TPCS related options in gcc 5.0

2015-01-28 Thread Terry Guo


> -Original Message-
> From: Gerald Pfeifer [mailto:ger...@pfeifer.com]
> Sent: Thursday, January 29, 2015 2:53 AM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw; Ramana Radhakrishnan
> Subject: RE: [Patch][wwwdocs]Deprecate the ARM TPCS related options in
> gcc 5.0
> 
> On Wednesday 2015-01-28 09:57, Terry Guo wrote:
> > Thanks Gerald. Patch is updated. Is this one OK?
> 
> This good to me.  (Perhaps say "which were only applicable", since there
are
> gone now?)
> 
> Gerald

Thanks Gerald. Patch is committed.  Because the options are not removed from
GCC right now, I am not using "which were". We give a warning first and then
remove the gcc code later, user can have chances to update their existing
projects.

BR,
Terry 





[Patch][ARM]Don't put volatile memory access in IT block for cortex-m7

2015-02-12 Thread Terry Guo
Hi there,

This patch intends to prevent gcc from putting volatile memory access into
IT block for target like cortex-m7.

gcc/ChangeLog:

2015-02-12  Terry Guo  

* config/arm/arm.c (arm_tune_cortex_m7): New global variable.
* config/arm/arm.h (TARGET_NO_VOLATILE_CE): New macro.
(arm_tune_cortex_m7): Declare new global variable.
* config/arm/arm.md (arm_comparison_operator): Disabled if not allow
 volatile memory access in IT block.

gcc/testsuite/ChangeLog:

2015-02-12  Terry Guo  

* gcc.target/arm/cortex-m7-it-volatile.c: New test.diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 297dfe1..d6b854d 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -290,6 +290,9 @@ extern void (*arm_lang_output_object_attributes_hook)(void);
 
 #define TARGET_CRC32   (arm_arch_crc)
 
+/* Targets that don't support accessing volatile memory inside IT block.  */
+#define TARGET_NO_VOLATILE_CE  (arm_tune_cortex_m7)
+
 /* The following two macros concern the ability to execute coprocessor
instructions for VFPv3 or NEON.  TARGET_VFP3/TARGET_VFPD32 are currently
only ever tested when we know we are generating for VFP hardware; we need
@@ -552,6 +555,9 @@ extern int arm_tune_wbuf;
 /* Nonzero if tuning for Cortex-A9.  */
 extern int arm_tune_cortex_a9;
 
+/* Nonzero if tuning for Cortex-M7.  */
+extern int arm_tune_cortex_m7;
+
 /* Nonzero if we should define __THUMB_INTERWORK__ in the
preprocessor.
XXX This is a bit of a hack, it's intended to help work around
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 7bf5b4d..081ccec 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -846,6 +846,9 @@ int arm_tune_wbuf = 0;
 /* Nonzero if tuning for Cortex-A9.  */
 int arm_tune_cortex_a9 = 0;
 
+/* Nonzero if tuning for Cortex-M7.  */
+int arm_tune_cortex_m7 = 0;
+
 /* Nonzero if generating Thumb instructions.  */
 int thumb_code = 0;
 
@@ -2859,7 +2862,8 @@ arm_option_override (void)
   arm_arch_iwmmxt2 = (insn_flags & FL_IWMMXT2) != 0;
   arm_arch_thumb_hwdiv = (insn_flags & FL_THUMB_DIV) != 0;
   arm_arch_arm_hwdiv = (insn_flags & FL_ARM_DIV) != 0;
-  arm_tune_cortex_a9 = (arm_tune == cortexa9) != 0;
+  arm_tune_cortex_a9 = (arm_tune == cortexa9);
+  arm_tune_cortex_m7 = (arm_tune == cortexm7);
   arm_arch_crc = (insn_flags & FL_CRC32) != 0;
   arm_m_profile_small_mul = (insn_flags & FL_SMALLMUL) != 0;
   if (arm_restrict_it == 2)
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index c13e9b2..164ac13 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -10755,7 +10755,8 @@
   [(match_operator 0 "arm_comparison_operator"
 [(match_operand 1 "cc_register" "")
  (const_int 0)])]
-  "TARGET_32BIT"
+  "TARGET_32BIT
+   && (!TARGET_NO_VOLATILE_CE || !volatile_refs_p (PATTERN (insn)))"
   ""
 [(set_attr "predicated" "yes")]
 )
diff --git a/gcc/testsuite/gcc.target/arm/cortex-m7-it-volatile.c 
b/gcc/testsuite/gcc.target/arm/cortex-m7-it-volatile.c
new file mode 100644
index 000..206afdb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/cortex-m7-it-volatile.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-options "-Os -mthumb -mcpu=cortex-m7" } */
+
+int
+foo (int a, int b, volatile int *c, volatile int *d)
+{
+  if (a > b)
+return c[0];
+  else
+return d[0];
+}
+
+/* { dg-final { scan-assembler-not "ldrgt" } } */


Re: Ping : [PATCH] [gcc, combine] PR46164: Don't combine the insns if a volatile register is contained.

2015-02-15 Thread Terry Guo
On Fri, Feb 13, 2015 at 5:06 PM, Richard Sandiford
 wrote:
> Segher Boessenkool  writes:
>> On Thu, Feb 12, 2015 at 03:54:21PM +, Richard Sandiford wrote:
>>> "Hale Wang"  writes:
>>> > Ping?
>>
>> It's not a regression (or is it?), so it is not appropriate for stage4.
>>
>>
>>> >> diff --git a/gcc/combine.c b/gcc/combine.c index 5c763b4..6901ac2 100644
>>> >> --- a/gcc/combine.c
>>> >> +++ b/gcc/combine.c
>>> >> @@ -1904,6 +1904,12 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3,
>>> >> rtx_insn *pred ATTRIBUTE_UNUSED,
>>> >>set = expand_field_assignment (set);
>>> >>src = SET_SRC (set), dest = SET_DEST (set);
>>> >>
>>> >> +  /* Don't combine if dest contains a user specified register, because
>>> > the
>>> >> + user specified register (same with dest) in i3 would be replaced by
>>> > the
>>> >> + src of insn which might be different with the user's expectation.
>>> >> + */  if (REG_P (dest) && REG_USERVAR_P (dest) && HARD_REGISTER_P
>>> >> (dest))
>>> >> +return 0;
>>>
>>> I suppose this is similar to Andrew's comment, but I think the rule
>>> is that it's invalid to replace a REG_USERVAR_P operand in an inline asm.
>>
>> Why not?  You probably mean register asm, not all user variables?
>
> Yeah, meant hard REG_USERVAR_P, sorry, as for the patch.
>
>>> Outside of an inline asm we make no guarantee about whether something is
>>> stored in a particular register or not.
>>>
>>> So IMO we should be checking whether either INSN or I3 is an asm as well
>>> as the above.
>>
>> [ INSN can never be an asm, that is already refused by can_combine_p. ]
>>
>> We do not guarantee things will end up in the specified reg (except for asm),
>> but will it hurt to leave things in the reg the user said it should be in, 
>> even
>> if we do not guarantee this behaviour?
>
> Whether it does not, making the test unnecessarily wide is at best only
> going to paper over problems elsewhere.  I really think we should test
> for i3 being an asm.
>
> Thanks,
> Richard

Thanks for reviewing. Hale wants me to continue his work because he
will be in holiday in next ten days. The check of asm is added. Is
this one OK?

BR,
Terry


pr64818-combine-user-specified-register.patch-3
Description: Binary data


Re: Ping : [PATCH] [gcc, combine] PR46164: Don't combine the insns if a volatile register is contained.

2015-02-16 Thread Terry Guo
On Sun, Feb 15, 2015 at 7:35 PM, Segher Boessenkool
 wrote:
> Hi Terry,
>
> I still think this is stage1 material.
>
>> +  /* Don't combine if dest contains a user specified register and i3 
>> contains
>> + ASM_OPERANDS, because the user specified register (same with dest) in 
>> i3
>> + would be replaced by the src of insn which might be different with
>> + the user's expectation.  */
>
> "Do not eliminate a register asm in an asm input" or similar?  Text
> explaining why REG_USERVAR_P && HARD_REGISTER_P works here would be
> good to have, too.
>
>> +  if (REG_P (dest) && REG_USERVAR_P (dest) && HARD_REGISTER_P (dest)
>> +  && (GET_CODE (PATTERN (i3)) == SET
>> +   && GET_CODE (SET_SRC (PATTERN (i3))) == ASM_OPERANDS))
>> +return 0;
>
> That works only for asms with exactly one output.  You want
> extract_asm_operands.
>
>
> Segher

Thanks Segher. Patch is updated per you suggestion. Is this one ok for stage 1?

BR,
Terry


pr64818-combine-user-specified-register.patch-4
Description: Binary data


RE: [2/2][PATCH,ARM]Generate UAL assembly code for Thumb-1 target

2014-11-07 Thread Terry Guo


> -Original Message-
> From: Christian Bruel [mailto:christian.br...@st.com]
> Sent: Friday, November 07, 2014 5:27 PM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [2/2][PATCH,ARM]Generate UAL assembly code for Thumb-1
> target
> 
> hi,
> 
> the ARM bootstrap seems to fail for libgcc2.c on the thumb multilib for
> libgcc2: muldi3 -mthumb -O2  -g
> 
> /tmp/ccYrycUw.s: Assembler messages:
> /tmp/ccYrycUw.s:69: Error: MOV Rd, Rs with two low registers is not
> permitted on this architecture -- `mov r6,r7'
> 
> preprocessed attached.
> 
> Thanks
> 
> Christian

Many thanks. I am looking into it now.

BR,
Terry






[Patch,ARM/Thumb1]Fix 'mov' instruction for Thumb-1 UAL

2014-11-11 Thread Terry Guo
Hi there,

Attached patch intends to fix below trunk failure caused by recent thumb-1
UAL patch:

/tmp/cc9EfnXy.s: Assembler messages:
/tmp/cc9EfnXy.s:69: Error: MOV Rd, Rs with two low registers is not
permitted on this architecture -- `mov r6,r7'

Now for pre-v6 Thumb-1, the 'movs' will be used rather than the 'mov'.

The multilib for ARM/Thumb1/hard-float all can be built. Tested with
regression test on armv4t thumb and v6m thumb. No regression. Is it ok to
trunk?

BR,
Terry

2014-11-11  Terry Guo  

 * doc/invoke.texi (-masm-syntax-unified): Reword and fix typo.
 * config/arm/thumb1.md (*thumb_mulsi3): Use movs to move low registers.
 (*thumb1_movhf): Likewise.diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index 8a2abe9..3d6f80b 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -131,12 +131,10 @@
(mult:SI (match_operand:SI 1 "register_operand" "%l,*h,0")
 (match_operand:SI 2 "register_operand" "l,l,l")))]
  "TARGET_THUMB1 && !arm_arch6"
-  "*
-  if (which_alternative < 2)
-return \"mov\\t%0, %1\;muls\\t%0, %2\";
-  else
-return \"muls\\t%0, %2\";
-  "
+  "@
+   movs\\t%0, %1\;muls\\t%0, %2
+   mov\\t%0, %1\;muls\\t%0, %2
+   muls\\t%0, %2"
   [(set_attr "length" "4,4,2")
(set_attr "type" "muls")]
 )
@@ -787,6 +785,8 @@
   "*
   switch (which_alternative)
 {
+case 0:
+  return \"movs\\t%0, %1\";
 case 1:
   {
rtx addr;
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cd20b6e..13270bc 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13040,13 +13040,11 @@ off by default.
 
 @item -masm-syntax-unified
 @opindex masm-syntax-unified
-Assume the Thumb1 inline assembly code are using unified syntax.
-The default is currently off, which means divided syntax is assumed.
+Assume inline assembler is using unified asm syntax.  The default is
+currently off which implies divided syntax.  Currently this option is
+available only for Thumb1 and has no effect on ARM state and Thumb2.
 However, this may change in future releases of GCC.  Divided syntax
-should be considered deprecated.  This option has no effect when
-generating Thumb2 code.  Thumb2 assembly code always uses unified syntax.
-This option has no effect for ARM state assembly code which will still
-uses divided syntax.
+should be considered deprecated.
 
 @item -mrestrict-it
 @opindex mrestrict-it


RE: [2/2][PATCH,ARM]Generate UAL assembly code for Thumb-1 target

2014-11-11 Thread Terry Guo


> -Original Message-
> From: Terry Guo [mailto:terry@arm.com]
> Sent: Friday, November 07, 2014 6:01 PM
> To: 'Christian Bruel'
> Cc: gcc-patches@gcc.gnu.org
> Subject: RE: [2/2][PATCH,ARM]Generate UAL assembly code for Thumb-1
> target
> 
> 
> 
> > -Original Message-
> > From: Christian Bruel [mailto:christian.br...@st.com]
> > Sent: Friday, November 07, 2014 5:27 PM
> > To: Terry Guo
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [2/2][PATCH,ARM]Generate UAL assembly code for Thumb-1
> > target
> >
> > hi,
> >
> > the ARM bootstrap seems to fail for libgcc2.c on the thumb multilib
> > for
> > libgcc2: muldi3 -mthumb -O2  -g
> >
> > /tmp/ccYrycUw.s: Assembler messages:
> > /tmp/ccYrycUw.s:69: Error: MOV Rd, Rs with two low registers is not
> > permitted on this architecture -- `mov r6,r7'
> >
> > preprocessed attached.
> >
> > Thanks
> >
> > Christian
> 
> Many thanks. I am looking into it now.
> 
> BR,
> Terry

Fix is committed to trunk at 
https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=217341.

BR,
Terry





[Patch, ARM]Add pipeline description for ARM Cortex-M7

2014-11-12 Thread Terry Guo
Hi there,

Attached patch intends to add pipeline description for  ARM MCU Cortex-M7.
Is it ok to trunk?

BR,
Terry

2014-11-12  Terry Guo  

* config/arm/arm.c (arm_issue_rate): Return 2 for cortex-m7.
* config/arm/arm.md (generic_sched): Exclude cortex-m7.
(generic_vfp): Likewise.
* config/arm/cortex-m7.md: New pipeline description for cortex-m7.diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 3f2ddd4..8ad2690 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -29925,6 +29925,7 @@ arm_issue_rate (void)
 case cortexa57:
   return 3;
 
+case cortexm7:
 case cortexr4:
 case cortexr4f:
 case cortexr5:
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 8106943..c028405 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -377,7 +377,11 @@
 
 (define_attr "generic_sched" "yes,no"
   (const (if_then_else
-  (ior (eq_attr "tune" 
"fa526,fa626,fa606te,fa626te,fmp626,fa726te,arm926ejs,arm1020e,arm1026ejs,arm1136js,arm1136jfs,cortexa5,cortexa7,cortexa8,cortexa9,cortexa12,cortexa15,cortexa53,cortexm4,marvell_pj4")
+  (ior (eq_attr "tune" "fa526,fa626,fa606te,fa626te,fmp626,fa726te,\
+arm926ejs,arm1020e,arm1026ejs,arm1136js,\
+arm1136jfs,cortexa5,cortexa7,cortexa8,\
+cortexa9,cortexa12,cortexa15,cortexa53,\
+cortexm4,cortexm7,marvell_pj4")
   (eq_attr "tune_cortexr4" "yes"))
   (const_string "no")
   (const_string "yes"
@@ -385,7 +389,9 @@
 (define_attr "generic_vfp" "yes,no"
   (const (if_then_else
  (and (eq_attr "fpu" "vfp")
-  (eq_attr "tune" 
"!arm1020e,arm1022e,cortexa5,cortexa7,cortexa8,cortexa9,cortexa53,cortexm4,marvell_pj4")
+  (eq_attr "tune" "!arm1020e,arm1022e,cortexa5,cortexa7,\
+cortexa8,cortexa9,cortexa53,cortexm4,\
+cortexm7,marvell_pj4")
   (eq_attr "tune_cortexr4" "no"))
  (const_string "yes")
  (const_string "no"
@@ -409,6 +415,7 @@
 (include "cortex-a53.md")
 (include "cortex-r4.md")
 (include "cortex-r4f.md")
+(include "cortex-m7.md")
 (include "cortex-m4.md")
 (include "cortex-m4-fpu.md")
 (include "vfp11.md")
diff --git a/gcc/config/arm/cortex-m7.md b/gcc/config/arm/cortex-m7.md
new file mode 100644
index 000..aab1da1
--- /dev/null
+++ b/gcc/config/arm/cortex-m7.md
@@ -0,0 +1,181 @@
+;; ARM Cortex-M7 pipeline description
+;; Copyright (C) 2014 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_automaton "cortex_m7")
+
+;; We model the dual-issue constraints of this core with
+;; following units.
+
+(define_cpu_unit "cm7_i0, cm7_i1" "cortex_m7")
+(define_cpu_unit "cm7_a0, cm7_a1" "cortex_m7")
+(define_cpu_unit "cm7_branch,cm7_wb,cm7_ext,cm7_shf" "cortex_m7")
+(define_cpu_unit "cm7_lsu" "cortex_m7")
+(define_cpu_unit "cm7_mac" "cortex_m7")
+(define_cpu_unit "cm7_fpu" "cortex_m7")
+
+(define_reservation "cm7_all_units"
+"cm7_i0+cm7_i1+cm7_a0+cm7_a1+cm7_branch\
+ +cm7_wb+cm7_ext+cm7_shf+cm7_lsu+cm7_mac\
+ +cm7_fpu")
+
+;; Simple alu instruction without inline shift operation.
+(define_insn_reservation "cortex_m7_alu_simple" 2
+  (and (eq_attr "tune" "cortexm7")
+   (eq_attr "type" "alu_imm,alus_imm,logic_imm,logics_imm,\
+alu_sreg,alus_sreg,logic_reg,logics_reg,\
+adc_imm,adcs_imm,adc_reg,adcs_reg,\
+adr,bfm,rev,\
+shift_imm,shift_reg,\
+mov_imm,mov_reg,mvn_imm,mvn_reg,\
+mov_shift_reg,mov_shift,\
+mvn_shift,mvn_shif

Re: add taishanv110 pipeline scheduling

2018-12-05 Thread Terry Guo
On Thu, Dec 6, 2018 at 9:31 AM wuyuan (E)  wrote:
>
> Hi ARM maintainers:
> The taishanv110 core uses generic pipeline scheduling, which 
> restricted the performance of taishanv110 core. By adding the pipeline 
> scheduling of taishanv110 core in GCC,The performance of taishanv110 has been 
> improved.
> The patch  as follows, please join.
>
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> old mode 100644
> new mode 100755
> index c4ec556..d6cf1d3
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,9 @@
> +2018-12-05  wuyuan  
> +

Better be "Wu Yuan"

> +   * config/aarch64/aarch64-cores.def: New CPU.
> +   * config/aarch64/aarch64.md : Add "tsv110.md"
> +   * gcc/config/aarch64/tsv110.md : pipeline description
> +
Can remove the "gcc/" part.

> 2018-11-26  David Malcolm  
>
>   * dump-context.h (dump_context::dump_loc): Convert 1st param from
> diff --git a/gcc/config/aarch64/aarch64-cores.def 
> b/gcc/config/aarch64/aarch64-cores.def
> index 74be5db..8e84844 100644
> --- a/gcc/config/aarch64/aarch64-cores.def
> +++ b/gcc/config/aarch64/aarch64-cores.def
> @@ -99,7 +99,7 @@ AARCH64_CORE("ares",  ares, cortexa57, 8_2A,  
> AARCH64_FL_FOR_ARCH8_2 | AARCH64_F
> /* ARMv8.4-A Architecture Processors.  */
>
> /* HiSilicon ('H') cores. */
> -AARCH64_CORE("tsv110", tsv110,cortexa57,8_4A, 
> AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES 
> | AARCH64_FL_SHA2, tsv110,   0x48, 0xd01, -1)
> +AARCH64_CORE("tsv110", tsv110,tsv110,8_4A, 
> AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES 
> | AARCH64_FL_SHA2, tsv110,   0x48, 0xd01, -1)
>
> /* Qualcomm ('Q') cores. */
> AARCH64_CORE("saphira", saphira,saphira,8_4A,  
> AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC, saphira,   
> 0x51, 0xC01, -1)
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 82af4d4..5278d6b 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -348,7 +348,7 @@
> (include "thunderx.md")
> (include "../arm/xgene1.md")
> (include "thunderx2t99.md")
> -
> +(include "tsv110.md")
> ;; ---
> ;; Jumps and other miscellaneous insns
> ;; ---
> diff --git a/gcc/config/aarch64/tsv110.md b/gcc/config/aarch64/tsv110.md
> new file mode 100644
> index 000..e912447
> --- /dev/null
> +++ b/gcc/config/aarch64/tsv110.md
> @@ -0,0 +1,708 @@
> +;; tsv110 pipeline description
> +;; Copyright (C) 2014-2016 Free Software Foundation, Inc.
> +;;

Given this is a new file, I think the copyright year should be updated.

BR,
Terry

> +;; This file is part of GCC.
> +;;
> +;; GCC is free software; you can redistribute it and/or modify it
> +;; under the terms of the GNU General Public License as published by
> +;; the Free Software Foundation; either version 3, or (at your option)
> +;; any later version.
> +;;
> +;; GCC is distributed in the hope that it will be useful, but
> +;; WITHOUT ANY WARRANTY; without even the implied warranty of
> +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +;; General Public License for more details.
> +;;
> +;; You should have received a copy of the GNU General Public License
> +;; along with GCC; see the file COPYING3.  If not see
> +;; .
> +
> +(define_automaton "tsv110")
> +
> +(define_attr "tsv110_neon_type"
> +  "neon_arith_acc, neon_arith_acc_q,
> +   neon_arith_basic, neon_arith_complex,
> +   neon_reduc_add_acc, neon_multiply, neon_multiply_q,
> +   neon_multiply_long, neon_mla, neon_mla_q, neon_mla_long,
> +   neon_sat_mla_long, neon_shift_acc, neon_shift_imm_basic,
> +   neon_shift_imm_complex,
> +   neon_shift_reg_basic, neon_shift_reg_basic_q, neon_shift_reg_complex,
> +   neon_shift_reg_complex_q, neon_fp_negabs, neon_fp_arith,
> +   neon_fp_arith_q, neon_fp_reductions_q, neon_fp_cvt_int,
> +   neon_fp_cvt_int_q, neon_fp_cvt16, neon_fp_minmax, neon_fp_mul,
> +   neon_fp_mul_q, neon_fp_mla, neon_fp_mla_q, neon_fp_recpe_rsqrte,
> +   neon_fp_recpe_rsqrte_q, neon_fp_recps_rsqrts, neon_fp_recps_rsqrts_q,
> +   neon_bitops, neon_bitops_q, neon_from_gp,
> +   neon_from_gp_q, neon_move, neon_tbl3_tbl4, neon_zip_q, neon_to_gp,
> +   neon_load_a, neon_load_b, neon_load_c, neon_load_d, neon_load_e,
> +   neon_load_f, neon_store_a, neon_store_b, neon_store_complex,
> +   unknown"
> +  (cond [
> + (eq_attr "type" "neon_arith_acc, neon_reduc_add_acc,\
> +  neon_reduc_add_acc_q")
> +   (const_string "neon_arith_acc")
> + (eq_attr "type" "neon_arith_acc_q")
> +   (const_string "neon_arith_acc_q")
> + (eq_attr "type" "neon_abs,neon_abs_q,neon_add, neon_add_q, 
> neon_add_long,\
> +  neon_add_widen, neon_neg, neon_neg_q,\
> +  neon

[PATCH][x86_64] Fix PR87853, _mm_cmpgt_epi8 broken with -funsigned-char

2018-11-04 Thread Terry Guo
Hi there,

This patch intends to fix PR87853 by involving a new 'signed char'
vector type to avoid the impact of option -funsigned-char. Tested with
bootstrap and regression tests on x86_64. No regressions.

Is it OK to trunk and release branch?

BR,
Terry

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index ac121a8..dc10a11 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2018-11-05  Xuepeng Guo  
+
+   PR target/87853
+   * config/i386/emmintrin.h (__v16qs): New to cope with option
+   -funsigned-char.
+   (_mm_cmpeq_epi8): Replace __v16qi with __v16qs.
+   (_mm_cmplt_epi8): Likewise.
+   (_mm_cmpgt_epi8): Likewise.
+
 2018-11-04  Bernd Edlinger  

PR tree-optimization/86572
diff --git a/gcc/config/i386/emmintrin.h b/gcc/config/i386/emmintrin.h
index 7a6ff80..3c1f04b 100644
--- a/gcc/config/i386/emmintrin.h
+++ b/gcc/config/i386/emmintrin.h
@@ -45,6 +45,7 @@ typedef unsigned int __v4su __attribute__
((__vector_size__ (16)));
 typedef short __v8hi __attribute__ ((__vector_size__ (16)));
 typedef unsigned short __v8hu __attribute__ ((__vector_size__ (16)));
 typedef char __v16qi __attribute__ ((__vector_size__ (16)));
+typedef signed char __v16qs __attribute__ ((__vector_size__ (16)));
 typedef unsigned char __v16qu __attribute__ ((__vector_size__ (16)));

 /* The Intel API is flexible enough that we must allow aliasing with other
@@ -1295,7 +1296,7 @@ _mm_xor_si128 (__m128i __A, __m128i __B)
 extern __inline __m128i __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
 _mm_cmpeq_epi8 (__m128i __A, __m128i __B)
 {
-  return (__m128i) ((__v16qi)__A == (__v16qi)__B);
+  return (__m128i) ((__v16qs)__A == (__v16qs)__B);
 }

 extern __inline __m128i __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
@@ -1313,7 +1314,7 @@ _mm_cmpeq_epi32 (__m128i __A, __m128i __B)
 extern __inline __m128i __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
 _mm_cmplt_epi8 (__m128i __A, __m128i __B)
 {
-  return (__m128i) ((__v16qi)__A < (__v16qi)__B);
+  return (__m128i) ((__v16qs)__A < (__v16qs)__B);
 }

 extern __inline __m128i __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
@@ -1331,7 +1332,7 @@ _mm_cmplt_epi32 (__m128i __A, __m128i __B)
 extern __inline __m128i __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
 _mm_cmpgt_epi8 (__m128i __A, __m128i __B)
 {
-  return (__m128i) ((__v16qi)__A > (__v16qi)__B);
+  return (__m128i) ((__v16qs)__A > (__v16qs)__B);
 }

 extern __inline __m128i __attribute__((__gnu_inline__,
__always_inline__, __artificial__))


Re: [PATCH] x86: Optimize VFIXUPIMM* patterns with multiple-alternative constraints

2018-11-09 Thread Terry Guo
On Fri, Nov 9, 2018 at 6:05 PM Uros Bizjak  wrote:
>
> On Fri, Nov 9, 2018 at 10:54 AM Wei Xiao  wrote:
> >
> > Hi Uros
> >
> > Thanks for the remarks!
> > I improve the patch as attached to address the issues you mentioned:
> > 1. No changes to substs any more.
> > 2. Adopt established approach (e.g "rcp14") 
> > to
> > handle zero masks.
> >
> > I'd like to explain our motivation of combining vfixupimm patterns: there 
> > will
> > be a lot of new x86 instructions with both masking and rounding like 
> > vfixupimm
> > in the future but we still want to keep x86 MD as short as possible and 
> > don't
> > want to write 2 patterns for each of these new instructions, which will also
> > raise code review cost for maintainer. We want to make sure the new pattern
> > paradigm is ok for x86 maintainer through this patch.
>
> Yes, the patch looks much nicer now.
>
> +2018-11-09 Wei Xiao 
> + *config/i386/sse.md: Combine VFIXUPIMM* patterns
> + (_fixupimm_maskz): Update.
> + (_fixupimm): Update.
> + (_fixupimm_mask): Remove.
> + (avx512f_sfixupimm_maskz): Update.
> + (avx512f_sfixupimm): Update.
> + (avx512f_sfixupimm_mask): Remove.
>
> (In future,  please add ChangeLog entry to the text of the mail).
>
> OK for mainline.
>
> Thanks,
> Uros.

Thanks for the review. In future we will pay more attentions to follow
the convention.

BR,
Terry


Re: [PATCH] x86-64: Use TI->SF and TI->DF conversions in soft-fp

2019-01-21 Thread Terry Guo
On Tue, Jan 22, 2019 at 7:48 AM Joseph Myers  wrote:
>
> On Mon, 21 Jan 2019, H.J. Lu wrote:
>
> > TI->SF and TI->DF conversions in libgcc2.c:
> >
> > FSTYPE
> > FUNC (DWtype u)
> > {
> >   ...
> > }
> >
> > have no rounding mode support.  We should replace __floattisf, __floattidf,
> > __floatuntisf and __floatuntidf in libgcc2.c with these from soft-fp.
>
> Please explain what you mean by "have no rounding mode support" (i.e., the
> exact flow through a function that is incorrect in a non-default rounding
> mode).  This patch is missing testcases - which of course should be
> architecture-independent.  (Any bug in libgcc2.c should first have an
> architecture-independent fix - it can't be considered fixed based on a fix
> for one architecture.  Then, if some other approach is optimal on
> particular architectures, they can get optimized variants.)
>
> I believe all those function implementations are designed so that only a
> single rounding occurs, which is for the final result, so no explicit
> handling of rounding modes is ever needed (the integer code before then
> may set up sticky bits appropriately to ensure the floating-point parts of
> the code only need a single rounding, which works in all modes), but maybe
> there are bugs in certain cases.  To identify the correct fix, we need
> details of the exact code path being used (the exact values of the various
> macros, choices for the various conditional parts of the function, values
> each variable has at each point) and where the existing,
> rounding-mode-independent logic goes wrong.
>
> --
> Joseph S. Myers
> jos...@codesourcery.com

Hi Joseph,

I believe HJ is proposing patch to fix bug
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88931. In the test case
of the bug, the "#pragma STDC FENV_ACCESS ON" is used and there are
four rounding modes:

  {
ROUNDING (FE_DOWNWARD),
ROUNDING (FE_UPWARD),
ROUNDING (FE_TOWARDZERO),
ROUNDING (FE_TONEAREST)
  }

The current _floattisf from libgcc2 doesn't support those four rounding modes.

BR,
Terry


[Patch, ARM/Thumb1]Add a Thumb1 insn pattern to legalize the instruction that moves pc to low register

2014-12-08 Thread Terry Guo
Hi there,

When compile below simple code:

terguo01@terry-pc01:mtpcs-frame$ cat test.c
int main(void)
{
return 0;
}

I got ICE with option -mtpcs-leaf-frame (no error if remove this option).

terguo01@terry-pc01:mtpcs-frame$
/work/terguo01/tools/gcc-arm-none-eabi-5_0-2014q4/bin/arm-none-eabi-gcc
-mtpcs-leaf-frame test.c -c -mcpu=cortex-m0plus -mthumb -da
test.c: In function 'main':
test.c:4:1: error: unrecognizable insn:
 }
 ^
(insn 20 19 21 (set (reg:SI 2 r2)
(reg:SI 15 pc)) test.c:2 -1
 (nil))
test.c:4:1: internal compiler error: in extract_insn, at recog.c:2327
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html\ for instructions.

This RTL is generated in function thumb1_expand_prologue. The expected insn
pattern is thumb1_movsi_insn in thumb1.md. And instruction like "mov r2, pc"
is a legal instruction. Because gcc returns NO_REG for PC register, so no
valid pattern to match instruction that move pc to low register. This patch
intends to add a new insn pattern to legalize such thing.

Tested with GCC regression test. No regression. Is it OK to trunk?

BR,
Terry

2014-12-08  Terry Guo  terry@arm.com

 * config/arm/predicates.md (pc_register): New to match PC register.
 * config/arm/thumb1.md (*thumb1_movpc_insn): New insn pattern.

gcc/testsuite/ChangeLog:
2014-12-08  Terry Guo  terry@arm.com

 * gcc.target/arm/thumb1-mov-pc.c: New test.diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
index 032808c..c5ef5ed 100644
--- a/gcc/config/arm/predicates.md
+++ b/gcc/config/arm/predicates.md
@@ -361,6 +361,10 @@
   (and (match_code "smin,smax,umin,umax")
(match_test "mode == GET_MODE (op)")))
 
+(define_special_predicate "pc_register"
+  (and (match_code "reg")
+   (match_test "REGNO (op) == PC_REGNUM")))
+
 (define_special_predicate "cc_register"
   (and (match_code "reg")
(and (match_test "REGNO (op) == CC_REGNUM")
diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index ddedc39..8e6057c 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -1780,6 +1780,16 @@
   "
 )
 
+(define_insn "*thumb1_movpc_insn"
+  [(set (match_operand:SI 0 "low_register_operand")
+(match_operand:SI 1 "pc_register"))]
+  "TARGET_THUMB1"
+  "mov\\t%0, pc"
+  [(set_attr "length" "2")
+   (set_attr "conds" "nocond")
+   (set_attr "type"   "mov_reg")]
+)
+
 ;; NB never uses BX.
 (define_insn "*thumb1_tablejump"
   [(set (pc) (match_operand:SI 0 "register_operand" "l*r"))
diff --git a/gcc/testsuite/gcc.target/arm/thumb1-mov-pc.c 
b/gcc/testsuite/gcc.target/arm/thumb1-mov-pc.c
new file mode 100644
index 000..9f94131
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb1-mov-pc.c
@@ -0,0 +1,7 @@
+/* { dg-options "-mtpcs-leaf-frame -O2" } */
+/* { dg-skip-if "" { ! { arm_thumb1 } } } */
+int
+main ()
+{
+  return 0;
+}


[Backport]Is it ok to backport this bug fix to 4.8 branch?

2014-12-17 Thread Terry Guo
Hi Jakub,

Is it OK to back port revision 209293 to upstream 4.8 branch as it fixed bug
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60663? Thanks.

BR,
Terry





RE: [Backport]Is it ok to backport this bug fix to 4.8 branch?

2014-12-17 Thread Terry Guo
> -Original Message-
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: Wednesday, December 17, 2014 4:20 PM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [Backport]Is it ok to backport this bug fix to 4.8 branch?
> 
> On Wed, Dec 17, 2014 at 04:13:59PM +0800, Terry Guo wrote:
> > Is it OK to back port revision 209293 to upstream 4.8 branch as it fixed
bug
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60663? Thanks.
> 
> I don't think it is ok, certainly not for 4.8.4 now.  See PR63637.
> 
>   Jakub

Got it and thank you.

BR,
Terry





RE: [Patch, ARM/Thumb1]Add a Thumb1 insn pattern to legalize the instruction that moves pc to low register

2015-01-08 Thread Terry Guo


> -Original Message-
> From: Richard Earnshaw
> Sent: Monday, December 08, 2014 7:31 PM
> To: Terry Guo; gcc-patches@gcc.gnu.org
> Cc: Ramana Radhakrishnan
> Subject: Re: [Patch, ARM/Thumb1]Add a Thumb1 insn pattern to legalize the
> instruction that moves pc to low register
> 
> On 08/12/14 08:24, Terry Guo wrote:
> > Hi there,
> >
> > When compile below simple code:
> >
> > terguo01@terry-pc01:mtpcs-frame$ cat test.c int main(void) {
> > return 0;
> > }
> >
> > I got ICE with option -mtpcs-leaf-frame (no error if remove this
option).
> >
> > terguo01@terry-pc01:mtpcs-frame$
> > /work/terguo01/tools/gcc-arm-none-eabi-5_0-2014q4/bin/arm-none-eabi-
> gc
> > c -mtpcs-leaf-frame test.c -c -mcpu=cortex-m0plus -mthumb -da
> > test.c: In function 'main':
> > test.c:4:1: error: unrecognizable insn:
> >  }
> >  ^
> > (insn 20 19 21 (set (reg:SI 2 r2)
> > (reg:SI 15 pc)) test.c:2 -1
> >  (nil))
> > test.c:4:1: internal compiler error: in extract_insn, at recog.c:2327
> > Please submit a full bug report, with preprocessed source if
> > appropriate.
> > See http://gcc.gnu.org/bugs.html\ for instructions.
> >
> > This RTL is generated in function thumb1_expand_prologue. The expected
> > insn pattern is thumb1_movsi_insn in thumb1.md. And instruction like
> "mov r2, pc"
> > is a legal instruction. Because gcc returns NO_REG for PC register, so
> > no valid pattern to match instruction that move pc to low register.
> > This patch intends to add a new insn pattern to legalize such thing.
> >
> > Tested with GCC regression test. No regression. Is it OK to trunk?
> >
> > BR,
> > Terry
> >
> > 2014-12-08  Terry Guo  terry@arm.com
> >
> >  * config/arm/predicates.md (pc_register): New to match PC register.
> >  * config/arm/thumb1.md (*thumb1_movpc_insn): New insn pattern.
> >
> > gcc/testsuite/ChangeLog:
> > 2014-12-08  Terry Guo  terry@arm.com
> >
> >  * gcc.target/arm/thumb1-mov-pc.c: New test.
> >
> >
> > thumb1-move-pc-v1.txt
> >
> >
> > diff --git a/gcc/config/arm/predicates.md
> > b/gcc/config/arm/predicates.md index 032808c..c5ef5ed 100644
> > --- a/gcc/config/arm/predicates.md
> > +++ b/gcc/config/arm/predicates.md
> > @@ -361,6 +361,10 @@
> >(and (match_code "smin,smax,umin,umax")
> > (match_test "mode == GET_MODE (op)")))
> >
> > +(define_special_predicate "pc_register"
> > +  (and (match_code "reg")
> > +   (match_test "REGNO (op) == PC_REGNUM")))
> > +
> >  (define_special_predicate "cc_register"
> >(and (match_code "reg")
> > (and (match_test "REGNO (op) == CC_REGNUM") diff --git
> > a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md index
> > ddedc39..8e6057c 100644
> > --- a/gcc/config/arm/thumb1.md
> > +++ b/gcc/config/arm/thumb1.md
> > @@ -1780,6 +1780,16 @@
> >"
> >  )
> >
> > +(define_insn "*thumb1_movpc_insn"
> > +  [(set (match_operand:SI 0 "low_register_operand")
> 
> This needs constraints.
> 

The constraint is used now. Is this one OK?

BR,
Terry

2015-01-09  Terry Guo  terry@arm.com

 * config/arm/thumb1.md (*thumb1_movpc_insn): New insn pattern.diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index 2208ae6..e04 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -1753,6 +1753,16 @@
   "
 )
 
+(define_insn "*thumb1_movpc_insn"
+  [(set (match_operand:SI 0 "s_register_operand" "=l")
+   (reg:SI PC_REGNUM))]
+  "TARGET_THUMB1"
+  "mov\\t%0, pc"
+  [(set_attr "length" "2")
+   (set_attr "conds"  "nocond")
+   (set_attr "type"   "mov_reg")]
+)
+
 ;; NB never uses BX.
 (define_insn "*thumb1_tablejump"
   [(set (pc) (match_operand:SI 0 "register_operand" "l*r"))


[Patch, ARM]Update GCC to generate Tag_ABI_HardFP_use per the latest EABI doc

2015-01-14 Thread Terry Guo
Hi there,

According to the latest EABI at
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0045d/IHI0045D_ABI_adden
da.pdf, the new definition of Tag_ABI_HardFP_use is as below:

Tag_ABI_HardFP_use, (=27), uleb128
0 The user intended that FP use should be implied by Tag_FP_arch
1 The user intended this code to execute on the single-precision variant
 derived from Tag_FP_arch
2 Reserved
3 The user intended that FP use should be implied by Tag_FP_arch
 (Note: This is a deprecated duplicate of the default encoded by 0)

The attached patch intends to update gcc to conform this definition. Tested
with GCC regression test, no regressions. Is it OK?

BR,
Terry

2015-01-14  Terry Guo  

   * config/arm/arm.c (arm_file_start): Update the assignment of
Tag_ABI_HardFP_use.diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 0ec526b..378bed9 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -25576,7 +25576,13 @@ arm_file_start (void)
  if (arm_fpu_desc->model == ARM_FP_MODEL_VFP)
{
  if (TARGET_HARD_FLOAT)
-   arm_emit_eabi_attribute ("Tag_ABI_HardFP_use", 27, 3);
+   {
+ if (TARGET_VFP_SINGLE)
+   arm_emit_eabi_attribute ("Tag_ABI_HardFP_use", 27, 1);
+ else
+   arm_emit_eabi_attribute ("Tag_ABI_HardFP_use", 27, 0);
+   }
+
  if (TARGET_HARD_FLOAT_ABI)
arm_emit_eabi_attribute ("Tag_ABI_VFP_args", 28, 1);
}


RE: [RFC] New feature to reuse one multilib among different targets

2012-11-07 Thread Terry Guo

[...]
> > Please help to review this new Multilib feature. It intends to
> provide
> > user
> 
> Your patch doesn't include documentation for fragments.texi (which
> needs to define the semantics without reference to the details of what
> gcc.c's internal datastructures for multilibs, as output by genmultilib,
> might look like).
> 
> I am unconvinced that directly adding to the drivers' internal
> datastructures like this is a sensible interface for specifying
> multilib choice in target makefile fragments.
> 

Very appreciate your review and comments. Here is an updated patch which
follows the approaches used in current multilib implementation. With this
update, the following statement means target represented by "optC optD" can
reuse existing multilib built by options "optA optB":

MULTILIB_REUSE = optA/optB=optC/optD

To convert such statements to data structure used by multilib_raw, I
refactor codes in genmultilib into two functions combo_to_dir and
options_output. Then use combo_to_dir to convert left part into multilib
folder name and use options_output to convert right part into option list.

Inside gcc.c, those reuse rules will be used once gcc can't figure out
multilib that exactly matches current command line options.

I build trunk code with this patch along with --enable-multilib for targets
arm-none-eabi/x86/m6800/mips/powerpc. No problem found.

Is this patch OK? Please comment.

BR,
Terry

2012-11-08  Terry Guo  

* genmultilib (combo_to_dir): New function.
(options_output): New function.
(MULTILIB_REUSE): New argument.
* Makefile.in (s-mlib): Add a new argument MULTILIB_REUSE.
* gcc.c (multilib_reuse): New spec.
(set_multilib_dir): Use multilib_reuse.

multilib-reuse-v2.patch
Description: Binary data


RE: [RFC] New feature to reuse one multilib among different targets

2012-11-08 Thread Terry Guo
> -Original Message-
> From: Joseph Myers [mailto:jos...@codesourcery.com]
> Sent: Friday, November 09, 2012 5:10 AM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org
> Subject: RE: [RFC] New feature to reuse one multilib among different
> targets
> 
> On Thu, 8 Nov 2012, Terry Guo wrote:
> 
> > To convert such statements to data structure used by multilib_raw, I
> > refactor codes in genmultilib into two functions combo_to_dir and
> 
> The "function" keyword for creating shell functions is not POSIX, and I
> don't know if we ensure that $SHELL is a shell supporting functions.
> (It's documented that CONFIG_SHELL may need to be set to a POSIX shell
> if /bin/sh isn't sufficient, but does that feed through to the value of
> SHELL used to run this script?)
> 

You are right that we should make script POSIX compliant. This v3 patch
removed "function" and "local" which don't belong to POSIX standard. I also
verified that CONFIG_SHELL is passed to this script with value "/bin/sh".

Checked new genmultilib script with command "checkbashisms --posix
genmultilib" in Ubuntu. No warning and error messages reported.

[...]
> 
> Documentation changes need mentioning in ChangeLog entries.
> 

Added them in following ChangeLog.

BR,
Terry

2012-11-09  Terry Guo  

* genmultilib (combo_to_dir): New function.
(options_output): New function.
(MULTILIB_REUSE): New argument.
* Makefile.in (s-mlib): Add a new argument MULTILIB_REUSE.
* gcc.c (multilib_reuse): New spec.
(set_multilib_dir): Use multilib_reuse.
* doc/fragments.texi: Mention MULTILIB_REUSE.

multilib-reuse-v3.patch
Description: Binary data


RE: [RFC] New feature to reuse one multilib among different targets

2012-11-12 Thread Terry Guo
> -Original Message-
> From: Joseph Myers [mailto:jos...@codesourcery.com]
> Sent: Saturday, November 10, 2012 12:35 AM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org
> Subject: RE: [RFC] New feature to reuse one multilib among different
> targets
> 
> On Fri, 9 Nov 2012, Terry Guo wrote:
> 
> > You are right that we should make script POSIX compliant. This v3
> patch
> > removed "function" and "local" which don't belong to POSIX standard.
> I also
> > verified that CONFIG_SHELL is passed to this script with value
> "/bin/sh".
> 
> Suppose /bin/sh is not a POSIX shell but the user sets CONFIG_SHELL to
> something else (which is a POSIX shell).  Will SHELL in the makefile
> get
> set to the POSIX shell the user specified as CONFIG_SHELL?  That's
> what's
> needed to be able to use POSIX shell features in this script.
> 

The attached patch is updated to use sub-script rather than the function to
reuse code. Is it ok to avoid the issue you just mentioned?

BR,
Terry

2012-11-13  Terry Guo  

* genmultilib (tmpmultilib3): New refactored sub-script
to convert the option combination into folder name.
(tmpmultilib4): New refactored sub-script to output the
options in a option combination.
(MULTILIB_REUSE): New argument.
* Makefile.in (s-mlib): Add a new argument MULTILIB_REUSE.
* gcc.c (multilib_reuse): New spec.
(set_multilib_dir): Use multilib_reuse.
* doc/fragments.texi: Mention MULTILIB_REUSE.

multilib-reuse-v4.patch
Description: Binary data


Ping: RE: [RFC] New feature to reuse one multilib among different targets

2012-11-23 Thread Terry Guo
Hi Joseph,

Can you please help to review this patch and share your thoughts on this
feature? Thanks.

BR,
Terry

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Terry Guo
> Sent: Tuesday, November 13, 2012 12:47 PM
> To: jos...@codesourcery.com
> Cc: gcc-patches@gcc.gnu.org
> Subject: RE: [RFC] New feature to reuse one multilib among different
> targets
> 
> > -Original Message-
> > From: Joseph Myers [mailto:jos...@codesourcery.com]
> > Sent: Saturday, November 10, 2012 12:35 AM
> > To: Terry Guo
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: RE: [RFC] New feature to reuse one multilib among different
> > targets
> >
> > On Fri, 9 Nov 2012, Terry Guo wrote:
> >
> > > You are right that we should make script POSIX compliant. This v3
> > patch
> > > removed "function" and "local" which don't belong to POSIX standard.
> > I also
> > > verified that CONFIG_SHELL is passed to this script with value
> > "/bin/sh".
> >
> > Suppose /bin/sh is not a POSIX shell but the user sets CONFIG_SHELL
> to
> > something else (which is a POSIX shell).  Will SHELL in the makefile
> > get set to the POSIX shell the user specified as CONFIG_SHELL?
> That's
> > what's needed to be able to use POSIX shell features in this script.
> >
> 
> The attached patch is updated to use sub-script rather than the
> function to
> reuse code. Is it ok to avoid the issue you just mentioned?
> 
> BR,
> Terry
> 
> 2012-11-13  Terry Guo  
> 
>   * genmultilib (tmpmultilib3): New refactored sub-script
>   to convert the option combination into folder name.
>   (tmpmultilib4): New refactored sub-script to output the
>   options in a option combination.
>   (MULTILIB_REUSE): New argument.
>   * Makefile.in (s-mlib): Add a new argument MULTILIB_REUSE.
>   * gcc.c (multilib_reuse): New spec.
>   (set_multilib_dir): Use multilib_reuse.
>   * doc/fragments.texi: Mention MULTILIB_REUSE.




[Patch, ARM] Fix the check on arg reg number in function thumb_find_work_register

2012-11-27 Thread Terry Guo
Hello,

Attached patch intends to fix a bug on how to check argument register number
which should consider the PCS. A test case is also included. Without this
fix, one of the function argument will be overridden in the case. Tested on
QEMU for cortex-m3, no regression found. Is it OK to trunk?

BR,
Terry

gcc/ChangeLog:

2012-11-28  Terry Guo  

* config/arm/arm.c (thumb_find_work_register): Check
argument register number based on current PCS.

gcc/testsuite/ChangeLog:

2012-11-28  Terry Guo  

* gcc.target/arm/thumb-find-work-register.c: New.

fix-thumb-find-work-register.patch
Description: Binary data


Ping: RE: [Patch, ARM] Fix the check on arg reg number in function thumb_find_work_register

2012-12-03 Thread Terry Guo
Hi Ramana,

Can you please help to review this patch? Thanks.

BR,
Terry

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Terry Guo
> Sent: Wednesday, November 28, 2012 1:53 PM
> To: gcc-patches@gcc.gnu.org
> Subject: [Patch, ARM] Fix the check on arg reg number in function
> thumb_find_work_register
> 
> Hello,
> 
> Attached patch intends to fix a bug on how to check argument register
> number
> which should consider the PCS. A test case is also included. Without
> this
> fix, one of the function argument will be overridden in the case.
> Tested on
> QEMU for cortex-m3, no regression found. Is it OK to trunk?
> 
> BR,
> Terry
> 
> gcc/ChangeLog:
> 
> 2012-11-28  Terry Guo  
> 
> * config/arm/arm.c (thumb_find_work_register): Check
> argument register number based on current PCS.
> 
> gcc/testsuite/ChangeLog:
> 
> 2012-11-28  Terry Guo  
> 
> * gcc.target/arm/thumb-find-work-register.c: New.




Ping-2: RE: [RFC] New feature to reuse one multilib among different targets

2012-12-03 Thread Terry Guo
Hi Joseph,

Can you please review this patch? If I missed something, please point out.
Thanks.

BR,
Terry

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Terry Guo
> Sent: Friday, November 23, 2012 5:12 PM
> To: jos...@codesourcery.com
> Cc: gcc-patches@gcc.gnu.org
> Subject: Ping: RE: [RFC] New feature to reuse one multilib among
> different targets
> 
> Hi Joseph,
> 
> Can you please help to review this patch and share your thoughts on
> this
> feature? Thanks.
> 
> BR,
> Terry
> 
> > -Original Message-
> > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> > ow...@gcc.gnu.org] On Behalf Of Terry Guo
> > Sent: Tuesday, November 13, 2012 12:47 PM
> > To: jos...@codesourcery.com
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: RE: [RFC] New feature to reuse one multilib among different
> > targets
> >
> > > -Original Message-
> > > From: Joseph Myers [mailto:jos...@codesourcery.com]
> > > Sent: Saturday, November 10, 2012 12:35 AM
> > > To: Terry Guo
> > > Cc: gcc-patches@gcc.gnu.org
> > > Subject: RE: [RFC] New feature to reuse one multilib among
> different
> > > targets
> > >
> > > On Fri, 9 Nov 2012, Terry Guo wrote:
> > >
> > > > You are right that we should make script POSIX compliant. This v3
> > > patch
> > > > removed "function" and "local" which don't belong to POSIX
> standard.
> > > I also
> > > > verified that CONFIG_SHELL is passed to this script with value
> > > "/bin/sh".
> > >
> > > Suppose /bin/sh is not a POSIX shell but the user sets CONFIG_SHELL
> > to
> > > something else (which is a POSIX shell).  Will SHELL in the
> makefile
> > > get set to the POSIX shell the user specified as CONFIG_SHELL?
> > That's
> > > what's needed to be able to use POSIX shell features in this script.
> > >
> >
> > The attached patch is updated to use sub-script rather than the
> > function to
> > reuse code. Is it ok to avoid the issue you just mentioned?
> >
> > BR,
> > Terry
> >
> > 2012-11-13  Terry Guo  
> >
> > * genmultilib (tmpmultilib3): New refactored sub-script
> > to convert the option combination into folder name.
> > (tmpmultilib4): New refactored sub-script to output the
> > options in a option combination.
> > (MULTILIB_REUSE): New argument.
> > * Makefile.in (s-mlib): Add a new argument MULTILIB_REUSE.
> > * gcc.c (multilib_reuse): New spec.
> > (set_multilib_dir): Use multilib_reuse.
> > * doc/fragments.texi: Mention MULTILIB_REUSE.
> 
> 





RE: [RFC] New feature to reuse one multilib among different targets

2012-12-07 Thread Terry Guo


> -Original Message-
> From: Joseph Myers [mailto:jos...@codesourcery.com]
> Sent: Friday, December 07, 2012 2:04 AM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org
> Subject: RE: [RFC] New feature to reuse one multilib among different
> targets
> 
> On Tue, 13 Nov 2012, Terry Guo wrote:
> 
> > +@findex MULTILIB_REUSE
> > +@item MULTILIB_REUSE
> > +Sometimes it is desirable to reuse one existing multilib among
> different
> > +targets.  Such kind of reuse can minimize the number of multilib
> variants.
> 
> I don't think "among different targets" is the right wording here.
> "for
> different sets of options"?
> 

Updated with your comments.

> > +A typical reuse rule is comprised of two parts connected by equality
> sign.
> 
> Not just a typical reuse rule, but *any* reuse rule, as I understand it.
> 

You are right. Removed the word "typical".

> > +The left part of the rule are the options used to build multilib and
> the right
> > +part are the options representing target that will reuse this
> multilib.  The
> > +equality sign in both parts should be replaced with period.
> 
> Is the order of the options significant, on either or both sides?  Can
> the
> options on the right hand side be options that aren't used to build any
> multilibs?  I think the documentation should answer that sort of
> question.
> 

Yes, the option order in left part matters. More explanations are added as
in patch.

> > @@ -7475,10 +7484,16 @@ set_multilib_dir (void)
> >
> >first = 1;
> >p = multilib_select;
> > +
> > +  /* Append multilib reuse rules if any.  With those rules, we can
> reuse
> > + one multilib for certain different targets.  */
> > +  if (strlen(multilib_reuse) > 0)
> 
> Missing space before '('.
> 

Added the missing space.

> > -  /* Ignore newlines.  */
> > -  if (*p == '\n')
> > +  /* Ignore newlinesi and spaces.  */
> 
> Typo "newlinesi".  And why the change to ignore spaces as well - what's
> different about this new feature to require that?
> 

Typo is corrected. I once wanted to enable user to change multilib spec in
gcc driver on the fly by defining own spec file with content:

*multilib:
OWN RULES

or

*multilib:
+ OWN RULES

For the latter case, an extra space will be involved and break the parse of
multilib spec. So I ignore space here.

But as your said we had better not touch those internal data structure, I
give up this idea now.

> > @@ -7491,8 +7506,8 @@ set_multilib_dir (void)
> >   if (*p == '\0')
> > {
> > invalid_select:
> > - fatal_error ("multilib select %qs is invalid",
> > -  multilib_select);
> > + fatal_error ("multilib select %qs%qs is invalid",
> > +  multilib_select, multilib_reuse);
> 
> Printing two quoted strings with no space between the closing quote of
> one
> and the opening quote of the other certainly doesn't seem right.
> (Really
> this whole error message seems pretty bad - it won't make sense to
> users -
> but that's a pre-existing condition.)
> 

An extra space is inserted.

> > +rm -rf tmpmultilib3
> > +cat >tmpmultilib3 <<\EOF
> 
> As I understand it, this is a refactoring of existing code.  The patch
> might be easier to review if the bits that just refactored existing
> code
> into sub-scripts (without any changes to that code) were sent as a
> separate self-contained patch, and then the new feature patch was sent
> as
> a patch applying on top of those.
> 

Your understanding is correct. Now I put code refactor part into patch 00.
Patch 01 is supposed to be applied on top of it.

> > +  # We only care rule that has concrete multilib.
> 
> "care about", I think, but this sentence still doesn't really make
> sense
> to me.  What are the cases that aren't being cared about here, and why
> are
> they valid inputs?  Surely, given a proper MULTILIB_REUSE setting,
> every
> rule in that setting should do something meaningful and rules that
> don't
> should result in errors?
> 

Now an error will be generated once the rule tries to reuse nonexistent
multilib.

Thank you again, Joseph.

BR,
Terry

2012-12-07  Terry Guo  

* gcc/Makefile.in (s-mlib): New argument MULTILIB_REUSE.
* gcc/doc/fragments.texi: Document MULTILIB_REUSE.
* gcc/gcc.c (multilib_reuse): New internal spec.
(set_multilib_dir): Also search multilib from multilib_reuse.
* gcc/genmultilib (tmpmultilib3): Refactor code.
(tmpmultilib4): Ditto.
(multilib_reuse): New multilib argument.


00-multilib-reuse-v5.patch
Description: Binary data


01-multilib-reuse-v5.patch
Description: Binary data


[PATCH,ARM] Fix PR57329 - backport to gcc 4.8

2013-06-03 Thread Terry Guo
Hello,

This patch (trunk r197155)
http://gcc.gnu.org/ml/gcc-cvs/2013-03/msg00784.html
fixes an ICE in gcc 4.8:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57329

OK to backport to 4.8 branch?  Tested with 4.8 regression test on QEMU, no
new regression.

BR,
Terry




[PATCH, ARM]Option support to new ARM MCU Cortex-M7

2014-09-23 Thread Terry Guo
Hi there,

The attached patch intends to provide option support to newly announced core
Cortex-M7 and related FPU:
http://www.arm.com/about/newsroom/arm-supercharges-mcu-market-with-high-perf
ormance-cortex-m7-processor.php
http://www.arm.com/products/processors/cortex-m/cortex-m7-processor.php

The required Binutils support is
https://sourceware.org/ml/binutils/2014-09/msg00201.html.

Is it OK to trunk?

BR,
Terry

2014-09-24  Terry Guo  

 * config/arm/arm-cores.def (cortex-m7): New core name.
 * config/arm/arm-fpus.def (fpv5-sp-d16): New fpu name.
 (fpv5-d16): Ditto.
 * config/arm/arm-tables.opt: Regenerated.
 * config/arm/arm-tune.md: Likewise. 
 * doc/invoke.texi: Document new cpu and fpu names.
 * config/arm/arm.h (TARGET_VFP5): New macro.
 * config/arm/vfp.md (2,
 smax3, smin3): Enabled for FPU FPv5.diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index a830a83..56ec7fd 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -149,6 +149,7 @@ ARM_CORE("cortex-r4",   cortexr4, cortexr4, 
7R,  FL_LDSCHED, cortex)
 ARM_CORE("cortex-r4f", cortexr4f, cortexr4f,   7R,  
FL_LDSCHED, cortex)
 ARM_CORE("cortex-r5",  cortexr5, cortexr5, 7R,  FL_LDSCHED 
| FL_ARM_DIV, cortex)
 ARM_CORE("cortex-r7",  cortexr7, cortexr7, 7R,  FL_LDSCHED 
| FL_ARM_DIV, cortex)
+ARM_CORE("cortex-m7",  cortexm7, cortexm7, 7EM, 
FL_LDSCHED, v7m)
 ARM_CORE("cortex-m4",  cortexm4, cortexm4, 7EM, 
FL_LDSCHED, v7m)
 ARM_CORE("cortex-m3",  cortexm3, cortexm3, 7M,  
FL_LDSCHED, v7m)
 ARM_CORE("marvell-pj4",marvell_pj4, marvell_pj4,   7A,  
FL_LDSCHED, 9e)
diff --git a/gcc/config/arm/arm-fpus.def b/gcc/config/arm/arm-fpus.def
index 85d9693..edd0c35 100644
--- a/gcc/config/arm/arm-fpus.def
+++ b/gcc/config/arm/arm-fpus.def
@@ -37,6 +37,8 @@ ARM_FPU("neon-fp16",  ARM_FP_MODEL_VFP, 3, VFP_REG_D32, true, 
true, false)
 ARM_FPU("vfpv4",   ARM_FP_MODEL_VFP, 4, VFP_REG_D32, false, true, false)
 ARM_FPU("vfpv4-d16",   ARM_FP_MODEL_VFP, 4, VFP_REG_D16, false, true, false)
 ARM_FPU("fpv4-sp-d16", ARM_FP_MODEL_VFP, 4, VFP_REG_SINGLE, false, true, false)
+ARM_FPU("fpv5-sp-d16", ARM_FP_MODEL_VFP, 5, VFP_REG_SINGLE, false, true, false)
+ARM_FPU("fpv5-d16",ARM_FP_MODEL_VFP, 5, VFP_REG_D16, false, true, false)
 ARM_FPU("neon-vfpv4",  ARM_FP_MODEL_VFP, 4, VFP_REG_D32, true, true, false)
 ARM_FPU("fp-armv8",ARM_FP_MODEL_VFP, 8, VFP_REG_D32, false, true, false)
 ARM_FPU("neon-fp-armv8",ARM_FP_MODEL_VFP, 8, VFP_REG_D32, true, true, false)
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index bc046a0..04191bc 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -274,6 +274,9 @@ EnumValue
 Enum(processor_type) String(cortex-r7) Value(cortexr7)
 
 EnumValue
+Enum(processor_type) String(cortex-m7) Value(cortexm7)
+
+EnumValue
 Enum(processor_type) String(cortex-m4) Value(cortexm4)
 
 EnumValue
@@ -423,17 +426,23 @@ EnumValue
 Enum(arm_fpu) String(fpv4-sp-d16) Value(11)
 
 EnumValue
-Enum(arm_fpu) String(neon-vfpv4) Value(12)
+Enum(arm_fpu) String(fpv5-sp-d16) Value(12)
 
 EnumValue
-Enum(arm_fpu) String(fp-armv8) Value(13)
+Enum(arm_fpu) String(fpv5-d16) Value(13)
 
 EnumValue
-Enum(arm_fpu) String(neon-fp-armv8) Value(14)
+Enum(arm_fpu) String(neon-vfpv4) Value(14)
 
 EnumValue
-Enum(arm_fpu) String(crypto-neon-fp-armv8) Value(15)
+Enum(arm_fpu) String(fp-armv8) Value(15)
 
 EnumValue
-Enum(arm_fpu) String(vfp3) Value(16)
+Enum(arm_fpu) String(neon-fp-armv8) Value(16)
+
+EnumValue
+Enum(arm_fpu) String(crypto-neon-fp-armv8) Value(17)
+
+EnumValue
+Enum(arm_fpu) String(vfp3) Value(18)
 
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index 954cab8..4217fbe 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -28,7 +28,8 @@
genericv7a,cortexa5,cortexa7,
cortexa8,cortexa9,cortexa12,
cortexa15,cortexr4,cortexr4f,
-   cortexr5,cortexr7,cortexm4,
-   cortexm3,marvell_pj4,cortexa15cortexa7,
-   cortexa53,cortexa57,cortexa57cortexa53"
+   cortexr5,cortexr7,cortexm7,
+   cortexm4,cortexm3,marvell_pj4,
+   cortexa15cortexa7,cortexa53,cortexa57,
+   cortexa57cortexa53"
(const (symbol_ref "((enum attr_tune) arm_tune)")))
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index ff4ddac..3623c70 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -296,6 +296,9 @@ extern void (*arm_lang_output_object_attributes_hook)(void);
 /* FPU supports VFPv3 instructions.  */
 #define TARGET_VFP3 (TARGET_VFP && arm_fpu_desc->rev >= 3)
 
+/* FPU supports FP

[Ping]RE: [PATCH,ARM] Fix PR57329 - backport to gcc 4.8

2013-06-28 Thread Terry Guo
Ping.

BR,
Terry

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Terry Guo
> Sent: Monday, June 03, 2013 6:02 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw; Ramana Radhakrishnan
> Subject: [PATCH,ARM] Fix PR57329 - backport to gcc 4.8
> 
> Hello,
> 
> This patch (trunk r197155)
> http://gcc.gnu.org/ml/gcc-cvs/2013-03/msg00784.html
> fixes an ICE in gcc 4.8:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57329
> 
> OK to backport to 4.8 branch?  Tested with 4.8 regression test on QEMU, no
> new regression.
> 
> BR,
> Terry
> 
> 





[PATCH, ARM/Thumb1] Adjust rtx cost to prevent expanding MULT into shift/add instructions

2013-07-23 Thread Terry Guo
Hi there,

This patch intends to update thumb1_size_rtx_costs function to correctly
handle those RTXs defined by RTL expansion pass. Thus the GIMPLE
multiplication will be expanded to single mul instruction instead of a bunch
of shift/add/sub instructions which are in fact more expensive.

Tested with GCC regression test on QEMU ARM926, no regression. Is it OK to
trunk and 4.8 branch?

BR,
Terry


gcc/ChangeLog:
2013-07-24  Terry Guo  

* config/arm/arm.c (thumb1_size_rtx_costs): Assign proper cost for
shift_add/shift_sub0/shift_sub1 RTXs.

gcc/testsuite/ChangeLog:
2013-07-24  Terry Guo  

* gcc.target/arm/thumb1-Os-mult.c: New test case.diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index e6fd420..5c07832 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -7925,6 +7925,15 @@ thumb1_size_rtx_costs (rtx x, enum rtx_code code, enum 
rtx_code outer)
 
 case PLUS:
 case MINUS:
+  /* Thumb-1 needs two instructions to fulfill shiftadd/shiftsub0/shiftsub1
+defined by RTL expansion, especially for the expansion of
+multiplication.  */
+  if ((GET_CODE (XEXP (x, 0)) == MULT
+  && power_of_two_operand (XEXP (XEXP (x,0),1), SImode))
+ || (GET_CODE (XEXP (x, 1)) == MULT
+ && power_of_two_operand (XEXP (XEXP (x, 1), 1), SImode)))
+   return COSTS_N_INSNS (2);
+  /* On purpose fall through for normal RTX.  */
 case COMPARE:
 case NEG:
 case NOT:
diff --git a/gcc/testsuite/gcc.target/arm/thumb1-Os-mult.c 
b/gcc/testsuite/gcc.target/arm/thumb1-Os-mult.c
new file mode 100644
index 000..31b8bd6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb1-Os-mult.c
@@ -0,0 +1,12 @@
+/* { dg-require-effective-target arm_thumb1_ok } */
+/* { dg-do compile } */
+/* { dg-options "-Os" } */
+/* { dg-skip-if "" { ! { arm_thumb1 } } } */
+
+int
+mymul3 (int x)
+{
+  return x * 0x555;
+}
+
+/* { dg-final { scan-assembler "mul\[\\t \]*r.,\[\\t \]*r." } } */


[arm-embedded] Patch to define multilibs for arm embedded-4_8-branch

2013-07-24 Thread Terry Guo
Hi Joey,

This patch is to define multilibs for recently created embedded-4_8-branch.
Is it OK to commit?

BR,
Terry

2013-07-24  Terry Guo  

* configure.ac (with_multilib_list): Export its value.
* Makefile.in (with_multilib_list): Import it from configure files.
* configure: Regenerated.
* config/arm/t-mlibs: New files to define multilibs.
* config.gcc: Use above multilib fragment.Index: gcc/config.gcc
===
--- gcc/config.gcc  (revision 201200)
+++ gcc/config.gcc  (working copy)
@@ -917,7 +917,7 @@
case ${target} in
arm*-*-eabi*)
  tm_file="$tm_file newlib-stdint.h"
- tmake_file="${tmake_file} arm/t-bpabi"
+ tmake_file="${tmake_file} arm/t-bpabi arm/t-mlibs"
  use_gcc_stdint=wrap
  ;;
arm*-*-rtems*)
Index: gcc/ChangeLog.arm
===
--- gcc/ChangeLog.arm   (revision 0)
+++ gcc/ChangeLog.arm   (working copy)
@@ -0,0 +1,7 @@
+2013-07-24  Terry Guo  
+
+   * configure.ac (with_multilib_list): Export its value.
+   * Makefile.in (with_multilib_list): Import it from configure files.
+   * configure: Regenerated.
+   * config/arm/t-mlibs: New files to define multilibs.
+   * config.gcc: Use above multilib fragment.
Index: gcc/configure
===
--- gcc/configure   (revision 201200)
+++ gcc/configure   (working copy)
@@ -753,6 +753,7 @@
 LN_S
 AWK
 SET_MAKE
+with_multilib_list
 REPORT_BUGS_TEXI
 REPORT_BUGS_TO
 PKGVERSION
@@ -1660,7 +1661,7 @@
   --with-specs=SPECS  add SPECS to driver command-line processing
   --with-pkgversion=PKG   Use PKG in the version string in place of "GCC"
   --with-bugurl=URL   Direct users to URL to report a bug
-  --with-multilib-listselect multilibs (SH and x86-64 only)
+  --with-multilib-listselect multilibs (ARM, SH and x86-64 only)
   --with-gnu-ld   assume the C compiler uses GNU ld default=no
   --with-libiconv-prefix[=DIR]  search for libiconv in DIR/include and DIR/lib
   --without-libiconv-prefix don't search for libiconv in includedir and 
libdir
@@ -7397,6 +7398,7 @@
 fi
 
 
+
 # -
 # Checks for other programs
 # -
@@ -17828,7 +17830,7 @@
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 17831 "configure"
+#line 17833 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -17934,7 +17936,7 @@
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 17937 "configure"
+#line 17939 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
Index: gcc/configure.ac
===
--- gcc/configure.ac(revision 201200)
+++ gcc/configure.ac(working copy)
@@ -839,9 +839,10 @@
 [enable_languages=c])
 
 AC_ARG_WITH(multilib-list,
-[AS_HELP_STRING([--with-multilib-list], [select multilibs (SH and x86-64 
only)])],
+[AS_HELP_STRING([--with-multilib-list], [select multilibs (ARM, SH and x86-64 
only)])],
 :,
 with_multilib_list=default)
+AC_SUBST(with_multilib_list)
 
 # -
 # Checks for other programs
Index: gcc/config/arm/t-mlibs
===
--- gcc/config/arm/t-mlibs  (revision 0)
+++ gcc/config/arm/t-mlibs  (working copy)
@@ -0,0 +1,85 @@
+# A set of predefined MULTILIB which can be used for different ARM targets.
+# Via the configure option --with-multilib-list, user can customize the
+# final MULTILIB implementation.
+
+comma := ,
+space :=
+space +=
+
+MULTILIB_OPTIONS   = mthumb/marm
+MULTILIB_DIRNAMES  = thumb arm
+MULTILIB_OPTIONS  += march=armv6s-m/march=armv7-m/march=armv7e-m/march=armv7
+MULTILIB_DIRNAMES += armv6-m armv7-m armv7e-m armv7-ar
+MULTILIB_OPTIONS  += mfloat-abi=softfp/mfloat-abi=hard
+MULTILIB_DIRNAMES += softfp fpu
+MULTILIB_OPTIONS  += mfpu=fpv4-sp-d16/mfpu=vfpv3-d16
+MULTILIB_DIRNAMES += fpv4-sp-d16 vfpv3-d16
+
+MULTILIB_MATCHES   = march?armv6s-m=mcpu?cortex-m0
+MULTILIB_MATCHES  += march?armv6s-m=mcpu?cortex-m0plus
+MULTILIB_MATCHES  += march?armv6s-m=mcpu?cortex-m1
+MULTILIB_MATCHES  += march?armv6s-m=march?armv6-m
+MULTILIB_MATCHES  += march?armv7-m=mcpu?cortex-m3
+MULTILIB_MATCHES  += march?armv7e-m=mcpu?cortex-m4
+MULTILIB_MATCHES  += march?armv7=march?armv7-r
+MULTILIB_MATCHES  += march?armv7=march?armv7-a
+MULTILIB_MATCHES  += march?armv7=mcpu?cortex-r4
+MULTILIB_MATCHES  += march?armv7=mcpu?cortex-r4f
+MULTILIB_MATCHES  += march?armv7=mcpu?cortex-r5
+MULTILIB_MATCHES  += march?armv7=mcpu?cortex-a5
+MULTILIB_MATCHES  += march?armv7=mcpu?cortex-a7
+M

[arm-embedded] Request to back port Cortex-R7 option support patch

2013-08-05 Thread Terry Guo
Hi Joey,

Attached patch is a backport to support cortex-r7 in gcc command line.
Tested and it works.

Is it OK to commit?

BR,
Terry

2013-08-05  Terry Guo  

Backport from mainline r197153
2013-03-27  Terry Guo  

* config/arm/arm-cores.def: Added core cortex-r7.
* config/arm/arm-tune.md: Regenerated.
* config/arm/arm-tables.opt: Regenerated.
* doc/invoke.texi: Added entry for core cortex-r7.
Index: gcc/config/arm/arm-tables.opt
===
--- gcc/config/arm/arm-tables.opt   (revision 201479)
+++ gcc/config/arm/arm-tables.opt   (working copy)
@@ -259,6 +259,9 @@
 Enum(processor_type) String(cortex-r5) Value(cortexr5)
 
 EnumValue
+Enum(processor_type) String(cortex-r7) Value(cortexr7)
+
+EnumValue
 Enum(processor_type) String(cortex-m4) Value(cortexm4)
 
 EnumValue
Index: gcc/config/arm/arm-cores.def
===
--- gcc/config/arm/arm-cores.def(revision 201479)
+++ gcc/config/arm/arm-cores.def(working copy)
@@ -132,6 +132,7 @@
 ARM_CORE("cortex-r4",cortexr4, 7R,  
FL_LDSCHED, cortex)
 ARM_CORE("cortex-r4f",   cortexr4f,7R,  
FL_LDSCHED, cortex)
 ARM_CORE("cortex-r5",cortexr5, 7R,  
FL_LDSCHED | FL_ARM_DIV, cortex)
+ARM_CORE("cortex-r7",cortexr7, 7R,  
FL_LDSCHED | FL_ARM_DIV, cortex)
 ARM_CORE("cortex-m4",cortexm4, 7EM, 
FL_LDSCHED, cortex)
 ARM_CORE("cortex-m3",cortexm3, 7M,  
FL_LDSCHED, cortex)
 ARM_CORE("cortex-m1",cortexm1, 6M,  
FL_LDSCHED, v6m)
Index: gcc/config/arm/arm-tune.md
===
--- gcc/config/arm/arm-tune.md  (revision 201479)
+++ gcc/config/arm/arm-tune.md  (working copy)
@@ -1,5 +1,5 @@
 ;; -*- buffer-read-only: t -*-
 ;; Generated automatically by gentune.sh from arm-cores.def
 (define_attr "tune"
-   
"arm2,arm250,arm3,arm6,arm60,arm600,arm610,arm620,arm7,arm7d,arm7di,arm70,arm700,arm700i,arm710,arm720,arm710c,arm7100,arm7500,arm7500fe,arm7m,arm7dm,arm7dmi,arm8,arm810,strongarm,strongarm110,strongarm1100,strongarm1110,fa526,fa626,arm7tdmi,arm7tdmis,arm710t,arm720t,arm740t,arm9,arm9tdmi,arm920,arm920t,arm922t,arm940t,ep9312,arm10tdmi,arm1020t,arm9e,arm946es,arm966es,arm968es,arm10e,arm1020e,arm1022e,xscale,iwmmxt,iwmmxt2,fa606te,fa626te,fmp626,fa726te,arm926ejs,arm1026ejs,arm1136js,arm1136jfs,arm1176jzs,arm1176jzfs,mpcorenovfp,mpcore,arm1156t2s,arm1156t2fs,genericv7a,cortexa5,cortexa7,cortexa8,cortexa9,cortexa15,cortexr4,cortexr4f,cortexr5,cortexm4,cortexm3,cortexm1,cortexm0,cortexm0plus,marvell_pj4"
+   
"arm2,arm250,arm3,arm6,arm60,arm600,arm610,arm620,arm7,arm7d,arm7di,arm70,arm700,arm700i,arm710,arm720,arm710c,arm7100,arm7500,arm7500fe,arm7m,arm7dm,arm7dmi,arm8,arm810,strongarm,strongarm110,strongarm1100,strongarm1110,fa526,fa626,arm7tdmi,arm7tdmis,arm710t,arm720t,arm740t,arm9,arm9tdmi,arm920,arm920t,arm922t,arm940t,ep9312,arm10tdmi,arm1020t,arm9e,arm946es,arm966es,arm968es,arm10e,arm1020e,arm1022e,xscale,iwmmxt,iwmmxt2,fa606te,fa626te,fmp626,fa726te,arm926ejs,arm1026ejs,arm1136js,arm1136jfs,arm1176jzs,arm1176jzfs,mpcorenovfp,mpcore,arm1156t2s,arm1156t2fs,genericv7a,cortexa5,cortexa7,cortexa8,cortexa9,cortexa15,cortexr4,cortexr4f,cortexr5,cortexr7,cortexm4,cortexm3,cortexm1,cortexm0,cortexm0plus,marvell_pj4"
(const (symbol_ref "((enum attr_tune) arm_tune)")))
Index: gcc/ChangeLog.arm
===
--- gcc/ChangeLog.arm   (revision 201479)
+++ gcc/ChangeLog.arm   (working copy)
@@ -1,3 +1,13 @@
+2013-08-05  Terry Guo  
+
+   Backport from mainline r197153
+   2013-03-27  Terry Guo  
+
+   * config/arm/arm-cores.def: Added core cortex-r7.
+   * config/arm/arm-tune.md: Regenerated.
+   * config/arm/arm-tables.opt: Regenerated.
+   * doc/invoke.texi: Added entry for core cortex-r7.
+
 2013-07-24  Terry Guo  
 
* configure.ac (with_multilib_list): Export its value.
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 201479)
+++ gcc/doc/invoke.texi (working copy)
@@ -11264,7 +11264,7 @@
 @samp{arm1156t2-s}, @samp{arm1156t2f-s}, @samp{arm1176jz-s}, 
@samp{arm1176jzf-s},
 @samp{cortex-a5}, @samp{cortex-a7}, @samp{cortex-a8}, @samp{cortex-a9}, 
 @samp{cortex-a15}, @samp{cortex-r4}, @samp{cortex-r4f}, @samp{cortex-r5},
-@samp{cortex-m4}, @samp{cortex-m3},
+@samp{cortex-r7}, @samp{cortex-m4}, @samp{cortex-m3},
 @samp{cortex-m1},
 @samp{cortex-m0},
 @samp{cortex-m0plus},

[arm-embedded] Request to backport thumb1 far jump patch to embedded 4.8 branch

2013-08-05 Thread Terry Guo
Hello Joey,

The thumb1 far jump patch is about an optimization to avoid unnecessary lr
save instruction. It is now in trunk. Is it OK to back port it to embedded
4.8 branch?

BR,
Terry

gcc/ChangeLog.arm

 2013-08-05  Terry Guo  
 
Backport from mainline r197956
2013-04-15  Joey Ye  

* config/arm/arm.c (thumb1_final_prescan_insn): Assert lr save
for real far jump.
(thumb_far_jump_used_p): Count instruction size and set
far_jump_used.

gcc/testsuite/ChangeLog.arm

2013-08-05  Terry Guo  

Backport from mainline r197956
2013-04-15  Joey Ye  

* gcc.target/arm/thumb1-far-jump-1.c: New test.
* gcc.target/arm/thumb1-far-jump-2.c: New test.Index: gcc/ChangeLog.arm
===
--- gcc/ChangeLog.arm   (revision 201517)
+++ gcc/ChangeLog.arm   (working copy)
@@ -1,5 +1,15 @@
 2013-08-05  Terry Guo  
 
+   Backport from mainline r197956
+   2013-04-15  Joey Ye  
+
+   * config/arm/arm.c (thumb1_final_prescan_insn): Assert lr save
+   for real far jump.
+   (thumb_far_jump_used_p): Count instruction size and set
+   far_jump_used.
+
+2013-08-05  Terry Guo  
+
Backport from mainline r197153
2013-03-27  Terry Guo  
 
Index: gcc/testsuite/gcc.target/arm/thumb1-far-jump-1.c
===
--- gcc/testsuite/gcc.target/arm/thumb1-far-jump-1.c(revision 0)
+++ gcc/testsuite/gcc.target/arm/thumb1-far-jump-1.c(working copy)
@@ -0,0 +1,34 @@
+/* Check for thumb1 far jump. Shouldn't save lr for small leaf functions
+ * even with a branch in it.  */
+/* { dg-options "-Os" } */
+/* { dg-skip-if "" { ! { arm_thumb1 } } } */
+
+void f()
+{
+  for (;;);
+}
+
+volatile int g;
+void f2(int i)
+{
+  if (i) g=0;
+}
+
+void f3(int i)
+{
+  if (i) {
+g=0;
+g=1;
+g=2;
+g=3;
+g=4;
+g=5;
+g=6;
+g=7;
+g=8;
+g=9;
+  }
+}
+
+/* { dg-final { scan-assembler-not "push.*lr" } } */
+
Index: gcc/testsuite/gcc.target/arm/thumb1-far-jump-2.c
===
--- gcc/testsuite/gcc.target/arm/thumb1-far-jump-2.c(revision 0)
+++ gcc/testsuite/gcc.target/arm/thumb1-far-jump-2.c(working copy)
@@ -0,0 +1,57 @@
+/* Check for thumb1 far jump. This is the extreme case that far jump
+ * will be used with minimum number of instructions. By passing this case
+ * it means the heuristic of saving lr for far jump meets the most extreme
+ * requirement.  */
+/* { dg-options "-Os" } */
+/* { dg-skip-if "" { ! { arm_thumb1 } } } */
+
+volatile register r4 asm("r4");
+void f3(int i)
+{
+#define GO(n) \
+  extern volatile int g_##n; \
+  r4=(int)&g_##n;
+
+#define GO8(n) \
+  GO(n##_0) \
+  GO(n##_1) \
+  GO(n##_2) \
+  GO(n##_3) \
+  GO(n##_4) \
+  GO(n##_5) \
+  GO(n##_6) \
+  GO(n##_7)
+
+#define GO64(n) \
+  GO8(n##_0) \
+  GO8(n##_1) \
+  GO8(n##_2) \
+  GO8(n##_3) \
+  GO8(n##_4) \
+  GO8(n##_5) \
+  GO8(n##_6) \
+  GO8(n##_7) \
+
+#define GO498(n) \
+  GO64(n##_0) \
+  GO64(n##_1) \
+  GO64(n##_2) \
+  GO64(n##_3) \
+  GO64(n##_4) \
+  GO64(n##_5) \
+  GO64(n##_6) \
+  GO8(n##_0) \
+  GO8(n##_1) \
+  GO8(n##_2) \
+  GO8(n##_3) \
+  GO8(n##_4) \
+  GO8(n##_5) \
+  GO(n##_0) \
+  GO(n##_1) \
+
+  if (i) {
+GO498(0);
+  }
+}
+
+/* { dg-final { scan-assembler "push.*lr" } } */
Index: gcc/testsuite/ChangeLog.arm
===
--- gcc/testsuite/ChangeLog.arm (revision 0)
+++ gcc/testsuite/ChangeLog.arm (working copy)
@@ -0,0 +1,7 @@
+2013-08-05  Terry Guo  
+
+   Backport from mainline r197956
+   2013-04-15  Joey Ye  
+
+   * gcc.target/arm/thumb1-far-jump-1.c: New test.
+   * gcc.target/arm/thumb1-far-jump-2.c: New test.
Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c(revision 201517)
+++ gcc/config/arm/arm.c(working copy)
@@ -22577,6 +22577,11 @@
   else if (conds != CONDS_NOCOND)
cfun->machine->thumb1_cc_insn = NULL_RTX;
 }
+
+/* Check if unexpected far jump is used.  */
+if (cfun->machine->lr_save_eliminated
+&& get_attr_far_jump (insn) == FAR_JUMP_YES)
+  internal_error("Unexpected thumb1 far jump");
 }
 
 int
@@ -22602,6 +22607,8 @@
 thumb_far_jump_used_p (void)
 {
   rtx insn;
+  bool far_jump = false;
+  unsigned int func_size = 0;
 
   /* This test is only important for leaf functions.  */
   /* assert (!leaf_function_p ()); */
@@ -22657,6 +22664,26 @@
  && get_attr_far_jump (insn) == FAR_JUMP_YES
  )
{
+ far_jump = true;
+   }
+  func_size += get_attr_length (insn);
+}
+
+  /* Attribute far_jump will always be true for thumb1 before
+ shorten_branch pass.  So checking

[PATCH][ARM] GCC command line support for Cortex-R7

2013-02-24 Thread Terry Guo
Hi,

This patch is to enable GCC to accept new command line option
-mcpu=cortex-r7. Is it OK to trunk?

BR,
Terry

2013-02-25  Terry Guo  

* config/arm/arm-cores.def: Added core cortex-r7.
* config/arm/arm-tune.md: Regenerated.
* config/arm/arm-tables.opt: Regenerated.
* doc/invoke.texi: Added entry for core cortex-r7.diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index a4cb7c6..185f78e 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -132,6 +132,7 @@ ARM_CORE("cortex-a15",	  cortexa15,	7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_D
 ARM_CORE("cortex-r4",	  cortexr4,	7R, FL_LDSCHED, cortex)
 ARM_CORE("cortex-r4f",	  cortexr4f,	7R, FL_LDSCHED, cortex)
 ARM_CORE("cortex-r5",	  cortexr5,	7R, FL_LDSCHED | FL_ARM_DIV, cortex)
+ARM_CORE("cortex-r7",	  cortexr7,	7R, FL_LDSCHED | FL_ARM_DIV, cortex)
 ARM_CORE("cortex-m4",	  cortexm4,	7EM, FL_LDSCHED, cortex)
 ARM_CORE("cortex-m3",	  cortexm3,	7M, FL_LDSCHED, cortex)
 ARM_CORE("cortex-m1",	  cortexm1,	6M, FL_LDSCHED, v6m)
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index 06a529d..ad52d18 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -259,6 +259,9 @@ EnumValue
 Enum(processor_type) String(cortex-r5) Value(cortexr5)
 
 EnumValue
+Enum(processor_type) String(cortex-r7) Value(cortexr7)
+
+EnumValue
 Enum(processor_type) String(cortex-m4) Value(cortexm4)
 
 EnumValue
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index 26c2e1f..ac85f94 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -1,5 +1,5 @@
 ;; -*- buffer-read-only: t -*-
 ;; Generated automatically by gentune.sh from arm-cores.def
 (define_attr "tune"
-	"arm2,arm250,arm3,arm6,arm60,arm600,arm610,arm620,arm7,arm7d,arm7di,arm70,arm700,arm700i,arm710,arm720,arm710c,arm7100,arm7500,arm7500fe,arm7m,arm7dm,arm7dmi,arm8,arm810,strongarm,strongarm110,strongarm1100,strongarm1110,fa526,fa626,arm7tdmi,arm7tdmis,arm710t,arm720t,arm740t,arm9,arm9tdmi,arm920,arm920t,arm922t,arm940t,ep9312,arm10tdmi,arm1020t,arm9e,arm946es,arm966es,arm968es,arm10e,arm1020e,arm1022e,xscale,iwmmxt,iwmmxt2,fa606te,fa626te,fmp626,fa726te,arm926ejs,arm1026ejs,arm1136js,arm1136jfs,arm1176jzs,arm1176jzfs,mpcorenovfp,mpcore,arm1156t2s,arm1156t2fs,genericv7a,cortexa5,cortexa7,cortexa8,cortexa9,cortexa15,cortexr4,cortexr4f,cortexr5,cortexm4,cortexm3,cortexm1,cortexm0,cortexm0plus,marvell_pj4"
+	"arm2,arm250,arm3,arm6,arm60,arm600,arm610,arm620,arm7,arm7d,arm7di,arm70,arm700,arm700i,arm710,arm720,arm710c,arm7100,arm7500,arm7500fe,arm7m,arm7dm,arm7dmi,arm8,arm810,strongarm,strongarm110,strongarm1100,strongarm1110,fa526,fa626,arm7tdmi,arm7tdmis,arm710t,arm720t,arm740t,arm9,arm9tdmi,arm920,arm920t,arm922t,arm940t,ep9312,arm10tdmi,arm1020t,arm9e,arm946es,arm966es,arm968es,arm10e,arm1020e,arm1022e,xscale,iwmmxt,iwmmxt2,fa606te,fa626te,fmp626,fa726te,arm926ejs,arm1026ejs,arm1136js,arm1136jfs,arm1176jzs,arm1176jzfs,mpcorenovfp,mpcore,arm1156t2s,arm1156t2fs,genericv7a,cortexa5,cortexa7,cortexa8,cortexa9,cortexa15,cortexr4,cortexr4f,cortexr5,cortexr7,cortexm4,cortexm3,cortexm1,cortexm0,cortexm0plus,marvell_pj4"
 	(const (symbol_ref "((enum attr_tune) arm_tune)")))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e9fe4ef..177ed70 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11260,7 +11260,7 @@ assembly code.  Permissible names are: @samp{arm2}, @samp{arm250},
 @samp{arm1156t2-s}, @samp{arm1156t2f-s}, @samp{arm1176jz-s}, @samp{arm1176jzf-s},
 @samp{cortex-a5}, @samp{cortex-a7}, @samp{cortex-a8}, @samp{cortex-a9}, 
 @samp{cortex-a15}, @samp{cortex-r4}, @samp{cortex-r4f}, @samp{cortex-r5},
-@samp{cortex-m4}, @samp{cortex-m3},
+@samp{cortex-r7}, @samp{cortex-m4}, @samp{cortex-m3},
 @samp{cortex-m1},
 @samp{cortex-m0},
 @samp{cortex-m0plus},

Ping: [PATCH][ARM] GCC command line support for Cortex-R7

2013-03-03 Thread Terry Guo
Ping...

The patch is at http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01105.html.

BR,
Terry

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Terry Guo
> Sent: Monday, February 25, 2013 10:23 AM
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH][ARM] GCC command line support for Cortex-R7
> 
> Hi,
> 
> This patch is to enable GCC to accept new command line option -
> mcpu=cortex-r7. Is it OK to trunk?
> 
> BR,
> Terry
> 
> 2013-02-25  Terry Guo  
> 
> * config/arm/arm-cores.def: Added core cortex-r7.
> * config/arm/arm-tune.md: Regenerated.
> * config/arm/arm-tables.opt: Regenerated.
> * doc/invoke.texi: Added entry for core cortex-r7.




Ping*2: [PATCH][ARM] GCC command line support for Cortex-R7

2013-03-11 Thread Terry Guo
Hi Richard,

Can you please help to review this patch?

BR,
Terry


> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Terry Guo
> Sent: Monday, March 04, 2013 10:46 AM
> To: gcc-patches@gcc.gnu.org
> Subject: Ping: [PATCH][ARM] GCC command line support for Cortex-R7
> 
> Ping...
> 
> The patch is at http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01105.html.
> 
> BR,
> Terry
> 
> > -Original Message-
> > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> > ow...@gcc.gnu.org] On Behalf Of Terry Guo
> > Sent: Monday, February 25, 2013 10:23 AM
> > To: gcc-patches@gcc.gnu.org
> > Subject: [PATCH][ARM] GCC command line support for Cortex-R7
> >
> > Hi,
> >
> > This patch is to enable GCC to accept new command line option -
> > mcpu=cortex-r7. Is it OK to trunk?
> >
> > BR,
> > Terry
> >
> > 2013-02-25  Terry Guo  
> >
> > * config/arm/arm-cores.def: Added core cortex-r7.
> > * config/arm/arm-tune.md: Regenerated.
> > * config/arm/arm-tables.opt: Regenerated.
> > * doc/invoke.texi: Added entry for core cortex-r7.
> 
> 





[Patch/ARM] Cortex-M4 core pipeline patch to tune LDR/STR pairs

2013-03-29 Thread Terry Guo
Hello,

The attached pipeline patch intends to turn following code generation

ldr r5, [r4, #12]
adds r2, r2, #16
str r5, [r3, #8]

to

ldr r5, [r4, #12]
str r5, [r3, #8]
adds r2, r2, #16

The reason is that the STR can be started from the second cycle of its
preceding LDR which takes 2 cycles, as long as the result of LDR isn't used
as memory address of STR.

Tested with various benchmarks on Cortex-M4 MPS. Except one regression
caused by register allocation, the others either show performance
improvement or no change.

Is it OK to trunk?

BR,
Terry

2013-03-29  Terry Guo  

* gcc/config/arm/cortex-m4.md: New bypass to tune LDR/STR
pairs.From 19dd8bdc9a03f78690700ded911e0cee66328c01 Mon Sep 17 00:00:00 2001
From: Terry Guo 
Date: Wed, 27 Mar 2013 17:23:09 +0800
Subject: [PATCH] improve m4 pipeline description

---
 gcc/config/arm/cortex-m4.md |4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/config/arm/cortex-m4.md b/gcc/config/arm/cortex-m4.md
index 187867b..47b0364 100644
--- a/gcc/config/arm/cortex-m4.md
+++ b/gcc/config/arm/cortex-m4.md
@@ -84,6 +84,10 @@
(eq_attr "type" "store4"))
   "cortex_m4_ex*5")
 
+(define_bypass 1 "cortex_m4_load1"
+ "cortex_m4_store1_1,cortex_m4_store1_2"
+ "arm_no_early_store_addr_dep")
+
 ;; If the address of load or store depends on the result of the preceding
 ;; instruction, the latency is increased by one.
 
-- 
1.7.9.5


RE: [Patch/ARM] Cortex-M4 core pipeline patch to tune LDR/STR pairs

2013-04-15 Thread Terry Guo
Hello Ramana,

Can you please review my patch at
http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01252.html.

Thanks.

Terry

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Terry Guo
> Sent: Friday, March 29, 2013 6:00 PM
> To: gcc-patches@gcc.gnu.org
> Subject: [Patch/ARM] Cortex-M4 core pipeline patch to tune LDR/STR pairs
> 
> Hello,
> 
> The attached pipeline patch intends to turn following code generation
> 
> ldr r5, [r4, #12]
> adds r2, r2, #16
> str r5, [r3, #8]
> 
> to
> 
> ldr r5, [r4, #12]
> str r5, [r3, #8]
> adds r2, r2, #16
> 
> The reason is that the STR can be started from the second cycle of its
> preceding LDR which takes 2 cycles, as long as the result of LDR isn't
used as
> memory address of STR.
> 
> Tested with various benchmarks on Cortex-M4 MPS. Except one regression
> caused by register allocation, the others either show performance
> improvement or no change.
> 
> Is it OK to trunk?
> 
> BR,
> Terry
> 
> 2013-03-29  Terry Guo  
> 
> * gcc/config/arm/cortex-m4.md: New bypass to tune LDR/STR
pairs.




[PATCH, ARM] Improve GCC pipeline description for Cortex-M4 FPU

2013-04-16 Thread Terry Guo
Hi,

This patch intends to improve cortex-m4 FPU pipeline description based on
below findings:

1) The integer instructions can be pipelined with fused/chained mac
instructions.
2) The two-cycle 32-bit floating point load instructions should be put
together to save one cycle. The three-cycle 64-bit fp load instructions
haven't such feature.
3) The 32-bit floating point store instructions need 1 cycle, not 2 cycles.

I use some f32 functions from CMSIS DSPLib to benchmark this patch. All of
them show performance improvement i.e. less cycles are needed to perform
those functions.

Is it OK for trunk?

BR,
Terry

2013-04-16  Terry Guo  

* config/arm/cortex-m4-fpu.md (cortex_m4_v): Delete cpu unit.
Replace with ...
(cortex_m4_v_a,  cortex_m4_v_b): ... new cpu units.
(cortex_m4_v, cortex_m4_exa_va, cortex_m4_exb_vb): New reservations.
(cortex_m4_fmacs): Use new reservations.
(cortex_m4_f_load, cortex_m4_f_store): Likewise.

diff --git a/gcc/config/arm/cortex-m4-fpu.md b/gcc/config/arm/cortex-m4-fpu.md
index a1945be..4ce3f10 100644
--- a/gcc/config/arm/cortex-m4-fpu.md
+++ b/gcc/config/arm/cortex-m4-fpu.md
@@ -18,10 +18,14 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; <http://www.gnu.org/licenses/>.
 
-;; Use an artifial unit to model FPU.
-(define_cpu_unit "cortex_m4_v" "cortex_m4")
+;; Use two artificial units to model FPU.
+(define_cpu_unit "cortex_m4_v_a" "cortex_m4")
+(define_cpu_unit "cortex_m4_v_b" "cortex_m4")
 
+(define_reservation "cortex_m4_v" "cortex_m4_v_a+cortex_m4_v_b")
 (define_reservation "cortex_m4_ex_v" "cortex_m4_ex+cortex_m4_v")
+(define_reservation "cortex_m4_exa_va" "cortex_m4_a+cortex_m4_v_a")
+(define_reservation "cortex_m4_exb_vb" "cortex_m4_b+cortex_m4_v_b")
 
 ;; Integer instructions following VDIV or VSQRT complete out-of-order.
 (define_insn_reservation "cortex_m4_fdivs" 15
@@ -44,10 +48,12 @@
(eq_attr "type" "fmuls"))
   "cortex_m4_ex_v")
 
+;; Integer instructions following multiply-accumulate instructions
+;; complete out-of-order.
 (define_insn_reservation "cortex_m4_fmacs" 4
   (and (eq_attr "tune" "cortexm4")
(eq_attr "type" "fmacs,ffmas"))
-  "cortex_m4_ex_v*3")
+  "cortex_m4_ex_v,cortex_m4_v*2")
 
 (define_insn_reservation "cortex_m4_ffariths" 1
   (and (eq_attr "tune" "cortexm4")
@@ -77,12 +83,12 @@
 (define_insn_reservation "cortex_m4_f_load" 2
   (and (eq_attr "tune" "cortexm4")
(eq_attr "type" "f_loads"))
-  "cortex_m4_ex_v*2")
+  "cortex_m4_exa_va,cortex_m4_exb_vb")
 
-(define_insn_reservation "cortex_m4_f_store" 2
+(define_insn_reservation "cortex_m4_f_store" 1
   (and (eq_attr "tune" "cortexm4")
(eq_attr "type" "f_stores"))
-  "cortex_m4_ex_v*2")
+  "cortex_m4_exa_va")
 
 (define_insn_reservation "cortex_m4_f_loadd" 3
   (and (eq_attr "tune" "cortexm4")

[arm-embedded] enable multilib for embedded-4_9-branch

2014-05-12 Thread Terry Guo
Hi there,

I just committed attached patch to enable build multilib for ARM
embedded-4_9-branch.

BR,
Terry

2014-05-12  Terry Guo  

* config.gcc (--with-multilib-list): Accept arm embedded cores.
* configure.ac (with_multilib_list): Export for being used in arm
embedded multilib fragment.
* configure: Regenerated.
* Makefile.in (with_multilib_list): Import for being used in
multilib fragment.
* config/arm/t-rmprofile: New multilib fragment for arm embedded
cores.Index: gcc/Makefile.in
===
--- gcc/Makefile.in (revision 210319)
+++ gcc/Makefile.in (working copy)
@@ -540,6 +540,7 @@
 lang_specs_files=@lang_specs_files@
 lang_tree_files=@lang_tree_files@
 target_cpu_default=@target_cpu_default@
+with_multilib_list=@with_multilib_list@
 OBJC_BOEHM_GC=@objc_boehm_gc@
 extra_modes_file=@extra_modes_file@
 extra_opt_files=@extra_opt_files@
Index: gcc/config/arm/t-rmprofile
===
--- gcc/config/arm/t-rmprofile  (revision 0)
+++ gcc/config/arm/t-rmprofile  (working copy)
@@ -0,0 +1,86 @@
+# A set of predefined MULTILIB which can be used for different ARM targets.
+# Via the configure option --with-multilib-list, user can customize the
+# final MULTILIB implementation.
+
+comma := ,
+space :=
+space +=
+
+MULTILIB_OPTIONS   = mthumb/marm
+MULTILIB_DIRNAMES  = thumb arm
+MULTILIB_OPTIONS  += march=armv6s-m/march=armv7-m/march=armv7e-m/march=armv7
+MULTILIB_DIRNAMES += armv6-m armv7-m armv7e-m armv7-ar
+MULTILIB_OPTIONS  += mfloat-abi=softfp/mfloat-abi=hard
+MULTILIB_DIRNAMES += softfp fpu
+MULTILIB_OPTIONS  += mfpu=fpv4-sp-d16/mfpu=vfpv3-d16
+MULTILIB_DIRNAMES += fpv4-sp-d16 vfpv3-d16
+
+MULTILIB_MATCHES   = march?armv6s-m=mcpu?cortex-m0
+MULTILIB_MATCHES  += march?armv6s-m=mcpu?cortex-m0plus
+MULTILIB_MATCHES  += march?armv6s-m=mcpu?cortex-m1
+MULTILIB_MATCHES  += march?armv6s-m=march?armv6-m
+MULTILIB_MATCHES  += march?armv7-m=mcpu?cortex-m3
+MULTILIB_MATCHES  += march?armv7e-m=mcpu?cortex-m4
+MULTILIB_MATCHES  += march?armv7=march?armv7-r
+MULTILIB_MATCHES  += march?armv7=march?armv7-a
+MULTILIB_MATCHES  += march?armv7=mcpu?cortex-r4
+MULTILIB_MATCHES  += march?armv7=mcpu?cortex-r4f
+MULTILIB_MATCHES  += march?armv7=mcpu?cortex-r5
+MULTILIB_MATCHES  += march?armv7=mcpu?cortex-r7
+MULTILIB_MATCHES  += march?armv7=mcpu?cortex-a5
+MULTILIB_MATCHES  += march?armv7=mcpu?cortex-a7
+MULTILIB_MATCHES  += march?armv7=mcpu?cortex-a8
+MULTILIB_MATCHES  += march?armv7=mcpu?cortex-a9
+MULTILIB_MATCHES  += march?armv7=mcpu?cortex-a15
+MULTILIB_MATCHES  += mfpu?vfpv3-d16=mfpu?vfpv3
+MULTILIB_MATCHES  += mfpu?vfpv3-d16=mfpu?vfpv3-fp16
+MULTILIB_MATCHES  += mfpu?vfpv3-d16=mfpu?vfpv3-d16-fp16
+MULTILIB_MATCHES  += mfpu?vfpv3-d16=mfpu?vfpv3xd
+MULTILIB_MATCHES  += mfpu?vfpv3-d16=mfpu?vfpv3xd-fp16
+MULTILIB_MATCHES  += mfpu?vfpv3-d16=mfpu?vfpv4
+MULTILIB_MATCHES  += mfpu?vfpv3-d16=mfpu?vfpv4-d16
+MULTILIB_MATCHES  += mfpu?vfpv3-d16=mfpu?neon
+MULTILIB_MATCHES  += mfpu?vfpv3-d16=mfpu?neon-fp16
+MULTILIB_MATCHES  += mfpu?vfpv3-d16=mfpu?neon-vfpv4
+
+MULTILIB_EXCEPTIONS =
+MULTILIB_REUSE =
+
+MULTILIB_REQUIRED  = mthumb
+MULTILIB_REQUIRED += marm
+MULTILIB_REQUIRED += mfloat-abi=hard
+
+MULTILIB_OSDIRNAMES  = mthumb=!thumb
+MULTILIB_OSDIRNAMES += marm=!arm
+MULTILIB_OSDIRNAMES += mfloat-abi.hard=!fpu
+
+ifneq (,$(findstring armv6-m,$(subst $(comma),$(space),$(with_multilib_list
+MULTILIB_REQUIRED   += mthumb/march=armv6s-m
+MULTILIB_OSDIRNAMES += mthumb/march.armv6s-m=!armv6-m
+endif
+
+ifneq (,$(findstring armv7-m,$(subst $(comma),$(space),$(with_multilib_list
+MULTILIB_REQUIRED   += mthumb/march=armv7-m
+MULTILIB_OSDIRNAMES += mthumb/march.armv7-m=!armv7-m
+endif
+
+ifneq (,$(findstring armv7e-m,$(subst 
$(comma),$(space),$(with_multilib_list
+MULTILIB_REQUIRED   += mthumb/march=armv7e-m
+MULTILIB_REQUIRED   += mthumb/march=armv7e-m/mfloat-abi=softfp/mfpu=fpv4-sp-d16
+MULTILIB_REQUIRED   += mthumb/march=armv7e-m/mfloat-abi=hard/mfpu=fpv4-sp-d16
+MULTILIB_OSDIRNAMES += mthumb/march.armv7e-m=!armv7e-m
+MULTILIB_OSDIRNAMES += 
mthumb/march.armv7e-m/mfloat-abi.hard/mfpu.fpv4-sp-d16=!armv7e-m/fpu
+MULTILIB_OSDIRNAMES += 
mthumb/march.armv7e-m/mfloat-abi.softfp/mfpu.fpv4-sp-d16=!armv7e-m/softfp
+endif
+
+ifneq (,$(filter armv7 armv7-r armv7-a,$(subst 
$(comma),$(space),$(with_multilib_list
+MULTILIB_REQUIRED   += mthumb/march=armv7
+MULTILIB_REQUIRED   += mthumb/march=armv7/mfloat-abi=softfp/mfpu=vfpv3-d16
+MULTILIB_REQUIRED   += mthumb/march=armv7/mfloat-abi=hard/mfpu=vfpv3-d16
+MULTILIB_OSDIRNAMES += mthumb/march.armv7=!armv7-ar/thumb
+MULTILIB_OSDIRNAMES += 
mthumb/march.armv7/mfloat-abi.hard/mfpu.vfpv3-d16=!armv7-ar/thumb/fpu
+MULTILIB_OSDIRNAMES += 
mthumb/march.armv7/mfloat-abi.softfp/mfpu.vfpv3-d16=!armv7-ar/thumb/softfp
+MULTILIB_REUSE  += mthumb/march.armv7=marm/march.armv7
+MULTILIB_REUSE  += 
mthumb

RE: [Patch, GCC/Thumb1] Improve 64bit constant load for Thumb1

2014-05-21 Thread Terry Guo


> -Original Message-
> From: Ramana Radhakrishnan [mailto:ramana@googlemail.com]
> Sent: Wednesday, May 21, 2014 4:56 PM
> To: Terry Guo
> Cc: gcc-patches; Richard Earnshaw; Ramana Radhakrishnan
> Subject: Re: [Patch, GCC/Thumb1] Improve 64bit constant load for Thumb1
> 
> -(define_split
> +; Split the load of 64-bit constant into two loads for high and low
> 32-bit parts respectively
> +; to see if we can load them in fewer instructions or fewer cycles.
> +; For the small 64-bit integer constants that satisfy constraint J,
> the instruction pattern
> +; thumb1_movdi_insn has a better way to handle them.
> >+(define_split
> >+  [(set (match_operand:ANY64 0 "arm_general_register_operand" "")
> >+(match_operand:ANY64 1 "const_double_operand" ""))]
> >+  "TARGET_THUMB1 && reload_completed && !satisfies_constraint_J
> (operands[1])"
> >+  [(set (match_dup 0) (match_dup 1))
> >+   (set (match_dup 2) (match_dup 3))]
> 
> Not ok - this splitter used to kick in in ARM state, now you've turned it off.
> Look at the movdi patterns in ARM state which deal with it immediate
> moves with a #.
> 
> So, the condition should read
> 
> (TARGET_32BIT) || ( TARGET_THUMB1 && )
> 
> Ok with that change and if no regressions.
> 
> regards
> Ramana
> 

Thanks for reviewing. This is a total new splitter dedicated for Thumb1 
targets. The one for ARM isn't touched.

BR,
Terry

> 
> 
> 
> On Fri, Apr 11, 2014 at 8:36 AM, Terry Guo  wrote:
> > Hi there,
> >
> > Current gcc prefers to using two LDR instructions to load 64bit constants.
> > This could miss some chances that 64bit load can be done in fewer
> > instructions or fewer cycles. For example, below code to load
> > 0x10001
> >
> > mov r0, #1
> > mov r1, #1
> >
> > is better than current solution:
> >
> > ldr r1, .L2+4
> > ldr r0, .L2
> > .L2:
> > .word   1
> > .word   1
> >
> > The attached patch intends to split 64bit load to take advantage of
> > such chances. Tested with gcc regression test on cortex-m0. No new
> regressions.
> >
> > Is it ok to stage 1?
> >
> > BR,
> > Terry
> >
> > gcc/
> > 2014-04-11  Terry Guo  
> >
> > * config/arm/arm.md (split 64-bit constant for Thumb1): New
> > split pattern.
> >
> > gcc/testsuite/
> > 2014-04-11  Terry Guo  
> >
> > * gcc.target/arm/thumb1-load-64bit-constant-1.c: New test.
> > * gcc.target/arm/thumb1-load-64bit-constant-2.c: Ditto.
> > * gcc.target/arm/thumb1-load-64bit-constant-3.c: Ditto.





RE: [GCC, ARM] Backport trunk fix to 4.8 branch to properly handle rtx of ARM PLD instruction

2014-01-15 Thread Terry Guo
> 
> Preferably, particularly since you haven't supplied a testcase.
> 
> R.

Bug is reported at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59826. I
shall update the patch to include the test case.

BR,
Terry




RE: [GCC, ARM] Backport trunk fix to 4.8 branch to properly handle rtx of ARM PLD instruction

2014-01-15 Thread Terry Guo


> -Original Message-
> From: Terry Guo [mailto:terry@arm.com]
> Sent: Wednesday, January 15, 2014 8:21 PM
> To: Richard Earnshaw
> Cc: gcc-patches@gcc.gnu.org
> Subject: RE: [GCC, ARM] Backport trunk fix to 4.8 branch to properly
handle
> rtx of ARM PLD instruction
> 
> >
> > Preferably, particularly since you haven't supplied a testcase.
> >
> > R.
> 
> Bug is reported at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59826. I
shall
> update the patch to include the test case.
> 
> BR,
> Terry

Here is updated patch along with test case. Is it OK?

BR,
TerryIndex: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 206619)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,21 @@
+2014-01-15  Terry Guo  
+
+   PR target/59826
+   Backported from mainline r204575 and applied to file arm.c.
+   2013-11-08  James Greenhalgh  
+
+   * config/arm/aarch-common.c
+   (search_term): New typedef.
+   (shift_rtx_costs): New array.
+   (arm_rtx_shift_left_p): New.
+   (arm_find_sub_rtx_with_search_term): Likewise.
+   (arm_find_sub_rtx_with_code): Likewise.
+   (arm_early_load_addr_dep): Add sanity checking.
+   (arm_no_early_alu_shift_dep): Likewise.
+   (arm_no_early_alu_shift_value_dep): Likewise.
+   (arm_no_early_mul_dep): Likewise.
+   (arm_no_early_store_addr_dep): Likewise.
+
 2014-01-14  Uros Bizjak  
 
Revert:
Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c(revision 206619)
+++ gcc/config/arm/arm.c(working copy)
@@ -1161,6 +1161,30 @@
   TLS_DESCSEQ  /* GNU scheme */
 };
 
+typedef struct
+{
+  rtx_code search_code;
+  rtx search_result;
+  bool find_any_shift;
+} search_term;
+
+/* Return TRUE if X is either an arithmetic shift left, or
+   is a multiplication by a power of two.  */
+bool
+arm_rtx_shift_left_p (rtx x)
+{
+  enum rtx_code code = GET_CODE (x);
+
+  if (code == MULT && CONST_INT_P (XEXP (x, 1))
+  && exact_log2 (INTVAL (XEXP (x, 1))) > 0)
+return true;
+
+  if (code == ASHIFT)
+return true;
+
+  return false;
+}
+
 /* The maximum number of insns to be used when loading a constant.  */
 inline static int
 arm_constant_limit (bool size_p)
@@ -24604,62 +24628,116 @@
 *pretend_size = (NUM_ARG_REGS - nregs) * UNITS_PER_WORD;
 }
 
-/* Return nonzero if the CONSUMER instruction (a store) does not need
-   PRODUCER's value to calculate the address.  */
+static rtx_code shift_rtx_codes[] =
+  { ASHIFT, ROTATE, ASHIFTRT, LSHIFTRT,
+ROTATERT, ZERO_EXTEND, SIGN_EXTEND };
 
-int
-arm_no_early_store_addr_dep (rtx producer, rtx consumer)
+/* Callback function for arm_find_sub_rtx_with_code.
+   DATA is safe to treat as a SEARCH_TERM, ST.  This will
+   hold a SEARCH_CODE.  PATTERN is checked to see if it is an
+   RTX with that code.  If it is, write SEARCH_RESULT in ST
+   and return 1.  Otherwise, or if we have been passed a NULL_RTX
+   return 0.  If ST.FIND_ANY_SHIFT then we are interested in
+   anything which can reasonably be described as a SHIFT RTX.  */
+static int
+arm_find_sub_rtx_with_search_term (rtx *pattern, void *data)
 {
-  rtx value = PATTERN (producer);
-  rtx addr = PATTERN (consumer);
+  search_term *st = (search_term *) data;
+  rtx_code pattern_code;
+  int found = 0;
 
-  if (GET_CODE (value) == COND_EXEC)
-value = COND_EXEC_CODE (value);
-  if (GET_CODE (value) == PARALLEL)
-value = XVECEXP (value, 0, 0);
-  value = XEXP (value, 0);
-  if (GET_CODE (addr) == COND_EXEC)
-addr = COND_EXEC_CODE (addr);
-  if (GET_CODE (addr) == PARALLEL)
-addr = XVECEXP (addr, 0, 0);
-  addr = XEXP (addr, 0);
+  gcc_assert (pattern);
+  gcc_assert (st);
 
-  return !reg_overlap_mentioned_p (value, addr);
+  /* Poorly formed patterns can really ruin our day.  */
+  if (*pattern == NULL_RTX)
+return 0;
+
+  pattern_code = GET_CODE (*pattern);
+
+  if (st->find_any_shift)
+{
+  unsigned i = 0;
+
+  /* Left shifts might have been canonicalized to a MULT of some
+power of two.  Make sure we catch them.  */
+  if (arm_rtx_shift_left_p (*pattern))
+   found = 1;
+  else
+   for (i = 0; i < ARRAY_SIZE (shift_rtx_codes); i++)
+ if (pattern_code == shift_rtx_codes[i])
+   found = 1;
+}
+
+  if (pattern_code == st->search_code)
+found = 1;
+
+  if (found)
+st->search_result = *pattern;
+
+  return found;
 }
 
-/* Return nonzero if the CONSUMER instruction (a store) does need
-   PRODUCER's value to calculate the address.  */
+/* Traverse PATTERN looking for a sub-rtx with RTX_CODE CODE.  */
+static rtx
+arm_find_sub_rtx_with_code (rtx pattern, rtx_code code, bool find_any_shift)
+{
+  search_term st;
+  int result = 0;
 
-int
-arm_early_store_addr_dep (rtx produ

RE: [GCC, ARM] Backport trunk fix to 4.8 branch to properly handle rtx of ARM PLD instruction

2014-01-15 Thread Terry Guo


> -Original Message-
> From: Richard Earnshaw
> Sent: Wednesday, January 15, 2014 10:30 PM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [GCC, ARM] Backport trunk fix to 4.8 branch to properly
handle
> rtx of ARM PLD instruction
> 
> On 15/01/14 12:37, Terry Guo wrote:
> >
> >
> >> -Original Message-
> >> From: Terry Guo [mailto:terry@arm.com]
> >> Sent: Wednesday, January 15, 2014 8:21 PM
> >> To: Richard Earnshaw
> >> Cc: gcc-patches@gcc.gnu.org
> >> Subject: RE: [GCC, ARM] Backport trunk fix to 4.8 branch to properly
> > handle
> >> rtx of ARM PLD instruction
> >>
> >>>
> >>> Preferably, particularly since you haven't supplied a testcase.
> >>>
> >>> R.
> >>
> >> Bug is reported at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59826.
> >> I
> > shall
> >> update the patch to include the test case.
> >>
> >> BR,
> >> Terry
> >
> > Here is updated patch along with test case. Is it OK?
> >
> 
> I'm rather concerned about the complexity of this patch as a backport.
> Furthermore, part of the problem is that the preload insn is misclassified
as
> an alu_reg operation, which it clearly isn't.
> 
> Instead of doing this, please could you try a simpler patch that simply
> reclassifies the "type" of preload as "load1".  This would break the
> alu->load/store dependency and thereby avoid the trigger of the problem
> 
> R.

Thanks Richard and you are right. What you said should be the root cause for
this issue. I will implement another patch for trunk to correctly classify
the preload instruction into load1, and then back port to 4.8 branch.
Therefore please consider my request in this thread discarded.

BR,
Terry






[GCC, ARM] Backport trunk patch to 4.8 to reclassify ARM preload insn

2014-01-15 Thread Terry Guo
Hi,

Current 4.8 branch will assign alu_reg attribute to the type of arm preload
insn, which is clearly wrong. The attached patch intends to back port trunk
patch to reclassify the type attribute as load1. With this back port, the
4.8 bug PR59826 can be fixed too. Tested with gcc regression test on QEMU
Cortex-M3, no new regressions. Is it OK to back port this patch
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00322.html? 

BR,
TerryIndex: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 206657)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,271 @@
+2014-01-16  Terry Guo  
+
+   PR target/59826
+   Partial Backport from mainline r202323.
+   2013-09-06  James Greenhalgh  
+
+   * config/arm/types.md: Add "no_insn", "multiple" and "untyped"
+   types.
+   * config/arm/arm-fixed.md: Add type attribute to all insn
+   patterns.
+   (add3): Add type attribute.
+   (add3): Likewise.
+   (usadd3): Likewise.
+   (ssadd3): Likewise.
+   (sub3): Likewise.
+   (sub3): Likewise.
+   (ussub3): Likewise.
+   (sssub3): Likewise.
+   (ssmulsa3): Likewise.
+   (usmulusa3): Likewise.
+   (arm_usatsihi): Likewise.
+   * config/arm/vfp.md
+   (*movdi_vfp): Add types for all instructions.
+   (*movdi_vfp_cortexa8): Likewise.
+   (*movhf_vfp_neon): Likewise.
+   (*movhf_vfp): Likewise.
+   (*movdf_vfp): Likewise.
+   (*thumb2_movdf_vfp): Likewise.
+   (*thumb2_movdfcc_vfp): Likewise.
+   * config/arm/arm.md: Add type attribute to all insn patterns.
+   (*thumb1_adddi3): Add type attribute.
+   (*arm_adddi3): Likewise.
+   (*adddi_sesidi_di): Likewise.
+   (*adddi_zesidi_di): Likewise.
+   (*thumb1_addsi3): Likewise.
+   (addsi3_compare0): Likewise.
+   (*addsi3_compare0_scratch): Likewise.
+   (*compare_negsi_si): Likewise.
+   (cmpsi2_addneg): Likewise.
+   (*addsi3_carryin_): Likewise.
+   (*addsi3_carryin_alt2_): Likewise.
+   (*addsi3_carryin_clobercc_): Likewise.
+   (*subsi3_carryin): Likewise.
+   (*subsi3_carryin_const): Likewise.
+   (*subsi3_carryin_compare): Likewise.
+   (*subsi3_carryin_compare_const): Likewise.
+   (*arm_subdi3): Likewise.
+   (*thumb_subdi3): Likewise.
+   (*subdi_di_zesidi): Likewise.
+   (*subdi_di_sesidi): Likewise.
+   (*subdi_zesidi_di): Likewise.
+   (*subdi_sesidi_di): Likewise.
+   (*subdi_zesidi_ze): Likewise.
+   (thumb1_subsi3_insn): Likewise.
+   (*arm_subsi3_insn): Likewise.
+   (*anddi3_insn): Likewise.
+   (*anddi_zesidi_di): Likewise.
+   (*anddi_sesdi_di): Likewise.
+   (*ne_zeroextracts): Likewise.
+   (*ne_zeroextracts): Likewise.
+   (*ite_ne_zeroextr): Likewise.
+   (*ite_ne_zeroextr): Likewise.
+   (*anddi_notdi_di): Likewise.
+   (*anddi_notzesidi): Likewise.
+   (*anddi_notsesidi): Likewise.
+   (andsi_notsi_si): Likewise.
+   (thumb1_bicsi3): Likewise.
+   (*iordi3_insn): Likewise.
+   (*iordi_zesidi_di): Likewise.
+   (*iordi_sesidi_di): Likewise.
+   (*thumb1_iorsi3_insn): Likewise.
+   (*xordi3_insn): Likewise.
+   (*xordi_zesidi_di): Likewise.
+   (*xordi_sesidi_di): Likewise.
+   (*arm_xorsi3): Likewise.
+   (*andsi_iorsi3_no): Likewise.
+   (*smax_0): Likewise.
+   (*smax_m1): Likewise.
+   (*arm_smax_insn): Likewise.
+   (*smin_0): Likewise.
+   (*arm_smin_insn): Likewise.
+   (*arm_umaxsi3): Likewise.
+   (*arm_uminsi3): Likewise.
+   (*minmax_arithsi): Likewise.
+   (*minmax_arithsi_): Likewise.
+   (*satsi_): Likewise.
+   (arm_ashldi3_1bit): Likewise.
+   (arm_ashrdi3_1bit): Likewise.
+   (arm_lshrdi3_1bit): Likewise.
+   (*arm_negdi2): Likewise.
+   (*thumb1_negdi2): Likewise.
+   (*arm_negsi2): Likewise.
+   (*thumb1_negsi2): Likewise.
+   (*negdi_extendsid): Likewise.
+   (*negdi_zero_extend): Likewise.
+   (*arm_abssi2): Likewise.
+   (*thumb1_abssi2): Likewise.
+   (*arm_neg_abssi2): Likewise.
+   (*thumb1_neg_abss): Likewise.
+   (one_cmpldi2): Likewise.
+   (extenddi2): Likewise.
+   (*compareqi_eq0): Likewise.
+   (*arm_extendhisi2addsi): Likewise.
+   (*arm_movdi): Likewise.
+   (*thumb1_movdi_insn): Likewise.
+   (*arm_movt): Likewise.
+   (*thumb1_movsi_insn): Likewise.
+   (pic_add_dot_plus_four): Likewise.
+   (pic_add_dot_plus_eight): Likewise.
+   (tls_load_dot_plus_eight): Likewise.
+   (*thumb1_movhi_insn): Likewise.
+   (*thumb1_movsf_insn): Likewise.
+   (*movdf_soft_insn): Likewise.
+   (*thumb_movdf_insn): Likewise.
+   (cbranchsi4_insn): Likewise.
+   (cbranchsi4_scratch): Likewise.
+   (*negated_cbranchsi4): Likewise.
+   (*tbit_cbranch): Likewise.
+   (*tlobits_cbranch): Like

RE: [GCC, ARM] Backport trunk patch to 4.8 to reclassify ARM preload insn

2014-01-16 Thread Terry Guo


> -Original Message-
> From: Richard Earnshaw
> Sent: Friday, January 17, 2014 12:22 AM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [GCC, ARM] Backport trunk patch to 4.8 to reclassify ARM
> preload insn
> 
> On 16/01/14 07:33, Terry Guo wrote:
> > Hi,
> >
> > Current 4.8 branch will assign alu_reg attribute to the type of arm
> > preload insn, which is clearly wrong. The attached patch intends to
> > back port trunk patch to reclassify the type attribute as load1. With
> > this back port, the
> > 4.8 bug PR59826 can be fixed too. Tested with gcc regression test on
> > QEMU Cortex-M3, no new regressions. Is it OK to back port this patch
> > http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00322.html?
> >
> > BR,
> > Terry
> >
> >
> 
> The patch to arm.md is OK.  The ChangeLog entry should just describe the
> change you're making.  Don't call this a back-port - it isn't really.
> 
> R.
> 

OK. The updated patch is committed at
http://gcc.gnu.org/ml/gcc-cvs/2014-01/msg00436.html.

BR,
Terry




[PATCH, ARM] Document armv7e-m for ARM option -march

2014-02-07 Thread Terry Guo
Hi,

This small patch intends to add missing armv7e-m in the documentation of ARM
option -march. I will commit it to trunk and then back port to 4.7/4.8
branch as obvious.

BR,
Terry

2014-02-08  Terry Guo  

* doc/invoke.texi: Document ARM -march=armv7e-m.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e3dc9df..4d1b657 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -12231,8 +12231,8 @@ of the @option{-mcpu=} option.  Permissible names
are: @samp{armv2},
 @samp{armv5}, @samp{armv5t}, @samp{armv5e}, @samp{armv5te},
 @samp{armv6}, @samp{armv6j},
 @samp{armv6t2}, @samp{armv6z}, @samp{armv6zk}, @samp{armv6-m},
-@samp{armv7}, @samp{armv7-a}, @samp{armv7-r}, @samp{armv7-m},
@samp{armv7ve},
-@samp{armv8-a}, @samp{armv8-a+crc},
+@samp{armv7}, @samp{armv7-a}, @samp{armv7-r}, @samp{armv7-m},
@samp{armv7e-m},
+@samp{armv7ve}, @samp{armv8-a}, @samp{armv8-a+crc},
 @samp{iwmmxt}, @samp{iwmmxt2}, @samp{ep9312}.
 
 @option{-march=armv7ve} is the armv7-a architecture with virtualization




  1   2   >