On 30/11/16 09:50, Thomas Preudhomme wrote: > Hi, > > Is this ok to backport to gcc-5-branch and gcc-6-branch? Patch applies > cleanly (patches attached for reference). > > > 2016-11-17 Thomas Preud'homme <thomas.preudho...@arm.com> > > Backport from mainline > 2016-11-17 Thomas Preud'homme <thomas.preudho...@arm.com> > > gcc/ > PR target/77933 > * config/arm/arm.c (thumb1_expand_prologue): Distinguish between lr > being live in the function and lr needing to be saved. Distinguish > between already saved pushable registers and registers to push. > Check for LR being an available pushable register. > > gcc/testsuite/ > PR target/77933 > * gcc.target/arm/pr77933-1.c: New test. > * gcc.target/arm/pr77933-2.c: Likewise. >
Your attached patch doesn't appear to match your ChangeLog. (rmprofile patch?). R. > > Best regards, > > Thomas > > > On 17/11/16 20:15, Thomas Preudhomme wrote: >> Hi Kyrill, >> >> I've committed the following updated patch where the test is >> restricted to Thumb >> execution mode and skipping it if not possible since -mtpcs-leaf-frame >> is only >> available in Thumb mode. I've considered the change obvious. >> >> *** gcc/ChangeLog *** >> >> 2016-11-08 Thomas Preud'homme <thomas.preudho...@arm.com> >> >> PR target/77933 >> * config/arm/arm.c (thumb1_expand_prologue): Distinguish >> between lr >> being live in the function and lr needing to be saved. >> Distinguish >> between already saved pushable registers and registers to push. >> Check for LR being an available pushable register. >> >> >> *** gcc/testsuite/ChangeLog *** >> >> 2016-11-08 Thomas Preud'homme <thomas.preudho...@arm.com> >> >> PR target/77933 >> * gcc.target/arm/pr77933-1.c: New test. >> * gcc.target/arm/pr77933-2.c: Likewise. >> >> Best regards, >> >> Thomas >> >> On 17/11/16 10:04, Kyrill Tkachov wrote: >>> >>> On 09/11/16 16:41, Thomas Preudhomme wrote: >>>> I've reworked the patch following comments from Wilco [1] (sorry >>>> could not >>>> find it in my MUA for some reason). >>>> >>>> [1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00317.html >>>> >>>> >>>> == Context == >>>> >>>> When saving registers, function thumb1_expand_prologue () aims at >>>> minimizing >>>> the number of push instructions. One of the optimization it does is >>>> to push LR >>>> alongside high register(s) (after having moved them to low >>>> register(s)) when >>>> there is no low register to save. The way this is implemented is to >>>> add LR to >>>> the pushable_regs mask if it is live just before pushing the >>>> registers in that >>>> mask. The mask of live pushable registers which is used to detect >>>> whether LR >>>> needs to be saved is then clear to ensure LR is only saved once. >>>> >>>> >>>> == Problem == >>>> >>>> However beyond deciding what register to push pushable_regs is used >>>> to track >>>> what pushable register can be used to move a high register before being >>>> pushed, hence the name. That mask is cleared when all high registers >>>> have been >>>> assigned a low register but the clearing assumes the high registers >>>> were >>>> assigned to the registers with the biggest number in that mask. This >>>> is not >>>> the case because LR is not considered when looking for a register in >>>> that >>>> mask. Furthermore, LR might have been saved in the TARGET_BACKTRACE >>>> path above >>>> yet the mask of live pushable registers is not cleared in that case. >>>> >>>> >>>> == Solution == >>>> >>>> This patch changes the loop to iterate over register LR to r0 so as >>>> to both >>>> fix the stack corruption reported in PR77933 and reuse lr to push >>>> some high >>>> register when possible. This patch also introduce a new variable >>>> lr_needs_saving to record whether LR (still) needs to be saved at a >>>> given >>>> point in code and sets the variable accordingly throughout the code, >>>> thus >>>> fixing the second issue. Finally, this patch create a new push_mask >>>> variable >>>> to distinguish between the mask of registers to push and the mask of >>>> live >>>> pushable registers. >>>> >>>> >>>> == Note == >>>> >>>> Other bits could have been improved but have been left out to allow >>>> the patch >>>> to be backported to stable branch: >>>> >>>> (1) using argument registers that are not holding an argument >>>> (2) using push_mask consistently instead of l_mask (in >>>> TARGET_BACKTRACE), mask >>>> (low register push) and push_mask >>>> (3) the !l_mask case improved in TARGET_BACKTRACE since offset == 0 >>>> (4) rename l_mask to a more appropriate name (live_pushable_regs_mask?) >>>> >>>> ChangeLog entry are as follow: >>>> >>>> *** gcc/ChangeLog *** >>>> >>>> 2016-11-08 Thomas Preud'homme <thomas.preudho...@arm.com> >>>> >>>> PR target/77933 >>>> * config/arm/arm.c (thumb1_expand_prologue): Distinguish >>>> between lr >>>> being live in the function and lr needing to be saved. >>>> Distinguish >>>> between already saved pushable registers and registers to push. >>>> Check for LR being an available pushable register. >>>> >>>> >>>> *** gcc/testsuite/ChangeLog *** >>>> >>>> 2016-11-08 Thomas Preud'homme <thomas.preudho...@arm.com> >>>> >>>> PR target/77933 >>>> * gcc.target/arm/pr77933-1.c: New test. >>>> * gcc.target/arm/pr77933-2.c: Likewise. >>>> >>>> >>>> Testing: no regression on arm-none-eabi GCC cross-compiler targeting >>>> Cortex-M0 >>>> >>>> Is this ok for trunk? >>>> >>> >>> Ok. >>> Thanks, >>> Kyrill >>> >>>> Best regards, >>>> >>>> Thomas >>>> >>>> On 02/11/16 17:08, Thomas Preudhomme wrote: >>>>> Hi, >>>>> >>>>> When saving registers, function thumb1_expand_prologue () aims at >>>>> minimizing >>>>> the >>>>> number of push instructions. One of the optimization it does is to >>>>> push lr >>>>> alongside high register(s) (after having moved them to low >>>>> register(s)) when >>>>> there is no low register to save. The way this is implemented is to >>>>> add lr to >>>>> the list of registers that can be pushed just before the push >>>>> happens. This >>>>> would then push lr and allows it to be used for further push if >>>>> there was not >>>>> enough registers to push all high registers to be pushed. >>>>> >>>>> However, the logic that decides what register to move high >>>>> registers to before >>>>> being pushed only looks at low registers (see for loop >>>>> initialization). This >>>>> means not only that lr is not used for pushing high registers but >>>>> also that lr >>>>> is not removed from the list of registers to be pushed when it's >>>>> not used. This >>>>> extra lr push is not poped in epilogue leading in stack corruption. >>>>> >>>>> This patch changes the loop to iterate over register r0 to lr so as >>>>> to both fix >>>>> the stack corruption and reuse lr to push some high register when >>>>> possible. >>>>> >>>>> ChangeLog entry are as follow: >>>>> >>>>> *** gcc/ChangeLog *** >>>>> >>>>> 2016-11-01 Thomas Preud'homme <thomas.preudho...@arm.com> >>>>> >>>>> PR target/77933 >>>>> * config/arm/arm.c (thumb1_expand_prologue): Also check for >>>>> lr being a >>>>> pushable register. >>>>> >>>>> >>>>> *** gcc/testsuite/ChangeLog *** >>>>> >>>>> 2016-11-01 Thomas Preud'homme <thomas.preudho...@arm.com> >>>>> >>>>> PR target/77933 >>>>> * gcc.target/arm/pr77933.c: New test. >>>>> >>>>> >>>>> Testing: no regression on arm-none-eabi GCC cross-compiler >>>>> targeting Cortex-M0 >>>>> >>>>> Is this ok for trunk? >>>>> >>>>> Best regards, >>>>> >>>>> Thomas >>> > > 1_rmprofile_multilib.patch > > > diff --git a/gcc/config.gcc b/gcc/config.gcc > index > d956da22ad60abfe9c6b4be0882f9e7dd64ac39f..15b662ad5449f8b91eb760b7fbe45f33d8cecb4b > 100644 > --- a/gcc/config.gcc > +++ b/gcc/config.gcc > @@ -3739,6 +3739,16 @@ case "${target}" in > # pragmatic. > tmake_profile_file="arm/t-aprofile" > ;; > + rmprofile) > + # Note that arm/t-rmprofile is a > + # stand-alone make file fragment to be > + # used only with itself. We do not > + # specifically use the > + # TM_MULTILIB_OPTION framework because > + # this shorthand is more > + # pragmatic. > + tmake_profile_file="arm/t-rmprofile" > + ;; > default) > ;; > *) > @@ -3748,9 +3758,10 @@ case "${target}" in > esac > > if test "x${tmake_profile_file}" != x ; then > - # arm/t-aprofile is only designed to work > - # without any with-cpu, with-arch, with-mode, > - # with-fpu or with-float options. > + # arm/t-aprofile and arm/t-rmprofile are only > + # designed to work without any with-cpu, > + # with-arch, with-mode, with-fpu or with-float > + # options. > if test "x$with_arch" != x \ > || test "x$with_cpu" != x \ > || test "x$with_float" != x \ > diff --git a/gcc/config/arm/t-rmprofile b/gcc/config/arm/t-rmprofile > new file mode 100644 > index > 0000000000000000000000000000000000000000..c8b5c9cbd03694eea69855e20372afa3e97d6b4c > --- /dev/null > +++ b/gcc/config/arm/t-rmprofile > @@ -0,0 +1,174 @@ > +# Copyright (C) 2016 Free Software Foundation, Inc. > +# > +# This file is part of GCC. > +# > +# GCC is free software; you can redistribute it and/or modify > +# it under the terms of the GNU General Public License as published by > +# the Free Software Foundation; either version 3, or (at your option) > +# any later version. > +# > +# GCC is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +# GNU General Public License for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with GCC; see the file COPYING3. If not see > +# <http://www.gnu.org/licenses/>. > + > +# This is a target makefile fragment that attempts to get > +# multilibs built for the range of CPU's, FPU's and ABI's that > +# are relevant for the ARM architecture. It should not be used in > +# conjunction with another make file fragment and assumes --with-arch, > +# --with-cpu, --with-fpu, --with-float, --with-mode have their default > +# values during the configure step. We enforce this during the > +# top-level configury. > + > +MULTILIB_OPTIONS = > +MULTILIB_DIRNAMES = > +MULTILIB_EXCEPTIONS = > +MULTILIB_MATCHES = > +MULTILIB_REUSE = > + > +# We have the following hierachy: > +# ISA: A32 (.) or T16/T32 (thumb). > +# Architecture: ARMv6S-M (v6-m), ARMv7-M (v7-m), ARMv7E-M (v7e-m), > +# ARMv8-M Baseline (v8-m.base) or ARMv8-M Mainline > (v8-m.main). > +# FPU: VFPv3-D16 (fpv3), FPV4-SP-D16 (fpv4-sp), FPV5-SP-D16 (fpv5-sp), > +# VFPv5-D16 (fpv5), or None (.). > +# Float-abi: Soft (.), softfp (softfp), or hard (hardfp). > + > +# Options to build libraries with > + > +MULTILIB_OPTIONS += mthumb > +MULTILIB_DIRNAMES += thumb > + > +MULTILIB_OPTIONS += > march=armv6s-m/march=armv7-m/march=armv7e-m/march=armv7/march=armv8-m.base/march=armv8-m.main > +MULTILIB_DIRNAMES += v6-m v7-m v7e-m v7-ar v8-m.base v8-m.main > + > +MULTILIB_OPTIONS += > mfpu=vfpv3-d16/mfpu=fpv4-sp-d16/mfpu=fpv5-sp-d16/mfpu=fpv5-d16 > +MULTILIB_DIRNAMES += fpv3 fpv4-sp fpv5-sp fpv5 > + > +MULTILIB_OPTIONS += mfloat-abi=softfp/mfloat-abi=hard > +MULTILIB_DIRNAMES += softfp hard > + > + > +# Option combinations to build library with > + > +# Default CPU/Arch > +MULTILIB_REQUIRED += mthumb > +MULTILIB_REQUIRED += mfloat-abi=hard > + > +# ARMv6-M > +MULTILIB_REQUIRED += mthumb/march=armv6s-m > + > +# ARMv8-M Baseline > +MULTILIB_REQUIRED += mthumb/march=armv8-m.base > + > +# ARMv7-M > +MULTILIB_REQUIRED += mthumb/march=armv7-m > + > +# ARMv7E-M > +MULTILIB_REQUIRED += mthumb/march=armv7e-m > +MULTILIB_REQUIRED += > mthumb/march=armv7e-m/mfpu=fpv4-sp-d16/mfloat-abi=softfp > +MULTILIB_REQUIRED += > mthumb/march=armv7e-m/mfpu=fpv4-sp-d16/mfloat-abi=hard > +MULTILIB_REQUIRED += > mthumb/march=armv7e-m/mfpu=fpv5-d16/mfloat-abi=softfp > +MULTILIB_REQUIRED += mthumb/march=armv7e-m/mfpu=fpv5-d16/mfloat-abi=hard > +MULTILIB_REQUIRED += > mthumb/march=armv7e-m/mfpu=fpv5-sp-d16/mfloat-abi=softfp > +MULTILIB_REQUIRED += > mthumb/march=armv7e-m/mfpu=fpv5-sp-d16/mfloat-abi=hard > + > +# ARMv8-M Mainline > +MULTILIB_REQUIRED += mthumb/march=armv8-m.main > +MULTILIB_REQUIRED += > mthumb/march=armv8-m.main/mfpu=fpv5-d16/mfloat-abi=softfp > +MULTILIB_REQUIRED += > mthumb/march=armv8-m.main/mfpu=fpv5-d16/mfloat-abi=hard > +MULTILIB_REQUIRED += > mthumb/march=armv8-m.main/mfpu=fpv5-sp-d16/mfloat-abi=softfp > +MULTILIB_REQUIRED += > mthumb/march=armv8-m.main/mfpu=fpv5-sp-d16/mfloat-abi=hard > + > +# ARMv7-R as well as ARMv7-A and ARMv8-A if aprofile was not specified > +MULTILIB_REQUIRED += mthumb/march=armv7 > +MULTILIB_REQUIRED += mthumb/march=armv7/mfpu=vfpv3-d16/mfloat-abi=softfp > +MULTILIB_REQUIRED += mthumb/march=armv7/mfpu=vfpv3-d16/mfloat-abi=hard > + > + > +# Matches > + > +# CPU Matches > +MULTILIB_MATCHES += march?armv6s-m=mcpu?cortex-m0 > +MULTILIB_MATCHES += march?armv6s-m=mcpu?cortex-m0.small-multiply > +MULTILIB_MATCHES += march?armv6s-m=mcpu?cortex-m0plus > +MULTILIB_MATCHES += march?armv6s-m=mcpu?cortex-m0plus.small-multiply > +MULTILIB_MATCHES += march?armv6s-m=mcpu?cortex-m1 > +MULTILIB_MATCHES += march?armv6s-m=mcpu?cortex-m1.small-multiply > +MULTILIB_MATCHES += march?armv7-m=mcpu?cortex-m3 > +MULTILIB_MATCHES += march?armv7e-m=mcpu?cortex-m4 > +MULTILIB_MATCHES += march?armv7e-m=mcpu?cortex-m7 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-r4 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-r4f > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-r5 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-r7 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-r8 > +MULTILIB_MATCHES += march?armv7=mcpu?marvell-pj4 > +MULTILIB_MATCHES += march?armv7=mcpu?generic-armv7-a > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-a8 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-a9 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-a5 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-a7 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-a15 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-a12 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-a17 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-a15.cortex-a7 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-a17.cortex-a7 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-a32 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-a35 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-a53 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-a57 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-a57.cortex-a53 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-a72 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-a72.cortex-a53 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-a73 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-a73.cortex-a35 > +MULTILIB_MATCHES += march?armv7=mcpu?cortex-a73.cortex-a53 > +MULTILIB_MATCHES += march?armv7=mcpu?exynos-m1 > +MULTILIB_MATCHES += march?armv7=mcpu?qdf24xx > +MULTILIB_MATCHES += march?armv7=mcpu?xgene1 > + > +# Arch Matches > +MULTILIB_MATCHES += march?armv6s-m=march?armv6-m > +MULTILIB_MATCHES += march?armv8-m.main=march?armv8-m.main+dsp > +MULTILIB_MATCHES += march?armv7=march?armv7-r > +ifeq (,$(HAS_APROFILE)) > +MULTILIB_MATCHES += march?armv7=march?armv7-a > +MULTILIB_MATCHES += march?armv7=march?armv7ve > +MULTILIB_MATCHES += march?armv7=march?armv8-a > +MULTILIB_MATCHES += march?armv7=march?armv8-a+crc > +MULTILIB_MATCHES += march?armv7=march?armv8.1-a > +MULTILIB_MATCHES += march?armv7=march?armv8.1-a+crc > +MULTILIB_MATCHES += march?armv7=march?armv8.2-a > +MULTILIB_MATCHES += march?armv7=march?armv8.2-a+fp16 > +endif > + > +# FPU matches > +ifeq (,$(HAS_APROFILE)) > +MULTILIB_MATCHES += mfpu?vfpv3-d16=mfpu?vfpv3 > +MULTILIB_MATCHES += mfpu?vfpv3-d16=mfpu?vfpv3-fp16 > +MULTILIB_MATCHES += mfpu?vfpv3-d16=mfpu?vfpv3-d16-fp16 > +MULTILIB_MATCHES += mfpu?vfpv3-d16=mfpu?neon > +MULTILIB_MATCHES += mfpu?vfpv3-d16=mfpu?neon-fp16 > +MULTILIB_MATCHES += mfpu?vfpv3-d16=mfpu?vfpv4 > +MULTILIB_MATCHES += mfpu?vfpv3-d16=mfpu?vfpv4-d16 > +MULTILIB_MATCHES += mfpu?vfpv3-d16=mfpu?neon-vfpv4 > +MULTILIB_MATCHES += mfpu?fpv5-d16=mfpu?fp-armv8 > +MULTILIB_MATCHES += mfpu?fpv5-d16=mfpu?neon-fp-armv8 > +MULTILIB_MATCHES += mfpu?fpv5-d16=mfpu?crypto-neon-fp-armv8 > +endif > + > + > +# We map all requests for ARMv7-R or ARMv7-A in ARM mode to Thumb mode and > +# any FPU to VFPv3-d16 if possible. > +MULTILIB_REUSE += mthumb/march.armv7=march.armv7 > +MULTILIB_REUSE += > mthumb/march.armv7/mfpu.vfpv3-d16/mfloat-abi.softfp=march.armv7/mfpu.vfpv3-d16/mfloat-abi.softfp > +MULTILIB_REUSE += > mthumb/march.armv7/mfpu.vfpv3-d16/mfloat-abi.hard=march.armv7/mfpu.vfpv3-d16/mfloat-abi.hard > +MULTILIB_REUSE += > mthumb/march.armv7/mfpu.vfpv3-d16/mfloat-abi.softfp=march.armv7/mfpu.fpv5-d16/mfloat-abi.softfp > +MULTILIB_REUSE += > mthumb/march.armv7/mfpu.vfpv3-d16/mfloat-abi.hard=march.armv7/mfpu.fpv5-d16/mfloat-abi.hard > +MULTILIB_REUSE += > mthumb/march.armv7/mfpu.vfpv3-d16/mfloat-abi.softfp=mthumb/march.armv7/mfpu.fpv5-d16/mfloat-abi.softfp > +MULTILIB_REUSE += > mthumb/march.armv7/mfpu.vfpv3-d16/mfloat-abi.hard=mthumb/march.armv7/mfpu.fpv5-d16/mfloat-abi.hard > diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi > index > e4c686e60c7f479ca3ea71e94c4bb6ad52373085..0b94bc1931a226e58d06a7ed5a726454142c006a > 100644 > --- a/gcc/doc/install.texi > +++ b/gcc/doc/install.texi > @@ -1107,19 +1107,59 @@ sysv, aix. > > @item --with-multilib-list=@var{list} > @itemx --without-multilib-list > -Specify what multilibs to build. > -Currently only implemented for arm*-*-*, sh*-*-* and x86-64-*-linux*. > +Specify what multilibs to build. @var{list} is a comma separated list of > +values, possibly consisting of a single value. Currently only implemented > +for arm*-*-*, sh*-*-* and x86-64-*-linux*. The accepted values and meaning > +for each target is given below. > > @table @code > @item arm*-*-* > -@var{list} is either @code{default} or @code{aprofile}. Specifying > -@code{default} is equivalent to omitting this option while specifying > -@code{aprofile} builds multilibs for each combination of ISA (@code{-marm} or > -@code{-mthumb}), architecture (@code{-march=armv7-a}, @code{-march=armv7ve}, > -or @code{-march=armv8-a}), FPU available (none, @code{-mfpu=vfpv3-d16}, > -@code{-mfpu=neon}, @code{-mfpu=vfpv4-d16}, @code{-mfpu=neon-vfpv4} or > -@code{-mfpu=neon-fp-armv8} depending on architecture) and floating-point ABI > -(@code{-mfloat-abi=softfp} or @code{-mfloat-abi=hard}). > +@var{list} is one of@code{default}, @code{aprofile} or @code{rmprofile}. > +Specifying @code{default} is equivalent to omitting this option, ie. only the > +default runtime library will be enabled. Specifying @code{aprofile} or > +@code{rmprofile} builds multilibs for a combination of ISA, architecture, > +FPU available and floating-point ABI. > + > +The table below gives the combination of ISAs, architectures, FPUs and > +floating-point ABIs for which multilibs are built for each accepted value. > + > +@multitable @columnfractions .15 .28 .30 > +@item Option @tab aprofile @tab rmprofile > +@item ISAs > +@tab @code{-marm} and @code{-mthumb} > +@tab @code{-mthumb} > +@item Architectures@*@*@*@*@*@* > +@tab default architecture@* > +@code{-march=armv7-a}@* > +@code{-march=armv7ve}@* > +@code{-march=armv8-a}@*@*@* > +@tab default architecture@* > +@code{-march=armv6s-m}@* > +@code{-march=armv7-m}@* > +@code{-march=armv7e-m}@* > +@code{-march=armv8-m.base}@* > +@code{-march=armv8-m.main}@* > +@code{-march=armv7} > +@item FPUs@*@*@*@*@* > +@tab none@* > +@code{-mfpu=vfpv3-d16}@* > +@code{-mfpu=neon}@* > +@code{-mfpu=vfpv4-d16}@* > +@code{-mfpu=neon-vfpv4}@* > +@code{-mfpu=neon-fp-armv8} > +@tab none@* > +@code{-mfpu=vfpv3-d16}@* > +@code{-mfpu=fpv4-sp-d16}@* > +@code{-mfpu=fpv5-sp-d16}@* > +@code{-mfpu=fpv5-d16}@* > +@item floating-point@/ ABIs@*@* > +@tab @code{-mfloat-abi=soft}@* > +@code{-mfloat-abi=softfp}@* > +@code{-mfloat-abi=hard} > +@tab @code{-mfloat-abi=soft}@* > +@code{-mfloat-abi=softfp}@* > +@code{-mfloat-abi=hard} > +@end multitable > > @item sh*-*-* > @var{list} is a comma separated list of CPU names. These must be of the > > > fix_pr77933_gcc5.patch > > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > index > c01a3c878968f6e6f07358b0686e4a59e34f56b7..5c975625bfa25d2c71c27db348cd3e70fe44a951 > 100644 > --- a/gcc/config/arm/arm.c > +++ b/gcc/config/arm/arm.c > @@ -24457,6 +24457,7 @@ thumb1_expand_prologue (void) > unsigned long live_regs_mask; > unsigned long l_mask; > unsigned high_regs_pushed = 0; > + bool lr_needs_saving; > > func_type = arm_current_func_type (); > > @@ -24479,6 +24480,7 @@ thumb1_expand_prologue (void) > > offsets = arm_get_frame_offsets (); > live_regs_mask = offsets->saved_regs_mask; > + lr_needs_saving = live_regs_mask & (1 << LR_REGNUM); > > /* Extract a mask of the ones we can give to the Thumb's push instruction. > */ > l_mask = live_regs_mask & 0x40ff; > @@ -24545,6 +24547,7 @@ thumb1_expand_prologue (void) > { > insn = thumb1_emit_multi_reg_push (l_mask, l_mask); > RTX_FRAME_RELATED_P (insn) = 1; > + lr_needs_saving = false; > > offset = bit_count (l_mask) * UNITS_PER_WORD; > } > @@ -24609,12 +24612,13 @@ thumb1_expand_prologue (void) > be a push of LR and we can combine it with the push of the first high > register. */ > else if ((l_mask & 0xff) != 0 > - || (high_regs_pushed == 0 && l_mask)) > + || (high_regs_pushed == 0 && lr_needs_saving)) > { > unsigned long mask = l_mask; > mask |= (1 << thumb1_extra_regs_pushed (offsets, true)) - 1; > insn = thumb1_emit_multi_reg_push (mask, mask); > RTX_FRAME_RELATED_P (insn) = 1; > + lr_needs_saving = false; > } > > if (high_regs_pushed) > @@ -24632,7 +24636,9 @@ thumb1_expand_prologue (void) > /* Here we need to mask out registers used for passing arguments > even if they can be pushed. This is to avoid using them to stash the > high > registers. Such kind of stash may clobber the use of arguments. */ > - pushable_regs = l_mask & (~arg_regs_mask) & 0xff; > + pushable_regs = l_mask & (~arg_regs_mask); > + if (lr_needs_saving) > + pushable_regs &= ~(1 << LR_REGNUM); > > if (pushable_regs == 0) > pushable_regs = 1 << thumb_find_work_register (live_regs_mask); > @@ -24640,8 +24646,9 @@ thumb1_expand_prologue (void) > while (high_regs_pushed > 0) > { > unsigned long real_regs_mask = 0; > + unsigned long push_mask = 0; > > - for (regno = LAST_LO_REGNUM; regno >= 0; regno --) > + for (regno = LR_REGNUM; regno >= 0; regno --) > { > if (pushable_regs & (1 << regno)) > { > @@ -24650,6 +24657,7 @@ thumb1_expand_prologue (void) > > high_regs_pushed --; > real_regs_mask |= (1 << next_hi_reg); > + push_mask |= (1 << regno); > > if (high_regs_pushed) > { > @@ -24659,23 +24667,20 @@ thumb1_expand_prologue (void) > break; > } > else > - { > - pushable_regs &= ~((1 << regno) - 1); > - break; > - } > + break; > } > } > > /* If we had to find a work register and we have not yet > saved the LR then add it to the list of regs to push. */ > - if (l_mask == (1 << LR_REGNUM)) > + if (lr_needs_saving) > { > - pushable_regs |= l_mask; > - real_regs_mask |= l_mask; > - l_mask = 0; > + push_mask |= 1 << LR_REGNUM; > + real_regs_mask |= 1 << LR_REGNUM; > + lr_needs_saving = false; > } > > - insn = thumb1_emit_multi_reg_push (pushable_regs, real_regs_mask); > + insn = thumb1_emit_multi_reg_push (push_mask, real_regs_mask); > RTX_FRAME_RELATED_P (insn) = 1; > } > } > diff --git a/gcc/testsuite/gcc.target/arm/pr77933-1.c > b/gcc/testsuite/gcc.target/arm/pr77933-1.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..95cf68ea7531bcc453371f493a05bd40caa5541b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/arm/pr77933-1.c > @@ -0,0 +1,46 @@ > +/* { dg-do run } */ > +/* { dg-options "-O2" } */ > + > +__attribute__ ((noinline, noclone)) void > +clobber_lr_and_highregs (void) > +{ > + __asm__ volatile ("" : : : "r8", "r9", "lr"); > +} > + > +int > +main (void) > +{ > + int ret; > + > + __asm volatile ("mov\tr4, #0xf4\n\t" > + "mov\tr5, #0xf5\n\t" > + "mov\tr6, #0xf6\n\t" > + "mov\tr7, #0xf7\n\t" > + "mov\tr0, #0xf8\n\t" > + "mov\tr8, r0\n\t" > + "mov\tr0, #0xfa\n\t" > + "mov\tr10, r0" > + : : : "r0", "r4", "r5", "r6", "r7", "r8", "r10"); > + > + clobber_lr_and_highregs (); > + > + __asm volatile ("cmp\tr4, #0xf4\n\t" > + "bne\tfail\n\t" > + "cmp\tr5, #0xf5\n\t" > + "bne\tfail\n\t" > + "cmp\tr6, #0xf6\n\t" > + "bne\tfail\n\t" > + "cmp\tr7, #0xf7\n\t" > + "bne\tfail\n\t" > + "mov\tr0, r8\n\t" > + "cmp\tr0, #0xf8\n\t" > + "bne\tfail\n\t" > + "mov\tr0, r10\n\t" > + "cmp\tr0, #0xfa\n\t" > + "bne\tfail\n\t" > + "mov\t%0, #1\n" > + "fail:\n\t" > + "sub\tr0, #1" > + : "=r" (ret) : :); > + return ret; > +} > diff --git a/gcc/testsuite/gcc.target/arm/pr77933-2.c > b/gcc/testsuite/gcc.target/arm/pr77933-2.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..9028c4fcab4229591fa057f15c641d2b5597cd1d > --- /dev/null > +++ b/gcc/testsuite/gcc.target/arm/pr77933-2.c > @@ -0,0 +1,47 @@ > +/* { dg-do run } */ > +/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */ > +/* { dg-options "-mthumb -O2 -mtpcs-leaf-frame" } */ > + > +__attribute__ ((noinline, noclone)) void > +clobber_lr_and_highregs (void) > +{ > + __asm__ volatile ("" : : : "r8", "r9", "lr"); > +} > + > +int > +main (void) > +{ > + int ret; > + > + __asm volatile ("mov\tr4, #0xf4\n\t" > + "mov\tr5, #0xf5\n\t" > + "mov\tr6, #0xf6\n\t" > + "mov\tr7, #0xf7\n\t" > + "mov\tr0, #0xf8\n\t" > + "mov\tr8, r0\n\t" > + "mov\tr0, #0xfa\n\t" > + "mov\tr10, r0" > + : : : "r0", "r4", "r5", "r6", "r7", "r8", "r10"); > + > + clobber_lr_and_highregs (); > + > + __asm volatile ("cmp\tr4, #0xf4\n\t" > + "bne\tfail\n\t" > + "cmp\tr5, #0xf5\n\t" > + "bne\tfail\n\t" > + "cmp\tr6, #0xf6\n\t" > + "bne\tfail\n\t" > + "cmp\tr7, #0xf7\n\t" > + "bne\tfail\n\t" > + "mov\tr0, r8\n\t" > + "cmp\tr0, #0xf8\n\t" > + "bne\tfail\n\t" > + "mov\tr0, r10\n\t" > + "cmp\tr0, #0xfa\n\t" > + "bne\tfail\n\t" > + "mov\t%0, #1\n" > + "fail:\n\t" > + "sub\tr0, #1" > + : "=r" (ret) : :); > + return ret; > +} > > > fix_pr77933_gcc6.patch > > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > index > 83cb13d1195beb19d6301f5c83a7eb544a91d877..1dba035c62c97a5f723d02208636c92108427379 > 100644 > --- a/gcc/config/arm/arm.c > +++ b/gcc/config/arm/arm.c > @@ -24710,6 +24710,7 @@ thumb1_expand_prologue (void) > unsigned long live_regs_mask; > unsigned long l_mask; > unsigned high_regs_pushed = 0; > + bool lr_needs_saving; > > func_type = arm_current_func_type (); > > @@ -24732,6 +24733,7 @@ thumb1_expand_prologue (void) > > offsets = arm_get_frame_offsets (); > live_regs_mask = offsets->saved_regs_mask; > + lr_needs_saving = live_regs_mask & (1 << LR_REGNUM); > > /* Extract a mask of the ones we can give to the Thumb's push instruction. > */ > l_mask = live_regs_mask & 0x40ff; > @@ -24798,6 +24800,7 @@ thumb1_expand_prologue (void) > { > insn = thumb1_emit_multi_reg_push (l_mask, l_mask); > RTX_FRAME_RELATED_P (insn) = 1; > + lr_needs_saving = false; > > offset = bit_count (l_mask) * UNITS_PER_WORD; > } > @@ -24862,12 +24865,13 @@ thumb1_expand_prologue (void) > be a push of LR and we can combine it with the push of the first high > register. */ > else if ((l_mask & 0xff) != 0 > - || (high_regs_pushed == 0 && l_mask)) > + || (high_regs_pushed == 0 && lr_needs_saving)) > { > unsigned long mask = l_mask; > mask |= (1 << thumb1_extra_regs_pushed (offsets, true)) - 1; > insn = thumb1_emit_multi_reg_push (mask, mask); > RTX_FRAME_RELATED_P (insn) = 1; > + lr_needs_saving = false; > } > > if (high_regs_pushed) > @@ -24885,7 +24889,9 @@ thumb1_expand_prologue (void) > /* Here we need to mask out registers used for passing arguments > even if they can be pushed. This is to avoid using them to stash the > high > registers. Such kind of stash may clobber the use of arguments. */ > - pushable_regs = l_mask & (~arg_regs_mask) & 0xff; > + pushable_regs = l_mask & (~arg_regs_mask); > + if (lr_needs_saving) > + pushable_regs &= ~(1 << LR_REGNUM); > > if (pushable_regs == 0) > pushable_regs = 1 << thumb_find_work_register (live_regs_mask); > @@ -24893,8 +24899,9 @@ thumb1_expand_prologue (void) > while (high_regs_pushed > 0) > { > unsigned long real_regs_mask = 0; > + unsigned long push_mask = 0; > > - for (regno = LAST_LO_REGNUM; regno >= 0; regno --) > + for (regno = LR_REGNUM; regno >= 0; regno --) > { > if (pushable_regs & (1 << regno)) > { > @@ -24903,6 +24910,7 @@ thumb1_expand_prologue (void) > > high_regs_pushed --; > real_regs_mask |= (1 << next_hi_reg); > + push_mask |= (1 << regno); > > if (high_regs_pushed) > { > @@ -24912,23 +24920,20 @@ thumb1_expand_prologue (void) > break; > } > else > - { > - pushable_regs &= ~((1 << regno) - 1); > - break; > - } > + break; > } > } > > /* If we had to find a work register and we have not yet > saved the LR then add it to the list of regs to push. */ > - if (l_mask == (1 << LR_REGNUM)) > + if (lr_needs_saving) > { > - pushable_regs |= l_mask; > - real_regs_mask |= l_mask; > - l_mask = 0; > + push_mask |= 1 << LR_REGNUM; > + real_regs_mask |= 1 << LR_REGNUM; > + lr_needs_saving = false; > } > > - insn = thumb1_emit_multi_reg_push (pushable_regs, real_regs_mask); > + insn = thumb1_emit_multi_reg_push (push_mask, real_regs_mask); > RTX_FRAME_RELATED_P (insn) = 1; > } > } > diff --git a/gcc/testsuite/gcc.target/arm/pr77933-1.c > b/gcc/testsuite/gcc.target/arm/pr77933-1.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..95cf68ea7531bcc453371f493a05bd40caa5541b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/arm/pr77933-1.c > @@ -0,0 +1,46 @@ > +/* { dg-do run } */ > +/* { dg-options "-O2" } */ > + > +__attribute__ ((noinline, noclone)) void > +clobber_lr_and_highregs (void) > +{ > + __asm__ volatile ("" : : : "r8", "r9", "lr"); > +} > + > +int > +main (void) > +{ > + int ret; > + > + __asm volatile ("mov\tr4, #0xf4\n\t" > + "mov\tr5, #0xf5\n\t" > + "mov\tr6, #0xf6\n\t" > + "mov\tr7, #0xf7\n\t" > + "mov\tr0, #0xf8\n\t" > + "mov\tr8, r0\n\t" > + "mov\tr0, #0xfa\n\t" > + "mov\tr10, r0" > + : : : "r0", "r4", "r5", "r6", "r7", "r8", "r10"); > + > + clobber_lr_and_highregs (); > + > + __asm volatile ("cmp\tr4, #0xf4\n\t" > + "bne\tfail\n\t" > + "cmp\tr5, #0xf5\n\t" > + "bne\tfail\n\t" > + "cmp\tr6, #0xf6\n\t" > + "bne\tfail\n\t" > + "cmp\tr7, #0xf7\n\t" > + "bne\tfail\n\t" > + "mov\tr0, r8\n\t" > + "cmp\tr0, #0xf8\n\t" > + "bne\tfail\n\t" > + "mov\tr0, r10\n\t" > + "cmp\tr0, #0xfa\n\t" > + "bne\tfail\n\t" > + "mov\t%0, #1\n" > + "fail:\n\t" > + "sub\tr0, #1" > + : "=r" (ret) : :); > + return ret; > +} > diff --git a/gcc/testsuite/gcc.target/arm/pr77933-2.c > b/gcc/testsuite/gcc.target/arm/pr77933-2.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..9028c4fcab4229591fa057f15c641d2b5597cd1d > --- /dev/null > +++ b/gcc/testsuite/gcc.target/arm/pr77933-2.c > @@ -0,0 +1,47 @@ > +/* { dg-do run } */ > +/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */ > +/* { dg-options "-mthumb -O2 -mtpcs-leaf-frame" } */ > + > +__attribute__ ((noinline, noclone)) void > +clobber_lr_and_highregs (void) > +{ > + __asm__ volatile ("" : : : "r8", "r9", "lr"); > +} > + > +int > +main (void) > +{ > + int ret; > + > + __asm volatile ("mov\tr4, #0xf4\n\t" > + "mov\tr5, #0xf5\n\t" > + "mov\tr6, #0xf6\n\t" > + "mov\tr7, #0xf7\n\t" > + "mov\tr0, #0xf8\n\t" > + "mov\tr8, r0\n\t" > + "mov\tr0, #0xfa\n\t" > + "mov\tr10, r0" > + : : : "r0", "r4", "r5", "r6", "r7", "r8", "r10"); > + > + clobber_lr_and_highregs (); > + > + __asm volatile ("cmp\tr4, #0xf4\n\t" > + "bne\tfail\n\t" > + "cmp\tr5, #0xf5\n\t" > + "bne\tfail\n\t" > + "cmp\tr6, #0xf6\n\t" > + "bne\tfail\n\t" > + "cmp\tr7, #0xf7\n\t" > + "bne\tfail\n\t" > + "mov\tr0, r8\n\t" > + "cmp\tr0, #0xf8\n\t" > + "bne\tfail\n\t" > + "mov\tr0, r10\n\t" > + "cmp\tr0, #0xfa\n\t" > + "bne\tfail\n\t" > + "mov\t%0, #1\n" > + "fail:\n\t" > + "sub\tr0, #1" > + : "=r" (ret) : :); > + return ret; > +} >