On 30/11/16 09:50, Thomas Preudhomme wrote:
> Hi,
> 
> Is this ok to backport to gcc-5-branch and gcc-6-branch? Patch applies
> cleanly (patches attached for reference).
> 
> 
> 2016-11-17  Thomas Preud'homme  <thomas.preudho...@arm.com>
> 
>     Backport from mainline
>     2016-11-17  Thomas Preud'homme  <thomas.preudho...@arm.com>
> 
>     gcc/
>     PR target/77933
>     * config/arm/arm.c (thumb1_expand_prologue): Distinguish between lr
>     being live in the function and lr needing to be saved.  Distinguish
>     between already saved pushable registers and registers to push.
>     Check for LR being an available pushable register.
> 
>     gcc/testsuite/
>     PR target/77933
>     * gcc.target/arm/pr77933-1.c: New test.
>     * gcc.target/arm/pr77933-2.c: Likewise.
> 

Your attached patch doesn't appear to match your ChangeLog.  (rmprofile
patch?).

R.

> 
> Best regards,
> 
> Thomas
> 
> 
> On 17/11/16 20:15, Thomas Preudhomme wrote:
>> Hi Kyrill,
>>
>> I've committed the following updated patch where the test is
>> restricted to Thumb
>> execution mode and skipping it if not possible since -mtpcs-leaf-frame
>> is only
>> available in Thumb mode. I've considered the change obvious.
>>
>> *** gcc/ChangeLog ***
>>
>> 2016-11-08  Thomas Preud'homme  <thomas.preudho...@arm.com>
>>
>>         PR target/77933
>>         * config/arm/arm.c (thumb1_expand_prologue): Distinguish
>> between lr
>>         being live in the function and lr needing to be saved. 
>> Distinguish
>>         between already saved pushable registers and registers to push.
>>         Check for LR being an available pushable register.
>>
>>
>> *** gcc/testsuite/ChangeLog ***
>>
>> 2016-11-08  Thomas Preud'homme  <thomas.preudho...@arm.com>
>>
>>         PR target/77933
>>         * gcc.target/arm/pr77933-1.c: New test.
>>         * gcc.target/arm/pr77933-2.c: Likewise.
>>
>> Best regards,
>>
>> Thomas
>>
>> On 17/11/16 10:04, Kyrill Tkachov wrote:
>>>
>>> On 09/11/16 16:41, Thomas Preudhomme wrote:
>>>> I've reworked the patch following comments from Wilco [1] (sorry
>>>> could not
>>>> find it in my MUA for some reason).
>>>>
>>>> [1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00317.html
>>>>
>>>>
>>>> == Context ==
>>>>
>>>> When saving registers, function thumb1_expand_prologue () aims at
>>>> minimizing
>>>> the number of push instructions. One of the optimization it does is
>>>> to push LR
>>>> alongside high register(s) (after having moved them to low
>>>> register(s)) when
>>>> there is no low register to save. The way this is implemented is to
>>>> add LR to
>>>> the pushable_regs mask if it is live just before pushing the
>>>> registers in that
>>>> mask. The mask of live pushable registers which is used to detect
>>>> whether LR
>>>> needs to be saved is then clear to ensure LR is only saved once.
>>>>
>>>>
>>>> == Problem ==
>>>>
>>>> However beyond deciding what register to push pushable_regs is used
>>>> to track
>>>> what pushable register can be used to move a high register before being
>>>> pushed, hence the name. That mask is cleared when all high registers
>>>> have been
>>>> assigned a low register but the clearing assumes the high registers
>>>> were
>>>> assigned to the registers with the biggest number in that mask. This
>>>> is not
>>>> the case because LR is not considered when looking for a register in
>>>> that
>>>> mask. Furthermore, LR might have been saved in the TARGET_BACKTRACE
>>>> path above
>>>> yet the mask of live pushable registers is not cleared in that case.
>>>>
>>>>
>>>> == Solution ==
>>>>
>>>> This patch changes the loop to iterate over register LR to r0 so as
>>>> to both
>>>> fix the stack corruption reported in PR77933 and reuse lr to push
>>>> some high
>>>> register when possible. This patch also introduce a new variable
>>>> lr_needs_saving to record whether LR (still) needs to be saved at a
>>>> given
>>>> point in code and sets the variable accordingly throughout the code,
>>>> thus
>>>> fixing the second issue. Finally, this patch create a new push_mask
>>>> variable
>>>> to distinguish between the mask of registers to push and the mask of
>>>> live
>>>> pushable registers.
>>>>
>>>>
>>>> == Note ==
>>>>
>>>> Other bits could have been improved but have been left out to allow
>>>> the patch
>>>> to be backported to stable branch:
>>>>
>>>> (1) using argument registers that are not holding an argument
>>>> (2) using push_mask consistently instead of l_mask (in
>>>> TARGET_BACKTRACE), mask
>>>> (low register push) and push_mask
>>>> (3) the !l_mask case improved in TARGET_BACKTRACE since offset == 0
>>>> (4) rename l_mask to a more appropriate name (live_pushable_regs_mask?)
>>>>
>>>> ChangeLog entry are as follow:
>>>>
>>>> *** gcc/ChangeLog ***
>>>>
>>>> 2016-11-08  Thomas Preud'homme  <thomas.preudho...@arm.com>
>>>>
>>>>         PR target/77933
>>>>         * config/arm/arm.c (thumb1_expand_prologue): Distinguish
>>>> between lr
>>>>         being live in the function and lr needing to be saved.
>>>> Distinguish
>>>>         between already saved pushable registers and registers to push.
>>>>         Check for LR being an available pushable register.
>>>>
>>>>
>>>> *** gcc/testsuite/ChangeLog ***
>>>>
>>>> 2016-11-08  Thomas Preud'homme  <thomas.preudho...@arm.com>
>>>>
>>>>         PR target/77933
>>>>         * gcc.target/arm/pr77933-1.c: New test.
>>>>         * gcc.target/arm/pr77933-2.c: Likewise.
>>>>
>>>>
>>>> Testing: no regression on arm-none-eabi GCC cross-compiler targeting
>>>> Cortex-M0
>>>>
>>>> Is this ok for trunk?
>>>>
>>>
>>> Ok.
>>> Thanks,
>>> Kyrill
>>>
>>>> Best regards,
>>>>
>>>> Thomas
>>>>
>>>> On 02/11/16 17:08, Thomas Preudhomme wrote:
>>>>> Hi,
>>>>>
>>>>> When saving registers, function thumb1_expand_prologue () aims at
>>>>> minimizing
>>>>> the
>>>>> number of push instructions. One of the optimization it does is to
>>>>> push lr
>>>>> alongside high register(s) (after having moved them to low
>>>>> register(s)) when
>>>>> there is no low register to save. The way this is implemented is to
>>>>> add lr to
>>>>> the list of registers that can be pushed just before the push
>>>>> happens. This
>>>>> would then push lr and allows it to be used for further push if
>>>>> there was not
>>>>> enough registers to push all high registers to be pushed.
>>>>>
>>>>> However, the logic that decides what register to move high
>>>>> registers to before
>>>>> being pushed only looks at low registers (see for loop
>>>>> initialization). This
>>>>> means not only that lr is not used for pushing high registers but
>>>>> also that lr
>>>>> is not removed from the list of registers to be pushed when it's
>>>>> not used. This
>>>>> extra lr push is not poped in epilogue leading in stack corruption.
>>>>>
>>>>> This patch changes the loop to iterate over register r0 to lr so as
>>>>> to both fix
>>>>> the stack corruption and reuse lr to push some high register when
>>>>> possible.
>>>>>
>>>>> ChangeLog entry are as follow:
>>>>>
>>>>> *** gcc/ChangeLog ***
>>>>>
>>>>> 2016-11-01  Thomas Preud'homme <thomas.preudho...@arm.com>
>>>>>
>>>>>         PR target/77933
>>>>>         * config/arm/arm.c (thumb1_expand_prologue): Also check for
>>>>> lr being a
>>>>>         pushable register.
>>>>>
>>>>>
>>>>> *** gcc/testsuite/ChangeLog ***
>>>>>
>>>>> 2016-11-01  Thomas Preud'homme <thomas.preudho...@arm.com>
>>>>>
>>>>>         PR target/77933
>>>>>         * gcc.target/arm/pr77933.c: New test.
>>>>>
>>>>>
>>>>> Testing: no regression on arm-none-eabi GCC cross-compiler
>>>>> targeting Cortex-M0
>>>>>
>>>>> Is this ok for trunk?
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Thomas
>>>
> 
> 1_rmprofile_multilib.patch
> 
> 
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 
> d956da22ad60abfe9c6b4be0882f9e7dd64ac39f..15b662ad5449f8b91eb760b7fbe45f33d8cecb4b
>  100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -3739,6 +3739,16 @@ case "${target}" in
>                               # pragmatic.
>                               tmake_profile_file="arm/t-aprofile"
>                               ;;
> +                     rmprofile)
> +                             # Note that arm/t-rmprofile is a
> +                             # stand-alone make file fragment to be
> +                             # used only with itself.  We do not
> +                             # specifically use the
> +                             # TM_MULTILIB_OPTION framework because
> +                             # this shorthand is more
> +                             # pragmatic.
> +                             tmake_profile_file="arm/t-rmprofile"
> +                             ;;
>                       default)
>                               ;;
>                       *)
> @@ -3748,9 +3758,10 @@ case "${target}" in
>                       esac
>  
>                       if test "x${tmake_profile_file}" != x ; then
> -                             # arm/t-aprofile is only designed to work
> -                             # without any with-cpu, with-arch, with-mode,
> -                             # with-fpu or with-float options.
> +                             # arm/t-aprofile and arm/t-rmprofile are only
> +                             # designed to work without any with-cpu,
> +                             # with-arch, with-mode, with-fpu or with-float
> +                             # options.
>                               if test "x$with_arch" != x \
>                                   || test "x$with_cpu" != x \
>                                   || test "x$with_float" != x \
> diff --git a/gcc/config/arm/t-rmprofile b/gcc/config/arm/t-rmprofile
> new file mode 100644
> index 
> 0000000000000000000000000000000000000000..c8b5c9cbd03694eea69855e20372afa3e97d6b4c
> --- /dev/null
> +++ b/gcc/config/arm/t-rmprofile
> @@ -0,0 +1,174 @@
> +# Copyright (C) 2016 Free Software Foundation, Inc.
> +#
> +# This file is part of GCC.
> +#
> +# GCC is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3, or (at your option)
> +# any later version.
> +#
> +# GCC is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with GCC; see the file COPYING3.  If not see
> +# <http://www.gnu.org/licenses/>.
> +
> +# This is a target makefile fragment that attempts to get
> +# multilibs built for the range of CPU's, FPU's and ABI's that
> +# are relevant for the ARM architecture.  It should not be used in
> +# conjunction with another make file fragment and assumes --with-arch,
> +# --with-cpu, --with-fpu, --with-float, --with-mode have their default
> +# values during the configure step.  We enforce this during the
> +# top-level configury.
> +
> +MULTILIB_OPTIONS     =
> +MULTILIB_DIRNAMES    =
> +MULTILIB_EXCEPTIONS  =
> +MULTILIB_MATCHES     =
> +MULTILIB_REUSE       =
> +
> +# We have the following hierachy:
> +#   ISA: A32 (.) or T16/T32 (thumb).
> +#   Architecture: ARMv6S-M (v6-m), ARMv7-M (v7-m), ARMv7E-M (v7e-m),
> +#                 ARMv8-M Baseline (v8-m.base) or ARMv8-M Mainline 
> (v8-m.main).
> +#   FPU: VFPv3-D16 (fpv3), FPV4-SP-D16 (fpv4-sp), FPV5-SP-D16 (fpv5-sp),
> +#        VFPv5-D16 (fpv5), or None (.).
> +#   Float-abi: Soft (.), softfp (softfp), or hard (hardfp).
> +
> +# Options to build libraries with
> +
> +MULTILIB_OPTIONS       += mthumb
> +MULTILIB_DIRNAMES      += thumb
> +
> +MULTILIB_OPTIONS       += 
> march=armv6s-m/march=armv7-m/march=armv7e-m/march=armv7/march=armv8-m.base/march=armv8-m.main
> +MULTILIB_DIRNAMES      += v6-m v7-m v7e-m v7-ar v8-m.base v8-m.main
> +
> +MULTILIB_OPTIONS       += 
> mfpu=vfpv3-d16/mfpu=fpv4-sp-d16/mfpu=fpv5-sp-d16/mfpu=fpv5-d16
> +MULTILIB_DIRNAMES      += fpv3 fpv4-sp fpv5-sp fpv5
> +
> +MULTILIB_OPTIONS       += mfloat-abi=softfp/mfloat-abi=hard
> +MULTILIB_DIRNAMES      += softfp hard
> +
> +
> +# Option combinations to build library with
> +
> +# Default CPU/Arch
> +MULTILIB_REQUIRED      += mthumb
> +MULTILIB_REQUIRED      += mfloat-abi=hard
> +
> +# ARMv6-M
> +MULTILIB_REQUIRED      += mthumb/march=armv6s-m
> +
> +# ARMv8-M Baseline
> +MULTILIB_REQUIRED      += mthumb/march=armv8-m.base
> +
> +# ARMv7-M
> +MULTILIB_REQUIRED      += mthumb/march=armv7-m
> +
> +# ARMv7E-M
> +MULTILIB_REQUIRED      += mthumb/march=armv7e-m
> +MULTILIB_REQUIRED      += 
> mthumb/march=armv7e-m/mfpu=fpv4-sp-d16/mfloat-abi=softfp
> +MULTILIB_REQUIRED      += 
> mthumb/march=armv7e-m/mfpu=fpv4-sp-d16/mfloat-abi=hard
> +MULTILIB_REQUIRED      += 
> mthumb/march=armv7e-m/mfpu=fpv5-d16/mfloat-abi=softfp
> +MULTILIB_REQUIRED      += mthumb/march=armv7e-m/mfpu=fpv5-d16/mfloat-abi=hard
> +MULTILIB_REQUIRED      += 
> mthumb/march=armv7e-m/mfpu=fpv5-sp-d16/mfloat-abi=softfp
> +MULTILIB_REQUIRED      += 
> mthumb/march=armv7e-m/mfpu=fpv5-sp-d16/mfloat-abi=hard
> +
> +# ARMv8-M Mainline
> +MULTILIB_REQUIRED      += mthumb/march=armv8-m.main
> +MULTILIB_REQUIRED      += 
> mthumb/march=armv8-m.main/mfpu=fpv5-d16/mfloat-abi=softfp
> +MULTILIB_REQUIRED      += 
> mthumb/march=armv8-m.main/mfpu=fpv5-d16/mfloat-abi=hard
> +MULTILIB_REQUIRED      += 
> mthumb/march=armv8-m.main/mfpu=fpv5-sp-d16/mfloat-abi=softfp
> +MULTILIB_REQUIRED      += 
> mthumb/march=armv8-m.main/mfpu=fpv5-sp-d16/mfloat-abi=hard
> +
> +# ARMv7-R as well as ARMv7-A and ARMv8-A if aprofile was not specified
> +MULTILIB_REQUIRED      += mthumb/march=armv7
> +MULTILIB_REQUIRED      += mthumb/march=armv7/mfpu=vfpv3-d16/mfloat-abi=softfp
> +MULTILIB_REQUIRED      += mthumb/march=armv7/mfpu=vfpv3-d16/mfloat-abi=hard
> +
> +
> +# Matches
> +
> +# CPU Matches
> +MULTILIB_MATCHES       += march?armv6s-m=mcpu?cortex-m0
> +MULTILIB_MATCHES       += march?armv6s-m=mcpu?cortex-m0.small-multiply
> +MULTILIB_MATCHES       += march?armv6s-m=mcpu?cortex-m0plus
> +MULTILIB_MATCHES       += march?armv6s-m=mcpu?cortex-m0plus.small-multiply
> +MULTILIB_MATCHES       += march?armv6s-m=mcpu?cortex-m1
> +MULTILIB_MATCHES       += march?armv6s-m=mcpu?cortex-m1.small-multiply
> +MULTILIB_MATCHES       += march?armv7-m=mcpu?cortex-m3
> +MULTILIB_MATCHES       += march?armv7e-m=mcpu?cortex-m4
> +MULTILIB_MATCHES       += march?armv7e-m=mcpu?cortex-m7
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-r4
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-r4f
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-r5
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-r7
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-r8
> +MULTILIB_MATCHES       += march?armv7=mcpu?marvell-pj4
> +MULTILIB_MATCHES       += march?armv7=mcpu?generic-armv7-a
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-a8
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-a9
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-a5
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-a7
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-a15
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-a12
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-a17
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-a15.cortex-a7
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-a17.cortex-a7
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-a32
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-a35
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-a53
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-a57
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-a57.cortex-a53
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-a72
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-a72.cortex-a53
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-a73
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-a73.cortex-a35
> +MULTILIB_MATCHES       += march?armv7=mcpu?cortex-a73.cortex-a53
> +MULTILIB_MATCHES       += march?armv7=mcpu?exynos-m1
> +MULTILIB_MATCHES       += march?armv7=mcpu?qdf24xx
> +MULTILIB_MATCHES       += march?armv7=mcpu?xgene1
> +
> +# Arch Matches
> +MULTILIB_MATCHES       += march?armv6s-m=march?armv6-m
> +MULTILIB_MATCHES       += march?armv8-m.main=march?armv8-m.main+dsp
> +MULTILIB_MATCHES       += march?armv7=march?armv7-r
> +ifeq (,$(HAS_APROFILE))
> +MULTILIB_MATCHES       += march?armv7=march?armv7-a
> +MULTILIB_MATCHES       += march?armv7=march?armv7ve
> +MULTILIB_MATCHES       += march?armv7=march?armv8-a
> +MULTILIB_MATCHES       += march?armv7=march?armv8-a+crc
> +MULTILIB_MATCHES       += march?armv7=march?armv8.1-a
> +MULTILIB_MATCHES       += march?armv7=march?armv8.1-a+crc
> +MULTILIB_MATCHES       += march?armv7=march?armv8.2-a
> +MULTILIB_MATCHES       += march?armv7=march?armv8.2-a+fp16
> +endif
> +
> +# FPU matches
> +ifeq (,$(HAS_APROFILE))
> +MULTILIB_MATCHES       += mfpu?vfpv3-d16=mfpu?vfpv3
> +MULTILIB_MATCHES       += mfpu?vfpv3-d16=mfpu?vfpv3-fp16
> +MULTILIB_MATCHES       += mfpu?vfpv3-d16=mfpu?vfpv3-d16-fp16
> +MULTILIB_MATCHES       += mfpu?vfpv3-d16=mfpu?neon
> +MULTILIB_MATCHES       += mfpu?vfpv3-d16=mfpu?neon-fp16
> +MULTILIB_MATCHES       += mfpu?vfpv3-d16=mfpu?vfpv4
> +MULTILIB_MATCHES       += mfpu?vfpv3-d16=mfpu?vfpv4-d16
> +MULTILIB_MATCHES       += mfpu?vfpv3-d16=mfpu?neon-vfpv4
> +MULTILIB_MATCHES       += mfpu?fpv5-d16=mfpu?fp-armv8
> +MULTILIB_MATCHES       += mfpu?fpv5-d16=mfpu?neon-fp-armv8
> +MULTILIB_MATCHES       += mfpu?fpv5-d16=mfpu?crypto-neon-fp-armv8
> +endif
> +
> +
> +# We map all requests for ARMv7-R or ARMv7-A in ARM mode to Thumb mode and
> +# any FPU to VFPv3-d16 if possible.
> +MULTILIB_REUSE         += mthumb/march.armv7=march.armv7
> +MULTILIB_REUSE         += 
> mthumb/march.armv7/mfpu.vfpv3-d16/mfloat-abi.softfp=march.armv7/mfpu.vfpv3-d16/mfloat-abi.softfp
> +MULTILIB_REUSE         += 
> mthumb/march.armv7/mfpu.vfpv3-d16/mfloat-abi.hard=march.armv7/mfpu.vfpv3-d16/mfloat-abi.hard
> +MULTILIB_REUSE         += 
> mthumb/march.armv7/mfpu.vfpv3-d16/mfloat-abi.softfp=march.armv7/mfpu.fpv5-d16/mfloat-abi.softfp
> +MULTILIB_REUSE         += 
> mthumb/march.armv7/mfpu.vfpv3-d16/mfloat-abi.hard=march.armv7/mfpu.fpv5-d16/mfloat-abi.hard
> +MULTILIB_REUSE         += 
> mthumb/march.armv7/mfpu.vfpv3-d16/mfloat-abi.softfp=mthumb/march.armv7/mfpu.fpv5-d16/mfloat-abi.softfp
> +MULTILIB_REUSE         += 
> mthumb/march.armv7/mfpu.vfpv3-d16/mfloat-abi.hard=mthumb/march.armv7/mfpu.fpv5-d16/mfloat-abi.hard
> diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
> index 
> e4c686e60c7f479ca3ea71e94c4bb6ad52373085..0b94bc1931a226e58d06a7ed5a726454142c006a
>  100644
> --- a/gcc/doc/install.texi
> +++ b/gcc/doc/install.texi
> @@ -1107,19 +1107,59 @@ sysv, aix.
>  
>  @item --with-multilib-list=@var{list}
>  @itemx --without-multilib-list
> -Specify what multilibs to build.
> -Currently only implemented for arm*-*-*, sh*-*-* and x86-64-*-linux*.
> +Specify what multilibs to build.  @var{list} is a comma separated list of
> +values, possibly consisting of a single value.  Currently only implemented
> +for arm*-*-*, sh*-*-* and x86-64-*-linux*.  The accepted values and meaning
> +for each target is given below.
>  
>  @table @code
>  @item arm*-*-*
> -@var{list} is either @code{default} or @code{aprofile}.  Specifying
> -@code{default} is equivalent to omitting this option while specifying
> -@code{aprofile} builds multilibs for each combination of ISA (@code{-marm} or
> -@code{-mthumb}), architecture (@code{-march=armv7-a}, @code{-march=armv7ve},
> -or @code{-march=armv8-a}), FPU available (none, @code{-mfpu=vfpv3-d16},
> -@code{-mfpu=neon}, @code{-mfpu=vfpv4-d16}, @code{-mfpu=neon-vfpv4} or
> -@code{-mfpu=neon-fp-armv8} depending on architecture) and floating-point ABI
> -(@code{-mfloat-abi=softfp} or @code{-mfloat-abi=hard}).
> +@var{list} is one of@code{default}, @code{aprofile} or @code{rmprofile}.
> +Specifying @code{default} is equivalent to omitting this option, ie. only the
> +default runtime library will be enabled.  Specifying @code{aprofile} or
> +@code{rmprofile} builds multilibs for a combination of ISA, architecture,
> +FPU available and floating-point ABI.
> +
> +The table below gives the combination of ISAs, architectures, FPUs and
> +floating-point ABIs for which multilibs are built for each accepted value.
> +
> +@multitable @columnfractions .15 .28 .30
> +@item Option @tab aprofile @tab rmprofile
> +@item ISAs
> +@tab @code{-marm} and @code{-mthumb}
> +@tab @code{-mthumb}
> +@item Architectures@*@*@*@*@*@*
> +@tab default architecture@*
> +@code{-march=armv7-a}@*
> +@code{-march=armv7ve}@*
> +@code{-march=armv8-a}@*@*@*
> +@tab default architecture@*
> +@code{-march=armv6s-m}@*
> +@code{-march=armv7-m}@*
> +@code{-march=armv7e-m}@*
> +@code{-march=armv8-m.base}@*
> +@code{-march=armv8-m.main}@*
> +@code{-march=armv7}
> +@item FPUs@*@*@*@*@*
> +@tab none@*
> +@code{-mfpu=vfpv3-d16}@*
> +@code{-mfpu=neon}@*
> +@code{-mfpu=vfpv4-d16}@*
> +@code{-mfpu=neon-vfpv4}@*
> +@code{-mfpu=neon-fp-armv8}
> +@tab none@*
> +@code{-mfpu=vfpv3-d16}@*
> +@code{-mfpu=fpv4-sp-d16}@*
> +@code{-mfpu=fpv5-sp-d16}@*
> +@code{-mfpu=fpv5-d16}@*
> +@item floating-point@/ ABIs@*@*
> +@tab @code{-mfloat-abi=soft}@*
> +@code{-mfloat-abi=softfp}@*
> +@code{-mfloat-abi=hard}
> +@tab @code{-mfloat-abi=soft}@*
> +@code{-mfloat-abi=softfp}@*
> +@code{-mfloat-abi=hard}
> +@end multitable
>  
>  @item sh*-*-*
>  @var{list} is a comma separated list of CPU names.  These must be of the
> 
> 
> fix_pr77933_gcc5.patch
> 
> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 
> c01a3c878968f6e6f07358b0686e4a59e34f56b7..5c975625bfa25d2c71c27db348cd3e70fe44a951
>  100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -24457,6 +24457,7 @@ thumb1_expand_prologue (void)
>    unsigned long live_regs_mask;
>    unsigned long l_mask;
>    unsigned high_regs_pushed = 0;
> +  bool lr_needs_saving;
>  
>    func_type = arm_current_func_type ();
>  
> @@ -24479,6 +24480,7 @@ thumb1_expand_prologue (void)
>  
>    offsets = arm_get_frame_offsets ();
>    live_regs_mask = offsets->saved_regs_mask;
> +  lr_needs_saving = live_regs_mask & (1 << LR_REGNUM);
>  
>    /* Extract a mask of the ones we can give to the Thumb's push instruction. 
>  */
>    l_mask = live_regs_mask & 0x40ff;
> @@ -24545,6 +24547,7 @@ thumb1_expand_prologue (void)
>       {
>         insn = thumb1_emit_multi_reg_push (l_mask, l_mask);
>         RTX_FRAME_RELATED_P (insn) = 1;
> +       lr_needs_saving = false;
>  
>         offset = bit_count (l_mask) * UNITS_PER_WORD;
>       }
> @@ -24609,12 +24612,13 @@ thumb1_expand_prologue (void)
>       be a push of LR and we can combine it with the push of the first high
>       register.  */
>    else if ((l_mask & 0xff) != 0
> -        || (high_regs_pushed == 0 && l_mask))
> +        || (high_regs_pushed == 0 && lr_needs_saving))
>      {
>        unsigned long mask = l_mask;
>        mask |= (1 << thumb1_extra_regs_pushed (offsets, true)) - 1;
>        insn = thumb1_emit_multi_reg_push (mask, mask);
>        RTX_FRAME_RELATED_P (insn) = 1;
> +      lr_needs_saving = false;
>      }
>  
>    if (high_regs_pushed)
> @@ -24632,7 +24636,9 @@ thumb1_expand_prologue (void)
>        /* Here we need to mask out registers used for passing arguments
>        even if they can be pushed.  This is to avoid using them to stash the 
> high
>        registers.  Such kind of stash may clobber the use of arguments.  */
> -      pushable_regs = l_mask & (~arg_regs_mask) & 0xff;
> +      pushable_regs = l_mask & (~arg_regs_mask);
> +      if (lr_needs_saving)
> +     pushable_regs &= ~(1 << LR_REGNUM);
>  
>        if (pushable_regs == 0)
>       pushable_regs = 1 << thumb_find_work_register (live_regs_mask);
> @@ -24640,8 +24646,9 @@ thumb1_expand_prologue (void)
>        while (high_regs_pushed > 0)
>       {
>         unsigned long real_regs_mask = 0;
> +       unsigned long push_mask = 0;
>  
> -       for (regno = LAST_LO_REGNUM; regno >= 0; regno --)
> +       for (regno = LR_REGNUM; regno >= 0; regno --)
>           {
>             if (pushable_regs & (1 << regno))
>               {
> @@ -24650,6 +24657,7 @@ thumb1_expand_prologue (void)
>  
>                 high_regs_pushed --;
>                 real_regs_mask |= (1 << next_hi_reg);
> +               push_mask |= (1 << regno);
>  
>                 if (high_regs_pushed)
>                   {
> @@ -24659,23 +24667,20 @@ thumb1_expand_prologue (void)
>                         break;
>                   }
>                 else
> -                 {
> -                   pushable_regs &= ~((1 << regno) - 1);
> -                   break;
> -                 }
> +                 break;
>               }
>           }
>  
>         /* If we had to find a work register and we have not yet
>            saved the LR then add it to the list of regs to push.  */
> -       if (l_mask == (1 << LR_REGNUM))
> +       if (lr_needs_saving)
>           {
> -           pushable_regs |= l_mask;
> -           real_regs_mask |= l_mask;
> -           l_mask = 0;
> +           push_mask |= 1 << LR_REGNUM;
> +           real_regs_mask |= 1 << LR_REGNUM;
> +           lr_needs_saving = false;
>           }
>  
> -       insn = thumb1_emit_multi_reg_push (pushable_regs, real_regs_mask);
> +       insn = thumb1_emit_multi_reg_push (push_mask, real_regs_mask);
>         RTX_FRAME_RELATED_P (insn) = 1;
>       }
>      }
> diff --git a/gcc/testsuite/gcc.target/arm/pr77933-1.c 
> b/gcc/testsuite/gcc.target/arm/pr77933-1.c
> new file mode 100644
> index 
> 0000000000000000000000000000000000000000..95cf68ea7531bcc453371f493a05bd40caa5541b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/pr77933-1.c
> @@ -0,0 +1,46 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +__attribute__ ((noinline, noclone)) void
> +clobber_lr_and_highregs (void)
> +{
> +  __asm__ volatile ("" : : : "r8", "r9", "lr");
> +}
> +
> +int
> +main (void)
> +{
> +  int ret;
> +
> +  __asm volatile ("mov\tr4, #0xf4\n\t"
> +               "mov\tr5, #0xf5\n\t"
> +               "mov\tr6, #0xf6\n\t"
> +               "mov\tr7, #0xf7\n\t"
> +               "mov\tr0, #0xf8\n\t"
> +               "mov\tr8, r0\n\t"
> +               "mov\tr0, #0xfa\n\t"
> +               "mov\tr10, r0"
> +               : : : "r0", "r4", "r5", "r6", "r7", "r8", "r10");
> +
> +  clobber_lr_and_highregs ();
> +
> +  __asm volatile ("cmp\tr4, #0xf4\n\t"
> +               "bne\tfail\n\t"
> +               "cmp\tr5, #0xf5\n\t"
> +               "bne\tfail\n\t"
> +               "cmp\tr6, #0xf6\n\t"
> +               "bne\tfail\n\t"
> +               "cmp\tr7, #0xf7\n\t"
> +               "bne\tfail\n\t"
> +               "mov\tr0, r8\n\t"
> +               "cmp\tr0, #0xf8\n\t"
> +               "bne\tfail\n\t"
> +               "mov\tr0, r10\n\t"
> +               "cmp\tr0, #0xfa\n\t"
> +               "bne\tfail\n\t"
> +               "mov\t%0, #1\n"
> +               "fail:\n\t"
> +               "sub\tr0, #1"
> +               : "=r" (ret) : :);
> +  return ret;
> +}
> diff --git a/gcc/testsuite/gcc.target/arm/pr77933-2.c 
> b/gcc/testsuite/gcc.target/arm/pr77933-2.c
> new file mode 100644
> index 
> 0000000000000000000000000000000000000000..9028c4fcab4229591fa057f15c641d2b5597cd1d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/pr77933-2.c
> @@ -0,0 +1,47 @@
> +/* { dg-do run } */
> +/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */
> +/* { dg-options "-mthumb -O2 -mtpcs-leaf-frame" } */
> +
> +__attribute__ ((noinline, noclone)) void
> +clobber_lr_and_highregs (void)
> +{
> +  __asm__ volatile ("" : : : "r8", "r9", "lr");
> +}
> +
> +int
> +main (void)
> +{
> +  int ret;
> +
> +  __asm volatile ("mov\tr4, #0xf4\n\t"
> +               "mov\tr5, #0xf5\n\t"
> +               "mov\tr6, #0xf6\n\t"
> +               "mov\tr7, #0xf7\n\t"
> +               "mov\tr0, #0xf8\n\t"
> +               "mov\tr8, r0\n\t"
> +               "mov\tr0, #0xfa\n\t"
> +               "mov\tr10, r0"
> +               : : : "r0", "r4", "r5", "r6", "r7", "r8", "r10");
> +
> +  clobber_lr_and_highregs ();
> +
> +  __asm volatile ("cmp\tr4, #0xf4\n\t"
> +               "bne\tfail\n\t"
> +               "cmp\tr5, #0xf5\n\t"
> +               "bne\tfail\n\t"
> +               "cmp\tr6, #0xf6\n\t"
> +               "bne\tfail\n\t"
> +               "cmp\tr7, #0xf7\n\t"
> +               "bne\tfail\n\t"
> +               "mov\tr0, r8\n\t"
> +               "cmp\tr0, #0xf8\n\t"
> +               "bne\tfail\n\t"
> +               "mov\tr0, r10\n\t"
> +               "cmp\tr0, #0xfa\n\t"
> +               "bne\tfail\n\t"
> +               "mov\t%0, #1\n"
> +               "fail:\n\t"
> +               "sub\tr0, #1"
> +               : "=r" (ret) : :);
> +  return ret;
> +}
> 
> 
> fix_pr77933_gcc6.patch
> 
> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 
> 83cb13d1195beb19d6301f5c83a7eb544a91d877..1dba035c62c97a5f723d02208636c92108427379
>  100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -24710,6 +24710,7 @@ thumb1_expand_prologue (void)
>    unsigned long live_regs_mask;
>    unsigned long l_mask;
>    unsigned high_regs_pushed = 0;
> +  bool lr_needs_saving;
>  
>    func_type = arm_current_func_type ();
>  
> @@ -24732,6 +24733,7 @@ thumb1_expand_prologue (void)
>  
>    offsets = arm_get_frame_offsets ();
>    live_regs_mask = offsets->saved_regs_mask;
> +  lr_needs_saving = live_regs_mask & (1 << LR_REGNUM);
>  
>    /* Extract a mask of the ones we can give to the Thumb's push instruction. 
>  */
>    l_mask = live_regs_mask & 0x40ff;
> @@ -24798,6 +24800,7 @@ thumb1_expand_prologue (void)
>       {
>         insn = thumb1_emit_multi_reg_push (l_mask, l_mask);
>         RTX_FRAME_RELATED_P (insn) = 1;
> +       lr_needs_saving = false;
>  
>         offset = bit_count (l_mask) * UNITS_PER_WORD;
>       }
> @@ -24862,12 +24865,13 @@ thumb1_expand_prologue (void)
>       be a push of LR and we can combine it with the push of the first high
>       register.  */
>    else if ((l_mask & 0xff) != 0
> -        || (high_regs_pushed == 0 && l_mask))
> +        || (high_regs_pushed == 0 && lr_needs_saving))
>      {
>        unsigned long mask = l_mask;
>        mask |= (1 << thumb1_extra_regs_pushed (offsets, true)) - 1;
>        insn = thumb1_emit_multi_reg_push (mask, mask);
>        RTX_FRAME_RELATED_P (insn) = 1;
> +      lr_needs_saving = false;
>      }
>  
>    if (high_regs_pushed)
> @@ -24885,7 +24889,9 @@ thumb1_expand_prologue (void)
>        /* Here we need to mask out registers used for passing arguments
>        even if they can be pushed.  This is to avoid using them to stash the 
> high
>        registers.  Such kind of stash may clobber the use of arguments.  */
> -      pushable_regs = l_mask & (~arg_regs_mask) & 0xff;
> +      pushable_regs = l_mask & (~arg_regs_mask);
> +      if (lr_needs_saving)
> +     pushable_regs &= ~(1 << LR_REGNUM);
>  
>        if (pushable_regs == 0)
>       pushable_regs = 1 << thumb_find_work_register (live_regs_mask);
> @@ -24893,8 +24899,9 @@ thumb1_expand_prologue (void)
>        while (high_regs_pushed > 0)
>       {
>         unsigned long real_regs_mask = 0;
> +       unsigned long push_mask = 0;
>  
> -       for (regno = LAST_LO_REGNUM; regno >= 0; regno --)
> +       for (regno = LR_REGNUM; regno >= 0; regno --)
>           {
>             if (pushable_regs & (1 << regno))
>               {
> @@ -24903,6 +24910,7 @@ thumb1_expand_prologue (void)
>  
>                 high_regs_pushed --;
>                 real_regs_mask |= (1 << next_hi_reg);
> +               push_mask |= (1 << regno);
>  
>                 if (high_regs_pushed)
>                   {
> @@ -24912,23 +24920,20 @@ thumb1_expand_prologue (void)
>                         break;
>                   }
>                 else
> -                 {
> -                   pushable_regs &= ~((1 << regno) - 1);
> -                   break;
> -                 }
> +                 break;
>               }
>           }
>  
>         /* If we had to find a work register and we have not yet
>            saved the LR then add it to the list of regs to push.  */
> -       if (l_mask == (1 << LR_REGNUM))
> +       if (lr_needs_saving)
>           {
> -           pushable_regs |= l_mask;
> -           real_regs_mask |= l_mask;
> -           l_mask = 0;
> +           push_mask |= 1 << LR_REGNUM;
> +           real_regs_mask |= 1 << LR_REGNUM;
> +           lr_needs_saving = false;
>           }
>  
> -       insn = thumb1_emit_multi_reg_push (pushable_regs, real_regs_mask);
> +       insn = thumb1_emit_multi_reg_push (push_mask, real_regs_mask);
>         RTX_FRAME_RELATED_P (insn) = 1;
>       }
>      }
> diff --git a/gcc/testsuite/gcc.target/arm/pr77933-1.c 
> b/gcc/testsuite/gcc.target/arm/pr77933-1.c
> new file mode 100644
> index 
> 0000000000000000000000000000000000000000..95cf68ea7531bcc453371f493a05bd40caa5541b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/pr77933-1.c
> @@ -0,0 +1,46 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +__attribute__ ((noinline, noclone)) void
> +clobber_lr_and_highregs (void)
> +{
> +  __asm__ volatile ("" : : : "r8", "r9", "lr");
> +}
> +
> +int
> +main (void)
> +{
> +  int ret;
> +
> +  __asm volatile ("mov\tr4, #0xf4\n\t"
> +               "mov\tr5, #0xf5\n\t"
> +               "mov\tr6, #0xf6\n\t"
> +               "mov\tr7, #0xf7\n\t"
> +               "mov\tr0, #0xf8\n\t"
> +               "mov\tr8, r0\n\t"
> +               "mov\tr0, #0xfa\n\t"
> +               "mov\tr10, r0"
> +               : : : "r0", "r4", "r5", "r6", "r7", "r8", "r10");
> +
> +  clobber_lr_and_highregs ();
> +
> +  __asm volatile ("cmp\tr4, #0xf4\n\t"
> +               "bne\tfail\n\t"
> +               "cmp\tr5, #0xf5\n\t"
> +               "bne\tfail\n\t"
> +               "cmp\tr6, #0xf6\n\t"
> +               "bne\tfail\n\t"
> +               "cmp\tr7, #0xf7\n\t"
> +               "bne\tfail\n\t"
> +               "mov\tr0, r8\n\t"
> +               "cmp\tr0, #0xf8\n\t"
> +               "bne\tfail\n\t"
> +               "mov\tr0, r10\n\t"
> +               "cmp\tr0, #0xfa\n\t"
> +               "bne\tfail\n\t"
> +               "mov\t%0, #1\n"
> +               "fail:\n\t"
> +               "sub\tr0, #1"
> +               : "=r" (ret) : :);
> +  return ret;
> +}
> diff --git a/gcc/testsuite/gcc.target/arm/pr77933-2.c 
> b/gcc/testsuite/gcc.target/arm/pr77933-2.c
> new file mode 100644
> index 
> 0000000000000000000000000000000000000000..9028c4fcab4229591fa057f15c641d2b5597cd1d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/pr77933-2.c
> @@ -0,0 +1,47 @@
> +/* { dg-do run } */
> +/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */
> +/* { dg-options "-mthumb -O2 -mtpcs-leaf-frame" } */
> +
> +__attribute__ ((noinline, noclone)) void
> +clobber_lr_and_highregs (void)
> +{
> +  __asm__ volatile ("" : : : "r8", "r9", "lr");
> +}
> +
> +int
> +main (void)
> +{
> +  int ret;
> +
> +  __asm volatile ("mov\tr4, #0xf4\n\t"
> +               "mov\tr5, #0xf5\n\t"
> +               "mov\tr6, #0xf6\n\t"
> +               "mov\tr7, #0xf7\n\t"
> +               "mov\tr0, #0xf8\n\t"
> +               "mov\tr8, r0\n\t"
> +               "mov\tr0, #0xfa\n\t"
> +               "mov\tr10, r0"
> +               : : : "r0", "r4", "r5", "r6", "r7", "r8", "r10");
> +
> +  clobber_lr_and_highregs ();
> +
> +  __asm volatile ("cmp\tr4, #0xf4\n\t"
> +               "bne\tfail\n\t"
> +               "cmp\tr5, #0xf5\n\t"
> +               "bne\tfail\n\t"
> +               "cmp\tr6, #0xf6\n\t"
> +               "bne\tfail\n\t"
> +               "cmp\tr7, #0xf7\n\t"
> +               "bne\tfail\n\t"
> +               "mov\tr0, r8\n\t"
> +               "cmp\tr0, #0xf8\n\t"
> +               "bne\tfail\n\t"
> +               "mov\tr0, r10\n\t"
> +               "cmp\tr0, #0xfa\n\t"
> +               "bne\tfail\n\t"
> +               "mov\t%0, #1\n"
> +               "fail:\n\t"
> +               "sub\tr0, #1"
> +               : "=r" (ret) : :);
> +  return ret;
> +}
> 

Reply via email to