Jeff Law <[email protected]> writes:
> So this is another nasty latent bug exposed by ext-dce.
>
> Similar to the prior m68k failure it's another problem with how we
> handle paradoxical subregs on big endian targets.
>
> In this instance when we remove the hard subregs we take something like:
>
> (subreg:DI (reg:SI 0) 0)
>
> And turn it into
>
> (reg:SI -1)
>
> Which is clearly wrong. (reg:SI 0) is correct.
>
>
> The transformation happens in alter_subreg, but I really wanted to fix
> this in subreg_regno since we could have similar problems in some of the
> other callers of subreg_regno.
>
> Unfortunately reload depends on the current behavior of subreg_regno; in
> the cases where the return value is an invalid register, the wrong half
> of a register pair, etc the resulting bogus value is detected by reload
> and triggers reloading of the inner object. So that's the new comment
> in subreg_regno.
>
> The second best place to fix is alter_subreg which is what this patch
> does. If presented with a paradoxical subreg, then the base register
> number should always be REGNO (SUBREG_REG (object)). It's just how
> paradoxicals are designed to work.
>
> I haven't tried to fix the other places that call subreg_regno. After
> being burned by reload, I'm more than a bit worried about unintended
> fallout.
>
> I must admit I'm surprised we haven't stumbled over this before and that
> it didn't fix any failures on the big endian embedded targets.
>
> Boostrapped & regression tested on x86_64, also went through all the
> embedded targets in my tester and bootstrapped on m68k & s390x to get
> some additional big endian testing.
>
> Pushing to the trunk.
>
> jeff
>
>
> commit e9738e77674e23f600315ca1efed7d1c7944d0cc
> Author: Jeff Law <[email protected]>
> Date: Mon Aug 12 07:29:25 2024 -0600
>
> [rtl-optimization/116244] Don't create bogus regs in alter_subreg
>
> So this is another nasty latent bug exposed by ext-dce.
>
> Similar to the prior m68k failure it's another problem with how we handle
> paradoxical subregs on big endian targets.
>
> In this instance when we remove the hard subregs we take something like:
>
> (subreg:DI (reg:SI 0) 0)
>
> And turn it into
>
> (reg:SI -1)
>
> Which is clearly wrong. (reg:SI 0) is correct.
>
> The transformation happens in alter_subreg, but I really wanted to fix
> this in
> subreg_regno since we could have similar problems in some of the other
> callers
> of subreg_regno.
>
> Unfortunately reload depends on the current behavior of subreg_regno; in
> the
> cases where the return value is an invalid register, the wrong half of a
> register pair, etc the resulting bogus value is detected by reload and
> triggers
> reloading of the inner object. So that's the new comment in subreg_regno.
>
> The second best place to fix is alter_subreg which is what this patch
> does. If
> presented with a paradoxical subreg, then the base register number should
> always be REGNO (SUBREG_REG (object)). It's just how paradoxicals are
> designed
> to work.
>
> I haven't tried to fix the other places that call subreg_regno. After
> being
> burned by reload, I'm more than a bit worried about unintended fallout.
>
> I must admit I'm surprised we haven't stumbled over this before and that
> it
> didn't fix any failures on the big endian embedded targets.
>
> Boostrapped & regression tested on x86_64, also went through all the
> embedded
> targets in my tester and bootstrapped on m68k & s390x to get some
> additional
> big endian testing.
>
> Pushing to the trunk.
>
> rtl-optimization/116244
> gcc/
> * rtlanal.cc (subreg_regno): Update comment.
> * final.cc (alter_subrg): Always use REGNO (SUBREG_REG ()) to get
> the base regsiter for paradoxical subregs.
>
> gcc/testsuite/
> * g++.target/m68k/m68k.exp: New test driver.
> * g++.target/m68k/pr116244.C: New test.
>
> diff --git a/gcc/final.cc b/gcc/final.cc
> index eb9e065d9f0..0167b2f8602 100644
> --- a/gcc/final.cc
> +++ b/gcc/final.cc
> @@ -3123,7 +3123,17 @@ alter_subreg (rtx *xp, bool final_p)
> unsigned int regno;
> poly_int64 offset;
>
> - regno = subreg_regno (x);
> + /* A paradoxical should always be REGNO (y) + 0. Using subreg_regno
> + for something like (subreg:DI (reg:SI N) 0) on a WORDS_BIG_ENDIAN
> + target will return N-1 which is catastrophic for N == 0 and just
> + wrong for other cases.
> +
> + Fixing subreg_regno would be a better option, except that reload
> + depends on its current behavior. */
> + if (paradoxical_subreg_p (x))
> + regno = REGNO (y);
> + else
> + regno = subreg_regno (x);
Are you sure that's right? For a 32-bit big-endian target,
(subreg:DI (reg:SI 1) 0) really should simplify to (reg:DI 0) rather
than (reg:DI 1).
Like you say, this leads to an invalid result for (reg:SI 0) in place
of (reg:SI 1). But AIUI, it's reload's/LRA's job to ensure that that
never happens. This is part of the reason why LRA/IRA need to track
the widest referenced mode.
So it sounds like something might have gone wrong earlier/elsewhere.
Thanks,
Richard
> if (subreg_lowpart_p (x))
> offset = byte_lowpart_offset (GET_MODE (x), GET_MODE (y));
> else
> diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
> index 4158a531bdd..6f6e6544755 100644
> --- a/gcc/rtlanal.cc
> +++ b/gcc/rtlanal.cc
> @@ -4313,7 +4313,16 @@ lowpart_subreg_regno (unsigned int regno, machine_mode
> xmode,
> return simplify_subreg_regno (regno, xmode, offset, ymode);
> }
>
> -/* Return the final regno that a subreg expression refers to. */
> +/* Return the final regno that a subreg expression refers to.
> +
> + Callers such as reload_inner_reg_of_subreg rely on this returning
> + the simplified hard reg, even if that result is not a valid regno for
> + the given mode. That triggers reloading the inner part of the
> + subreg.
> +
> + That inherently means other uses of this routine probably need
> + to be audited for their behavior when requested subreg can't
> + be expressed as a hard register after apply the offset. */
> unsigned int
> subreg_regno (const_rtx x)
> {
> diff --git a/gcc/testsuite/g++.target/m68k/m68k.exp
> b/gcc/testsuite/g++.target/m68k/m68k.exp
> new file mode 100644
> index 00000000000..8f6416e9fdf
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/m68k/m68k.exp
> @@ -0,0 +1,34 @@
> +# Copyright (C) 2019-2024 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with GCC; see the file COPYING3. If not see
> +# <http://www.gnu.org/licenses/>.
> +
> +# GCC testsuite that uses the `dg.exp' driver.
> +
> +# Exit immediately if this isn't a m68k target.
> +if ![istarget m68k*-*-*] then {
> + return
> +}
> +
> +# Load support procs.
> +load_lib g++-dg.exp
> +
> +# Initialize `dg'.
> +dg-init
> +
> +# Main loop.
> +dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.C]] "" ""
> +
> +# All done.
> +dg-finish
> diff --git a/gcc/testsuite/g++.target/m68k/pr116244.C
> b/gcc/testsuite/g++.target/m68k/pr116244.C
> new file mode 100644
> index 00000000000..2e78832efbb
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/m68k/pr116244.C
> @@ -0,0 +1,226 @@
> +// { dg-do compile }
> +// { dg-additional-options "-std=gnu++20 -O2 -w -march=isaa" }
> +namespace std {
> +struct __conditional {
> + template <typename _Tp, typename> using type = _Tp;
> +};
> +template <bool, typename _If, typename _Else>
> +using __conditional_t = __conditional::type<_If, _Else>;
> +template <typename> constexpr bool is_lvalue_reference_v = true;
> +template <typename> bool is_same_v;
> +template <typename _From, typename _To>
> +constexpr bool is_convertible_v = __is_convertible(_From, _To);
> +template <typename> bool is_invocable_v;
> +template <typename...> using common_reference_t = int;
> +template <typename _Iterator> struct iterator_traits : _Iterator {};
> +namespace __detail {
> +template <typename, typename _Up>
> +concept __same_as = is_same_v<_Up>;
> +}
> +template <typename _Tp, typename _Up>
> +concept same_as = __detail::__same_as<_Up, _Tp>;
> +template <typename _Derived, typename _Base>
> +concept derived_from = is_convertible_v<_Derived, _Base>;
> +template <typename, typename>
> +concept common_reference_with =
> + same_as<common_reference_t<>, common_reference_t<>>;
> +namespace __detail {
> +template <typename _Tp>
> +concept __boolean_testable = requires(_Tp __t) { __t; };
> +template <typename _Tp, typename>
> +concept __weakly_eq_cmp_with = requires(_Tp __t) { __t; };
> +} // namespace __detail
> +template <typename... _Args>
> +concept invocable = is_invocable_v<_Args...>;
> +template <typename>
> +concept regular_invocable = invocable<>;
> +template <typename _Fn>
> +concept predicate = __detail::__boolean_testable<_Fn>;
> +template <typename _Rel, typename, typename>
> +concept relation = predicate<_Rel>;
> +template <typename _Rel, typename _Tp, typename _Up>
> +concept strict_weak_order = relation<_Rel, _Tp, _Up>;
> +namespace {
> +struct duration {
> + duration() = default;
> + template <typename _Rep2> duration(_Rep2 __rep) : __r(__rep) {}
> + long long count() { return __r; }
> + long long __r;
> +};
> +long long operator+(duration __lhs, duration __rhs) {
> + long __trans_tmp_1 = __lhs.count();
> + return __trans_tmp_1 + __rhs.count();
> +}
> +struct time_point {
> + time_point();
> + time_point(duration __dur) : __d(__dur) {}
> + duration time_since_epoch() { return __d; }
> + duration __d;
> +};
> +time_point operator+(time_point __lhs, duration __rhs) {
> + duration __trans_tmp_2 = __lhs.time_since_epoch();
> + return time_point(__trans_tmp_2 + __rhs);
> +}
> +template <typename> using sys_time = time_point;
> +} // namespace
> +template <typename> class allocator;
> +namespace {
> +struct less {};
> +} // namespace
> +struct forward_iterator_tag {};
> +namespace __detail {
> +template <typename _Iter> struct __iter_traits_impl {
> + using type = iterator_traits<_Iter>;
> +};
> +template <typename _Iter> using __iter_traits =
> __iter_traits_impl<_Iter>::type;
> +template <typename> struct __iter_concept_impl;
> +template <typename _Iter>
> + requires requires { typename _Iter; }
> +struct __iter_concept_impl<_Iter> {
> + using type = __iter_traits<_Iter>::iterator_concept;
> +};
> +template <typename _Iter>
> +using __iter_concept = __iter_concept_impl<_Iter>::type;
> +template <typename _In>
> +concept __indirectly_readable_impl = common_reference_with<_In, _In>;
> +} // namespace __detail
> +template <typename _In>
> +concept indirectly_readable = __detail::__indirectly_readable_impl<_In>;
> +template <typename _Iter>
> +concept input_or_output_iterator = requires(_Iter __i) { __i; };
> +template <typename _Sent, typename _Iter>
> +concept sentinel_for = __detail::__weakly_eq_cmp_with<_Sent, _Iter>;
> +template <typename _Iter>
> +concept forward_iterator =
> + derived_from<__detail::__iter_concept<_Iter>, forward_iterator_tag>;
> +template <typename _Fn, typename>
> +concept indirectly_regular_unary_invocable = regular_invocable<_Fn>;
> +template <typename _Fn, typename _I1, typename _I2>
> +concept indirect_strict_weak_order = strict_weak_order<_Fn, _I1, _I2>;
> +namespace __detail {
> +template <typename, typename> struct __projected;
> +}
> +template <indirectly_readable _Iter,
> + indirectly_regular_unary_invocable<_Iter> _Proj>
> +using projected = __detail::__projected<_Iter, _Proj>;
> +namespace ranges::__access {
> +template <typename _Tp> auto __begin(_Tp __t) { return __t.begin(); }
> +} // namespace ranges::__access
> +namespace __detail {
> +template <typename _Tp>
> +using __range_iter_t = decltype(ranges::__access::__begin(_Tp()));
> +}
> +template <typename _Tp> struct iterator_traits<_Tp *> {
> + typedef forward_iterator_tag iterator_concept;
> + using reference = _Tp;
> +};
> +} // namespace std
> +template <typename _Iterator, typename> struct __normal_iterator {
> + using iterator_concept = std::__detail::__iter_concept<_Iterator>;
> + std::iterator_traits<_Iterator>::reference operator*();
> + void operator++();
> +};
> +template <typename _IteratorL, typename _IteratorR, typename _Container>
> +bool operator==(__normal_iterator<_IteratorL, _Container>,
> + __normal_iterator<_IteratorR, _Container>);
> +int _M_get_sys_info_ri;
> +namespace std {
> +template <typename> struct allocator_traits;
> +template <typename _Tp> struct allocator_traits<allocator<_Tp>> {
> + using pointer = _Tp *;
> +};
> +namespace {
> +namespace __detail {
> +template <typename _Tp>
> +concept __maybe_borrowed_range = is_lvalue_reference_v<_Tp>;
> +}
> +int end;
> +template <typename>
> +concept range = requires { end; };
> +template <typename _Tp>
> +concept borrowed_range = __detail::__maybe_borrowed_range<_Tp>;
> +template <typename _Tp> using iterator_t =
> std::__detail::__range_iter_t<_Tp>;
> +template <typename _Tp>
> +concept forward_range = forward_iterator<iterator_t<_Tp>>;
> +struct dangling;
> +template <input_or_output_iterator _It, sentinel_for<_It> _Sent = _It>
> +struct subrange {
> + _It begin();
> + _Sent end();
> +};
> +template <range _Range>
> +using borrowed_subrange_t =
> + __conditional_t<borrowed_range<_Range>, subrange<iterator_t<_Range>>,
> + dangling>;
> +} // namespace
> +template <typename _Alloc> struct _Vector_base {
> + typedef allocator_traits<_Alloc>::pointer pointer;
> +};
> +template <typename _Tp, typename _Alloc = allocator<_Tp>> struct vector {
> + __normal_iterator<typename _Vector_base<_Alloc>::pointer, int> begin();
> +};
> +template <typename _Tp> struct __shared_ptr_access {
> + _Tp *operator->();
> +};
> +namespace chrono {
> +struct weekday {
> + char _M_wd;
> + weekday _S_from_days() {
> + long __trans_tmp_3;
> + return 7 > __trans_tmp_3;
> + }
> + weekday(unsigned __wd) : _M_wd(__wd == 7 ?: __wd) {}
> + weekday() : weekday{_S_from_days()} {}
> + unsigned c_encoding() { return _M_wd; }
> + friend duration operator-(weekday __x, weekday __y) {
> + auto __n = __x.c_encoding() - __y.c_encoding();
> + return int(__n) >= 0 ? __n : duration{};
> + }
> +};
> +struct year_month_day {
> + year_month_day _S_from_days(duration __dp) {
> + auto __r0 = int(__dp.count()), __u2 = 5 * __r0;
> + year_month_day{__u2};
> + }
> + year_month_day();
> + year_month_day(int);
> + year_month_day(sys_time<long> __dp)
> + : year_month_day(_S_from_days(__dp.time_since_epoch())) {}
> +};
> +struct time_zone {
> + void _M_get_sys_info() const;
> +};
> +} // namespace chrono
> +namespace ranges {
> +struct {
> + template <
> + forward_range _Range, typename _Tp, typename _Proj,
> + indirect_strict_weak_order<_Tp, projected<_Range, _Proj>> _Comp = less>
> + borrowed_subrange_t<_Range> operator()(_Range, _Tp, _Comp, _Proj);
> +} equal_range;
> +} // namespace ranges
> +} // namespace std
> +namespace std::chrono {
> +struct Rule;
> +unsigned day_of_week;
> +struct _Node {
> + vector<Rule> rules;
> +};
> +char name;
> +struct Rule {
> + int from;
> + void start_time(int, long) {
> + year_month_day ymd;
> + sys_time<long> date;
> + duration diff = day_of_week - weekday{};
> + ymd = date + diff;
> + }
> +};
> +__shared_ptr_access<_Node> __trans_tmp_5;
> +void time_zone::_M_get_sys_info() const {
> + auto node = __trans_tmp_5;
> + auto rules = ranges::equal_range(node->rules, _M_get_sys_info_ri, {},
> name);
> + for (auto rule : rules)
> + rule.start_time(rule.from, {});
> +}
> +} // namespace std::chrono