Jeff Law <j...@ventanamicro.com> writes: > So this is another nasty latent bug exposed by ext-dce. > > Similar to the prior m68k failure it's another problem with how we > handle paradoxical subregs on big endian targets. > > In this instance when we remove the hard subregs we take something like: > > (subreg:DI (reg:SI 0) 0) > > And turn it into > > (reg:SI -1) > > Which is clearly wrong. (reg:SI 0) is correct. > > > The transformation happens in alter_subreg, but I really wanted to fix > this in subreg_regno since we could have similar problems in some of the > other callers of subreg_regno. > > Unfortunately reload depends on the current behavior of subreg_regno; in > the cases where the return value is an invalid register, the wrong half > of a register pair, etc the resulting bogus value is detected by reload > and triggers reloading of the inner object. So that's the new comment > in subreg_regno. > > The second best place to fix is alter_subreg which is what this patch > does. If presented with a paradoxical subreg, then the base register > number should always be REGNO (SUBREG_REG (object)). It's just how > paradoxicals are designed to work. > > I haven't tried to fix the other places that call subreg_regno. After > being burned by reload, I'm more than a bit worried about unintended > fallout. > > I must admit I'm surprised we haven't stumbled over this before and that > it didn't fix any failures on the big endian embedded targets. > > Boostrapped & regression tested on x86_64, also went through all the > embedded targets in my tester and bootstrapped on m68k & s390x to get > some additional big endian testing. > > Pushing to the trunk. > > jeff > > > commit e9738e77674e23f600315ca1efed7d1c7944d0cc > Author: Jeff Law <j...@ventanamicro.com> > Date: Mon Aug 12 07:29:25 2024 -0600 > > [rtl-optimization/116244] Don't create bogus regs in alter_subreg > > So this is another nasty latent bug exposed by ext-dce. > > Similar to the prior m68k failure it's another problem with how we handle > paradoxical subregs on big endian targets. > > In this instance when we remove the hard subregs we take something like: > > (subreg:DI (reg:SI 0) 0) > > And turn it into > > (reg:SI -1) > > Which is clearly wrong. (reg:SI 0) is correct. > > The transformation happens in alter_subreg, but I really wanted to fix > this in > subreg_regno since we could have similar problems in some of the other > callers > of subreg_regno. > > Unfortunately reload depends on the current behavior of subreg_regno; in > the > cases where the return value is an invalid register, the wrong half of a > register pair, etc the resulting bogus value is detected by reload and > triggers > reloading of the inner object. So that's the new comment in subreg_regno. > > The second best place to fix is alter_subreg which is what this patch > does. If > presented with a paradoxical subreg, then the base register number should > always be REGNO (SUBREG_REG (object)). It's just how paradoxicals are > designed > to work. > > I haven't tried to fix the other places that call subreg_regno. After > being > burned by reload, I'm more than a bit worried about unintended fallout. > > I must admit I'm surprised we haven't stumbled over this before and that > it > didn't fix any failures on the big endian embedded targets. > > Boostrapped & regression tested on x86_64, also went through all the > embedded > targets in my tester and bootstrapped on m68k & s390x to get some > additional > big endian testing. > > Pushing to the trunk. > > rtl-optimization/116244 > gcc/ > * rtlanal.cc (subreg_regno): Update comment. > * final.cc (alter_subrg): Always use REGNO (SUBREG_REG ()) to get > the base regsiter for paradoxical subregs. > > gcc/testsuite/ > * g++.target/m68k/m68k.exp: New test driver. > * g++.target/m68k/pr116244.C: New test. > > diff --git a/gcc/final.cc b/gcc/final.cc > index eb9e065d9f0..0167b2f8602 100644 > --- a/gcc/final.cc > +++ b/gcc/final.cc > @@ -3123,7 +3123,17 @@ alter_subreg (rtx *xp, bool final_p) > unsigned int regno; > poly_int64 offset; > > - regno = subreg_regno (x); > + /* A paradoxical should always be REGNO (y) + 0. Using subreg_regno > + for something like (subreg:DI (reg:SI N) 0) on a WORDS_BIG_ENDIAN > + target will return N-1 which is catastrophic for N == 0 and just > + wrong for other cases. > + > + Fixing subreg_regno would be a better option, except that reload > + depends on its current behavior. */ > + if (paradoxical_subreg_p (x)) > + regno = REGNO (y); > + else > + regno = subreg_regno (x);
Are you sure that's right? For a 32-bit big-endian target, (subreg:DI (reg:SI 1) 0) really should simplify to (reg:DI 0) rather than (reg:DI 1). Like you say, this leads to an invalid result for (reg:SI 0) in place of (reg:SI 1). But AIUI, it's reload's/LRA's job to ensure that that never happens. This is part of the reason why LRA/IRA need to track the widest referenced mode. So it sounds like something might have gone wrong earlier/elsewhere. Thanks, Richard > if (subreg_lowpart_p (x)) > offset = byte_lowpart_offset (GET_MODE (x), GET_MODE (y)); > else > diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc > index 4158a531bdd..6f6e6544755 100644 > --- a/gcc/rtlanal.cc > +++ b/gcc/rtlanal.cc > @@ -4313,7 +4313,16 @@ lowpart_subreg_regno (unsigned int regno, machine_mode > xmode, > return simplify_subreg_regno (regno, xmode, offset, ymode); > } > > -/* Return the final regno that a subreg expression refers to. */ > +/* Return the final regno that a subreg expression refers to. > + > + Callers such as reload_inner_reg_of_subreg rely on this returning > + the simplified hard reg, even if that result is not a valid regno for > + the given mode. That triggers reloading the inner part of the > + subreg. > + > + That inherently means other uses of this routine probably need > + to be audited for their behavior when requested subreg can't > + be expressed as a hard register after apply the offset. */ > unsigned int > subreg_regno (const_rtx x) > { > diff --git a/gcc/testsuite/g++.target/m68k/m68k.exp > b/gcc/testsuite/g++.target/m68k/m68k.exp > new file mode 100644 > index 00000000000..8f6416e9fdf > --- /dev/null > +++ b/gcc/testsuite/g++.target/m68k/m68k.exp > @@ -0,0 +1,34 @@ > +# Copyright (C) 2019-2024 Free Software Foundation, Inc. > + > +# This program is free software; you can redistribute it and/or modify > +# it under the terms of the GNU General Public License as published by > +# the Free Software Foundation; either version 3 of the License, or > +# (at your option) any later version. > +# > +# This program is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +# GNU General Public License for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with GCC; see the file COPYING3. If not see > +# <http://www.gnu.org/licenses/>. > + > +# GCC testsuite that uses the `dg.exp' driver. > + > +# Exit immediately if this isn't a m68k target. > +if ![istarget m68k*-*-*] then { > + return > +} > + > +# Load support procs. > +load_lib g++-dg.exp > + > +# Initialize `dg'. > +dg-init > + > +# Main loop. > +dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.C]] "" "" > + > +# All done. > +dg-finish > diff --git a/gcc/testsuite/g++.target/m68k/pr116244.C > b/gcc/testsuite/g++.target/m68k/pr116244.C > new file mode 100644 > index 00000000000..2e78832efbb > --- /dev/null > +++ b/gcc/testsuite/g++.target/m68k/pr116244.C > @@ -0,0 +1,226 @@ > +// { dg-do compile } > +// { dg-additional-options "-std=gnu++20 -O2 -w -march=isaa" } > +namespace std { > +struct __conditional { > + template <typename _Tp, typename> using type = _Tp; > +}; > +template <bool, typename _If, typename _Else> > +using __conditional_t = __conditional::type<_If, _Else>; > +template <typename> constexpr bool is_lvalue_reference_v = true; > +template <typename> bool is_same_v; > +template <typename _From, typename _To> > +constexpr bool is_convertible_v = __is_convertible(_From, _To); > +template <typename> bool is_invocable_v; > +template <typename...> using common_reference_t = int; > +template <typename _Iterator> struct iterator_traits : _Iterator {}; > +namespace __detail { > +template <typename, typename _Up> > +concept __same_as = is_same_v<_Up>; > +} > +template <typename _Tp, typename _Up> > +concept same_as = __detail::__same_as<_Up, _Tp>; > +template <typename _Derived, typename _Base> > +concept derived_from = is_convertible_v<_Derived, _Base>; > +template <typename, typename> > +concept common_reference_with = > + same_as<common_reference_t<>, common_reference_t<>>; > +namespace __detail { > +template <typename _Tp> > +concept __boolean_testable = requires(_Tp __t) { __t; }; > +template <typename _Tp, typename> > +concept __weakly_eq_cmp_with = requires(_Tp __t) { __t; }; > +} // namespace __detail > +template <typename... _Args> > +concept invocable = is_invocable_v<_Args...>; > +template <typename> > +concept regular_invocable = invocable<>; > +template <typename _Fn> > +concept predicate = __detail::__boolean_testable<_Fn>; > +template <typename _Rel, typename, typename> > +concept relation = predicate<_Rel>; > +template <typename _Rel, typename _Tp, typename _Up> > +concept strict_weak_order = relation<_Rel, _Tp, _Up>; > +namespace { > +struct duration { > + duration() = default; > + template <typename _Rep2> duration(_Rep2 __rep) : __r(__rep) {} > + long long count() { return __r; } > + long long __r; > +}; > +long long operator+(duration __lhs, duration __rhs) { > + long __trans_tmp_1 = __lhs.count(); > + return __trans_tmp_1 + __rhs.count(); > +} > +struct time_point { > + time_point(); > + time_point(duration __dur) : __d(__dur) {} > + duration time_since_epoch() { return __d; } > + duration __d; > +}; > +time_point operator+(time_point __lhs, duration __rhs) { > + duration __trans_tmp_2 = __lhs.time_since_epoch(); > + return time_point(__trans_tmp_2 + __rhs); > +} > +template <typename> using sys_time = time_point; > +} // namespace > +template <typename> class allocator; > +namespace { > +struct less {}; > +} // namespace > +struct forward_iterator_tag {}; > +namespace __detail { > +template <typename _Iter> struct __iter_traits_impl { > + using type = iterator_traits<_Iter>; > +}; > +template <typename _Iter> using __iter_traits = > __iter_traits_impl<_Iter>::type; > +template <typename> struct __iter_concept_impl; > +template <typename _Iter> > + requires requires { typename _Iter; } > +struct __iter_concept_impl<_Iter> { > + using type = __iter_traits<_Iter>::iterator_concept; > +}; > +template <typename _Iter> > +using __iter_concept = __iter_concept_impl<_Iter>::type; > +template <typename _In> > +concept __indirectly_readable_impl = common_reference_with<_In, _In>; > +} // namespace __detail > +template <typename _In> > +concept indirectly_readable = __detail::__indirectly_readable_impl<_In>; > +template <typename _Iter> > +concept input_or_output_iterator = requires(_Iter __i) { __i; }; > +template <typename _Sent, typename _Iter> > +concept sentinel_for = __detail::__weakly_eq_cmp_with<_Sent, _Iter>; > +template <typename _Iter> > +concept forward_iterator = > + derived_from<__detail::__iter_concept<_Iter>, forward_iterator_tag>; > +template <typename _Fn, typename> > +concept indirectly_regular_unary_invocable = regular_invocable<_Fn>; > +template <typename _Fn, typename _I1, typename _I2> > +concept indirect_strict_weak_order = strict_weak_order<_Fn, _I1, _I2>; > +namespace __detail { > +template <typename, typename> struct __projected; > +} > +template <indirectly_readable _Iter, > + indirectly_regular_unary_invocable<_Iter> _Proj> > +using projected = __detail::__projected<_Iter, _Proj>; > +namespace ranges::__access { > +template <typename _Tp> auto __begin(_Tp __t) { return __t.begin(); } > +} // namespace ranges::__access > +namespace __detail { > +template <typename _Tp> > +using __range_iter_t = decltype(ranges::__access::__begin(_Tp())); > +} > +template <typename _Tp> struct iterator_traits<_Tp *> { > + typedef forward_iterator_tag iterator_concept; > + using reference = _Tp; > +}; > +} // namespace std > +template <typename _Iterator, typename> struct __normal_iterator { > + using iterator_concept = std::__detail::__iter_concept<_Iterator>; > + std::iterator_traits<_Iterator>::reference operator*(); > + void operator++(); > +}; > +template <typename _IteratorL, typename _IteratorR, typename _Container> > +bool operator==(__normal_iterator<_IteratorL, _Container>, > + __normal_iterator<_IteratorR, _Container>); > +int _M_get_sys_info_ri; > +namespace std { > +template <typename> struct allocator_traits; > +template <typename _Tp> struct allocator_traits<allocator<_Tp>> { > + using pointer = _Tp *; > +}; > +namespace { > +namespace __detail { > +template <typename _Tp> > +concept __maybe_borrowed_range = is_lvalue_reference_v<_Tp>; > +} > +int end; > +template <typename> > +concept range = requires { end; }; > +template <typename _Tp> > +concept borrowed_range = __detail::__maybe_borrowed_range<_Tp>; > +template <typename _Tp> using iterator_t = > std::__detail::__range_iter_t<_Tp>; > +template <typename _Tp> > +concept forward_range = forward_iterator<iterator_t<_Tp>>; > +struct dangling; > +template <input_or_output_iterator _It, sentinel_for<_It> _Sent = _It> > +struct subrange { > + _It begin(); > + _Sent end(); > +}; > +template <range _Range> > +using borrowed_subrange_t = > + __conditional_t<borrowed_range<_Range>, subrange<iterator_t<_Range>>, > + dangling>; > +} // namespace > +template <typename _Alloc> struct _Vector_base { > + typedef allocator_traits<_Alloc>::pointer pointer; > +}; > +template <typename _Tp, typename _Alloc = allocator<_Tp>> struct vector { > + __normal_iterator<typename _Vector_base<_Alloc>::pointer, int> begin(); > +}; > +template <typename _Tp> struct __shared_ptr_access { > + _Tp *operator->(); > +}; > +namespace chrono { > +struct weekday { > + char _M_wd; > + weekday _S_from_days() { > + long __trans_tmp_3; > + return 7 > __trans_tmp_3; > + } > + weekday(unsigned __wd) : _M_wd(__wd == 7 ?: __wd) {} > + weekday() : weekday{_S_from_days()} {} > + unsigned c_encoding() { return _M_wd; } > + friend duration operator-(weekday __x, weekday __y) { > + auto __n = __x.c_encoding() - __y.c_encoding(); > + return int(__n) >= 0 ? __n : duration{}; > + } > +}; > +struct year_month_day { > + year_month_day _S_from_days(duration __dp) { > + auto __r0 = int(__dp.count()), __u2 = 5 * __r0; > + year_month_day{__u2}; > + } > + year_month_day(); > + year_month_day(int); > + year_month_day(sys_time<long> __dp) > + : year_month_day(_S_from_days(__dp.time_since_epoch())) {} > +}; > +struct time_zone { > + void _M_get_sys_info() const; > +}; > +} // namespace chrono > +namespace ranges { > +struct { > + template < > + forward_range _Range, typename _Tp, typename _Proj, > + indirect_strict_weak_order<_Tp, projected<_Range, _Proj>> _Comp = less> > + borrowed_subrange_t<_Range> operator()(_Range, _Tp, _Comp, _Proj); > +} equal_range; > +} // namespace ranges > +} // namespace std > +namespace std::chrono { > +struct Rule; > +unsigned day_of_week; > +struct _Node { > + vector<Rule> rules; > +}; > +char name; > +struct Rule { > + int from; > + void start_time(int, long) { > + year_month_day ymd; > + sys_time<long> date; > + duration diff = day_of_week - weekday{}; > + ymd = date + diff; > + } > +}; > +__shared_ptr_access<_Node> __trans_tmp_5; > +void time_zone::_M_get_sys_info() const { > + auto node = __trans_tmp_5; > + auto rules = ranges::equal_range(node->rules, _M_get_sys_info_ri, {}, > name); > + for (auto rule : rules) > + rule.start_time(rule.from, {}); > +} > +} // namespace std::chrono