https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106523

--- Comment #9 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-12 branch has been updated by Jakub Jelinek
<ja...@gcc.gnu.org>:

https://gcc.gnu.org/g:a558a4d3d1b488783b96dff7141d12e02ded3ad3

commit r12-9157-ga558a4d3d1b488783b96dff7141d12e02ded3ad3
Author: Jakub Jelinek <ja...@redhat.com>
Date:   Tue Jan 17 12:14:25 2023 +0100

    forwprop: Fix up rotate pattern matching [PR106523]

    The comment above simplify_rotate roughly describes what patterns
    are matched into what:
       We are looking for X with unsigned type T with bitsize B, OP being
       +, | or ^, some type T2 wider than T.  For:
       (X << CNT1) OP (X >> CNT2)                           iff CNT1 + CNT2 ==
B
       ((T) ((T2) X << CNT1)) OP ((T) ((T2) X >> CNT2))     iff CNT1 + CNT2 ==
B

       transform these into:
       X r<< CNT1

       Or for:
       (X << Y) OP (X >> (B - Y))
       (X << (int) Y) OP (X >> (int) (B - Y))
       ((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y)))
       ((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y)))
       (X << Y) | (X >> ((-Y) & (B - 1)))
       (X << (int) Y) | (X >> (int) ((-Y) & (B - 1)))
       ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1))))
       ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))

       transform these into (last 2 only if ranger can prove Y < B):
       X r<< Y

       Or for:
       (X << (Y & (B - 1))) | (X >> ((-Y) & (B - 1)))
       (X << (int) (Y & (B - 1))) | (X >> (int) ((-Y) & (B - 1)))
       ((T) ((T2) X << (Y & (B - 1)))) | ((T) ((T2) X >> ((-Y) & (B - 1))))
       ((T) ((T2) X << (int) (Y & (B - 1)))) \
         | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))

       transform these into:
       X r<< (Y & (B - 1))

    The following testcase shows that 2 of these are problematic.
    If T2 is wider than T, then the 2 which yse (-Y) & (B - 1) on one
    of the shift counts but Y on the can do something different from
    rotate.  E.g.:
    __attribute__((noipa)) unsigned char
    f7 (unsigned char x, unsigned int y)
    {
      unsigned int t = x;
      return (t << y) | (t >> ((-y) & 7));
    }
    if y is [0, 7], then it is a normal rotate, and if y is in [32, ~0U]
    then it is UB, but for y in [9, 31] the left shift in this case
    will never leave any bits in the result, while in a rotate they are
    left there.  Say for y 5 and x 0xaa the expression gives
    0x55 which is the same thing as rotate, while for y 19 and x 0xaa
    0x5, which is different.
    Now, I believe the
       ((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y)))
       ((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y)))
    forms are ok, because B - Y still needs to be a valid shift count,
    and if Y > B then B - Y should be either negative or very large
    positive (for unsigned types).
    And similarly the last 2 cases above which use & (B - 1) on both
    shift operands are definitely ok.

    The following patch disables the
       ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1))))
       ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))
    unless ranger says Y is not in [B, B2 - 1] range.

    And, looking at it again this morning, actually the Y equal to B
    case is still fine, if Y is equal to 0, then it is
    (T) (((T2) X << 0) | ((T2) X >> 0))
    and so X, for Y == B it is
    (T) (((T2) X << B) | ((T2) X >> 0))
    which is the same as
    (T) (0 | ((T2) X >> 0))
    which is also X.  So instead of the [B, B2 - 1] range we could use
    [B + 1, B2 - 1].  And, if we wanted to go further, even multiplies
    of B are ok if they are smaller than B2, so we could construct a detailed
    int_range_max if we wanted.

    2023-01-17  Jakub Jelinek  <ja...@redhat.com>

            PR tree-optimization/106523
            * tree-ssa-forwprop.cc (simplify_rotate): For the
            patterns with (-Y) & (B - 1) in one operand's shift
            count and Y in another, if T2 has wider precision than T,
            punt if Y could have a value in [B, B2 - 1] range.

            * c-c++-common/rotate-2.c (f5, f6, f7, f8, f13, f14, f15, f16,
            f37, f38, f39, f40, f45, f46, f47, f48): Add assertions using
            __builtin_unreachable about shift count.
            * c-c++-common/rotate-2b.c: New test.
            * c-c++-common/rotate-4.c (f5, f6, f7, f8, f13, f14, f15, f16,
            f37, f38, f39, f40, f45, f46, f47, f48): Add assertions using
            __builtin_unreachable about shift count.
            * c-c++-common/rotate-4b.c: New test.
            * gcc.c-torture/execute/pr106523.c: New test.

    (cherry picked from commit 001121e8921d5d1a439ce0e64ab04c5959b0bfd8)

Reply via email to