https://gcc.gnu.org/g:b6b238ddcb119bb51555ead9be0fa7b06b8a6be7

commit r16-1191-gb6b238ddcb119bb51555ead9be0fa7b06b8a6be7
Author: Jakub Jelinek <ja...@redhat.com>
Date:   Thu Jun 5 18:10:22 2025 +0200

    ranger: Add support for float <-> int casts [PR120231]
    
    The following patch adds support for float <-> integer conversions in
    ranger.
    The patch reverts part of the r16-571 changes, those changes were right
    for fold_range, but not for op1_range, where RO_IFI and RO_FIF are actually
    called rather than RO_IFF and RO_FII that the patch expected.
    Also, the float -> int operation actually uses FIX_TRUNC_EXPR tree code
    rather than NOP_EXPR or CONVERT_EXPR and int -> float uses FLOAT_EXPR,
    but I think we can just handle all of them using operator_cast, at least
    as long as we don't try to use VIEW_CONVERT_EXPR using that too; not really
    sure handling VCE at least for floating to integral or vice versa would
    be actually useful though.
    
    The patch "regressed" two tests, gfortran.dg/inline_matmul_16.f90 and
    g++.dg/tree-ssa/loop-split-1.C.  In the first case, there is a loop doing
    matmul on various sizes of matrices, up to 10x10 matrices, and Fortran
    FE given the options emits two implementations of the matmul, one inline
    for the case where the matmul has less than 1000 elements and one for
    larger matmuls.  The check for whatever reason uses floating point
    calculations and before this patch we weren't able to prove that all the
    matrices will be smaller than the cutoff and the test was checking for
    presence of the fallback call; with the patch we are able to figure it
    out and only keep the inline copy.  I've duplicated the test, once
    unmodified source which doesn't expect _gfortran_matmul string in optimized
    dump anymore, and another copy which uses volatile ten instead of 10 in
    loop upper bounds so that it has to keep the fallback and scans for it.
    The other test is g++.dg/tree-ssa/loop-split-1.C, which does
    constexpr unsigned s = 100000000;
    ...
        for(unsigned i = 0; i < s; ++i)
        {
            if(i == 0)
                a[i] = b[i] * c[i];
            else
                a[i] = (b[i] + c[i]) * c[i-1] * std::log(i);
        }
    and for some reason the successful loop splitting for which the test
    searches in a dump file is dependent on the errno fallback of std::log,
    where we do t = std::log((double)i); if ((double)i) u> 0); else log 
((double)i);
    But i goes only from 1 to 100000000, so (double)i has the range
    [1.0, 100000000.0] with the patch and so we see it will never need errno
    nor raise exception.  I've tested adding + d for it where d is 0.0 but
    modifiable in some other TU, and tested it also with r14-2851 and r14-2852,
    where the former FAILed the test both unmodified and modified, while
    the latter PASSed both versions.
    
    2025-06-05  Jakub Jelinek  <ja...@redhat.com>
    
            PR tree-optimization/120231
            * range-op.cc (range_op_table::range_op_table): Register op_cast
            also for FLOAT_EXPR and FIX_TRUNC_EXPR.
            (RO_III): Adjust comment.
            (range_op_handler::op1_range): Handle RO_IFI rather than RO_IFF.
            Don't handle RO_FII.
            (range_operator::op1_range): Remove overload with
            irange &, tree, const frange &, const frange &, relation_trio
            and frange &, tree, const irange &, const irange &, relation_trio
            arguments.  Add overload with
            irange &, tree, const frange &, const irange &, relation_trio
            arguments.
            * range-op-mixed.h (operator_cast::op1_range): Remove overload with
            irange &, tree, const frange &, const frange &, relation_trio
            and frange &, tree, const irange &, const irange &, relation_trio
            arguments.  Add overload with
            irange &, tree, const frange &, const irange &, relation_trio and
            frange &, tree, const irange &, const frange &, relation_trio
            arguments.
            * range-op.h (range_operator::op1_cast): Remove overload with
            irange &, tree, const frange &, const frange &, relation_trio
            and frange &, tree, const irange &, const irange &, relation_trio
            arguments.  Add overload with
            irange &, tree, const frange &, const irange &, relation_trio
            arguments.
            * range-op-float.cc (operator_cast::fold_range): Implement
            float to int and int to float casts.
            (operator_cast::op1_range): Remove overload with
            irange &, tree, const frange &, const frange &, relation_trio
            and frange &, tree, const irange &, const irange &, relation_trio
            arguments.  Add overload with
            irange &, tree, const frange &, const irange &, relation_trio and
            frange &, tree, const irange &, const frange &, relation_trio
            arguments and implement reverse op of float to int and int to float
            cast there.
    
            * gcc.dg/tree-ssa/pr120231-2.c: New test.
            * gcc.dg/tree-ssa/pr120231-3.c: New test.
            * gfortran.dg/inline_matmul_16.f90: Don't expect any 
_gfortran_matmul
            strings in optimized dump.
            * gfortran.dg/inline_matmul_26.f90: New test.
            * g++.dg/tree-ssa/loop-split-1.C (d): New variable.
            (main): Use std::log (i + d) instead of std::log (i).

Diff:
---
 gcc/range-op-float.cc                          | 159 +++++++++++++++++++++++--
 gcc/range-op-mixed.h                           |   4 +-
 gcc/range-op.cc                                |  19 +--
 gcc/range-op.h                                 |   4 -
 gcc/testsuite/g++.dg/tree-ssa/loop-split-1.C   |   3 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr120231-2.c     | 107 +++++++++++++++++
 gcc/testsuite/gcc.dg/tree-ssa/pr120231-3.c     |  40 +++++++
 gcc/testsuite/gfortran.dg/inline_matmul_16.f90 |   2 +-
 gcc/testsuite/gfortran.dg/inline_matmul_26.f90 |  36 ++++++
 9 files changed, 342 insertions(+), 32 deletions(-)

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index ea344a4d043f..32a6cd7e72bc 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -3022,34 +3022,173 @@ operator_cast::op1_range (frange &r, tree type, const 
frange &lhs,
 
 // Implement fold for a cast from float to an int.
 bool
-operator_cast::fold_range (irange &, tree, const frange &,
+operator_cast::fold_range (irange &r, tree type, const frange &op1,
                           const irange &, relation_trio) const
 {
-  return false;
+  if (empty_range_varying (r, type, op1, op1))
+    return true;
+  if (op1.maybe_isnan () || op1.maybe_isinf ())
+    {
+      r.set_varying (type);
+      return true;
+    }
+  REAL_VALUE_TYPE lb, ub;
+  real_trunc (&lb, VOIDmode, &op1.lower_bound ());
+  real_trunc (&ub, VOIDmode, &op1.upper_bound ());
+  REAL_VALUE_TYPE l, u;
+  l = real_value_from_int_cst (NULL_TREE, TYPE_MIN_VALUE (type));
+  if (real_less (&lb, &l))
+    {
+      r.set_varying (type);
+      return true;
+    }
+  u = real_value_from_int_cst (NULL_TREE, TYPE_MAX_VALUE (type));
+  if (real_less (&u, &ub))
+    {
+      r.set_varying (type);
+      return true;
+    }
+  bool fail = false;
+  wide_int wlb = real_to_integer (&lb, &fail, TYPE_PRECISION (type));
+  wide_int wub = real_to_integer (&ub, &fail, TYPE_PRECISION (type));
+  if (fail)
+    {
+      r.set_varying (type);
+      return true;
+    }
+  r.set (type, wlb, wub);
+  return true;
 }
 
 // Implement op1_range for a cast from float to an int.
 bool
-operator_cast::op1_range (frange &, tree, const irange &,
-                         const irange &, relation_trio) const
+operator_cast::op1_range (frange &r, tree type, const irange &lhs,
+                         const frange &, relation_trio) const
 {
-  return false;
+  if (lhs.undefined_p ())
+    return false;
+  REAL_VALUE_TYPE lb, lbo, ub, ubo;
+  wide_int lhs_lb = lhs.lower_bound ();
+  wide_int lhs_ub = lhs.upper_bound ();
+  tree lhs_type = lhs.type ();
+  enum machine_mode mode = TYPE_MODE (type);
+  real_from_integer (&lbo, VOIDmode, lhs_lb, TYPE_SIGN (lhs_type));
+  real_from_integer (&ubo, VOIDmode, lhs_ub, TYPE_SIGN (lhs_type));
+  real_convert (&lb, mode, &lbo);
+  real_convert (&ub, mode, &ubo);
+  if (real_identical (&lb, &lbo))
+    {
+      /* If low bound is exactly representable in type,
+        use nextafter (lb - 1., +inf).  */
+      real_arithmetic (&lb, PLUS_EXPR, &lbo, &dconstm1);
+      real_convert (&lb, mode, &lb);
+      if (!real_identical (&lb, &lbo))
+       frange_nextafter (mode, lb, dconstinf);
+      if (real_identical (&lb, &lbo))
+       frange_nextafter (mode, lb, dconstninf);
+    }
+  else if (real_less (&lbo, &lb))
+    frange_nextafter (mode, lb, dconstninf);
+  if (real_identical (&ub, &ubo))
+    {
+      /* If upper bound is exactly representable in type,
+        use nextafter (ub + 1., -inf).  */
+      real_arithmetic (&ub, PLUS_EXPR, &ubo, &dconst1);
+      real_convert (&ub, mode, &ub);
+      if (!real_identical (&ub, &ubo))
+       frange_nextafter (mode, ub, dconstninf);
+      if (real_identical (&ub, &ubo))
+       frange_nextafter (mode, ub, dconstinf);
+    }
+  else if (real_less (&ub, &ubo))
+    frange_nextafter (mode, ub, dconstinf);
+  r.set (type, lb, ub, nan_state (false));
+  return true;
 }
 
 // Implement fold for a cast from int to a float.
 bool
-operator_cast::fold_range (frange &, tree, const irange &,
+operator_cast::fold_range (frange &r, tree type, const irange &op1,
                           const frange &, relation_trio) const
 {
-  return false;
+  if (empty_range_varying (r, type, op1, op1))
+    return true;
+  REAL_VALUE_TYPE lb, ub;
+  wide_int op1_lb = op1.lower_bound ();
+  wide_int op1_ub = op1.upper_bound ();
+  tree op1_type = op1.type ();
+  enum machine_mode mode = flag_rounding_math ? VOIDmode : TYPE_MODE (type);
+  real_from_integer (&lb, mode, op1_lb, TYPE_SIGN (op1_type));
+  real_from_integer (&ub, mode, op1_ub, TYPE_SIGN (op1_type));
+  if (flag_rounding_math)
+    {
+      REAL_VALUE_TYPE lbo = lb, ubo = ub;
+      mode = TYPE_MODE (type);
+      real_convert (&lb, mode, &lb);
+      real_convert (&ub, mode, &ub);
+      if (real_less (&lbo, &lb))
+       frange_nextafter (mode, lb, dconstninf);
+      if (real_less (&ub, &ubo))
+       frange_nextafter (mode, ub, dconstinf);
+    }
+  r.set (type, lb, ub, nan_state (false));
+  frange_drop_infs (r, type);
+  if (r.undefined_p ())
+    r.set_varying (type);
+  return true;
 }
 
 // Implement op1_range for a cast from int to a float.
 bool
-operator_cast::op1_range (irange &, tree, const frange &,
-                         const frange &, relation_trio) const
+operator_cast::op1_range (irange &r, tree type, const frange &lhs,
+                         const irange &, relation_trio) const
 {
-  return false;
+  if (lhs.undefined_p ())
+    return false;
+  if (lhs.known_isnan ())
+    {
+      r.set_varying (type);
+      return true;
+    }
+  REAL_VALUE_TYPE lb = lhs.lower_bound ();
+  REAL_VALUE_TYPE ub = lhs.upper_bound ();
+  enum machine_mode mode = TYPE_MODE (lhs.type ());
+  frange_nextafter (mode, lb, dconstninf);
+  frange_nextafter (mode, ub, dconstinf);
+  if (flag_rounding_math)
+    {
+      real_floor (&lb, mode, &lb);
+      real_ceil (&ub, mode, &ub);
+    }
+  else
+    {
+      real_trunc (&lb, mode, &lb);
+      real_trunc (&ub, mode, &ub);
+    }
+  REAL_VALUE_TYPE l, u;
+  wide_int wlb, wub;
+  l = real_value_from_int_cst (NULL_TREE, TYPE_MIN_VALUE (type));
+  if (real_less (&lb, &l))
+    wlb = wi::min_value (TYPE_PRECISION (type), TYPE_SIGN (type));
+  else
+    {
+      bool fail = false;
+      wlb = real_to_integer (&lb, &fail, TYPE_PRECISION (type));
+      if (fail)
+       wlb = wi::min_value (TYPE_PRECISION (type), TYPE_SIGN (type));
+    }
+  u = real_value_from_int_cst (NULL_TREE, TYPE_MAX_VALUE (type));
+  if (real_less (&u, &ub))
+    wub = wi::max_value (TYPE_PRECISION (type), TYPE_SIGN (type));
+  else
+    {
+      bool fail = false;
+      wub = real_to_integer (&ub, &fail, TYPE_PRECISION (type));
+      if (fail)
+       wub = wi::max_value (TYPE_PRECISION (type), TYPE_SIGN (type));
+    }
+  r.set (type, wlb, wub);
+  return true;
 }
 
 // Initialize any float operators to the primary table
diff --git a/gcc/range-op-mixed.h b/gcc/range-op-mixed.h
index 0edc06ebe116..f8f183069046 100644
--- a/gcc/range-op-mixed.h
+++ b/gcc/range-op-mixed.h
@@ -499,10 +499,10 @@ public:
                  const frange &lhs, const frange &op2,
                  relation_trio = TRIO_VARYING) const final override;
   bool op1_range (frange &r, tree type,
-                 const irange &lhs, const irange &op2,
+                 const irange &lhs, const frange &op2,
                  relation_trio = TRIO_VARYING) const final override;
   bool op1_range (irange &r, tree type,
-                 const frange &lhs, const frange &op2,
+                 const frange &lhs, const irange &op2,
                  relation_trio = TRIO_VARYING) const final override;
 
   relation_kind lhs_op1_relation (const irange &lhs,
diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index 04128ee2dde5..0a3f0b6b56c7 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -97,6 +97,8 @@ range_op_table::range_op_table ()
   set (INTEGER_CST, op_cst);
   set (NOP_EXPR, op_cast);
   set (CONVERT_EXPR, op_cast);
+  set (FLOAT_EXPR, op_cast);
+  set (FIX_TRUNC_EXPR, op_cast);
   set (PLUS_EXPR, op_plus);
   set (ABS_EXPR, op_abs);
   set (MINUS_EXPR, op_minus);
@@ -165,7 +167,7 @@ dispatch_trio (unsigned lhs, unsigned op1, unsigned op2)
 // of the routines in range_operator.  Note the last 3 characters are
 // shorthand for the LHS, OP1, and OP2 range discriminator class.
 // Reminder, single operand instructions use the LHS type for op2, even if
-// unused. so FLOAT = INT would be RO_FIF.
+// unused.  So FLOAT = INT would be RO_FIF.
 
 const unsigned RO_III =        dispatch_trio (VR_IRANGE, VR_IRANGE, VR_IRANGE);
 const unsigned RO_IFI = dispatch_trio (VR_IRANGE, VR_FRANGE, VR_IRANGE);
@@ -298,10 +300,10 @@ range_op_handler::op1_range (vrange &r, tree type,
        return m_operator->op1_range (as_a <irange> (r), type,
                                      as_a <irange> (lhs),
                                      as_a <irange> (op2), rel);
-      case RO_IFF:
+      case RO_IFI:
        return m_operator->op1_range (as_a <irange> (r), type,
                                      as_a <frange> (lhs),
-                                     as_a <frange> (op2), rel);
+                                     as_a <irange> (op2), rel);
       case RO_PPP:
        return m_operator->op1_range (as_a <prange> (r), type,
                                      as_a <prange> (lhs),
@@ -322,10 +324,6 @@ range_op_handler::op1_range (vrange &r, tree type,
        return m_operator->op1_range (as_a <frange> (r), type,
                                      as_a <irange> (lhs),
                                      as_a <frange> (op2), rel);
-      case RO_FII:
-       return m_operator->op1_range (as_a <frange> (r), type,
-                                     as_a <irange> (lhs),
-                                     as_a <irange> (op2), rel);
       case RO_FFF:
        return m_operator->op1_range (as_a <frange> (r), type,
                                      as_a <frange> (lhs),
@@ -785,13 +783,6 @@ range_operator::fold_range (frange &, tree, const irange &,
 
 bool
 range_operator::op1_range (irange &, tree, const frange &,
-                          const frange &, relation_trio) const
-{
-  return false;
-}
-
-bool
-range_operator::op1_range (frange &, tree, const irange &,
                           const irange &, relation_trio) const
 {
   return false;
diff --git a/gcc/range-op.h b/gcc/range-op.h
index 107578613783..9e0e65187c82 100644
--- a/gcc/range-op.h
+++ b/gcc/range-op.h
@@ -152,10 +152,6 @@ public:
                          relation_trio = TRIO_VARYING) const;
   virtual bool op1_range (irange &r, tree type,
                          const frange &lhs,
-                         const frange &op2,
-                         relation_trio = TRIO_VARYING) const;
-  virtual bool op1_range (frange &r, tree type,
-                         const irange &lhs,
                          const irange &op2,
                          relation_trio = TRIO_VARYING) const;
 
diff --git a/gcc/testsuite/g++.dg/tree-ssa/loop-split-1.C 
b/gcc/testsuite/g++.dg/tree-ssa/loop-split-1.C
index 898100653348..4df85f5e036c 100644
--- a/gcc/testsuite/g++.dg/tree-ssa/loop-split-1.C
+++ b/gcc/testsuite/g++.dg/tree-ssa/loop-split-1.C
@@ -6,6 +6,7 @@
 #include <cmath>
 
 constexpr unsigned s = 100000000;
+double d = 0.0;
 
 int main()
 {
@@ -19,7 +20,7 @@ int main()
         if(i == 0)
             a[i] = b[i] * c[i];
         else
-            a[i] = (b[i] + c[i]) * c[i-1] * std::log(i);
+            a[i] = (b[i] + c[i]) * c[i-1] * std::log(i + d);
     }
 }
 /* { dg-final { scan-tree-dump-times "loop split" 1 "lsplit" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr120231-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr120231-2.c
new file mode 100644
index 000000000000..d2b41baacda7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr120231-2.c
@@ -0,0 +1,107 @@
+/* PR tree-optimization/120231 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-add-options float64 } */
+/* { dg-require-effective-target float64 } */
+/* { dg-final { scan-tree-dump-not "link_failure \\\(\\\);" "optimized" } } */
+
+void link_failure (void);
+
+static _Float64 __attribute__((noinline))
+f1 (signed char x)
+{
+  return x;
+}
+
+static _Float64 __attribute__((noinline))
+f2 (signed char x)
+{
+  if (x >= -37 && x <= 42)
+    return x;
+  return 0.0f64;
+}
+
+void
+f3 (signed char x)
+{
+  _Float64 y = f1 (x);
+  if (y < (_Float64) (-__SCHAR_MAX__ - 1) || y > (_Float64) __SCHAR_MAX__)
+    link_failure ();
+  y = f2 (x);
+  if (y < -37.0f64 || y > 42.0f64)
+    link_failure ();
+}
+
+static _Float64 __attribute__((noinline))
+f4 (long long x)
+{
+  return x;
+}
+
+static _Float64 __attribute__((noinline))
+f5 (long long x)
+{
+  if (x >= -0x3ffffffffffffffeLL && x <= 0x3ffffffffffffffeLL)
+    return x;
+  return 0.0f64;
+}
+
+void
+f6 (long long x)
+{
+  _Float64 y = f4 (x);
+  if (y < (_Float64) (-__LONG_LONG_MAX__ - 1) || y > (_Float64) 
__LONG_LONG_MAX__)
+    link_failure ();
+  y = f5 (x);
+  if (y < (_Float64) -0x3ffffffffffffffeLL || y > (_Float64) 
0x3ffffffffffffffeLL)
+    link_failure ();
+}
+
+static signed char __attribute__((noinline))
+f7 (_Float64 x)
+{
+  if (x >= -78.5f64 && x <= 98.25f64)
+    return x;
+  return 0;
+}
+
+static unsigned char __attribute__((noinline))
+f8 (_Float64 x)
+{
+  if (x >= -0.75f64 && x <= 231.625f64)
+    return x;
+  return 31;
+}
+
+static long long __attribute__((noinline))
+f9 (_Float64 x)
+{
+  if (x >= -3372587051122780362.75f64 && x <= 3955322825938799366.25f64)
+    return x;
+  return 0;
+}
+
+static unsigned long long __attribute__((noinline))
+f10 (_Float64 x)
+{
+  if (x >= 31.25f64 && x <= 16751991430751148048.125f64)
+    return x;
+  return 4700;
+}
+
+void
+f11 (_Float64 x)
+{
+  signed char a = f7 (x);
+  if (a < -78 || a > 98)
+    link_failure ();
+  unsigned char b = f8 (x);
+  if (b > 231)
+    link_failure ();
+  long long c = f9 (x);
+  if (c < -3372587051122780160LL || c > 3955322825938799616LL)
+    link_failure ();
+  unsigned long long d = f10 (x);
+  if (d < 31 || d > 16751991430751148032ULL)
+    link_failure ();
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr120231-3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr120231-3.c
new file mode 100644
index 000000000000..d578c5b669f2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr120231-3.c
@@ -0,0 +1,40 @@
+/* PR tree-optimization/120231 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-add-options float64 } */
+/* { dg-require-effective-target float64 } */
+/* { dg-final { scan-tree-dump-not "link_failure \\\(\\\);" "optimized" } } */
+
+void link_failure (void);
+
+void
+foo (long long x)
+{
+  _Float64 y = x;
+  if (y >= -8577328745032543176.25f64 && y <= 699563045341050951.75f64)
+    {
+      if (x < -8577328745032544256LL || x > 699563045341051136LL)
+       link_failure ();
+    }
+  if (y >= -49919160463252.125f64 && y <= 757060336735329.625f64)
+    {
+      if (x < -49919160463252LL || x > 757060336735329LL)
+       link_failure ();
+    }
+}
+
+void
+bar (_Float64 x)
+{
+  long long y = x;
+  if (y >= -6923230004751524066LL && y <= 2202103129706786704LL)
+    {
+      if (x < -6923230004751524864.0f64 || x > 2202103129706786816.0f64)
+       link_failure ();
+    }
+  if (y >= -171621738469699LL && y <= 45962470357748LL)
+    {
+      if (x <= -1716217384696970.f64 || x >= 45962470357749.0f64)
+       link_failure ();
+    }
+}
diff --git a/gcc/testsuite/gfortran.dg/inline_matmul_16.f90 
b/gcc/testsuite/gfortran.dg/inline_matmul_16.f90
index 580cb1ac9393..bb1a3cb48ab4 100644
--- a/gcc/testsuite/gfortran.dg/inline_matmul_16.f90
+++ b/gcc/testsuite/gfortran.dg/inline_matmul_16.f90
@@ -58,4 +58,4 @@ program main
      end do
   end do
 end program main
-! { dg-final { scan-tree-dump-times "_gfortran_matmul" 1 "optimized" } }
+! { dg-final { scan-tree-dump-not "_gfortran_matmul" "optimized" } }
diff --git a/gcc/testsuite/gfortran.dg/inline_matmul_26.f90 
b/gcc/testsuite/gfortran.dg/inline_matmul_26.f90
new file mode 100644
index 000000000000..0876941ad4cd
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/inline_matmul_26.f90
@@ -0,0 +1,36 @@
+! { dg-do run }
+! { dg-options "-ffrontend-optimize -fdump-tree-optimized -Wrealloc-lhs 
-finline-matmul-limit=1000 -O" }
+! PR 66094: Check functionality for MATMUL(TRANSPOSE(A),B)) for 
two-dimensional arrays
+program main
+  implicit none
+  integer :: in, im, icnt
+  integer, volatile :: ten
+
+  ten = 10
+  ! cycle through a few test cases...
+  do in = 2,ten 
+     do im = 2,ten
+        do icnt = 2,ten
+           block
+             real, dimension(icnt,in) :: a2
+             real, dimension(icnt,im) :: b2
+             real, dimension(in,im) :: c2,cr
+             integer :: i,j,k
+             call random_number(a2)
+             call random_number(b2)
+             c2 = 0
+             do i=1,size(a2,2)
+                do j=1, size(b2,2)
+                   do k=1, size(a2,1)
+                      c2(i,j) = c2(i,j) + a2(k,i) * b2(k,j)
+                   end do
+                end do
+             end do
+             cr = matmul(transpose(a2), b2)
+             if (any(abs(c2-cr) > 1e-4)) STOP 7
+           end block
+        end do
+     end do
+  end do
+end program main
+! { dg-final { scan-tree-dump-times "_gfortran_matmul" 1 "optimized" } }

Reply via email to