Okay, let me explain the background of my previous patch.

Prior to applying my patch, for the test case bug-10.c (a reduced example of a larger program with incorrect runtime results), the vsetvli sequence compiled with --param=vsetvl-strategy=simple was as follows: 1. vsetvli zero,a4,e16,m4,ta,ma + vsetvli zero,a4,e32,m8,ta,ma + vsetvli zero,a4,e8,m2,ta,ma

The vsetvli sequence compiled with --param=vsetvl-strategy=optim was as follows:
2. vsetvli zero,a4,e32,m4,ta,ma + vsetvli zero,zero,e8,m2,ta,ma >
Although vl remains unchanged, the SEW/LMUL ratio in sequence 2 changes, leading to undefined behavior.

The only difference I see with your patch vs without is

   <       vsetvli zero,zero,e8,m2,ta,ma
   ---
   >       vsetvli zero,a3,e8,m2,ta,ma

and we ensure the former doesn't occur in the test.

But that difference doesn't matter because the ratio is the same before and after. That's why I'm asking. bug-10.c as is doesn't test anything reasonable IMHO. Right, the ratio (or rather the associated LMUL) was wrong but the current test doesn't make sure it isn't. Can you share the non-reduced (or less reduced) case?

/* { dg-do compile { target { rv64 } } } */
/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3" } */

#include <riscv_vector.h>

_Float16 a (uint64_t);
int8_t b () {
  int c = 100;
  double *d;
  _Float16 *e;
  for (size_t f;; c -= f)
    {
      f = c;
      __riscv_vsll_vx_u8mf8 (__riscv_vid_v_u8mf8 (f), 2, f);
      vfloat16mf4_t g;
      a (1);
      g = __riscv_vfmv_s_f_f16mf4 (2, f);
      vfloat64m1_t i = __riscv_vfmv_s_f_f64m1 (30491, f);
      vuint16mf4_t j;
      __riscv_vsoxei16_v_f16mf4 (e, j, g, f);
      vuint8mf8_t k = __riscv_vsll_vx_u8mf8 (__riscv_vid_v_u8mf8 (f), 3, f);
      __riscv_vsoxei8_v_f64m1 (d, k, i, f);
    }
}

/* { dg-final { scan-assembler-not "e64,mf4" } } */

That works, thanks.

--
Regards
Robin

Reply via email to