Okay, let me explain the background of my previous patch.
Prior to applying my patch, for the test case bug-10.c (a reduced example of
a larger program with incorrect runtime results),
the vsetvli sequence compiled with --param=vsetvl-strategy=simple was as
follows:
1. vsetvli zero,a4,e16,m4,ta,ma + vsetvli zero,a4,e32,m8,ta,ma + vsetvli
zero,a4,e8,m2,ta,ma
The vsetvli sequence compiled with --param=vsetvl-strategy=optim was as
follows:
2. vsetvli zero,a4,e32,m4,ta,ma + vsetvli zero,zero,e8,m2,ta,ma >
Although vl remains unchanged, the SEW/LMUL ratio in sequence 2 changes,
leading to undefined behavior.
The only difference I see with your patch vs without is
< vsetvli zero,zero,e8,m2,ta,ma
---
> vsetvli zero,a3,e8,m2,ta,ma
and we ensure the former doesn't occur in the test.
But that difference doesn't matter because the ratio is the same before and
after. That's why I'm asking. bug-10.c as is doesn't test anything reasonable
IMHO. Right, the ratio (or rather the associated LMUL) was wrong but the
current test doesn't make sure it isn't. Can you share the non-reduced (or
less reduced) case?
/* { dg-do compile { target { rv64 } } } */
/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3" } */
#include <riscv_vector.h>
_Float16 a (uint64_t);
int8_t b () {
int c = 100;
double *d;
_Float16 *e;
for (size_t f;; c -= f)
{
f = c;
__riscv_vsll_vx_u8mf8 (__riscv_vid_v_u8mf8 (f), 2, f);
vfloat16mf4_t g;
a (1);
g = __riscv_vfmv_s_f_f16mf4 (2, f);
vfloat64m1_t i = __riscv_vfmv_s_f_f64m1 (30491, f);
vuint16mf4_t j;
__riscv_vsoxei16_v_f16mf4 (e, j, g, f);
vuint8mf8_t k = __riscv_vsll_vx_u8mf8 (__riscv_vid_v_u8mf8 (f), 3, f);
__riscv_vsoxei8_v_f64m1 (d, k, i, f);
}
}
/* { dg-final { scan-assembler-not "e64,mf4" } } */
That works, thanks.
--
Regards
Robin