On Wed, 05 Mar 2025 12:17:24 +0100, "Robin Dapp" wrote:
> Hi Jin,
>
> > I apologize for the delayed response. I spent quite a bit of time trying to
> > reproduce
> > the case, and given the passage of time, it wasn't easy to refine the
> > testing.
> > Fortunately, you can see the results here.
Hi Jin,
I apologize for the delayed response. I spent quite a bit of time trying to
reproduce
the case, and given the passage of time, it wasn't easy to refine the testing.
Fortunately, you can see the results here.
https://godbolt.org/z/Mc8veW7oT
Using GCC version 14.2.0 should allow you to
On Fri, 28 Feb 2025 12:48:36 +0100, "Robin Dapp" wrote:
> > Okay, let me explain the background of my previous patch.
> >
> > Prior to applying my patch, for the test case bug-10.c (a reduced example
> > of
> > a larger program with incorrect runtime results),
> > the vsetvli sequence compiled wi
What we could do is
prev.set_ratio (calculate_ratio (prev.get_sew (), prev.get_vlmul ()));
prev.set_vlmul (calculate_vlmul (prev.get_sew (), prev.get_ratio ()));
No, that also doesn't work because the ratio can be invalid then.
We fuse two vsetvls. One of them has a larger SEW which w
Okay, let me explain the background of my previous patch.
Prior to applying my patch, for the test case bug-10.c (a reduced example of
a larger program with incorrect runtime results),
the vsetvli sequence compiled with --param=vsetvl-strategy=simple was as
follows:
1. vsetvli zero,a4,e16,m4,ta
On Fri, 28 Feb 2025 06:47:24 +0100, "Robin Dapp" wrote:
> > This patch modifies the sequence:
> > vsetvli zero,a4,e32,m4,ta,ma + vsetvli zero,a4,e8,m2,ta,ma
> > to:
> > vsetvli zero,a4,e32,m8,ta,ma + vsetvli zero,zero,e8,m2,ta,ma
> > Functionally, there is no difference. However, this change resol
It seems the issue is we didn't set "vlmul" ?
Can we do that:
int max_sew = MAX (prev.get_sew (), next.get_sew ());
prev.set_sew (max_sew);
prev.set_vlmul (calculate_vlmul (...));
prev.set_ratio (calculate_ratio (prev.get_sew (), prev.get_vlmul ()));
What we could do is
prev.set_ratio (cal
This patch modifies the sequence:
vsetvli zero,a4,e32,m4,ta,ma + vsetvli zero,a4,e8,m2,ta,ma
to:
vsetvli zero,a4,e32,m8,ta,ma + vsetvli zero,zero,e8,m2,ta,ma
Functionally, there is no difference. However, this change resolves the
issue with "e64,mf4", and allows the second vsetvli to omit a4, wh
On Thu, 27 Feb 2025 16:00:08 +0100, "Robin Dapp" wrote:
> Hi,
>
> when merging two vsetvls that both only demand "SEW >= ..." we
> use their maximum SEW and keep the LMUL. That may lead to invalid
> vector configurations like
> e64, mf4.
> As we make sure that the SEW requirements overlap we ca