On Wed, 05 Mar 2025 12:17:24 +0100, "Robin Dapp" wrote:
> Hi Jin,
>
> > I apologize for the delayed response. I spent quite a bit of time trying to
> > reproduce
> > the case, and given the passage of time, it wasn't easy to refine the
> > testing.
> > Fortunately, you can see the results here.
Hi Jin,
I apologize for the delayed response. I spent quite a bit of time trying to
reproduce
the case, and given the passage of time, it wasn't easy to refine the testing.
Fortunately, you can see the results here.
https://godbolt.org/z/Mc8veW7oT
Using GCC version 14.2.0 should allow you to
On Fri, 28 Feb 2025 12:48:36 +0100, "Robin Dapp" wrote:
> > Okay, let me explain the background of my previous patch.
> >
> > Prior to applying my patch, for the test case bug-10.c (a reduced example
> > of
> > a larger program with incorrect runtime results),
> > the vsetvli sequence compiled wi
What we could do is
prev.set_ratio (calculate_ratio (prev.get_sew (), prev.get_vlmul ()));
prev.set_vlmul (calculate_vlmul (prev.get_sew (), prev.get_ratio ()));
No, that also doesn't work because the ratio can be invalid then.
We fuse two vsetvls. One of them has a larger SEW which w
Okay, let me explain the background of my previous patch.
Prior to applying my patch, for the test case bug-10.c (a reduced example of
a larger program with incorrect runtime results),
the vsetvli sequence compiled with --param=vsetvl-strategy=simple was as
follows:
1. vsetvli zero,a4,e16,m4,ta
On Fri, 28 Feb 2025 06:47:24 +0100, "Robin Dapp" wrote:
> > This patch modifies the sequence:
> > vsetvli zero,a4,e32,m4,ta,ma + vsetvli zero,a4,e8,m2,ta,ma
> > to:
> > vsetvli zero,a4,e32,m8,ta,ma + vsetvli zero,zero,e8,m2,ta,ma
> > Functionally, there is no difference. However, this change resol
It seems the issue is we didn't set "vlmul" ?
Can we do that:
int max_sew = MAX (prev.get_sew (), next.get_sew ());
prev.set_sew (max_sew);
prev.set_vlmul (calculate_vlmul (...));
prev.set_ratio (calculate_ratio (prev.get_sew (), prev.get_vlmul ()));
What we could do is
prev.set_ratio (cal
This patch modifies the sequence:
vsetvli zero,a4,e32,m4,ta,ma + vsetvli zero,a4,e8,m2,ta,ma
to:
vsetvli zero,a4,e32,m8,ta,ma + vsetvli zero,zero,e8,m2,ta,ma
Functionally, there is no difference. However, this change resolves the
issue with "e64,mf4", and allows the second vsetvli to omit a4, wh
On Thu, 27 Feb 2025 16:00:08 +0100, "Robin Dapp" wrote:
> Hi,
>
> when merging two vsetvls that both only demand "SEW >= ..." we
> use their maximum SEW and keep the LMUL. That may lead to invalid
> vector configurations like
> e64, mf4.
> As we make sure that the SEW requirements overlap we ca
Hi,
when merging two vsetvls that both only demand "SEW >= ..." we
use their maximum SEW and keep the LMUL. That may lead to invalid
vector configurations like
e64, mf4.
As we make sure that the SEW requirements overlap we can use the SEW
and LMUL of the configuration with the larger SEW.
Ma J
10 matches
Mail list logo