: juzhe.zh...@rivai.ai; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
On Thu, Jan 11, 2024 at 10:52 AM Robin Dapp wrote:
>
> On 1/11/24 10:46, juzhe.zh...@rivai.ai wrote:
> > Oh. I see I think I have done wrong here.
> &
3%]
[local count: 359464610]:
goto ; [100.00%]
}
Final ASM:
main:
lui a5,%hi(a)
li a4,19
sb a4,%lo(a)(a5)
li a0,0
ret
juzhe.zh...@rivai.ai
From: Robin Dapp
Date: 2024-01-11 20:56
To: juzhe.zh...@rivai.ai; Richard Biener
CC: rdapp.gcc; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
Subject:
> 32872 spends 2 scalar instructions + 1 scalar_to_vec cost:
>
> lia4,-32768
> addiwa4,a4,104
> vmv.v.xv16,a4
>
> It seems reasonable but only can fix test with -march=rv64gcv_zvl256b but
> failed on -march=rv64gcv_zvl4096b.
The scalar version also needs both instructions:
li a0,32
e later pass failed
to CSE it...
juzhe.zh...@rivai.ai
From: Robin Dapp
Date: 2024-01-11 19:15
To: juzhe.zh...@rivai.ai; Richard Biener
CC: rdapp.gcc; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
> I think we sho
: rdapp.gcc; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
> I think we shouldn't vectorize it with any vlen, since the non-vectorized
> codegen is much better.
> And also, I have tested -msve-vector-bits=2048, A
pp
Date: 2024-01-11 19:15
To: juzhe.zh...@rivai.ai; Richard Biener
CC: rdapp.gcc; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
> I think we shouldn't vectorize it with any vlen, since the non-vectorized
> codege
h can work well.
juzhe.zh...@rivai.ai
From: Robin Dapp
Date: 2024-01-11 19:15
To: juzhe.zh...@rivai.ai; Richard Biener
CC: rdapp.gcc; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
> I think we shouldn't vectorize it with
> I think we shouldn't vectorize it with any vlen, since the non-vectorized
> codegen is much better.
> And also, I have tested -msve-vector-bits=2048, ARM SVE doesn't vectorize it.
> -zvl65536b, RVV Clang also doesn't vectorize it.
Of course I agree that optimizing everything to return 0 is
what
e it.
-zvl65536b, RVV Clang also doesn't vectorize it.
juzhe.zh...@rivai.ai
From: Robin Dapp
Date: 2024-01-11 18:40
To: juzhe.zh...@rivai.ai; Richard Biener
CC: rdapp.gcc; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1
On 1/11/24 11:20, juzhe.zh...@rivai.ai wrote:
> Ok I see your idea and we need to adjust scalar_to_vec accurately. Inside the
> loop we have these 2 scalar_to_vec:
>
> 1. MIN_EXPR 1 times scalar_to_vec costs 1 in prologue
>
> This scalar_to_vec cost should be 0 or 1 since it only generate si
LP since there all invariants are represented by SLP nodes
which we can hand down.
> >
> > juzhe.zh...@rivai.ai
> >
> >
> > From: Robin Dapp
> > Date: 2024-01-11 18:14
> > To: juzhe.zh...@rivai.ai; Richard Biener
> > CC: rd
ut we don't.
>
> juzhe.zh...@rivai.ai
>
>
> From: Robin Dapp
> Date: 2024-01-11 18:14
> To: juzhe.zh...@rivai.ai; Richard Biener
> CC: rdapp.gcc; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
> Subject: Re: [PATCH] RISC-V: Increase scala
n Dapp
Date: 2024-01-11 18:14
To: juzhe.zh...@rivai.ai; Richard Biener
CC: rdapp.gcc; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
> Yeah... I just noticed. I should set it as 4 to fix it with biggest VLEN
> size,
>
> Yeah... I just noticed. I should set it as 4 to fix it with biggest VLEN
> size,
> that is, -march=rv64gcv_zvl4096b --param=riscv-autovec-lmul=m8...
>
> I am confused now how to fix this case.
4 is definitely too high compared to a regular instruction.
vmv.vx could even be zero-cost for const
cheng;
jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
>> The slidedown/vmv.x.s part is of course vec_extract but we indeed
>> don't seem to cost it as vec_to_scalar here.
>
> It looks like a vectorized live operation as it's not in
To: Richard Biener
CC: rdapp.gcc; juzhe.zh...@rivai.ai; gcc-patches; kito.cheng; Kito.cheng;
jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
>> The slidedown/vmv.x.s part is of course vec_extract but we indeed
>> don't seem to cost it as vec_to_scala
>> The slidedown/vmv.x.s part is of course vec_extract but we indeed
>> don't seem to cost it as vec_to_scalar here.
>
> It looks like a vectorized live operation as it's not in the loop body
> (and thus really irrelevant for costing in practice). This has
>
> /* ??? Enable for loop costi
On Thu, Jan 11, 2024 at 10:52 AM Robin Dapp wrote:
>
> On 1/11/24 10:46, juzhe.zh...@rivai.ai wrote:
> > Oh. I see I think I have done wrong here.
> >
> > I should adjust cost for VEC_EXTRACT not VEC_SET.
> >
> > But it's odd, I didn't see loop vectorizer is scanning scalar_to_vec
> > cost in vect
ivai.ai; Richard Biener
CC: rdapp.gcc; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
On 1/11/24 10:46, juzhe.zh...@rivai.ai wrote:
> Oh. I see I think I have done wrong here.
>
> I should adjust cost for VEC_EXTRACT not
On 1/11/24 10:46, juzhe.zh...@rivai.ai wrote:
> Oh. I see I think I have done wrong here.
>
> I should adjust cost for VEC_EXTRACT not VEC_SET.
>
> But it's odd, I didn't see loop vectorizer is scanning scalar_to_vec
> cost in vect.dump.
The slidedown/vmv.x.s part is of course vec_extract but we
e: 2024-01-11 17:18
To: Juzhe-Zhong
CC: gcc-patches; kito.cheng; kito.cheng; jeffreyalaw; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
On Thu, Jan 11, 2024 at 9:24 AM Juzhe-Zhong wrote:
>
> This patch fixes the following inefficient vectorized co
; jeffreyalaw; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
On Thu, Jan 11, 2024 at 9:24 AM Juzhe-Zhong wrote:
>
> This patch fixes the following inefficient vectorized codes:
>
> vsetvli a5,zero,e8,mf2,ta,ma
> li a2,17
>
On Thu, Jan 11, 2024 at 9:24 AM Juzhe-Zhong wrote:
>
> This patch fixes the following inefficient vectorized codes:
>
> vsetvli a5,zero,e8,mf2,ta,ma
> li a2,17
> vid.v v1
> li a4,-32768
> vsetvli zero,zero,e16,m1,ta,ma
> addiw a4,a4,104
23 matches
Mail list logo