Hi, Richard.
I tried hard in RISC-V backend. I found to fix the case with
-march=rv64gcv_zvl4096b can not be without vec_to_scalar count.
Is there an approach that we can count vec_to_scalar cost without this piece
code in middle-end ?
/* ??? Enable for loop costing as well. */
i
I think we can let it vectorize as long as we purely use VLSmodes.
This following patch looks reasonable:
diff --git a/gcc/config/riscv/riscv-vector-costs.cc
b/gcc/config/riscv/riscv-vector-costs.cc
index 58ec0b9b503..4e351bc066c 100644
--- a/gcc/config/riscv/riscv-vector-costs.cc
+++ b/gcc/confi
I see after this accurate cost adjustment, it is still vectorized but different
vect dump:
[local count: 118111602]:
# a.4_25 = PHI <1(2), _4(11)>
# ivtmp_30 = PHI <18(2), ivtmp_20(11)>
# vect_vec_iv_.12_149 = PHI <{ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
15, 16 }(2), _150(11)>
Ok. but with this scalar_to_vec set to 2:
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index df9799d9c5e..a14fb36817a 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -366,7 +366,7 @@ static const common_vector_cost rvv_vls_vector_cost = {
1, /* ga
Hi, Robin.
I model scalar value initialization accurately with following patch:
+/* Adjust vectorization cost after calling
+ targetm.vectorize.builtin_vectorization_cost. For some statement, we would
+ like to further fine-grain tweak the cost on top of
+ targetm.vectorize.builtin_vectoriz
Oh, Sorry. I made a mistake. It's -mrvv-vector-bits=2048 RVV clang doesn't
vectorize it.
But ARM SVE GCC doesn't fix the issue if we specify -msve-vector-bits=2048, it
will vectorize it.
https://godbolt.org/z/x7s8Kz87a
I guess LLVM has some magic in their cost model which can work well.
juz
>> (My question whether why we shouldn't vectorize this at 256b
>> and above still stands, though)
I think we shouldn't vectorize it with any vlen, since the non-vectorized
codegen is much better.
And also, I have tested -msve-vector-bits=2048, ARM SVE doesn't vectorize it.
-zvl65536b, RVV Clang a
On Thu, Jan 11, 2024 at 11:18 AM Richard Biener
wrote:
>
> On Thu, Jan 11, 2024 at 11:20 AM juzhe.zh...@rivai.ai
> wrote:
> >
> > Ok I see your idea and we need to adjust scalar_to_vec accurately. Inside
> > the loop we have these 2 scalar_to_vec:
> >
> > 1. MIN_EXPR 1 times scalar_to_vec costs
On Thu, Jan 11, 2024 at 11:20 AM juzhe.zh...@rivai.ai
wrote:
>
> Ok I see your idea and we need to adjust scalar_to_vec accurately. Inside the
> loop we have these 2 scalar_to_vec:
>
> 1. MIN_EXPR 1 times scalar_to_vec costs 1 in prologue
>
>This scalar_to_vec cost should be 0 or 1 since it
Ok I see your idea and we need to adjust scalar_to_vec accurately. Inside the
loop we have these 2 scalar_to_vec:
1. MIN_EXPR 1 times scalar_to_vec costs 1 in prologue
This scalar_to_vec cost should be 0 or 1 since it only generate single
instructions: vmv.v.i v16,15
2. 32872 >> patt_26 1
And also I have investigate LLVM cost model. They don't cost vsevli in
vectorization cost model.
But their cost model does a good job...
juzhe.zh...@rivai.ai
From: Robin Dapp
Date: 2024-01-11 18:09
To: Richard Biener
CC: rdapp.gcc; juzhe.zh...@rivai.ai; gcc-patches; kito.cheng; Kito.cheng;
j
>> That said, we also don't really cost all our vsetvls yet (difficult...).
If cost vsetvl, we will need to cost 1 more for each STMT.
However, it is not accurate. Since our VSETVL PASS will eliminate redundancy...
juzhe.zh...@rivai.ai
From: Robin Dapp
Date: 2024-01-11 18:09
To: Richard Biener
>> With a cost of "3" we still vectorize for zvl512b and larger.
>>Is that intended? I don't really see why 512 should vectorized
>>but 256 not. Disregarding that everything should be optimized
>>away, 2 iterations for the whole loop with 256 bits doesn't
>>seem that bad.
Yeah... I just noticed.
Oh. I see I think I have done wrong here.
I should adjust cost for VEC_EXTRACT not VEC_SET.
But it's odd, I didn't see loop vectorizer is scanning scalar_to_vec
cost in vect.dump.
The vect tree:
# a.4_25 = PHI <1(2), _4(11)>
# ivtmp_30 = PHI <18(2), ivtmp_20(11)>
# vect_vec_iv_.10_137 = P
Thanks Richard.
So you think increase scalar_to_vec cost is not the correct approach to fix
this case?
Or could you give me a suggestion to fix this case ?
Thanks.
juzhe.zh...@rivai.ai
From: Richard Biener
Date: 2024-01-11 17:18
To: Juzhe-Zhong
CC: gcc-patches; kito.cheng; kito.cheng; jeff
15 matches
Mail list logo