Re: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-12 Thread juzhe.zh...@rivai.ai
Hi, Richard. I tried hard in RISC-V backend. I found to fix the case with -march=rv64gcv_zvl4096b can not be without vec_to_scalar count. Is there an approach that we can count vec_to_scalar cost without this piece code in middle-end ? /* ??? Enable for loop costing as well. */ i

Re: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread juzhe.zh...@rivai.ai
I think we can let it vectorize as long as we purely use VLSmodes. This following patch looks reasonable: diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc index 58ec0b9b503..4e351bc066c 100644 --- a/gcc/config/riscv/riscv-vector-costs.cc +++ b/gcc/confi

Re: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread juzhe.zh...@rivai.ai
I see after this accurate cost adjustment, it is still vectorized but different vect dump: [local count: 118111602]: # a.4_25 = PHI <1(2), _4(11)> # ivtmp_30 = PHI <18(2), ivtmp_20(11)> # vect_vec_iv_.12_149 = PHI <{ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }(2), _150(11)>

Re: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread juzhe.zh...@rivai.ai
Ok. but with this scalar_to_vec set to 2: diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index df9799d9c5e..a14fb36817a 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -366,7 +366,7 @@ static const common_vector_cost rvv_vls_vector_cost = { 1, /* ga

Re: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread juzhe.zh...@rivai.ai
Hi, Robin. I model scalar value initialization accurately with following patch: +/* Adjust vectorization cost after calling + targetm.vectorize.builtin_vectorization_cost. For some statement, we would + like to further fine-grain tweak the cost on top of + targetm.vectorize.builtin_vectoriz

Re: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread juzhe.zh...@rivai.ai
Oh, Sorry. I made a mistake. It's -mrvv-vector-bits=2048 RVV clang doesn't vectorize it. But ARM SVE GCC doesn't fix the issue if we specify -msve-vector-bits=2048, it will vectorize it. https://godbolt.org/z/x7s8Kz87a I guess LLVM has some magic in their cost model which can work well. juz

Re: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread juzhe.zh...@rivai.ai
>> (My question whether why we shouldn't vectorize this at 256b >> and above still stands, though) I think we shouldn't vectorize it with any vlen, since the non-vectorized codegen is much better. And also, I have tested -msve-vector-bits=2048, ARM SVE doesn't vectorize it. -zvl65536b, RVV Clang a

Re: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread Richard Biener
On Thu, Jan 11, 2024 at 11:18 AM Richard Biener wrote: > > On Thu, Jan 11, 2024 at 11:20 AM juzhe.zh...@rivai.ai > wrote: > > > > Ok I see your idea and we need to adjust scalar_to_vec accurately. Inside > > the loop we have these 2 scalar_to_vec: > > > > 1. MIN_EXPR 1 times scalar_to_vec costs

Re: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread Richard Biener
On Thu, Jan 11, 2024 at 11:20 AM juzhe.zh...@rivai.ai wrote: > > Ok I see your idea and we need to adjust scalar_to_vec accurately. Inside the > loop we have these 2 scalar_to_vec: > > 1. MIN_EXPR 1 times scalar_to_vec costs 1 in prologue > >This scalar_to_vec cost should be 0 or 1 since it

Re: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread juzhe.zh...@rivai.ai
Ok I see your idea and we need to adjust scalar_to_vec accurately. Inside the loop we have these 2 scalar_to_vec: 1. MIN_EXPR 1 times scalar_to_vec costs 1 in prologue This scalar_to_vec cost should be 0 or 1 since it only generate single instructions: vmv.v.i v16,15 2. 32872 >> patt_26 1

Re: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread juzhe.zh...@rivai.ai
And also I have investigate LLVM cost model. They don't cost vsevli in vectorization cost model. But their cost model does a good job... juzhe.zh...@rivai.ai From: Robin Dapp Date: 2024-01-11 18:09 To: Richard Biener CC: rdapp.gcc; juzhe.zh...@rivai.ai; gcc-patches; kito.cheng; Kito.cheng; j

Re: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread juzhe.zh...@rivai.ai
>> That said, we also don't really cost all our vsetvls yet (difficult...). If cost vsetvl, we will need to cost 1 more for each STMT. However, it is not accurate. Since our VSETVL PASS will eliminate redundancy... juzhe.zh...@rivai.ai From: Robin Dapp Date: 2024-01-11 18:09 To: Richard Biener

Re: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread juzhe.zh...@rivai.ai
>> With a cost of "3" we still vectorize for zvl512b and larger. >>Is that intended? I don't really see why 512 should vectorized >>but 256 not. Disregarding that everything should be optimized >>away, 2 iterations for the whole loop with 256 bits doesn't >>seem that bad. Yeah... I just noticed.

Re: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread juzhe.zh...@rivai.ai
Oh. I see I think I have done wrong here. I should adjust cost for VEC_EXTRACT not VEC_SET. But it's odd, I didn't see loop vectorizer is scanning scalar_to_vec cost in vect.dump. The vect tree: # a.4_25 = PHI <1(2), _4(11)> # ivtmp_30 = PHI <18(2), ivtmp_20(11)> # vect_vec_iv_.10_137 = P

Re: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread juzhe.zh...@rivai.ai
Thanks Richard. So you think increase scalar_to_vec cost is not the correct approach to fix this case? Or could you give me a suggestion to fix this case ? Thanks. juzhe.zh...@rivai.ai From: Richard Biener Date: 2024-01-11 17:18 To: Juzhe-Zhong CC: gcc-patches; kito.cheng; kito.cheng; jeff