https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120805
--- Comment #11 from Avinash Jayakar <avinashd at linux dot ibm.com> --- (In reply to Tamar Christina from comment #9) > (In reply to Avinash Jayakar from comment #8) > > (In reply to Tamar Christina from comment #7) > > > (In reply to Avinash Jayakar from comment #6) > > > No, const_vf will be 0 when vector length agnostic code is being used. > > > i.e. a polynomial VF where it's a runtime constant but not a compile time > > > one. > > > > > > And because it's a runtime constant it's the only case we can't statically > > > set a range. > > > > > > In GCC AArch64 and RISC-V are 2-coeff poly targets. > > > > > > The previous code wasn't wrong, it just didn't support setting ranges for > > > LOOP_VINFO_USING_PARTIAL_VECTORS_P loops. But you can set a range for a > > > non-poly LOOP_VINFO_USING_PARTIAL_VECTORS_P loop, because as soon as your > > > VF is known and your number of iterations are known, you can tell what the > > > minimum and maximum number of times you iterate over your vector latch > > > are. > > > > > > i.e. should your VF be 4 and your max 14, you know your range is min=1, > > > max=4, > > > where the final iteration processes 2 values using some kind of mask. > > > > > > So the question here is whether the range in GIMPLE is incorrect (as in > > > has > > > the wrong value). > > > > > > It would be good if you can isolate a testcase (of 1 function) to look at. > > > > Ok sure, yeah I was looking at code where the loop bounds are not known at > > compile time. I have set up a test case in godbolt > > > > https://godbolt.org/z/16nxPTTsx > > > > Here as you can see in the gimple generated of the trunk, there is one extra > > range generated compared to the gcc 15.1.0 version. > > Global Exported: bnd.31_180 = [irange] unsigned int [1, 1073741824] > > with the trunk build. I am a little new to this, so please pardon my > > ignorance, I am not sure why does the function gets invoked 2 times. > > Ah, I see. Thanks for the clear example! The function gets called twice > because > there are two loops, the main and epilogue. The calculations are correct > for the > main loop, but for the epilogue one in this example they're not. Which I > think is > what you were trying to say with > > > we assign its value to the vector length. > > > > I believe that the previous behaviour was correct, where you do not need to > > assign the range info to the epilogue. Adding the range info has different > > lowering path within the rtl for powerpc, which I have not yet investigated > > yet. > > > > I think the code before worked because a non-partial epilogue would have > niters_vector > be a const (e.g. a gimple value) but the partial iteration loop it's a > runtime constant. > > > So my main doubt here is const_vf, is supposed to be 0 for the epilogue > > block right, just like log_vf was null for the epilogue. If so, this is a > > simple fix, by assigning a temporary_const_vf, and assigning the actual > > value inside the latter mentioned if block. Please do let me know, in this > > case I can create a patch. > > I think we do want a range though for an epilogue if we can, since we know > an epilogue > iterates at most once, due to how we set the bounds for the loop itself. > > For the specific case here it can be fixed by > > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc > index 2d01a4b0ed1..c81aff76efa 100644 > --- a/gcc/tree-vect-loop-manip.cc > +++ b/gcc/tree-vect-loop-manip.cc > @@ -2857,7 +2857,7 @@ vect_gen_vector_loop_niters (loop_vec_info loop_vinfo, > tree niters, > we set range information to make niters analyzer's life easier. > Note the number of latch iteration value can be TYPE_MAX_VALUE so > we have to represent the vector niter TYPE_MAX_VALUE + 1 / vf. */ > - if (stmts != NULL && const_vf > 0) > + if (stmts != NULL && const_vf > 0 && !LOOP_VINFO_EPILOGUE_P > (loop_vinfo)) > { > if (niters_no_overflow > && LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo)) > > but I think we can provide a range here. The range should be <1,1>. I'll > need to > double check. > > So I'll take this one if you don't mind. Ok, for this particular case this fix would work. But as you said, the range should be <1,1> for the epilogue. I added that particular logic here + if (niters_no_overflow && !LOOP_VINFO_EPILOGUE_P (loop_vinfo)) { int_range<1> vr (type, wi::one (TYPE_PRECISION (type)), @@ -2882,6 +2874,14 @@ vect_gen_vector_loop_niters (loop_vec_info loop_vinfo, tree niters, TYPE_SIGN (type))); set_range_info (niters_vector, vr); } + else if (niters_no_overflow && LOOP_VINFO_EPILOGUE_P (loop_vinfo)) { + + int_range<1> vr (type, + wi::one (TYPE_PRECISION (type)), + wi::one (TYPE_PRECISION (type)) + ); + set_range_info (niters_vector, vr); + } Please let me know if this is what you had in mind. But these test cases still fail, since it generates the same load store instructions as current trunk version. I think the generated assembly is correct, but I have to investigate furthur on why adding range information for the epilogue part produces a different load store pattern (thus the number of lxvl and lxvx are different failing the test cases).