https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120805

--- Comment #11 from Avinash Jayakar <avinashd at linux dot ibm.com> ---
(In reply to Tamar Christina from comment #9)
> (In reply to Avinash Jayakar from comment #8)
> > (In reply to Tamar Christina from comment #7)
> > > (In reply to Avinash Jayakar from comment #6)
> > > No, const_vf will be 0 when vector length agnostic code is being used.
> > > i.e. a polynomial VF where it's a runtime constant but not a compile time
> > > one.
> > > 
> > > And because it's a runtime constant it's the only case we can't statically
> > > set a range.
> > > 
> > > In GCC AArch64 and RISC-V are 2-coeff poly targets.
> > > 
> > > The previous code wasn't wrong, it just didn't support setting ranges for
> > > LOOP_VINFO_USING_PARTIAL_VECTORS_P loops.  But you can set a range for a
> > > non-poly LOOP_VINFO_USING_PARTIAL_VECTORS_P loop, because as soon as your
> > > VF is known and your number of iterations are known, you can tell what the
> > > minimum and maximum number of times you iterate over your vector latch 
> > > are.
> > > 
> > > i.e. should your VF be 4 and your max 14, you know your range is min=1,
> > > max=4,
> > > where the final iteration processes 2 values using some kind of mask.
> > > 
> > > So the question here is whether the range in GIMPLE is incorrect (as in 
> > > has
> > > the wrong value).
> > > 
> > > It would be good if you can isolate a testcase (of 1 function) to look at.
> > 
> > Ok sure, yeah I was looking at code where the loop bounds are not known at
> > compile time. I have set up a test case in godbolt
> > 
> > https://godbolt.org/z/16nxPTTsx
> > 
> > Here as you can see in the gimple generated of the trunk, there is one extra
> > range generated compared to the gcc 15.1.0 version.
> > Global Exported: bnd.31_180 = [irange] unsigned int [1, 1073741824]
> > with the trunk build. I am a little new to this, so please pardon my
> > ignorance, I am not sure why does the function gets invoked 2 times.
> 
> Ah, I see. Thanks for the clear example! The function gets called twice
> because
> there are two loops, the main and epilogue.  The calculations are correct
> for the
> main loop, but for the epilogue one in this example they're not.  Which I
> think is
> what you were trying to say with
> 
> > we assign its value to the vector length. 
> > 
> > I believe that the previous behaviour was correct, where you do not need to
> > assign the range info to the epilogue. Adding the range info has different
> > lowering path within the rtl for powerpc, which I have not yet investigated
> > yet.
> > 
> 
> I think the code before worked because a non-partial epilogue would have
> niters_vector
> be a const (e.g. a gimple value) but the partial iteration loop it's a
> runtime constant.
> 
> > So my main doubt here is const_vf, is supposed to be 0 for the epilogue
> > block right, just like log_vf was null for the epilogue. If so, this is a
> > simple fix, by assigning a temporary_const_vf, and assigning the actual
> > value inside the latter mentioned if block. Please do let me know, in this
> > case I can create a patch.
> 
> I think we do want a range though for an epilogue if we can, since we know
> an epilogue
> iterates at most once, due to how we set the bounds for the loop itself.
> 
> For the specific case here it can be fixed by
> 
> diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> index 2d01a4b0ed1..c81aff76efa 100644
> --- a/gcc/tree-vect-loop-manip.cc
> +++ b/gcc/tree-vect-loop-manip.cc
> @@ -2857,7 +2857,7 @@ vect_gen_vector_loop_niters (loop_vec_info loop_vinfo,
> tree niters,
>          we set range information to make niters analyzer's life easier.
>          Note the number of latch iteration value can be TYPE_MAX_VALUE so
>          we have to represent the vector niter TYPE_MAX_VALUE + 1 / vf.  */
> -      if (stmts != NULL && const_vf > 0)
> +      if (stmts != NULL && const_vf > 0 && !LOOP_VINFO_EPILOGUE_P
> (loop_vinfo))
>         {
>           if (niters_no_overflow
>               && LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo))
> 
> but I think we can provide a range here. The range should be <1,1>.  I'll
> need to
> double check.
> 
> So I'll take this one if you don't mind.

Ok, for this particular case this fix would work. But as you said, the range
should be <1,1> for the epilogue. I added that particular logic here
+         if (niters_no_overflow && !LOOP_VINFO_EPILOGUE_P (loop_vinfo))
            {
              int_range<1> vr (type,
                               wi::one (TYPE_PRECISION (type)),
@@ -2882,6 +2874,14 @@ vect_gen_vector_loop_niters (loop_vec_info loop_vinfo,
tree niters,
                                           TYPE_SIGN (type)));
              set_range_info (niters_vector, vr);
            }
+       else if (niters_no_overflow && LOOP_VINFO_EPILOGUE_P (loop_vinfo)) {
+               
+             int_range<1> vr (type,
+                              wi::one (TYPE_PRECISION (type)),
+                              wi::one (TYPE_PRECISION (type))
+                               );
+             set_range_info (niters_vector, vr);
+       }
Please let me know if this is what you had in mind.
But these test cases still fail, since it generates the same load store
instructions as current trunk version. 
I think the generated assembly is correct, but I have to investigate furthur on
why adding range information for the epilogue part produces a different load
store pattern (thus the number of lxvl and lxvx are different failing the test
cases).

Reply via email to