Hi Richard,

Thanks for the review.

> On 8 Oct 2024, at 7:15 pm, Richard Biener <richard.guent...@gmail.com> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> On Mon, Aug 5, 2024 at 7:05 AM Kugan Vivekanandarajah
> <kvivekana...@nvidia.com> wrote:
>> 
>> 
>> 
>>> On 15 Jul 2024, at 5:18 pm, Jakub Jelinek <ja...@redhat.com> wrote:
>>> 
>>> External email: Use caution opening links or attachments
>>> 
>>> 
>>> On Mon, Jul 15, 2024 at 12:39:22AM +0000, Kugan Vivekanandarajah wrote:
>>>> OMP safelen handling is assigning backend provided max as an int even when 
>>>> the pragma didn’t provide one. As a result, vectoriser is rejecting SVE 
>>>> modes while comparing poly_int with the safelen.
>>>> 
>>>> That is, for the attached test case,  omp_max_vf gets [16, 16] from the 
>>>> backend. This then becomes 16 as omp safelen is an integer. When 
>>>> vectoriser compares the potential vector mode with  maybe_lt (max_vf, 
>>>> min_vf)) , this would fail resulting in any SVE vector mode being  
>>>> selected.
>>>> 
>>>> One suggestion there was to set safelen to INT_MAX when OMP pragma does 
>>>> not provide safely explicitly.
>>> 
>>> This is wrong.  The reason why safelen is set to that sctx.max_vf is that if
>>> there are any "omp simd array" arrays, sctx.max_vf is their size.
>>> The code you're touching has a comment that explains it even:
>>> /* If max_vf is non-zero, then we can use only a vectorization factor
>>>    up to the max_vf we chose.  So stick it into the safelen clause.  */
>>> if (maybe_ne (sctx.max_vf, 0U))
>>> 
>>> If sctx.max_vf is 0, there were no "omp simd array" arrays emitted and so
>>> OMP_CLAUSE_SAFELEN isn't set.
>>> The vectorizer can only shrink the arrays, not grow them and that is why
>>> there is this limit.
>>> 
>>> Now, I think even SVE has a limit, which is not a scalar constant but
>>> poly_int, so I think in that case you need to arrange for the size of the
>>> arrays to be POLY_INT_CST as well and use that as a limit.
>>> Now, the clause argument itself at least in the OpenMP standard needs to be 
>>> an
>>> integer constant (if provided), because the proposals to extend it for the
>>> SVE-like arches (aarch64, RISC-V) have not been voted in I believe.
>>> So, there is a question what to do if user specifies safelen (32) or
>>> something similar.
>>> But if the user didn't specify it (so it is effectively infinitity), then
>>> yes, it might be ok to set it to some POLY_INT_CST representing the sizes of
>>> the arrays and tweak the loop safelen so that it can represent those.
>> 
>> Thanks for the explanation. Does that mean:
>> 1. We change loop->safelen to poly_int
>> 2. Modify the apply_safelen  to account for the poly_int.
>> 
>> I am attaching an RFC patch for your reference.
> 
> @@ -412,10 +413,15 @@ vect_analyze_data_ref_dependence (struct
> data_dependence_relation *ddr,
>      executed concurrently, assume independence.  */
>   auto apply_safelen = [&]()
>     {
> -      if (loop->safelen >= 2)
> +      if (maybe_ge (loop->safelen,  2UL))
>        {
> -         if ((unsigned int) loop->safelen < *max_vf)
> -           *max_vf = loop->safelen;
> +         unsigned int safelen;
> +         if (loop->safelen.is_constant ())
> +           safelen = loop->safelen.coeffs[0];
> +         else
> +           safelen = INT_MAX;
> +         if (safelen < *max_vf)
> +           *max_vf = safelen;
> 
> isn't that effectively the same as doing
> 
>      if (known_lt (safelen, *max_vf))
>        *max_vf = <something>;
> 
> ?  Specifically at this point we have to take a _lower_ bound on 'safelen', so
> I think assuming INT_MAX here is just bogus, or at least turning loop->safelen
> into a POLY_INT just to represent a "maximum" value in a strange way is
> broken given that exact chosen POLY_INT is _not_ the actual safelen (else
> your change above is wrong?).

Doesn’t POLY_INT with two coefficients means that we can (in theory) have large 
 enough value with this? That is why I  thought max_vf in that case can reach 
INT_MAX.
> 
> Can't we just explicitly document that loop->safelen == 0 means there's no
> guaranteed concurrent evaluation but loop->safelen == -1 means it's safe
> to evaluate any number of concurrent iterations and in turn special-case
> the OMP SIMD array handling to use a proper array size in that case
> (dependent on targets)?

This sounds good to me. If you like this approach, I can revise the patch 
accordingly.

Thanks,
Kugan

> 
> Richard.
> 
>> Thanks,
>> Kugan
>> 
>> 
>> 
>> Signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
>> 
>>> 
>>>>       PR middle-end/114635
>>>>       PR 114635
>>>> 
>>>> gcc/ChangeLog:
>>>> 
>>>>       * omp-low.cc (lower_rec_input_clauses): Set INT_MAX
>>>>       when safelen is not provided instead of using backend
>>>>       provided safelen.
>>>> 
>>>> gcc/testsuite/ChangeLog:
>>>> 
>>>>       * c-c++-common/pr114635-1.cpp: New test.
>>>>       * c-c++-common/pr114635-2.cpp: New test.
>>>> 
>>>> Signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
>>> 
>>>       Jakub


Reply via email to