On Mon, Aug 5, 2024 at 7:05 AM Kugan Vivekanandarajah
<kvivekana...@nvidia.com> wrote:
>
>
>
> > On 15 Jul 2024, at 5:18 pm, Jakub Jelinek <ja...@redhat.com> wrote:
> >
> > External email: Use caution opening links or attachments
> >
> >
> > On Mon, Jul 15, 2024 at 12:39:22AM +0000, Kugan Vivekanandarajah wrote:
> >> OMP safelen handling is assigning backend provided max as an int even when 
> >> the pragma didn’t provide one. As a result, vectoriser is rejecting SVE 
> >> modes while comparing poly_int with the safelen.
> >>
> >> That is, for the attached test case,  omp_max_vf gets [16, 16] from the 
> >> backend. This then becomes 16 as omp safelen is an integer. When 
> >> vectoriser compares the potential vector mode with  maybe_lt (max_vf, 
> >> min_vf)) , this would fail resulting in any SVE vector mode being  
> >> selected.
> >>
> >> One suggestion there was to set safelen to INT_MAX when OMP pragma does 
> >> not provide safely explicitly.
> >
> > This is wrong.  The reason why safelen is set to that sctx.max_vf is that if
> > there are any "omp simd array" arrays, sctx.max_vf is their size.
> > The code you're touching has a comment that explains it even:
> >  /* If max_vf is non-zero, then we can use only a vectorization factor
> >     up to the max_vf we chose.  So stick it into the safelen clause.  */
> >  if (maybe_ne (sctx.max_vf, 0U))
> >
> > If sctx.max_vf is 0, there were no "omp simd array" arrays emitted and so
> > OMP_CLAUSE_SAFELEN isn't set.
> > The vectorizer can only shrink the arrays, not grow them and that is why
> > there is this limit.
> >
> > Now, I think even SVE has a limit, which is not a scalar constant but
> > poly_int, so I think in that case you need to arrange for the size of the
> > arrays to be POLY_INT_CST as well and use that as a limit.
> > Now, the clause argument itself at least in the OpenMP standard needs to be 
> > an
> > integer constant (if provided), because the proposals to extend it for the
> > SVE-like arches (aarch64, RISC-V) have not been voted in I believe.
> > So, there is a question what to do if user specifies safelen (32) or
> > something similar.
> > But if the user didn't specify it (so it is effectively infinitity), then
> > yes, it might be ok to set it to some POLY_INT_CST representing the sizes of
> > the arrays and tweak the loop safelen so that it can represent those.
>
> Thanks for the explanation. Does that mean:
> 1. We change loop->safelen to poly_int
> 2. Modify the apply_safelen  to account for the poly_int.
>
> I am attaching an RFC patch for your reference.

@@ -412,10 +413,15 @@ vect_analyze_data_ref_dependence (struct
data_dependence_relation *ddr,
      executed concurrently, assume independence.  */
   auto apply_safelen = [&]()
     {
-      if (loop->safelen >= 2)
+      if (maybe_ge (loop->safelen,  2UL))
        {
-         if ((unsigned int) loop->safelen < *max_vf)
-           *max_vf = loop->safelen;
+         unsigned int safelen;
+         if (loop->safelen.is_constant ())
+           safelen = loop->safelen.coeffs[0];
+         else
+           safelen = INT_MAX;
+         if (safelen < *max_vf)
+           *max_vf = safelen;

isn't that effectively the same as doing

      if (known_lt (safelen, *max_vf))
        *max_vf = <something>;

?  Specifically at this point we have to take a _lower_ bound on 'safelen', so
I think assuming INT_MAX here is just bogus, or at least turning loop->safelen
into a POLY_INT just to represent a "maximum" value in a strange way is
broken given that exact chosen POLY_INT is _not_ the actual safelen (else
your change above is wrong?).

Can't we just explicitly document that loop->safelen == 0 means there's no
guaranteed concurrent evaluation but loop->safelen == -1 means it's safe
to evaluate any number of concurrent iterations and in turn special-case
the OMP SIMD array handling to use a proper array size in that case
(dependent on targets)?

Richard.

> Thanks,
> Kugan
>
>
>
> Signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
>
> >
> >>        PR middle-end/114635
> >>        PR 114635
> >>
> >> gcc/ChangeLog:
> >>
> >>        * omp-low.cc (lower_rec_input_clauses): Set INT_MAX
> >>        when safelen is not provided instead of using backend
> >>        provided safelen.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >>        * c-c++-common/pr114635-1.cpp: New test.
> >>        * c-c++-common/pr114635-2.cpp: New test.
> >>
> >> Signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
> >
> >        Jakub
>
>

Reply via email to