On Mon, Aug 5, 2024 at 7:05 AM Kugan Vivekanandarajah <kvivekana...@nvidia.com> wrote: > > > > > On 15 Jul 2024, at 5:18 pm, Jakub Jelinek <ja...@redhat.com> wrote: > > > > External email: Use caution opening links or attachments > > > > > > On Mon, Jul 15, 2024 at 12:39:22AM +0000, Kugan Vivekanandarajah wrote: > >> OMP safelen handling is assigning backend provided max as an int even when > >> the pragma didn’t provide one. As a result, vectoriser is rejecting SVE > >> modes while comparing poly_int with the safelen. > >> > >> That is, for the attached test case, omp_max_vf gets [16, 16] from the > >> backend. This then becomes 16 as omp safelen is an integer. When > >> vectoriser compares the potential vector mode with maybe_lt (max_vf, > >> min_vf)) , this would fail resulting in any SVE vector mode being > >> selected. > >> > >> One suggestion there was to set safelen to INT_MAX when OMP pragma does > >> not provide safely explicitly. > > > > This is wrong. The reason why safelen is set to that sctx.max_vf is that if > > there are any "omp simd array" arrays, sctx.max_vf is their size. > > The code you're touching has a comment that explains it even: > > /* If max_vf is non-zero, then we can use only a vectorization factor > > up to the max_vf we chose. So stick it into the safelen clause. */ > > if (maybe_ne (sctx.max_vf, 0U)) > > > > If sctx.max_vf is 0, there were no "omp simd array" arrays emitted and so > > OMP_CLAUSE_SAFELEN isn't set. > > The vectorizer can only shrink the arrays, not grow them and that is why > > there is this limit. > > > > Now, I think even SVE has a limit, which is not a scalar constant but > > poly_int, so I think in that case you need to arrange for the size of the > > arrays to be POLY_INT_CST as well and use that as a limit. > > Now, the clause argument itself at least in the OpenMP standard needs to be > > an > > integer constant (if provided), because the proposals to extend it for the > > SVE-like arches (aarch64, RISC-V) have not been voted in I believe. > > So, there is a question what to do if user specifies safelen (32) or > > something similar. > > But if the user didn't specify it (so it is effectively infinitity), then > > yes, it might be ok to set it to some POLY_INT_CST representing the sizes of > > the arrays and tweak the loop safelen so that it can represent those. > > Thanks for the explanation. Does that mean: > 1. We change loop->safelen to poly_int > 2. Modify the apply_safelen to account for the poly_int. > > I am attaching an RFC patch for your reference.
@@ -412,10 +413,15 @@ vect_analyze_data_ref_dependence (struct data_dependence_relation *ddr, executed concurrently, assume independence. */ auto apply_safelen = [&]() { - if (loop->safelen >= 2) + if (maybe_ge (loop->safelen, 2UL)) { - if ((unsigned int) loop->safelen < *max_vf) - *max_vf = loop->safelen; + unsigned int safelen; + if (loop->safelen.is_constant ()) + safelen = loop->safelen.coeffs[0]; + else + safelen = INT_MAX; + if (safelen < *max_vf) + *max_vf = safelen; isn't that effectively the same as doing if (known_lt (safelen, *max_vf)) *max_vf = <something>; ? Specifically at this point we have to take a _lower_ bound on 'safelen', so I think assuming INT_MAX here is just bogus, or at least turning loop->safelen into a POLY_INT just to represent a "maximum" value in a strange way is broken given that exact chosen POLY_INT is _not_ the actual safelen (else your change above is wrong?). Can't we just explicitly document that loop->safelen == 0 means there's no guaranteed concurrent evaluation but loop->safelen == -1 means it's safe to evaluate any number of concurrent iterations and in turn special-case the OMP SIMD array handling to use a proper array size in that case (dependent on targets)? Richard. > Thanks, > Kugan > > > > Signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com> > > > > >> PR middle-end/114635 > >> PR 114635 > >> > >> gcc/ChangeLog: > >> > >> * omp-low.cc (lower_rec_input_clauses): Set INT_MAX > >> when safelen is not provided instead of using backend > >> provided safelen. > >> > >> gcc/testsuite/ChangeLog: > >> > >> * c-c++-common/pr114635-1.cpp: New test. > >> * c-c++-common/pr114635-2.cpp: New test. > >> > >> Signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com> > > > > Jakub > >