Hi Richard, Thanks for the review.
> On 8 Oct 2024, at 7:15 pm, Richard Biener <richard.guent...@gmail.com> wrote: > > External email: Use caution opening links or attachments > > > On Mon, Aug 5, 2024 at 7:05 AM Kugan Vivekanandarajah > <kvivekana...@nvidia.com> wrote: >> >> >> >>> On 15 Jul 2024, at 5:18 pm, Jakub Jelinek <ja...@redhat.com> wrote: >>> >>> External email: Use caution opening links or attachments >>> >>> >>> On Mon, Jul 15, 2024 at 12:39:22AM +0000, Kugan Vivekanandarajah wrote: >>>> OMP safelen handling is assigning backend provided max as an int even when >>>> the pragma didn’t provide one. As a result, vectoriser is rejecting SVE >>>> modes while comparing poly_int with the safelen. >>>> >>>> That is, for the attached test case, omp_max_vf gets [16, 16] from the >>>> backend. This then becomes 16 as omp safelen is an integer. When >>>> vectoriser compares the potential vector mode with maybe_lt (max_vf, >>>> min_vf)) , this would fail resulting in any SVE vector mode being >>>> selected. >>>> >>>> One suggestion there was to set safelen to INT_MAX when OMP pragma does >>>> not provide safely explicitly. >>> >>> This is wrong. The reason why safelen is set to that sctx.max_vf is that if >>> there are any "omp simd array" arrays, sctx.max_vf is their size. >>> The code you're touching has a comment that explains it even: >>> /* If max_vf is non-zero, then we can use only a vectorization factor >>> up to the max_vf we chose. So stick it into the safelen clause. */ >>> if (maybe_ne (sctx.max_vf, 0U)) >>> >>> If sctx.max_vf is 0, there were no "omp simd array" arrays emitted and so >>> OMP_CLAUSE_SAFELEN isn't set. >>> The vectorizer can only shrink the arrays, not grow them and that is why >>> there is this limit. >>> >>> Now, I think even SVE has a limit, which is not a scalar constant but >>> poly_int, so I think in that case you need to arrange for the size of the >>> arrays to be POLY_INT_CST as well and use that as a limit. >>> Now, the clause argument itself at least in the OpenMP standard needs to be >>> an >>> integer constant (if provided), because the proposals to extend it for the >>> SVE-like arches (aarch64, RISC-V) have not been voted in I believe. >>> So, there is a question what to do if user specifies safelen (32) or >>> something similar. >>> But if the user didn't specify it (so it is effectively infinitity), then >>> yes, it might be ok to set it to some POLY_INT_CST representing the sizes of >>> the arrays and tweak the loop safelen so that it can represent those. >> >> Thanks for the explanation. Does that mean: >> 1. We change loop->safelen to poly_int >> 2. Modify the apply_safelen to account for the poly_int. >> >> I am attaching an RFC patch for your reference. > > @@ -412,10 +413,15 @@ vect_analyze_data_ref_dependence (struct > data_dependence_relation *ddr, > executed concurrently, assume independence. */ > auto apply_safelen = [&]() > { > - if (loop->safelen >= 2) > + if (maybe_ge (loop->safelen, 2UL)) > { > - if ((unsigned int) loop->safelen < *max_vf) > - *max_vf = loop->safelen; > + unsigned int safelen; > + if (loop->safelen.is_constant ()) > + safelen = loop->safelen.coeffs[0]; > + else > + safelen = INT_MAX; > + if (safelen < *max_vf) > + *max_vf = safelen; > > isn't that effectively the same as doing > > if (known_lt (safelen, *max_vf)) > *max_vf = <something>; > > ? Specifically at this point we have to take a _lower_ bound on 'safelen', so > I think assuming INT_MAX here is just bogus, or at least turning loop->safelen > into a POLY_INT just to represent a "maximum" value in a strange way is > broken given that exact chosen POLY_INT is _not_ the actual safelen (else > your change above is wrong?). Doesn’t POLY_INT with two coefficients means that we can (in theory) have large enough value with this? That is why I thought max_vf in that case can reach INT_MAX. > > Can't we just explicitly document that loop->safelen == 0 means there's no > guaranteed concurrent evaluation but loop->safelen == -1 means it's safe > to evaluate any number of concurrent iterations and in turn special-case > the OMP SIMD array handling to use a proper array size in that case > (dependent on targets)? This sounds good to me. If you like this approach, I can revise the patch accordingly. Thanks, Kugan > > Richard. > >> Thanks, >> Kugan >> >> >> >> Signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com> >> >>> >>>> PR middle-end/114635 >>>> PR 114635 >>>> >>>> gcc/ChangeLog: >>>> >>>> * omp-low.cc (lower_rec_input_clauses): Set INT_MAX >>>> when safelen is not provided instead of using backend >>>> provided safelen. >>>> >>>> gcc/testsuite/ChangeLog: >>>> >>>> * c-c++-common/pr114635-1.cpp: New test. >>>> * c-c++-common/pr114635-2.cpp: New test. >>>> >>>> Signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com> >>> >>> Jakub