Paul-Antoine Arras wrote:
As permitted by OpenMP 6, accept an 'omp nothing' directive as intervening code.
Well, as you have in the error message:
- error_at (input_location,
- "intervening code must not contain OpenMP directives");
+ error_at (
+ input_location,
+ "intervening code must not contain executable OpenMP directives");
it is not only 'nothing' but more that is permitted. For
completeness, the OpenMP 6's wording is:
"Additionally, intervening code must not contain **executable**
directives or calls to the OpenMP runtime API in its corresponding
region."
whereas 5.1/5.2 had "... does not contain OpenMP directives ..."
In the spec current draft, OpenMP has the following non-executable
directives:
(A) declarative
allocate
begin declare_target
begin declare_variant
declare_induction
declare_mapper
declare_reduction
declare_simd
declare_target
declare_variant
groupprivate
threadprivate
(B) informational
assume
assumes
begin assumes
requires
(C) meta
begin metadirective
metadirective
(D) subsidiary
scan
section
task_iteration
(E) utility
error
nothing
Some aren't implemented, yet (like groupprivate or declare induction),
others are not permitted in interving code for other reasons (like
declare target), and some are a bit tricky like 'error' which is
'untility' but becomes 'executable' with 'at(execution)'.¹
[¹ I think all 'error' directives that make it into tree/gfc_code
are executable as the others are emitted while parsing.]
Still, it seems as if at least 'assume' makes sense and should be
supported:
#pragma omp for collapse(2)
for (int i = 1; i < n; i++)
{
#pragma omp assume holds (x > 1)
z = fabs(x - i);
for (int k = 0; k < m; k++)
...
}
NOTE: GCC already classifies the directives. In Fortran, there
is the gfc_omp_directives array in openmp.cc and for C/C++,
in c-common.h/c-omp.cc there is the c_omp_directives.
Technically, 'declare {reduction,induction}' would probably
be fine inside a scope - but why would you do this as
intervening code in a loop?
Likewise
int y; / #pragma omp allocate(y) allocator(my_allocator)
is valid but does not seem to make sense in a hot loop.
(BTW: The latter by construction has to use a call to an
allocate function; maybe not need this for a predefined
allocator, but for a user-defined one always.)
* * *
I think the wording in the changelog needs to be adapted to
talk about non-executable directives and not single out
'nothing'.
For the support: While a bunch is permitted, I guess it would
be sufficient to permit additionally:
- 'error at(compilation)' - presumably works already, should be tested
- 'assume'
* * *
Also handle the special case where a metadirective can be resolved at parse time
to 'omp nothing'.
This fixes a build issue that affects 10 out 12 SPECaccel benchmarks.
Namely, by resolving this early, conflicting results like 'collapse(2)'
with an 'omp simd' before the inner loop can be prevented if it is known
that the condition can never be true.
* * *
The early evaluation works using:
+ /* If only one selector matches and it evaluates to 'omp nothing', no need to
+ * proceed. */
+ if (ctxs.length () == 1)
This is based on the evaluation of 'omp_context_selector_matches'
with complete_p = false while parsing. I that happens, this
case is ignored (not pushed to ctxs).
The omp_context_selector_matches returns 'no match' for:
- ignored selectors
- if implementation != "gnu"
- ...
* * *
Moving to Fortran:
--- a/gcc/fortran/parse.cc
+++ b/gcc/fortran/parse.cc
@@ -6518,7 +6518,16 @@ parse_omp_metadirective_body (gfc_statement omp_st)
locus body_locus = gfc_current_locus;
bool saw_error = false;
- accept_statement (omp_st);
+ /* If there is only one variant and it evaluates to 'omp nothing', no need to
+ proceed. */
+ if (variant->selectors == NULL && variant->stmt == ST_NONE
+ && variant->next == NULL)
+ {
+ reject_statement ();
+ return next_statement ();
+ }
+ else
+ accept_statement (omp_st);
In Fortran, parsing and resolution are usually two
separate items as many things are only known after
having parsed a whole subprogram (e.g. module) as
especially procedure definitions might come later.
Likewise, one needs to know the whole specification
part before one knows everything about location
declarations (variables, types, ...).
Thus, the Fortran way is: parse everything, once
finished, resolve it.
In any case, the following is mishandles as I get
Error: not enough DO loops for collapsed !$OMP TARGET TEAMS LOOP (level
1) at (1)
for the program
INTEGER, INTENT(IN) :: x_min, x_max, y_min, y_max
integer, parameter :: one = 1
...
!$omp target teams loop collapse(2))
DO k=y_min,y_max
!$omp metadirective &
when(user={condition((one < 0 ? x_min > 0: one /= 1))}: simd)
DO j=x_min,x_max
I would claim that the condition statically resolves to .false.
* * *
Actually, the parser vs. resolution thing is not as bleak. If
we talk about constant expressions, they can be resolved earlier
and, indeed, there is a call to 'gfc_resolve_expr' during parsing.
The question is why is it not active for the case above?
And, unrelated, this resolve_expr call actually rejects
valid code (function call to a later function; unrelated to
this patch as we talk here about compile-time resolvable
cases) => https://gcc.gnu.org/PR122306
* * *
I guess for now we can live with that broken part as it
was broken before - still: why is the conditional expression
not evaluated is the question.
* * *
@@ -7089,6 +7089,16 @@ match_omp_metadirective (bool begin_p)
return MATCH_ERROR;
}
+ /* Don't record the variant if the selector is a false user condition. */
+ if (selectors != NULL && selectors->code == OMP_TRAIT_SET_USER
+ && selectors->trait_selectors->code == OMP_TRAIT_USER_CONDITION
+ && selectors->trait_selectors->properties->property_kind
+ == OMP_TRAIT_PROPERTY_BOOL_EXPR
+ && selectors->trait_selectors->properties->expr->expr_type
+ == EXPR_CONSTANT
+ && selectors->trait_selectors->properties->expr->value.logical == 0)
+ continue;
This seems to be extremely tailored for the testcase. If one uses:
implementation={vendor(gnu)}
it would fail immediately as well.
The question is how to salvage this.
Currently, there is in gfc_trans_omp_metadirective:
tree ctx = gfc_trans_omp_set_selector (variant->selectors,
variant->where);
ctx = omp_check_context_selector (gfc_get_location (&variant->where),
ctx, OMP_CTX_METADIRECTIVE);
[…] /* If the selector doesn't match, drop the whole variant. */
if (!omp_context_selector_matches (ctx, NULL_TREE, false))
This is fine - but comes much later then the case we want to solve:
resolving loop collapse.
My feeling is that we want to move both some checking and the tree
conversion of *only* the selector (+ omp_context_selector_matches)
to the resolving stage - such that we can then skip some parts.
(This implies adding a 'tree selector_tree' besides 'selector' to
'gfc_omp_variant, but that's fine.)
Admittedly, that's a bit more complex - and we probably want to
stay first with the current version.
Can you add a note to https://gcc.gnu.org/PR122306 about this
case? - It kind of feels the same issue:
- your check + the existing is-boolean check comes too early
- the can be removed check comes too late
And it seems as if properly fixing will involve touching the
same code - and should be done together.
Hence: Please add a note to that PR once this patch has
landed.
* * *
In summary:
- Make clear in the changelog that this is about non-executable
directives not (only) about 'nothing' (even if only a subset
is handled).
- We should probably add a comment before the check that this
only handles a sensible subset of non-executable directives,
possibly cross-referencing to the (not used) classification
in the array.
- 'assume' should be permitted in intervening code.
(and 'error at(compilation)', but I think that's not a
compiler change).
- For 'assume' and 'error at(compilation)' (warning or error,
does not matter), having a testcase would be good.
- Can you check why gfc_match_omp_context_selector's call to
if (!gfc_resolve_expr (otp->expr)
|| (property_kind == OMP_TRAIT_PROPERTY_BOOL_EXPR
&& otp->expr->ts.type != BT_LOGICAL)
failed to simply the conditional expression?
- Once the patch landed, can you add a note to PR122306
about the need to process the context selector earlier
(to be done in lock step with handling moving the
resolution later).
Thanks for the patch - and sorry for nitpicking.
Tobias