http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60661
Bug ID: 60661 Summary: DO CONCURRENT with MASK: Avoid using a temporary for the mask Product: gcc Version: 4.9.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: tkoenig at gcc dot gnu.org Currently, gfortran generates a temporary as shown below. However, the question is whether one cannot do without a temporary by moving the mask expression into the loop. I think that usually works - but not always. It works when: a) The variable in the mask does not occur on the LHS of an assignment or as intent([in]out) argument of a pure subroutine b) If the variable only occurs with the same array index as later in the body of the DO CONCURRENT loop I am not sure whether something with FORALL prevents this optimization. I think the simplest fix would be to transform DO CONCURRENT(i=1:n, mask(i)) ... to DO CONCURRENT(i=1:n) IF (.not. mask(i)) CYCLE in the FE optimization "7.2.4.2.3 Evaluation of the mask expression The scalar-mask-expr, if any, is evaluated for each combination of index-name values. If there is no scalar-mask-expr, it is as if it appeared with the value true. The index-name variables may be primaries in the scalar-mask-expr. The set of active combinations of index-name values is the subset of all possible combinations (7.2.4.2.2) for which the scalar-mask-expr has the value true." C736 (R752) The scalar-mask-expr shall be scalar and of type logical. C737 (R752) Any procedure referenced in the scalar-mask-expr , including one referenced by a defined operation, shall be a pure procedure (12.7). forall (i=start:end:stride; maskexpr) e<i> = f<i> g<i> = h<i> end forall (where e,f,g,h<i> are arbitrary expressions possibly involving i) Translates to: count = ((end + 1 - start) / stride) masktmp(:) = maskexpr(:) maskindex = 0; for (i = start; i <= end; i += stride) { if (masktmp[maskindex++]) e<i> = f<i> } maskindex = 0; for (i = start; i <= end; i += stride) { if (masktmp[maskindex++]) g<i> = h<i> }