Re: [PATCH] Cilk Plus merging to trunk (2 of n)

Joseph S. Myers Sun, 23 Sep 2012 13:45:41 -0700

On Sun, 23 Sep 2012, Iyer, Balaji V wrote:

> Hello Everyone,
>     Attached, please find a patch that will implement Cilk Plus Array 
> Notations for the C compiler. Array notations are indented to allow 
> programmers to directly express parallelism in their programs. Array 
> notations can be used to possibly see a more predictable performance 
> improvement, hardware resource-utilization and vectorization. To enable 
> the compiler recognize array notation syntax, we have added a flag 
> "-fcilkplus." If this flag is not used, none of these changes are 
> visible to the compiler user. For more information and examples about 
> array notations please see Chapter 4 in the Cilk Plus Specification 
> (http://software.intel.com/sites/default/files/m/6/3/1/cilk_plus_language_specification.pdf).


There seem to be a lot of deficiencies in this specification.  I list some 
here - I don't want answers in email explaining what is intended (unless 
it's already explained in the specification and I missed it), I want the 
specification revised to describe things in closer relation to C standard 
terms, and then you can answer in email pointing to the relevant wording 
in the improved specification - and to the testcases in your patch 
verifying that the semantics are properly implemented, but without needing 
to elaborate beyond pointing to specification text and testcases for each 
question.

What syntax productions, in C11 terms, are <array base>, <lower bound>, 
<length> and <stride> in the syntax given for section operators?  Are 
there constraints that <lower bound>, <length> and <stride> must be of 
integer type?  Are there other constraints on their types and values?  
(For example, must anything be constant?  If there isn't such a 
requirement, what exactly is the requirement that "The right-hand side 
expression must have the same rank and size as the array context." - if a 
constraint, in what cases is it a constraint?  Maybe a constraint that 
values must match if integer constant expressions, with non-matching 
runtime values, at least one not an integer constant expression, being 
undefined behavior at runtime, for example?  What anyway is "size" in that 
quoted sentence?  Unlike "rank" and "shape", it doesn't seem to be a 
defined term.)

What do you mean, in standard terms, by "must have a declared size"?  How 
is this defined in relation to standard C array-to-pointer decay?  What 
about adjustment of function parameter types?  What about array sizes 
declared as [*]?  What if the <array base> is in parentheses?

Should I take it, from the absence of any restrictions mentioned on this 
subject, that the array that is sectioned may have elements of arbitrary 
type - so it's valid to operate this way on arrays of pointers, complex 
numbers or structures, for example?  But that the usual constraints on 
assignment apply, so that if you try A[:] = B[:] where A and B are 
two-dimensional arrays, this is a constraint violation, because the 
elements that would be assigned elementwise are themselves still arrays 
and C doesn't allow array assignment?

Is it valid or invalid to have an expression of the form (A[1:2])[1:2] 
with parentheses around a partial array section expression, of which a 
further section is taken?  It's not in the syntax given, but that syntax 
doesn't actually show the precise syntax productions added to C11.

What is the type of an array section expression?  What is the result of 
applying sizeof to such an expression?  What about GNU C typeof?  Use in 
C11 _Generic?

Can such expressions be used in conditional expressions, scalar ? section 
: other-section?  If you'd defined types, at least this could be deduced 
from the existing C rules on conditional expressions that say what the 
permitted types are.  Can you use them with comma operators, A[:] = (B, 
C[:])?  It's far from clear from the document as-is to what extent such 
expressions have an existence with types and values like normal 
expressions, in which case this would of course be permitted, as opposed 
to being special-case things for the RHS of assignments.

You say (4.3.2.2) that certain operators are applied with the same 
precedence and rules on permitted operands as for scalars.  What 
promotions apply?  If operations are carried out on two signed char 
arrays, for example, are the elements considered to be promoted to int, 
resulting in an expression whose type is thought of as a section of an 
array of int, which may then be converted back to signed char if stored in 
a signed char array?  (Thus, internal operations take place in int and 
what would have been overflow in signed char would not cause undefined 
behavior.)

In general, do such conversions apply between different operands, and on 
assignment?

I note you do not mention shift operators in the list in 4.3.2.2 of those 
permitted - obviously that requires testcases that they are duly rejected.

Are functions such as __sec_reduce_add defined to apply the relevant 
operation to an accumulated value and each element of the section in turn, 
in some unspecified order?  Or may it evaluate, for example, a sum of four 
elements as (a + b) + (c + d)?  What about if a user function is provided?

I take it each function has constraints corresponding to those on the 
relevant arithmetic operation?  (For example, __sec_reduce_add couldn't be 
called for pointers.)  Can the *zero functions be called for pointers 
(testing whether they are NULL)?  What about the *max* and *min* 
functions?  What about those functions for floating-point - do they follow 
the semantics of fmax / fmin regarding NaNs?  Are the results unspecified 
if the answer is a floating-point zero and both +0.0 and -0.0 are present?

(Regarding the patch itself, see my previous comments on previous patches 
in this series.  For example:

* All functions should have comments explaining the semantics of their 
arguments and return values.

* Diagnostic function calls should have explicit locations passed.

* Every diagnostic (that's not an ICE) should have a corresponding 
testcase in the testsuite for the associated cases of invalid code - I see 
no tests using dg-error at all in this patch.  Every case where the 
specification says something is not permitted should have an associated 
check in the compiler, diagnostic and testcase.

* Diagnostic formatting also needs fixing to follow the GNU Coding 
Standards.

* The usage

+         error ("__sec_implicit_index parameter must be constant integer "
+                "expression");
+         error ("Bailing out due to previous error");
+         exit (ICE_EXIT_CODE);

should be avoided.  Front-end code should never need to call exit 
directly.  If you have an internal error - something that should not be 
possible for any user code, valid or invalid - use internal_error.  If an 
error can occur for invalid user code, just call error_at and allow the 
compiler to continue execution to find further problems in the user's 
code, without bailing out.

* There should be nothing x86-specific about the testcases so they 
shouldn't need to be conditioned on x86 targets.

* Avoid using "int" as a type in the compiler to count the number of some 
entity on the host, use size_t instead, there's no need to add new cases 
that will cause trouble when someone wants to build a program with 2^31 of 
something, although there are plenty of such cases already.

* Instead of casting the result of xmalloc, use existing macros such as 
XNEWVEC.

* Use @option not @code for option named in the manual.

* Instead of declaring non-static functions

+int extract_sec_implicit_index_arg (tree);
+bool is_sec_implicit_index_fn (tree);
+void array_notation_init_builtins (void);

in a source file (array-notation-common.c), either make them static or 
declare them in an appropriate header included everywhere needing those 
declarations.

There may also be other issues.)

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Cilk Plus merging to trunk (2 of n)

Reply via email to