On Sun, 23 Sep 2012, Iyer, Balaji V wrote:
> Hello Everyone,
> Attached, please find a patch that will implement Cilk Plus Array
> Notations for the C compiler. Array notations are indented to allow
> programmers to directly express parallelism in their programs. Array
> notations can be used to possibly see a more predictable performance
> improvement, hardware resource-utilization and vectorization. To enable
> the compiler recognize array notation syntax, we have added a flag
> "-fcilkplus." If this flag is not used, none of these changes are
> visible to the compiler user. For more information and examples about
> array notations please see Chapter 4 in the Cilk Plus Specification
> (http://software.intel.com/sites/default/files/m/6/3/1/cilk_plus_language_specification.pdf).
There seem to be a lot of deficiencies in this specification. I list some
here - I don't want answers in email explaining what is intended (unless
it's already explained in the specification and I missed it), I want the
specification revised to describe things in closer relation to C standard
terms, and then you can answer in email pointing to the relevant wording
in the improved specification - and to the testcases in your patch
verifying that the semantics are properly implemented, but without needing
to elaborate beyond pointing to specification text and testcases for each
question.
What syntax productions, in C11 terms, are <array base>, <lower bound>,
<length> and <stride> in the syntax given for section operators? Are
there constraints that <lower bound>, <length> and <stride> must be of
integer type? Are there other constraints on their types and values?
(For example, must anything be constant? If there isn't such a
requirement, what exactly is the requirement that "The right-hand side
expression must have the same rank and size as the array context." - if a
constraint, in what cases is it a constraint? Maybe a constraint that
values must match if integer constant expressions, with non-matching
runtime values, at least one not an integer constant expression, being
undefined behavior at runtime, for example? What anyway is "size" in that
quoted sentence? Unlike "rank" and "shape", it doesn't seem to be a
defined term.)
What do you mean, in standard terms, by "must have a declared size"? How
is this defined in relation to standard C array-to-pointer decay? What
about adjustment of function parameter types? What about array sizes
declared as [*]? What if the <array base> is in parentheses?
Should I take it, from the absence of any restrictions mentioned on this
subject, that the array that is sectioned may have elements of arbitrary
type - so it's valid to operate this way on arrays of pointers, complex
numbers or structures, for example? But that the usual constraints on
assignment apply, so that if you try A[:] = B[:] where A and B are
two-dimensional arrays, this is a constraint violation, because the
elements that would be assigned elementwise are themselves still arrays
and C doesn't allow array assignment?
Is it valid or invalid to have an expression of the form (A[1:2])[1:2]
with parentheses around a partial array section expression, of which a
further section is taken? It's not in the syntax given, but that syntax
doesn't actually show the precise syntax productions added to C11.
What is the type of an array section expression? What is the result of
applying sizeof to such an expression? What about GNU C typeof? Use in
C11 _Generic?
Can such expressions be used in conditional expressions, scalar ? section
: other-section? If you'd defined types, at least this could be deduced
from the existing C rules on conditional expressions that say what the
permitted types are. Can you use them with comma operators, A[:] = (B,
C[:])? It's far from clear from the document as-is to what extent such
expressions have an existence with types and values like normal
expressions, in which case this would of course be permitted, as opposed
to being special-case things for the RHS of assignments.
You say (4.3.2.2) that certain operators are applied with the same
precedence and rules on permitted operands as for scalars. What
promotions apply? If operations are carried out on two signed char
arrays, for example, are the elements considered to be promoted to int,
resulting in an expression whose type is thought of as a section of an
array of int, which may then be converted back to signed char if stored in
a signed char array? (Thus, internal operations take place in int and
what would have been overflow in signed char would not cause undefined
behavior.)
In general, do such conversions apply between different operands, and on
assignment?
I note you do not mention shift operators in the list in 4.3.2.2 of those
permitted - obviously that requires testcases that they are duly rejected.
Are functions such as __sec_reduce_add defined to apply the relevant
operation to an accumulated value and each element of the section in turn,
in some unspecified order? Or may it evaluate, for example, a sum of four
elements as (a + b) + (c + d)? What about if a user function is provided?
I take it each function has constraints corresponding to those on the
relevant arithmetic operation? (For example, __sec_reduce_add couldn't be
called for pointers.) Can the *zero functions be called for pointers
(testing whether they are NULL)? What about the *max* and *min*
functions? What about those functions for floating-point - do they follow
the semantics of fmax / fmin regarding NaNs? Are the results unspecified
if the answer is a floating-point zero and both +0.0 and -0.0 are present?
(Regarding the patch itself, see my previous comments on previous patches
in this series. For example:
* All functions should have comments explaining the semantics of their
arguments and return values.
* Diagnostic function calls should have explicit locations passed.
* Every diagnostic (that's not an ICE) should have a corresponding
testcase in the testsuite for the associated cases of invalid code - I see
no tests using dg-error at all in this patch. Every case where the
specification says something is not permitted should have an associated
check in the compiler, diagnostic and testcase.
* Diagnostic formatting also needs fixing to follow the GNU Coding
Standards.
* The usage
+ error ("__sec_implicit_index parameter must be constant integer "
+ "expression");
+ error ("Bailing out due to previous error");
+ exit (ICE_EXIT_CODE);
should be avoided. Front-end code should never need to call exit
directly. If you have an internal error - something that should not be
possible for any user code, valid or invalid - use internal_error. If an
error can occur for invalid user code, just call error_at and allow the
compiler to continue execution to find further problems in the user's
code, without bailing out.
* There should be nothing x86-specific about the testcases so they
shouldn't need to be conditioned on x86 targets.
* Avoid using "int" as a type in the compiler to count the number of some
entity on the host, use size_t instead, there's no need to add new cases
that will cause trouble when someone wants to build a program with 2^31 of
something, although there are plenty of such cases already.
* Instead of casting the result of xmalloc, use existing macros such as
XNEWVEC.
* Use @option not @code for option named in the manual.
* Instead of declaring non-static functions
+int extract_sec_implicit_index_arg (tree);
+bool is_sec_implicit_index_fn (tree);
+void array_notation_init_builtins (void);
in a source file (array-notation-common.c), either make them static or
declare them in an appropriate header included everywhere needing those
declarations.
There may also be other issues.)
--
Joseph S. Myers
[email protected]