Re: [PATCH 02/14] Add D frontend (GDC) implementation.

Richard Sandiford Sat, 20 Oct 2018 02:03:55 -0700

Iain Buclaw <ibuc...@gdcproject.org> writes:
> On 14 October 2018 at 17:29, Richard Sandiford
> <rdsandif...@googlemail.com> wrote:
>> [Sorry if this turns out to do be a dup]
>>
>> Iain Buclaw <ibuc...@gdcproject.org> writes:
>>> +/* Build nodes that are used by the D front-end.
>>> +   These are distinct from C types.  */
>>> +
>>> +static void
>>> +d_build_d_type_nodes (void)
>>> +{
>>> +  /* Integral types.  */
>>> +  byte_type_node = make_signed_type (8);
>>> +  ubyte_type_node = make_unsigned_type (8);
>>> +
>>> +  short_type_node = make_signed_type (16);
>>> +  ushort_type_node = make_unsigned_type (16);
>>> +
>>> +  int_type_node = make_signed_type (32);
>>> +  uint_type_node = make_unsigned_type (32);
>>> +
>>> +  long_type_node = make_signed_type (64);
>>> +  ulong_type_node = make_unsigned_type (64);
>>
>> It's a bit confusing for the D type to be long_type_node but the C/ABI type
>> to be long_integer_type_node.  The D type is surely an integer too. :-)
>> With this coming among the handling of built-in functions, it initially
>> looked related, and I was wondering how it was safe on ILP32 systems
>> before realising the difference.
>>
>> Maybe prefixing them all with "d_" would be too ugly, but it would at
>> least be good to clarify the comment to say that these are "distinct
>> type nodes" (rather than just distinct definitions, as I'd initially
>> assumed) and that they're not used outside the frontend, or by the C
>> imports.
>>
>
> If prefixing with "d_", perhaps dropping the "_node" would make them
> sufficiently not ugly (d_long_type, d_uint_type, d_byte_type).


Sounds good to me FWIW.

>>> +/* Helper routine for all error routines.  Reports a diagnostic specified 
>>> by
>>> +   KIND at the explicit location LOC, where the message FORMAT has not yet
>>> +   been translated by the gcc diagnostic routines.  */
>>> +
>>> +static void ATTRIBUTE_GCC_DIAG(3,0)
>>> +d_diagnostic_report_diagnostic (const Loc& loc, int opt, const char 
>>> *format,
>>> +                             va_list ap, diagnostic_t kind, bool verbatim)
>>> +{
>>> +  va_list argp;
>>> +  va_copy (argp, ap);
>>> +
>>> +  if (loc.filename || !verbatim)
>>> +    {
>>> +      rich_location rich_loc (line_table, get_linemap (loc));
>>> +      diagnostic_info diagnostic;
>>> +      char *xformat = expand_format (format);
>>> +
>>> +      diagnostic_set_info (&diagnostic, xformat, &argp, &rich_loc, kind);
>>
>> How does this work with translation?  xgettext will only see the original
>> format string, not the result of expand_format.  Do you have some scripting
>> to do the same format mangling when collecting the translation strings?
>> Same concern:
>>
>
> These diagnostic routines handle errors coming from the dmd front-end,
> which are not translated - all sources are listed under po/EXCLUDES in
> another patch.

OK.  In that case I think you want to use diagnostic_set_info_translated
instead of diagnostic_set_info, so that we don't try to translate things
that aren't meant to be translated.  Also it would be good to reword the
comment above the function, since "where the message FORMAT has not yet
been translated by the gcc diagnostic routines" made it sound like these
messages were supposed to be translated at some point, which is where the
confusion started. :-)

>>> +/* Write a little-endian 32-bit VALUE to BUFFER.  */
>>> +
>>> +void
>>> +Port::writelongLE (unsigned value, void *buffer)
>>> +{
>>> +    unsigned char *p = (unsigned char*) buffer;
>>> +
>>> +    p[0] = (unsigned) value;
>>> +    p[1] = (unsigned) value >> 8;
>>> +    p[2] = (unsigned) value >> 16;
>>> +    p[3] = (unsigned) value >> 24;
>>> +}
>>> ...
>>> +/* Write a big-endian 32-bit VALUE to BUFFER.  */
>>> +
>>> +void
>>> +Port::writelongBE (unsigned value, void *buffer)
>>> +{
>>> +    unsigned char *p = (unsigned char*) buffer;
>>> +
>>> +    p[0] = (unsigned) value >> 24;
>>> +    p[1] = (unsigned) value >> 16;
>>> +    p[2] = (unsigned) value >> 8;
>>> +    p[3] = (unsigned) value;
>>> +}
>>
>> Overindented bodies.  Missing space before "*" in "(unsigned char*)"
>> in all these functions.
>>
>> Obviously this stuff assumes host CHAR_BIT == 8, but let's be realistic :-)
>> Is it also used in ways that require the target BITS_PER_UNIT to be 8
>> as well?  That could realistically be a different value (and we've had
>> ports like that in the past).
>>
>
> These read(long|word)(BE|LE) functions should only ever be used when
> reading the BOM of a UTF-16 or UTF-32 file.
>
> I've done a grep, and the write(long|word)(BE|LE) are no longer used
> by the dmd frontend, so there's little point keeping them around.
>
> If there's any utility in libiberty or another location then I'd be
> more than happy to delegate this action to that here.

If it's for UTF-16 and UTF-32 then I think it's fine as-is (if you keep it).
Was just asking in case.

>>> +/* Implements the lang_hooks.get_alias_set routine for language D.
>>> +   Get the alias set corresponding the type or expression T.
>>> +   Return -1 if we don't do anything special.  */
>>> +
>>> +static alias_set_type
>>> +d_get_alias_set (tree t)
>>> +{
>>> +  /* Permit type-punning when accessing a union, provided the access
>>> +     is directly through the union.  */
>>> +  for (tree u = t; handled_component_p (u); u = TREE_OPERAND (u, 0))
>>> +    {
>>> +      if (TREE_CODE (u) == COMPONENT_REF
>>> +       && TREE_CODE (TREE_TYPE (TREE_OPERAND (u, 0))) == UNION_TYPE)
>>> +     return 0;
>>> +    }
>>> +
>>> +  /* That's all the expressions we handle.  */
>>> +  if (!TYPE_P (t))
>>> +    return get_alias_set (TREE_TYPE (t));
>>> +
>>> +  /* For now in D, assume everything aliases everything else,
>>> +     until we define some solid rules.  */
>>> +  return 0;
>>> +}
>>
>> Any reason for not returning 0 unconditionally?  Won't the queries
>> deferred to get_alias_set either return 0 directly or come back here
>> and get 0?
>>
>> Might be worth adding a comment referencing build_vconvert, which stood
>> out as a source of what in C would be alias violations.
>>
>
> All previous attempts at doing more with this has caused silent
> corruption at runtime, the existence of build_vconvert likely doesn't
> help either.
>
> Although it isn't in the spec, D should be "strict aliasing".  But
> there's a lot of production code that looks like `intvar = *(int
> *)&floatvar;` that is currently expected to just work.
>
> By rule of thumb in D, the C behaviour should be followed if it looks
> like C code.  The only places where we deviate are made to be a
> compiler error.

But my point was more that the function seems to have the effect of
returning 0 all the time, so wouldn't it be better to get rid of the
code before the final "return 0"?  Or is there a case I missed?

>>> +/* Clear the DECL_BUILT_IN_CLASS flag on the function in CALLEXP.  */
>>> +
>>> +static void
>>> +clear_intrinsic_flag (tree callexp)
>>> +{
>>> +  tree decl = CALL_EXPR_FN (callexp);
>>> +
>>> +  if (TREE_CODE (decl) == ADDR_EXPR)
>>> +    decl = TREE_OPERAND (decl, 0);
>>> +
>>> +  gcc_assert (TREE_CODE (decl) == FUNCTION_DECL);
>>> +
>>> +  DECL_INTRINSIC_CODE (decl) = INTRINSIC_NONE;
>>> +  DECL_BUILT_IN_CLASS (decl) = NOT_BUILT_IN;
>>> +  DECL_FUNCTION_CODE (decl) = BUILT_IN_NONE;
>>> +}
>>
>> It seems wrong that a particular call to a function would change the
>> function's properties for all calls, and I don't think there's any
>> expectation in the midend that this kind of change might happen.
>>
>> Why's it needed?  Whatever the reason, I think it needs to be done in a
>> different way.
>>
>
> There are some D library math functions that are only treated as
> built-in during compile-time only.  When it comes to run-time code
> generation, we want to call the D library functions, not the C library
> (or built-ins).

But e.g. we can treat printf as a built-in and still call the real printf
without doing this.

>>> +static tree
>>> +expand_intrinsic_sqrt (intrinsic_code intrinsic, tree callexp)
>>> +{
>>> +  tree arg = CALL_EXPR_ARG (callexp, 0);
>>> +  tree type = TYPE_MAIN_VARIANT (TREE_TYPE (arg));
>>> +
>>> +  /* arg is an integral type - use double precision.  */
>>> +  if (INTEGRAL_TYPE_P (type))
>>> +    arg = fold_convert (double_type_node, arg);
>>> +
>>> +  /* Which variant of __builtin_sqrt* should we call?  */
>>> +  built_in_function code = (intrinsic == INTRINSIC_SQRT) ? BUILT_IN_SQRT :
>>> +    (intrinsic == INTRINSIC_SQRTF) ? BUILT_IN_SQRTF :
>>> +    (intrinsic == INTRINSIC_SQRTL) ? BUILT_IN_SQRTL : END_BUILTINS;
>>> +
>>> +  gcc_assert (code != END_BUILTINS);
>>> +  return call_builtin_fn (callexp, code, 1, arg);
>>> +}
>>
>> Converting integers to double precision isn't right for SQRTF and SQRTL.
>> But how do such calls reach here?  It looks like the deco strings in the
>> function entries provide types for the 3 sqrt intrinsics, is that right?
>> Does that translate to a well-typed FUNCTION_DECL, or do the function
>> types use some opaque types instead?  If the intrinsic function itself
>> is well-typed then an incoming CALLEXP with integer arguments would be
>> invalid.
>>
>
> As with pow(), I'll have to check by running this through the
> testsuite to see what fails and why.
>
> From memory, the reason is likely some requirement of CTFE, where this
> call *must* be const folded at compile-time, which may not happen if
> the types are wrong.

The reason for picking this one in particular is that the INTRINSIC_SQRT{,F,L}
FUNCTION_DECLs looked like they would have correct types, so any CALL_EXPR
tree with mismatched types would be invalid in terms of GCC internals.
If that happens, it sounds like there's a bug earlier on.

Thanks,
Richard

Re: [PATCH 02/14] Add D frontend (GDC) implementation.

Reply via email to