Martin Sebor <mse...@gmail.com> writes:
> On 11/08/2017 09:51 AM, Richard Sandiford wrote:
>> Martin Sebor <mse...@gmail.com> writes:
>>> On 11/08/2017 02:32 AM, Richard Sandiford wrote:
>>>> Martin Sebor <mse...@gmail.com> writes:
>>>>> I haven't done nearly a thorough review but the dtor followed by
>>>>> the placement new in the POLY_SET_COEFF() macro caught my eye so
>>>>> I thought I'd ask sooner rather than later.  Given the macro
>>>>> definition:
>>>>>
>>>>> +   The dummy comparison against a null C * is just a way of checking
>>>>> +   that C gives the right type.  */
>>>>> +#define POLY_SET_COEFF(C, RES, I, VALUE) \
>>>>> +  ((void) (&(RES).coeffs[0] == (C *) 0), \
>>>>> +   wi::int_traits<C>::precision_type == wi::FLEXIBLE_PRECISION \
>>>>> +   ? (void) ((RES).coeffs[I] = VALUE) \
>>>>> +   : (void) ((RES).coeffs[I].~C (), new (&(RES).coeffs[I]) C (VALUE)))
>>>>>
>>>>> is the following use well-defined?
>>>>>
>>>>> +template<unsigned int N, typename C>
>>>>> +inline poly_int_pod<N, C>&
>>>>> +poly_int_pod<N, C>::operator <<= (unsigned int a)
>>>>> +{
>>>>> +  POLY_SET_COEFF (C, *this, 0, this->coeffs[0] << a);
>>>>>
>>>>> It looks to me as though the VALUE argument in the ctor invoked
>>>>> by the placement new expression is evaluated after the dtor has
>>>>> destroyed the very array element the VALUE argument expands to.
>>>>
>>>> Good catch!  It should simply have been doing <<= on each coefficient --
>>>> I must have got carried away when converting to POLY_SET_COEFF.
>>>>
>>>> I double-checked the other uses and think that's the only one.
>>>>
>>>>> Whether or not is, in fact, a problem, it seems to me that using
>>>>> a function template rather than a macro would be a clearer and
>>>>> safer way to do the same thing.  (Safer in that the macro also
>>>>> evaluates its arguments multiple times, which is often a source
>>>>> of subtle bugs.)
>>>>
>>>> That would slow down -O0 builds though, by introducing an extra
>>>> function call and set of temporaries even when the coefficients
>>>> are primitive integers.
>>>
>>> Would decorating the function template with attribute always_inline
>>> help?
>>
>> It would remove the call itself, but we'd still have the extra temporary
>> objects that were the function argument and return value.
>
> Sorry, I do not want to get into another long discussion about
> trade-offs between safety and efficiency but I'm not sure I see
> what extra temporaries it would create.  It seems to me that
> an inline function template that took arguments of user-defined
> types by reference and others by value should be just as efficient
> as a macro.
>
>  From GCC's own manual:
>
>    6.43 An Inline Function is As Fast As a Macro
>    https://gcc.gnu.org/onlinedocs/gcc/Inline.html

You can see the difference with something like:

  inline
  void __attribute__((always_inline))
  f(int &dst, const int &src) { dst = src; }

  int g1(const int &y) { int x; f(x, y); return x; }
  int g2(const int &y) { int x; x = y; return x; }

where *.optimized from GCC 7.1 at -O0 is:

int g1(const int&) (const int & y)
{
  int & dst;
  const int & src;
  int x;
  int D.2285;
  int _3;
  int _6;

  <bb 2> [0.00%]:
  src_5 = y_2(D);
  _6 = *src_5;
  x = _6;
  _3 = x;
  x ={v} {CLOBBER};

<L1> [0.00%]:
  return _3;

}

vs:

int g2(const int&) (const int & y)
{
  int x;
  int D.2288;
  int _4;

  <bb 2> [0.00%]:
  x_3 = *y_2(D);
  _4 = x_3;

<L0> [0.00%]:
  return _4;

}

> If that's not the case and there is a significant performance
> penalty associated with inline functions at -O0 then GCC should
> be fixed to avoid it.

I think those docs are really talking about inline functions being as
fast as macros when optimisation is enabled.  I don't think we make
any guarantees about -O0 code quality.

Thanks,
Richard

Reply via email to