Martin Sebor <[email protected]> writes:
> On 11/08/2017 09:51 AM, Richard Sandiford wrote:
>> Martin Sebor <[email protected]> writes:
>>> On 11/08/2017 02:32 AM, Richard Sandiford wrote:
>>>> Martin Sebor <[email protected]> writes:
>>>>> I haven't done nearly a thorough review but the dtor followed by
>>>>> the placement new in the POLY_SET_COEFF() macro caught my eye so
>>>>> I thought I'd ask sooner rather than later. Given the macro
>>>>> definition:
>>>>>
>>>>> + The dummy comparison against a null C * is just a way of checking
>>>>> + that C gives the right type. */
>>>>> +#define POLY_SET_COEFF(C, RES, I, VALUE) \
>>>>> + ((void) (&(RES).coeffs[0] == (C *) 0), \
>>>>> + wi::int_traits<C>::precision_type == wi::FLEXIBLE_PRECISION \
>>>>> + ? (void) ((RES).coeffs[I] = VALUE) \
>>>>> + : (void) ((RES).coeffs[I].~C (), new (&(RES).coeffs[I]) C (VALUE)))
>>>>>
>>>>> is the following use well-defined?
>>>>>
>>>>> +template<unsigned int N, typename C>
>>>>> +inline poly_int_pod<N, C>&
>>>>> +poly_int_pod<N, C>::operator <<= (unsigned int a)
>>>>> +{
>>>>> + POLY_SET_COEFF (C, *this, 0, this->coeffs[0] << a);
>>>>>
>>>>> It looks to me as though the VALUE argument in the ctor invoked
>>>>> by the placement new expression is evaluated after the dtor has
>>>>> destroyed the very array element the VALUE argument expands to.
>>>>
>>>> Good catch! It should simply have been doing <<= on each coefficient --
>>>> I must have got carried away when converting to POLY_SET_COEFF.
>>>>
>>>> I double-checked the other uses and think that's the only one.
>>>>
>>>>> Whether or not is, in fact, a problem, it seems to me that using
>>>>> a function template rather than a macro would be a clearer and
>>>>> safer way to do the same thing. (Safer in that the macro also
>>>>> evaluates its arguments multiple times, which is often a source
>>>>> of subtle bugs.)
>>>>
>>>> That would slow down -O0 builds though, by introducing an extra
>>>> function call and set of temporaries even when the coefficients
>>>> are primitive integers.
>>>
>>> Would decorating the function template with attribute always_inline
>>> help?
>>
>> It would remove the call itself, but we'd still have the extra temporary
>> objects that were the function argument and return value.
>
> Sorry, I do not want to get into another long discussion about
> trade-offs between safety and efficiency but I'm not sure I see
> what extra temporaries it would create. It seems to me that
> an inline function template that took arguments of user-defined
> types by reference and others by value should be just as efficient
> as a macro.
>
> From GCC's own manual:
>
> 6.43 An Inline Function is As Fast As a Macro
> https://gcc.gnu.org/onlinedocs/gcc/Inline.html
You can see the difference with something like:
inline
void __attribute__((always_inline))
f(int &dst, const int &src) { dst = src; }
int g1(const int &y) { int x; f(x, y); return x; }
int g2(const int &y) { int x; x = y; return x; }
where *.optimized from GCC 7.1 at -O0 is:
int g1(const int&) (const int & y)
{
int & dst;
const int & src;
int x;
int D.2285;
int _3;
int _6;
<bb 2> [0.00%]:
src_5 = y_2(D);
_6 = *src_5;
x = _6;
_3 = x;
x ={v} {CLOBBER};
<L1> [0.00%]:
return _3;
}
vs:
int g2(const int&) (const int & y)
{
int x;
int D.2288;
int _4;
<bb 2> [0.00%]:
x_3 = *y_2(D);
_4 = x_3;
<L0> [0.00%]:
return _4;
}
> If that's not the case and there is a significant performance
> penalty associated with inline functions at -O0 then GCC should
> be fixed to avoid it.
I think those docs are really talking about inline functions being as
fast as macros when optimisation is enabled. I don't think we make
any guarantees about -O0 code quality.
Thanks,
Richard