Martin Sebor <mse...@gmail.com> writes: > On 11/08/2017 09:51 AM, Richard Sandiford wrote: >> Martin Sebor <mse...@gmail.com> writes: >>> On 11/08/2017 02:32 AM, Richard Sandiford wrote: >>>> Martin Sebor <mse...@gmail.com> writes: >>>>> I haven't done nearly a thorough review but the dtor followed by >>>>> the placement new in the POLY_SET_COEFF() macro caught my eye so >>>>> I thought I'd ask sooner rather than later. Given the macro >>>>> definition: >>>>> >>>>> + The dummy comparison against a null C * is just a way of checking >>>>> + that C gives the right type. */ >>>>> +#define POLY_SET_COEFF(C, RES, I, VALUE) \ >>>>> + ((void) (&(RES).coeffs[0] == (C *) 0), \ >>>>> + wi::int_traits<C>::precision_type == wi::FLEXIBLE_PRECISION \ >>>>> + ? (void) ((RES).coeffs[I] = VALUE) \ >>>>> + : (void) ((RES).coeffs[I].~C (), new (&(RES).coeffs[I]) C (VALUE))) >>>>> >>>>> is the following use well-defined? >>>>> >>>>> +template<unsigned int N, typename C> >>>>> +inline poly_int_pod<N, C>& >>>>> +poly_int_pod<N, C>::operator <<= (unsigned int a) >>>>> +{ >>>>> + POLY_SET_COEFF (C, *this, 0, this->coeffs[0] << a); >>>>> >>>>> It looks to me as though the VALUE argument in the ctor invoked >>>>> by the placement new expression is evaluated after the dtor has >>>>> destroyed the very array element the VALUE argument expands to. >>>> >>>> Good catch! It should simply have been doing <<= on each coefficient -- >>>> I must have got carried away when converting to POLY_SET_COEFF. >>>> >>>> I double-checked the other uses and think that's the only one. >>>> >>>>> Whether or not is, in fact, a problem, it seems to me that using >>>>> a function template rather than a macro would be a clearer and >>>>> safer way to do the same thing. (Safer in that the macro also >>>>> evaluates its arguments multiple times, which is often a source >>>>> of subtle bugs.) >>>> >>>> That would slow down -O0 builds though, by introducing an extra >>>> function call and set of temporaries even when the coefficients >>>> are primitive integers. >>> >>> Would decorating the function template with attribute always_inline >>> help? >> >> It would remove the call itself, but we'd still have the extra temporary >> objects that were the function argument and return value. > > Sorry, I do not want to get into another long discussion about > trade-offs between safety and efficiency but I'm not sure I see > what extra temporaries it would create. It seems to me that > an inline function template that took arguments of user-defined > types by reference and others by value should be just as efficient > as a macro. > > From GCC's own manual: > > 6.43 An Inline Function is As Fast As a Macro > https://gcc.gnu.org/onlinedocs/gcc/Inline.html
You can see the difference with something like: inline void __attribute__((always_inline)) f(int &dst, const int &src) { dst = src; } int g1(const int &y) { int x; f(x, y); return x; } int g2(const int &y) { int x; x = y; return x; } where *.optimized from GCC 7.1 at -O0 is: int g1(const int&) (const int & y) { int & dst; const int & src; int x; int D.2285; int _3; int _6; <bb 2> [0.00%]: src_5 = y_2(D); _6 = *src_5; x = _6; _3 = x; x ={v} {CLOBBER}; <L1> [0.00%]: return _3; } vs: int g2(const int&) (const int & y) { int x; int D.2288; int _4; <bb 2> [0.00%]: x_3 = *y_2(D); _4 = x_3; <L0> [0.00%]: return _4; } > If that's not the case and there is a significant performance > penalty associated with inline functions at -O0 then GCC should > be fixed to avoid it. I think those docs are really talking about inline functions being as fast as macros when optimisation is enabled. I don't think we make any guarantees about -O0 code quality. Thanks, Richard