On Fri, Oct 2, 2015 at 11:03 AM, Jakub Jelinek <ja...@redhat.com> wrote: > On Fri, Oct 02, 2015 at 10:18:01AM +0200, Richard Biener wrote: >> On Thu, Oct 1, 2015 at 8:36 PM, Jakub Jelinek <ja...@redhat.com> wrote: >> > On Thu, Oct 01, 2015 at 02:57:15PM +0100, James Greenhalgh wrote: >> >> 2015-10-01 James Greenhalgh <james.greenha...@arm.com> >> >> >> >> * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier. >> > >> > Also, please note that >> > + wide_int m = wi::min_value (TYPE_PRECISION (type), SIGNED); >> > + tree tt >> > + = build_nonstandard_integer_type (TYPE_PRECISION (type), >> > + false); >> > + tree mask = wide_int_to_tree (tt, m); >> > is really not a reliable way to determine which bit to change. >> > In some floating format it is not possible at all, in others it might not >> > be the topmost bit of the precision, or might depend on >> > FLOAT_WORDS_BIG_ENDIAN etc., see expand_copysign_bit and expand_copysign >> > for details (e.g. one has to look at fmt->signbit_rw etc.). >> > So, I probably agree with Andrew that it would be better optimized during >> > expansion. One issue for that though is that TER stops at calls, we'd need >> > to special case this case. >> >> I agreee with optimizing this in expansion only. The copysign form is >> shorter >> and it captures the high-level part of the operation better. Say we later >> constant-propagate a positive real into y then chances are high we optimize >> the copysign form but not the lowered one. Also if we ever get VRP to >> handle real-type ranges it would need to decipher the sequence as well. > > But hacking TER for this special case might not be nice either, perhaps > we want an internal function that would represent this > - CHANGESIGN (x, y) -- (x * copysign (1.0, y)) (or some better name) and > fold this say during fab pass or so, and then let expansion decide how > exactly to > expand it (custom optab, or the generic tweaking of the bit, something else?).
In the long run I wanted to make special expansions not require TER by doing those on the GIMPLE level in a separate pass right before expansion. > BTW, it seems wrf also in many places uses MAX <copysign (1.0, y), 0.0> > or MIN <copysign (1.0, y), 0.0> (always in pairs), would that be also > something to optimize? Hmm, we'll already CSE copysign so the question is how to optimize tem1 = MAX <x, y>; tem2= MIN <x, y>; Turn them back into control-flow? What would we like to end up with in assembly? > Also, the x * copysign (1.0, y) in wrf is actually x * (1/12.) * copysign > (1.0, y) > (or similar - other constants), wouldn't it make more sense to optimize that > as x * copysign (1/12., y) first (at least if we can reassociate)? Yeah, I think CST * copysign (CST, ...) should constant fold to copysign (CST', ...) if that's always valid. I don't think association comes into play here but as always you read the fine prints of the standard for FP optimziations... Richard. > > Jakub