On Fri, Oct 2, 2015 at 11:03 AM, Jakub Jelinek <ja...@redhat.com> wrote:
> On Fri, Oct 02, 2015 at 10:18:01AM +0200, Richard Biener wrote:
>> On Thu, Oct 1, 2015 at 8:36 PM, Jakub Jelinek <ja...@redhat.com> wrote:
>> > On Thu, Oct 01, 2015 at 02:57:15PM +0100, James Greenhalgh wrote:
>> >> 2015-10-01  James Greenhalgh  <james.greenha...@arm.com>
>> >>
>> >>       * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
>> >
>> > Also, please note that
>> > +      wide_int m = wi::min_value (TYPE_PRECISION (type), SIGNED);
>> > +      tree tt
>> > +     = build_nonstandard_integer_type (TYPE_PRECISION (type),
>> > +                                       false);
>> > +      tree mask = wide_int_to_tree (tt, m);
>> > is really not a reliable way to determine which bit to change.
>> > In some floating format it is not possible at all, in others it might not
>> > be the topmost bit of the precision, or might depend on
>> > FLOAT_WORDS_BIG_ENDIAN etc., see expand_copysign_bit and expand_copysign
>> > for details (e.g. one has to look at fmt->signbit_rw etc.).
>> > So, I probably agree with Andrew that it would be better optimized during
>> > expansion.  One issue for that though is that TER stops at calls, we'd need
>> > to special case this case.
>>
>> I agreee with optimizing this in expansion only.  The copysign form is 
>> shorter
>> and it captures the high-level part of the operation better.  Say we later
>> constant-propagate a positive real into y then chances are high we optimize
>> the copysign form but not the lowered one.  Also if we ever get VRP to
>> handle real-type ranges it would need to decipher the sequence as well.
>
> But hacking TER for this special case might not be nice either, perhaps
> we want an internal function that would represent this
> - CHANGESIGN (x, y) -- (x * copysign (1.0, y)) (or some better name) and
> fold this say during fab pass or so, and then let expansion decide how 
> exactly to
> expand it (custom optab, or the generic tweaking of the bit, something else?).

In the long run I wanted to make special expansions not require TER by doing
those on the GIMPLE level in a separate pass right before expansion.

> BTW, it seems wrf also in many places uses MAX <copysign (1.0, y), 0.0>
> or MIN <copysign (1.0, y), 0.0> (always in pairs), would that be also
> something to optimize?

Hmm, we'll already CSE copysign so the question is how to optimize
tem1 = MAX <x, y>; tem2= MIN <x, y>;  Turn them back into control-flow?
What would we like to end up with in assembly?

> Also, the x * copysign (1.0, y) in wrf is actually x * (1/12.) * copysign 
> (1.0, y)
> (or similar - other constants), wouldn't it make more sense to optimize that
> as x * copysign (1/12., y) first (at least if we can reassociate)?

Yeah, I think CST * copysign (CST, ...) should constant fold to
copysign (CST', ...)
if that's always valid.  I don't think association comes into play
here but as always
you read the fine prints of the standard for FP optimziations...

Richard.

>
>         Jakub

Reply via email to