On Thu, Jul 07, 2005 at 05:30:26PM -0700, Dale Johannesen wrote:
>         cvtss2sd        [EMAIL PROTECTED](%ecx), %xmm0
> 
> (this is Linux, the same happens on Darwin).
> This is not really a good idea, as movsd of a double-precision 1.0 is 
> faster.  The change from double to single precision is done in 
> compress_float_constant, and there's no cost computation there;
> presumably the RTL optimizers are expected to change it back if
> that's beneficial.

No, not really.  It was expected that any target that supports
conversion with memory input and register output actually stores
values in registers in a canonical format (think 387 fpu), and so
the conversion is actually a plain load, and so the conversion is
really and truely free.

If I'm not mistaken, -mfpmath=sse is the only exception to this.  :-/

> With -fpic, first, fold_rtx doesn't recognize the PIC form as 
> representing a constant,

While I certainly wouldn't expect fold_rtx to find out about this
all by itself, I'd have thought that there would have been a
REG_EQUIV or REG_EQUAL note that indicates that the end result is
the constant (const_double:DF 1.0), and use that in any simplification.

That said, I don't know that I'd work too hard on this, since most
simplification should have happened already at the tree level.


> Hacking around that doesn't help, because force_const_mem doesn't
> produce the PIC form of constant reference, even though we're in
> PIC mode ...

Correct.  The caller is supposed to use validize_mem.

> At this point I'm wondering if this is the right place to be attacking 
> the problem at all.

No, not really.  We should simply add cost comparisons before
compressing the constant in the first place.


r~

Reply via email to