On Thu, Jul 07, 2005 at 05:30:26PM -0700, Dale Johannesen wrote: > cvtss2sd [EMAIL PROTECTED](%ecx), %xmm0 > > (this is Linux, the same happens on Darwin). > This is not really a good idea, as movsd of a double-precision 1.0 is > faster. The change from double to single precision is done in > compress_float_constant, and there's no cost computation there; > presumably the RTL optimizers are expected to change it back if > that's beneficial.
No, not really. It was expected that any target that supports conversion with memory input and register output actually stores values in registers in a canonical format (think 387 fpu), and so the conversion is actually a plain load, and so the conversion is really and truely free. If I'm not mistaken, -mfpmath=sse is the only exception to this. :-/ > With -fpic, first, fold_rtx doesn't recognize the PIC form as > representing a constant, While I certainly wouldn't expect fold_rtx to find out about this all by itself, I'd have thought that there would have been a REG_EQUIV or REG_EQUAL note that indicates that the end result is the constant (const_double:DF 1.0), and use that in any simplification. That said, I don't know that I'd work too hard on this, since most simplification should have happened already at the tree level. > Hacking around that doesn't help, because force_const_mem doesn't > produce the PIC form of constant reference, even though we're in > PIC mode ... Correct. The caller is supposed to use validize_mem. > At this point I'm wondering if this is the right place to be attacking > the problem at all. No, not really. We should simply add cost comparisons before compressing the constant in the first place. r~