Re: Linemap and pph

2011-07-27 Thread Dodji Seketeli
Gabriel Charette  a écrit:

> From what I understand, the source_locations allocated for
> everything in a given set of headers (from the LC_ENTER for the
> header in the line_table up to, and including everything in between,
> the corresponding LC_LEAVE) is dependent on only one thing; the
> value of line_table->highest_location when the header was inserted

The source_location handed out by a given line map is dependant on
three things:

1/ The line_map::start_location.  For a given map, this one equals
line_table->highest_location + 1, because at the time the line map is
created, its line_map::start_location must be greater than the highest
source_location handed out by any line map previously created.  At any
point in time, line_table->highest_location equals the
line_map::start_location of the lastly created line_map.  This is part
of the monotonically increasing property of source_location we were
talking about earlier.

2/ The line number of the location

3/ The column number of the location

> (i.e. if in two different contexts we happen to have the same
> line_table->highest_location when inserting header A.h, all of the
> tokens for A.h in each context will have the same source_location).

Each token coming from a given A.h will have a different
source_location, as they will presumably have a different {line,column}
pair.

If in two different contexts we happen to have the same
line_table->highest_location, the line_map::start_location of the two
different line maps of the two different A.hs will be equal, though.
 
> If the previous statement is true, then we calculate an offset
> between the line_table->highest_location as it was in the serialized
> version of the header and as it is in the C file in which we are
> about to de-serialize the line table.
>
> We then need to update some things based on that offset:

[...]

It seems to me that you could just set the line_map::start_location of
the de-serialized map for the current portion of A.h to the current
line_table->highest_location of the main CU your are currently parsing
(i.e, just forget about the serialized line_table->highest_location
into account.  Actually I would think that it's unnecessary to
serialize the line_table->highest_location) + 1.

Then you should also set the de-serialized line_map::to_line to the
current line in your context.  Then you should add that line map to
the current line_table and set line_table->highest_location to the
line_map::start_location you have just computed.

Then, assign the source_location of each token that belongs to that
de-serialized map to a source_location that is handed out by your new
linemap_position_for_line_and_column, with de-serialized map passed in
argument.  This is where you would progress "backward" in the token
stream, for that given map.

Then somehow, when you need to suck in a new pph (so this time you are
going downward in the stream again), just start this dance of
de-serializing the map for that new pph, updating its properties,
adding it to line_table, and setting the source_location of the tokens
coming for that pph again.

-- 
Dodji


Question about SMS scheduling windows

2011-07-27 Thread Richard Sandiford
I've been looking at SMS, and have a question about get_sched_window.
When there are previously-scheduled predessors, we use:

  if (e->data_type == MEM_DEP)
end = MIN (end, SCHED_TIME (v_node) + ii - 1);

to get an upper bound on the scheduling window that is permitted
by memory dependencies.  I think this:

SCHED_TIME (v_node) + ii - 1

is an inclusive bound, in that scheduling the node at that time
would not break the memory dependence, whereas scheduling at
SCHED_TIME (v_node) would.  Is that right?

I ask because in the final range:

  start = early_start;
  end = MIN (end, early_start + ii);
  /* Schedule the node close to it's predecessors.  */
  step = 1;

END is an exclusive bound.  It seems like we might be double-counting here,
and effectively limiting the schedule to SCHED_TIME (v_node) + ii - 2.

While I'm here, I was also curious about:

  /* If there are more successors than predecessors schedule the
 node close to it's successors.  */
  if (count_succs >= count_preds)
{
  int old_start = start;

  start = end - 1;
  end = old_start - 1;
  step = -1;
}

This doesn't seem to be in the paper, and the comment suggests
"count_succs > count_preds" rather than "count_succs >= count_preds".
Is the ">=" vs ">" important?

Thanks,
Richard


Re: Question about SMS scheduling windows

2011-07-27 Thread Richard Sandiford
Richard Sandiford  writes:
> I've been looking at SMS, and have a question about get_sched_window.
> When there are previously-scheduled predessors, we use:
>
> if (e->data_type == MEM_DEP)
>   end = MIN (end, SCHED_TIME (v_node) + ii - 1);
>
> to get an upper bound on the scheduling window that is permitted
> by memory dependencies.  I think this:
>
> SCHED_TIME (v_node) + ii - 1
>
> is an inclusive bound, in that scheduling the node at that time
> would not break the memory dependence, whereas scheduling at
> SCHED_TIME (v_node) would.  Is that right?

I meant of course "scheduling at SCHED_TIME (v_node) + ii would".

Richard


Re: Question about SMS scheduling windows

2011-07-27 Thread Revital1 Eres
Hello Richard,

> I ask because in the final range:
>
>   start = early_start;
>   end = MIN (end, early_start + ii);
>   /* Schedule the node close to it's predecessors.  */
>   step = 1;
>
> END is an exclusive bound.  It seems like we might be double-counting
here,
> and effectively limiting the schedule to SCHED_TIME (v_node) + ii - 2.

Yes, I think it indeed should be fixed. Thanks for reporting on this.

Revital



Re: Linemap and pph

2011-07-27 Thread Gabriel Charette
I think I wasn't clear in the way I expressed my assumptions in my last email:

On Wed, Jul 27, 2011 at 1:11 AM, Dodji Seketeli  wrote:
>
> Gabriel Charette  a écrit:
>
> > From what I understand, the source_locations allocated for
> > everything in a given set of headers (from the LC_ENTER for the
> > header in the line_table up to, and including everything in between,
> > the corresponding LC_LEAVE) is dependent on only one thing; the
> > value of line_table->highest_location when the header was inserted
>
> The source_location handed out by a given line map is dependant on
> three things:
>
> 1/ The line_map::start_location.  For a given map, this one equals
> line_table->highest_location + 1, because at the time the line map is
> created, its line_map::start_location must be greater than the highest
> source_location handed out by any line map previously created.  At any
> point in time, line_table->highest_location equals the
> line_map::start_location of the lastly created line_map.  This is part
> of the monotonically increasing property of source_location we were
> talking about earlier.
>

Actually from my understanding, highest_location is not equal to the
last start_location, but to the last source_location returned by
either linemap_line_start or linemap_positition_for_column (which is
>= to the start_location of the current line_map).

> 2/ The line number of the location
>
> 3/ The column number of the location
>

Right, that's what I mean, I would not actually stream
highest_location. What I meant by they "all depend on only
highest_location" is that IF highest_location is the same in file B.c
and in a different compiled file C.c when they happen to include A.h,
then all of the source_locations for the tokens in A.h (in both
compilation) would be identical (i.e. token1's loc in B == token1's
loc in C, etc.) (as 2/3 always is the same since we're talking about
the same file A.h in both compilation, hence if 1 also holds, we get
the same result).

> > (i.e. if in two different contexts we happen to have the same
> > line_table->highest_location when inserting header A.h, all of the
> > tokens for A.h in each context will have the same source_location).
>
> Each token coming from a given A.h will have a different
> source_location, as they will presumably have a different {line,column}
> pair.
>

What I meant is that all of the source locations handed out in the
first compilation will be the same as all of the source locations
handed out in the second compilation, pairwise (not that ALL token's
source locations themselves will be the same within a single
compilation of course!).

> If in two different contexts we happen to have the same
> line_table->highest_location, the line_map::start_location of the two
> different line maps of the two different A.hs will be equal, though.
>
> > If the previous statement is true, then we calculate an offset
> > between the line_table->highest_location as it was in the serialized
> > version of the header and as it is in the C file in which we are
> > about to de-serialize the line table.
> >
> > We then need to update some things based on that offset:
>
> [...]
>

Hence, given that they only depend on start_location, I just have to
calculate an offset between the serialized start_location and the
start_location as it would be (highest_location + 1) in the C file
including the header, and offset all of the source_locations on each
token coming from the pph (without even needing to recalculate them!).

> It seems to me that you could just set the line_map::start_location of
> the de-serialized map for the current portion of A.h to the current
> line_table->highest_location of the main CU your are currently parsing
> (i.e, just forget about the serialized line_table->highest_location
> into account.  Actually I would think that it's unnecessary to
> serialize the line_table->highest_location) + 1.
>
> Then you should also set the de-serialized line_map::to_line to the
> current line in your context.  Then you should add that line map to
> the current line_table and set line_table->highest_location to the
> line_map::start_location you have just computed.
>
> Then, assign the source_location of each token that belongs to that
> de-serialized map to a source_location that is handed out by your new
> linemap_position_for_line_and_column, with de-serialized map passed in
> argument.  This is where you would progress "backward" in the token
> stream, for that given map.
>
> Then somehow, when you need to suck in a new pph (so this time you are
> going downward in the stream again), just start this dance of
> de-serializing the map for that new pph, updating its properties,
> adding it to line_table, and setting the source_location of the tokens
> coming for that pph again.
>

Doing it this way (with the offset) I would read in all the tokens and
linemap entries inherited from that header and it's underlying include
tree, thus no need to be tricky about inserting li

Re: Linemap and pph

2011-07-27 Thread Dodji Seketeli
Gabriel Charette  a écrit:


> Actually from my understanding, highest_location is not equal to the
> last start_location, but to the last source_location returned by
> either linemap_line_start or linemap_positition_for_column (which is
> >>= to the start_location of the current line_map).

Right, it equals the highest location yielded by any map in the system.
Sorry.

>
>> 2/ The line number of the location
>>
>> 3/ The column number of the location
>>
>
> Right, that's what I mean, I would not actually stream
> highest_location. What I meant by they "all depend on only
> highest_location" is that IF highest_location is the same in file B.c
> and in a different compiled file C.c when they happen to include A.h,
> then all of the source_locations for the tokens in A.h (in both
> compilation) would be identical (i.e. token1's loc in B == token1's
> loc in C, etc.) (as 2/3 always is the same since we're talking about
> the same file A.h in both compilation, hence if 1 also holds, we get
> the same result).
>

Oh, OK.

>> > (i.e. if in two different contexts we happen to have the same
>> > line_table->highest_location when inserting header A.h, all of the
>> > tokens for A.h in each context will have the same source_location).
>>
>> Each token coming from a given A.h will have a different
>> source_location, as they will presumably have a different {line,column}
>> pair.
>>
>
> What I meant is that all of the source locations handed out in the
> first compilation will be the same as all of the source locations
> handed out in the second compilation, pairwise (not that ALL token's
> source locations themselves will be the same within a single
> compilation of course!).
>

OK.

>> If in two different contexts we happen to have the same
>> line_table->highest_location, the line_map::start_location of the two
>> different line maps of the two different A.hs will be equal, though.
>>
>> > If the previous statement is true, then we calculate an offset
>> > between the line_table->highest_location as it was in the serialized
>> > version of the header and as it is in the C file in which we are
>> > about to de-serialize the line table.
>> >
>> > We then need to update some things based on that offset:
>>
>> [...]
>>
>
> Hence, given that they only depend on start_location, I just have to
> calculate an offset between the serialized start_location and the
> start_location as it would be (highest_location + 1) in the C file
> including the header, and offset all of the source_locations on each
> token coming from the pph (without even needing to recalculate them!).
>

That could work.  But then you'd need to do something for a map encoding
the locations of tokens coming from the pph to appear in line_table,
right?  Otherwise, at lookup time, (when you want to find the map that
matches the source_location of a token coming for that pph), you'll be
in trouble.  I am saying this b/c you are not calling linemap_line_start
anymore.  And that function was the one that was including the said map
to line_table.  And the map still must be inserted into line_table
somehow.

>> It seems to me that you could just set the line_map::start_location of
>> the de-serialized map for the current portion of A.h to the current
>> line_table->highest_location of the main CU your are currently parsing
>> (i.e, just forget about the serialized line_table->highest_location
>> into account.  Actually I would think that it's unnecessary to
>> serialize the line_table->highest_location) + 1.
>>
>> Then you should also set the de-serialized line_map::to_line to the
>> current line in your context.  Then you should add that line map to
>> the current line_table and set line_table->highest_location to the
>> line_map::start_location you have just computed.
>>
>> Then, assign the source_location of each token that belongs to that
>> de-serialized map to a source_location that is handed out by your new
>> linemap_position_for_line_and_column, with de-serialized map passed in
>> argument.  This is where you would progress "backward" in the token
>> stream, for that given map.
>>
>> Then somehow, when you need to suck in a new pph (so this time you are
>> going downward in the stream again), just start this dance of
>> de-serializing the map for that new pph, updating its properties,
>> adding it to line_table, and setting the source_location of the tokens
>> coming for that pph again.
>>
>
> Doing it this way (with the offset) I would read in all the tokens and
> linemap entries inherited from that header and it's underlying include
> tree, thus no need to be tricky about inserting line tables for the
> header's included file, as they are part of the header's serialized
> line_table by recursion (a pph'ed header can include other pph'ed
> header),

This is what I am not sure to understand.  There is only one line table
per CU.  The headers included by the CU generate instances of struct
line map that are inserted into the line table of the CU.  So I 

Register pressure analysis

2011-07-27 Thread Sergey Ostanevich
Hello!

Is there any scripts/tools that parse GCC dumps and report register
pressure in the dump - either overall, or in different parts of the
dump?

regards,
Sergos


Re: Linemap and pph

2011-07-27 Thread Gabriel Charette
>>
>> Hence, given that they only depend on start_location, I just have to
>> calculate an offset between the serialized start_location and the
>> start_location as it would be (highest_location + 1) in the C file
>> including the header, and offset all of the source_locations on each
>> token coming from the pph (without even needing to recalculate them!).
>>
>
> That could work.  But then you'd need to do something for a map encoding
> the locations of tokens coming from the pph to appear in line_table,
> right?  Otherwise, at lookup time, (when you want to find the map that
> matches the source_location of a token coming for that pph), you'll be
> in trouble.  I am saying this b/c you are not calling linemap_line_start
> anymore.  And that function was the one that was including the said map
> to line_table.  And the map still must be inserted into line_table
> somehow.
>

The lookup, from what I understand, only depends on the line_map
entries for the header to be present, and the same as they would be
*had* the functions been called (not that each token actually called
the linemap getters to get its location), and also of course depends
on that the tokens have the correct source_locations *as if* obtained
from the linemap getters.

>> Doing it this way (with the offset) I would read in all the tokens and
>> linemap entries inherited from that header and it's underlying include
>> tree, thus no need to be tricky about inserting line tables for the
>> header's included file, as they are part of the header's serialized
>> line_table by recursion (a pph'ed header can include other pph'ed
>> header),
>
> This is what I am not sure to understand.  There is only one line table
> per CU.  The headers included by the CU generate instances of struct
> line map that are inserted into the line table of the CU.  So I don't
> understand what you mean by "header's serialized line_table", as I don't
> think there is such a thing as a header's line_table.

What I mean by "serialized header line_table" is the serialized
version of the line_table as it was when were done parsing the header
being pph'ed.

I would then de-serialize that and insert it's line_map entries in the
C file's line_table, doing the necessary offset adjustements in the
process (and updating all other line_table variables like
highest_location that would have changed if we had actually called the
linemap functions)


Best,
Gabriel


Re: Register pressure analysis

2011-07-27 Thread Vladimir Makarov

On 07/27/2011 10:26 AM, Sergey Ostanevich wrote:

Hello!

Is there any scripts/tools that parse GCC dumps and report register
pressure in the dump - either overall, or in different parts of the
dump?

-fdump-rtl-ira creates an info dump of IRA.  There is info about maximal 
pressure for each region (currently loops) and each register pressure 
class.  Loop 0 corresponds to all program.  By default, a loop region is 
created when register pressure is high.  If you need to see register 
pressure for all loops you can use option -fira-region=all.




Re: Linemap and pph

2011-07-27 Thread Dodji Seketeli
Gabriel Charette  a écrit:

>>>
>>> Hence, given that they only depend on start_location, I just have to
>>> calculate an offset between the serialized start_location and the
>>> start_location as it would be (highest_location + 1) in the C file
>>> including the header, and offset all of the source_locations on each
>>> token coming from the pph (without even needing to recalculate them!).
>>>
>>
>> That could work.  But then you'd need to do something for a map encoding
>> the locations of tokens coming from the pph to appear in line_table,
>> right?  Otherwise, at lookup time, (when you want to find the map that
>> matches the source_location of a token coming for that pph), you'll be
>> in trouble.  I am saying this b/c you are not calling linemap_line_start
>> anymore.  And that function was the one that was including the said map
>> to line_table.  And the map still must be inserted into line_table
>> somehow.
>>
>
> The lookup, from what I understand, only depends on the line_map
> entries for the header to be present,

Exactly.  The line_map need to be present inside line_table->maps, so
you need to insert it in there somehow.  It wasn't clear to me from my
reading of your previous messages.

[...]

>>> Doing it this way (with the offset) I would read in all the tokens and
>>> linemap entries inherited from that header and it's underlying include
>>> tree, thus no need to be tricky about inserting line tables for the
>>> header's included file, as they are part of the header's serialized
>>> line_table by recursion (a pph'ed header can include other pph'ed
>>> header),
>>
>> This is what I am not sure to understand.  There is only one line table
>> per CU.  The headers included by the CU generate instances of struct
>> line map that are inserted into the line table of the CU.  So I don't
>> understand what you mean by "header's serialized line_table", as I don't
>> think there is such a thing as a header's line_table.
>
> What I mean by "serialized header line_table" is the serialized
> version of the line_table as it was when were done parsing the header
> being pph'ed.

OK.  Please note that you don't necessarily need to serialize the entire
line_map at the parsing point of the pph.  I believe that just
serializing the line maps of the pph'ed header should suffice.  How to
determine them is another question. :-)

> I would then de-serialize that and insert it's line_map entries in the
> C file's line_table, doing the necessary offset adjustements in the
> process (and updating all other line_table variables like
> highest_location that would have changed if we had actually called the
> linemap functions)

OK.  We are on the same page now.  Thanks.

-- 
Dodji


Re: Question about SMS scheduling windows

2011-07-27 Thread Ayal Zaks
(sorry for replicated submissions, had to convert to plain text)

>2011/7/27 Revital1 Eres 
>
>Hello Richard,
>
>
>> I ask because in the final range:
>>
>>   start = early_start;
>>   end = MIN (end, early_start + ii);
>>   /* Schedule the node close to it's predecessors.  */
>>   step = 1;
>>
>> END is an exclusive bound.  It seems like we might be double-counting
here,
>> and effectively limiting the schedule to SCHED_TIME (v_node) + ii - 2.
>
>
>Yes, I think it indeed should be fixed. Thanks for reporting on this.
>
>Revital

Agreed;

  if (e->data_type == MEM_DEP)
end = MIN (end, SCHED_TIME (v_node) + ii - 1);

should be replaced with

  if (e->data_type == MEM_DEP)
end = MIN (end, p_st + ii);

also for the (3rd) case when there are both previously-scheduled
predessors and previously-scheduled successors. The range is inclusive
of start and exclusive of end: for (c = start; c != end; c += step)...


>This doesn't seem to be in the paper, and the comment suggests
>"count_succs > count_preds" rather than "count_succs >= count_preds".
>Is the ">=" vs ">" important?

I think not: in both cases you'll be trying to minimize the same
number of live ranges. But indeed it's good to be consistent with the
comment.

Thanks,
Ayal.


RFC: PATCH: Require and use int64 for x86 options

2011-07-27 Thread H.J. Lu
On Wed, Jul 13, 2011 at 6:22 AM, Ian Lance Taylor  wrote:
> Igor Zamyatin  writes:
>
>> As you may see pta_flags enum in i386.c is almost full. So there is a
>> risk of overflow in quite near future. Comment in source code advises
>> "widen struct pta flags" which is now defined as unsigned. But it
>> looks not optimal.
>>
>> What will be the most proper solution for this problem?
>
> Why is widening pta_flags "not optimal?"
>
> It's hard for me to believe that we still care about bootstrapping a
> i386-*-* compiler with a compiler which doesn't support any 64-bit type.
> So I don't see any problem with setting need_64bit_hwint=yes in
> config.gcc for i386-*-*, changing pta_flags to be unsigned
> HOST_WIDE_INT, and letting pta_flags go up to (unsigned HOST_WIDE_INT) 1
> << 63.
>
> If anybody doesn't like that idea, we can simply add a flags2 field and
> a pta_flags2 enum with PTA2_xxx constants.
>

Hi,

We are also running out of bits in ix86_isa_flags.  This patch uses
int64 on both ix86_isa_flags and PTA.  I added a new option to opt:

; Maximum number of mask bits in a variable.
MaxMaskBits
ix86_isa_flags = 64

It mark ix86_isa_flags as 64bit.  Any comments?

Thanks.


-- 
H.J.
---
gcc/

2011-07-27  H.J. Lu  

* config.gcc: Set need_64bit_hwint to yes for x86 targets.

* opt-read.awk (BEGIN): Set max_mask_bits[var] and
var_mask_1[var].

* opth-gen.awk: Use var_mask_1[var] instead of 1.  Check
max_mask_bits[var] instead of 31.

* config/i386/i386.c (pta): Use HOST_WIDE_INT on flags.
(builtin_isa): Use HOST_WIDE_INT on isa.
(def_builtin): Use HOST_WIDE_INT on mask.
(def_builtin_const): Likewise.
(builtin_description): Likewise.

* config/i386/i386.opt (MaxMaskBits): New.
(ix86_isa_flags): Replace int with HOST_WIDE_INT.
(ix86_isa_flags_explicit): Likewise.
(x_ix86_isa_flags_explicit): Likewise.

libcpp/

2011-07-27  H.J. Lu  

* configure.ac: Set need_64bit_hwint to yes for x86 targets.
* configure: Regenerated.
gcc/

2011-07-27  H.J. Lu  

	* config.gcc: Set need_64bit_hwint to yes for x86 targets.

	* opt-read.awk (BEGIN): Set max_mask_bits[var] and
	var_mask_1[var].

	* opth-gen.awk: Use var_mask_1[var] instead of 1.  Check
	max_mask_bits[var] instead of 31. 

	* config/i386/i386.c (pta): Use HOST_WIDE_INT on flags.
	(builtin_isa): Use HOST_WIDE_INT on isa.
	(def_builtin): Use HOST_WIDE_INT on mask.
	(def_builtin_const): Likewise.
	(builtin_description): Likewise.

	* config/i386/i386.opt (MaxMaskBits): New.
	(ix86_isa_flags): Replace int with HOST_WIDE_INT.
	(ix86_isa_flags_explicit): Likewise.
	(x_ix86_isa_flags_explicit): Likewise.

libcpp/

2011-07-27  H.J. Lu  

	* configure.ac: Set need_64bit_hwint to yes for x86 targets.
	* configure: Regenerated.

diff --git a/gcc/config.gcc b/gcc/config.gcc
index d7cf895..54ac985 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -345,6 +345,7 @@ i[34567]86-*-*)
 	cpu_type=i386
 	c_target_objs="i386-c.o"
 	cxx_target_objs="i386-c.o"
+	need_64bit_hwint=yes
 	extra_options="${extra_options} fused-madd.opt"
 	extra_headers="cpuid.h mmintrin.h mm3dnow.h xmmintrin.h emmintrin.h
 		   pmmintrin.h tmmintrin.h ammintrin.h smmintrin.h
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 96263ed..c5dd881 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2918,7 +2918,7 @@ ix86_option_override_internal (bool main_args_p)
   PTA_F16C = 1 << 26,
   PTA_BMI = 1 << 27,
   PTA_TBM = 1 << 28
-  /* if this reaches 32, need to widen struct pta flags below */
+  /* if this reaches 64, need to widen struct pta flags below */
 };
 
   static struct pta
@@ -2926,7 +2926,7 @@ ix86_option_override_internal (bool main_args_p)
   const char *const name;		/* processor name or nickname.  */
   const enum processor_type processor;
   const enum attr_cpu schedule;
-  const unsigned /*enum pta_flags*/ flags;
+  const unsigned HOST_WIDE_INT /*enum pta_flags*/ flags;
 }
   const processor_alias_table[] =
 {
@@ -24016,7 +24016,7 @@ static GTY(()) tree ix86_builtins[(int) IX86_BUILTIN_MAX];
 struct builtin_isa {
   const char *name;		/* function name */
   enum ix86_builtin_func_type tcode; /* type to use in the declaration */
-  int isa;			/* isa_flags this builtin is defined for */
+  HOST_WIDE_INT isa;		/* isa_flags this builtin is defined for */
   bool const_p;			/* true if the declaration is constant */
   bool set_and_not_built_p;
 };
@@ -24041,7 +24041,8 @@ static struct builtin_isa ix86_builtins_isa[(int) IX86_BUILTIN_MAX];
errors if a builtin is added in the middle of a function scope.  */
 
 static inline tree
-def_builtin (int mask, const char *name, enum ix86_builtin_func_type tcode,
+def_builtin (HOST_WIDE_INT mask, const char *name,
+	 enum ix86_builtin_func_type tcode,
 	 enum ix86_builtins code)
 {
   tree decl = NULL_TREE;
@@ -24079,7 +24080,7 @@ def_bu

Re: RFC: PATCH: Require and use int64 for x86 options

2011-07-27 Thread Uros Bizjak
On Wed, Jul 27, 2011 at 6:42 PM, H.J. Lu  wrote:

>>> As you may see pta_flags enum in i386.c is almost full. So there is a
>>> risk of overflow in quite near future. Comment in source code advises
>>> "widen struct pta flags" which is now defined as unsigned. But it
>>> looks not optimal.
>>>
>>> What will be the most proper solution for this problem?
>>
>> Why is widening pta_flags "not optimal?"
>>
>> It's hard for me to believe that we still care about bootstrapping a
>> i386-*-* compiler with a compiler which doesn't support any 64-bit type.
>> So I don't see any problem with setting need_64bit_hwint=yes in
>> config.gcc for i386-*-*, changing pta_flags to be unsigned
>> HOST_WIDE_INT, and letting pta_flags go up to (unsigned HOST_WIDE_INT) 1
>> << 63.
>>
>> If anybody doesn't like that idea, we can simply add a flags2 field and
>> a pta_flags2 enum with PTA2_xxx constants.
>>
>
> Hi,
>
> We are also running out of bits in ix86_isa_flags.  This patch uses
> int64 on both ix86_isa_flags and PTA.  I added a new option to opt:
>
> ; Maximum number of mask bits in a variable.
> MaxMaskBits
> ix86_isa_flags = 64
>
> It mark ix86_isa_flags as 64bit.  Any comments?

We should just introduce ix86_isa_flags2.  We shouldn't stop at 128 flags. ;)

Uros.


Re: RFC: PATCH: Require and use int64 for x86 options

2011-07-27 Thread H.J. Lu
On Wed, Jul 27, 2011 at 10:00 AM, Uros Bizjak  wrote:
> On Wed, Jul 27, 2011 at 6:42 PM, H.J. Lu  wrote:
>
 As you may see pta_flags enum in i386.c is almost full. So there is a
 risk of overflow in quite near future. Comment in source code advises
 "widen struct pta flags" which is now defined as unsigned. But it
 looks not optimal.

 What will be the most proper solution for this problem?
>>>
>>> Why is widening pta_flags "not optimal?"
>>>
>>> It's hard for me to believe that we still care about bootstrapping a
>>> i386-*-* compiler with a compiler which doesn't support any 64-bit type.
>>> So I don't see any problem with setting need_64bit_hwint=yes in
>>> config.gcc for i386-*-*, changing pta_flags to be unsigned
>>> HOST_WIDE_INT, and letting pta_flags go up to (unsigned HOST_WIDE_INT) 1
>>> << 63.
>>>
>>> If anybody doesn't like that idea, we can simply add a flags2 field and
>>> a pta_flags2 enum with PTA2_xxx constants.
>>>
>>
>> Hi,
>>
>> We are also running out of bits in ix86_isa_flags.  This patch uses
>> int64 on both ix86_isa_flags and PTA.  I added a new option to opt:
>>
>> ; Maximum number of mask bits in a variable.
>> MaxMaskBits
>> ix86_isa_flags = 64
>>
>> It mark ix86_isa_flags as 64bit.  Any comments?
>
> We should just introduce ix86_isa_flags2.  We shouldn't stop at 128 flags. ;)
>

It is used to control which insns are are available. See
OPTION_MASK_ISA_XXX_SET and OPTION_MASK_ISA_XXX_UNSET
in common/config/i386/i386-common.c.  Adding ix86_isa_flags2
makes it very complicated:

1. We need to turn on a set of ISAs for -mXXX.
2. We need to turn off a set of ISAs for -mno-XXX.
3. We need to check 2 fields in def_builtin to see if an insn is
available.

As a side benefit,  need_64bit_hwint=yes will resolve
many 32bit code generation differences on ia32 and x86-64 hosts.
We can close a bunch of bugs, like

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43226



-- 
H.J.


Re: Romain Geissler copyright assignment

2011-07-27 Thread Romain Geissler
Le 26 juil. 2011 à 16:45, Yvan ROUX a écrit :
> Hi,
> 
> Romain is doing an internship at STMicroelectronics on GCC plugins, and
> as his mentor, I confirm and/or inform you that he is covered by the
> copyright assignement RT 211150 between ST and FSF.
> 
> Regards.
> 
> -- 
> Yvan ROUX 
> STMicroelectronics

Hi,

As an intern, i'm already covered by a copyright assignment from
STMicroelectronics, but do i also need one from my school (Ensimag: École
nationale supérieure d’informatique et de mathématiques appliquées de
Grenoble (France)) ? If yes, how do i proceed to get one ?

Regards

Romain Geissler.



[gnu.org #702521] Re: Romain Geissler copyright assignment

2011-07-27 Thread Donald R Robertson III via RT
> [romain.geiss...@gmail.com - Wed Jul 27 14:05:48 2011]:
> 
> Le 26 juil. 2011 à 16:45, Yvan ROUX a écrit :
> > Hi,
> > 
> > Romain is doing an internship at STMicroelectronics on GCC plugins, and
> > as his mentor, I confirm and/or inform you that he is covered by the
> > copyright assignement RT 211150 between ST and FSF.
> > 
> > Regards.
> > 
> > -- 
> > Yvan ROUX 
> > STMicroelectronics
> 
> Hi,
> 
> As an intern, i'm already covered by a copyright assignment from
> STMicroelectronics, but do i also need one from my school (Ensimag: École
> nationale supérieure d’informatique et de mathématiques appliquées de
> Grenoble (France)) ? If yes, how do i proceed to get one ?

I would assume that at most we would need just a disclaimer from the university 
(included in 
text at the bottom of this email). If the school would have a claim to owning 
your work (perhaps 
because they are paying for it or it is being done as part of a course) then we 
would need an 
assignment. Otherwise, you can just print out the disclaimer, give it to the 
appropriate person 
at the school to fill out and sign, and then send a scanned copy back to me. If 
you are not sure 
who the proper person would be to ask to sign it, talk to your department 
chair, s/he should 
know. Thank you so much for contributing, and I hope to hear from you soon.

> 
> Regards
> 
> Romain Geissler.
> 
> 
> 
> 
-- 
Sincerely,

Donald R. Robertson, III, J.D.
Assignment Administrator
Free Software Foundation
51 Franklin Street, Fifth Floor
Boston, MA 02110
Phone +1-617-542-5942 
Fax +1-617-542-2652

---

If you are a student at a college or university where, due to general
policy or a specific agreement which you signed, a claim might be made
by the institution to your work or to copyrights or patents arising from
it, then you and we need a signed piece of paper from the school
disclaiming rights to your changes.

Here is a disclaimer you can get signed to cover your future
changes to GNU software, as well as your past changes.

The disclaimer should be signed by someone authorized to license
patents and copyrights for the school; you should find out who has
that authorization.  It may be a specific office that deals with
patent, copyright, and trademark issues, it may be the school's lawyer
or president, or (hopefully) some more accessible official.  If you
can't get at them, anyone else authorized to issue licenses for
software produced there will do.

Much of this disclaimer consists of a description of which kinds of
work it applies to.  This description is just a suggestion--it can be
replaced with any description that says clearly which kinds of work
they want to disclaim and which kinds they don't.  Paragraph (b) is
optional; they can delete it if they wish.

If the school says they do have a claim that could conflict with the
use of the program, then please put us in touch with a suitable
representative of the school, so that we can negotiate what to do
about it.  Send a note about the issue to the school representative
and to ass...@gnu.org.

IMPORTANT: When you talk to the school representative, *no matter what
instructions they have given you*, don't fail to show them the sample
disclaimer below, or a disclaimer with the details filled in for your
specific case.  Schools are often willing to sign a disclaimer
without any fuss.  If you make your request less specific, you may
cloud the issue and cause a long and unnecessary delay.

Please keep a copy of the school's signed disclaimer, and snail the
original to us at

Attn: Disclaimer Clerk
Free Software Foundation
51 Franklin Street, Fifth Floor
Boston, MA 02110
USA

DISCLAIMER OF RIGHTS BY A COLLEGE OR UNIVERSITY

   We agree that software and other authored works of the 'Released
Category' (defined below), made by _, a
student or graduate student at this school, prior to the date of this
document, and for _ years thereafter, are freely assignable by
said student to Free Software Foundation (FSF) for distribution and
sharing under its free software policies.  We disclaim any status as
the author or owner of such works; we do not consider them as works 
made for hire for us.

  The Released Category comprises

(a) changes and enhancements to software already (as of the time such
change or enhancement is made) freely circulating under stated terms
permitting public redistribution, whether in the public domain, or
under the FSF's GNU General Public License, or under the FSF's GNU
Lesser General Public License (a.k.a. the GNU Library General Public
License), or under other such terms; and

(b) operating system components for operating systems providing
substantially the same functionality as portions of UNIX, BSD,
Microsoft Windows, or other popular operating systems.
The Released Category excludes __ [if 'none',
please so state; thank you--FSF].

   We affirm that we will do nothing in the future to undermine this
release.  If we have or acquire 

ANN: gcc-python-plugin 0.5

2011-07-27 Thread David Malcolm
gcc-python-plugin is a plugin for GCC 4.6 onwards which embeds the
CPython interpreter within GCC, allowing you to write new compiler
warnings in Python, generate code visualizations, etc.

Tarball releases are available at:
  https://fedorahosted.org/releases/g/c/gcc-python-plugin/

Prebuilt-documentation can be seen at:
  http://readthedocs.org/docs/gcc-python-plugin/en/latest/index.html

Project homepage:
  https://fedorahosted.org/gcc-python-plugin/

High level summary of the changes since the initial announcement:

  - new contributors

  - lots of bug fixes and compatibility fixes (e.g. for Python 3): the 
selftest suite now works for me with all eight different 
combinations of:
  - optimized vs debug builds of Python
  - Python 2.7 vs Python 3.2
  - i686 and x86_64
building against gcc-4.6.1 (also tested with gcc-4.6.0)

  - new example scripts; see:
http://readthedocs.org/docs/gcc-python-plugin/en/latest/examples.html

  - if PLUGIN_PYTHONPATH is defined at build time, hardcode the value
into the plugin's sys.path, allowing multiple builds to be
independently packaged

  - more documentation

  - work-in-progress on detecting reference-count errors in C Python
extension code.  Although this usage example can now detect errors,
it isn't yet ready for general use.  It can generate HTML
visualizations of those errors; see  
  http://dmalcolm.livejournal.com/6560.html
for examples.

  - numerous other improvements (see below)

  - new dependency: the "six" module is required at both build time and
run-time, to smooth over Python 2 vs Python 3 differences:

   http://pypi.python.org/pypi/six/

I've also packaged the plugin in RPM form for Fedora 16 onwards; see:
  https://fedoraproject.org/wiki/Features/GccPythonPlugin

Enjoy!
Dave

Detailed change notes follow

Version 0.5
===
David Malcolm (7):
  Override all locale information with LC_ALL=C when running
selftests
  Revamp support for options in selftests
  Add note about ccache
  Improvements to the example scripts
  Split up examples within the docs
  Fix gcc.Pass.__repr__

Version 0.4
===
David Malcolm (10):
  add explicit BR on gmp-devel (rhbz#725569)
  Make the test suites be locale-independent
  Suppress buffering of output in Python 3
  run-test-suite.py: support excluding tests from a run
  Python 3 fixes to testcpychecker.py
  Add 'str_no_uid' attribute to gcc.Tree and gcc.Gimple; use it to
fix a selftest
  Fix segfault seen on i686 due to erroneous implementation of
'pointer' attribute of gcc.TypeDecl
  Selftest fixes and exclusions for 32-bit builds
  Fix the test for 32/64-bit in selftests so that it works with both
Python 2 and 3

Version 0.3
===
David Malcolm (3):
  If PLUGIN_PYTHONPATH is defined at build time, hardcode it into
the plugin's sys.path
  Python 3 fixes
  Add the beginnings of a manpage for gcc-with-python

Version 0.2
===
Alexandre Lissy (2):
  fix: permerror() misusage
  Using Freedesktop standard for image viewing

David Malcolm (98):
  Introduce gcc.Parameter and gcc.get_parameters()
  Introduce a compatibility header file
  Document the 'basic_blocks' attribute of gcc.Cfg
  Fix a mismatch between gccutils.pformat() and the API docs
  Add note about debugging
  Move the debugging information to be more prominent, and reword
  Automatically supply the correct header search directory for
selftests that #include 
  Set up various things in sys, including sys.path
  Cope with calls to function pointers in the arg checker
  Remove stray import
  Fix issue with PyArg_ParseTuple("K") seen compiling gdb
  Fix erroneous error messages for the various "s" and "z" format
codes
  Format codes "U" and "S" can support several different argument
types
  Add a way of turning of const-correctness for "const char*"
checking
  Fix breakage of the various "es" and "et" format codes introduced
in last commit
  Implement verification of the "O&" format code (converter
callback, followed by appropriate arg)
  Use newlines and indentation to try to make the PyArg_ error
messages more readable
  Add the example from my blog post
( http://dmalcolm.livejournal.com/6364.html )
  Remove redundant (and non-functioning) selftest for "O&" format
code
  Add 'local_decls', 'start', 'end', 'funcdef_no' to gcc.Function
  Add 'arguments' and 'result' to gcc.FunctionDecl
  Fix typos in docs
  Add 'operand' to gcc.Unary; check that the keywords table to
PyArg_ has a NULL terminator
  Add 'location' to more tcc types; fill out more documentation
  Start building out examples of C syntax vs how it's seen by the
Python API
  Fix the behavior of the various "#" format codes.
  Add Alexandre Lissy to contributors
  The various "e" codes can accept NULL as the encoding
  Add s

Re: RFC: PATCH: Require and use int64 for x86 options

2011-07-27 Thread Joseph S. Myers
On Wed, 27 Jul 2011, H.J. Lu wrote:

> ; Maximum number of mask bits in a variable.
> MaxMaskBits
> ix86_isa_flags = 64
> 
> It mark ix86_isa_flags as 64bit.  Any comments?

The patch won't work as is.  set_option, for example, casts a pointer to 
(int *), and stores a mask that came from option->var_value, which is an 
int, so this won't work with option fields not of type int or values that 
don't fit in int; you'd need to check all uses of CLVC_BIT_CLEAR and 
CLVC_BIT_SET in the source tree to adapt things for the possibility of 
wider mask fields, and track the type of each such field.

Independently, I approve of setting need_64bit_hwint for all x86 targets, 
but your patch doesn't achieve the expected simplification.  In 
config.gcc, there are settings for various individual targets that should 
be removed once it's set in one place for all x86 targets.  In 
libcpp/configure.ac, similarly the cases for i[34567]86-*-darwin* 
i[34567]86-*-solaris2.1[0-9]* x86_64-*-solaris2.1[0-9]* 
i[34567]86-w64-mingw* i[34567]86-*-linux* (the last only if 
--enable-targets=all) should all be removed as obsolete once 
i[34567]86-*-* is there along with x86_64-*-*.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: RFC: PATCH: Require and use int64 for x86 options

2011-07-27 Thread H.J. Lu
On Wed, Jul 27, 2011 at 2:23 PM, Joseph S. Myers
 wrote:
> On Wed, 27 Jul 2011, H.J. Lu wrote:
>
>> ; Maximum number of mask bits in a variable.
>> MaxMaskBits
>> ix86_isa_flags = 64
>>
>> It mark ix86_isa_flags as 64bit.  Any comments?
>
> The patch won't work as is.  set_option, for example, casts a pointer to
> (int *), and stores a mask that came from option->var_value, which is an
> int, so this won't work with option fields not of type int or values that
> don't fit in int; you'd need to check all uses of CLVC_BIT_CLEAR and
> CLVC_BIT_SET in the source tree to adapt things for the possibility of
> wider mask fields, and track the type of each such field.

We will prepare a separate patch.

> Independently, I approve of setting need_64bit_hwint for all x86 targets,
> but your patch doesn't achieve the expected simplification.  In
> config.gcc, there are settings for various individual targets that should
> be removed once it's set in one place for all x86 targets.  In
> libcpp/configure.ac, similarly the cases for i[34567]86-*-darwin*
> i[34567]86-*-solaris2.1[0-9]* x86_64-*-solaris2.1[0-9]*
> i[34567]86-w64-mingw* i[34567]86-*-linux* (the last only if
> --enable-targets=all) should all be removed as obsolete once
> i[34567]86-*-* is there along with x86_64-*-*.
>

Is this patch OK for trunk?

Thanks.

H.J.

gcc/

2011-07-27  H.J. Lu  

* config.gcc: Set need_64bit_hwint to yes for x86 targets.

libcpp/

2011-07-27  H.J. Lu  

* configure.ac: Set need_64bit_hwint to yes for x86 targets.
* configure: Regenerated.
gcc/

2011-07-27  H.J. Lu  

	* config.gcc: Set need_64bit_hwint to yes for x86 targets.

libcpp/

2011-07-27  H.J. Lu  

	* configure.ac: Set need_64bit_hwint to yes for x86 targets.
	* configure: Regenerated.

diff --git a/gcc/config.gcc b/gcc/config.gcc
index d7cf895..02cc556 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -345,6 +345,7 @@ i[34567]86-*-*)
 	cpu_type=i386
 	c_target_objs="i386-c.o"
 	cxx_target_objs="i386-c.o"
+	need_64bit_hwint=yes
 	extra_options="${extra_options} fused-madd.opt"
 	extra_headers="cpuid.h mmintrin.h mm3dnow.h xmmintrin.h emmintrin.h
 		   pmmintrin.h tmmintrin.h ammintrin.h smmintrin.h
@@ -1211,7 +1212,6 @@ hppa[12]*-*-hpux11*)
 	fi
 	;;
 i[34567]86-*-darwin*)
-	need_64bit_hwint=yes
 	need_64bit_isa=yes
 	# Baseline choice for a machine that allows m64 support.
 	with_cpu=${with_cpu:-core2}
@@ -1293,7 +1293,6 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | i[34567]86-*-knetbsd*-gnu | i
 esac
 			done
 			TM_MULTILIB_CONFIG=`echo $TM_MULTILIB_CONFIG | sed 's/^,//'`
-			need_64bit_hwint=yes
 			need_64bit_isa=yes
 			case X"${with_cpu}" in
 			Xgeneric|Xatom|Xcore2|Xcorei7|Xcorei7-avx|Xnocona|Xx86-64|Xbdver2|Xbdver1|Xbtver1|Xamdfam10|Xbarcelona|Xk8|Xopteron|Xathlon64|Xathlon-fx|Xathlon64-sse3|Xk8-sse3|Xopteron-sse3)
@@ -1415,7 +1414,6 @@ i[34567]86-*-solaris2* | x86_64-*-solaris2.1[0-9]*)
 		tm_file="${tm_file} i386/x86-64.h i386/sol2-bi.h sol2-bi.h"
 		tm_defines="${tm_defines} TARGET_BI_ARCH=1"
 		tmake_file="$tmake_file i386/t-sol2-64"
-		need_64bit_hwint=yes
 		need_64bit_isa=yes
 		case X"${with_cpu}" in
 		Xgeneric|Xatom|Xcore2|Xcorei7|Xcorei7-avx|Xnocona|Xx86-64|Xbdver2|Xbdver1|Xbtver1|Xamdfam10|Xbarcelona|Xk8|Xopteron|Xathlon64|Xathlon-fx|Xathlon64-sse3|Xk8-sse3|Xopteron-sse3)
@@ -1478,7 +1476,6 @@ i[34567]86-*-mingw* | x86_64-*-mingw*)
 	xm_file=i386/xm-mingw32.h
 	case ${target} in
 		x86_64-*-* | *-w64-*)
-			need_64bit_hwint=yes
 			need_64bit_isa=yes
 			;;
 		*)
diff --git a/libcpp/configure b/libcpp/configure
index b453a7b..c400d23 100755
--- a/libcpp/configure
+++ b/libcpp/configure
@@ -7312,9 +7312,7 @@ case $target in
 	x86_64-*-* | \
 	ia64-*-* | \
 	hppa*64*-*-* | \
-	i[34567]86-*-darwin* | \
-	i[34567]86-*-solaris2.1[0-9]* | x86_64-*-solaris2.1[0-9]* | \
-	i[34567]86-w64-mingw* | \
+	i[34567]86-*-* | x86_64-*-solaris2.1[0-9]* | \
 	mips*-*-* | \
 	mmix-*-* | \
 	powerpc*-*-* | \
@@ -7324,13 +7322,6 @@ case $target in
 	spu-*-* | \
 	sh[123456789lbe]*-*-* | sh-*-*)
 		need_64bit_hwint=yes ;;
-	i[34567]86-*-linux*)
-		if test "x$enable_targets" = xall; then
-			need_64bit_hwint=yes
-		else
-			need_64bit_hwint=no
-		fi
-		;;
 	*)
 		need_64bit_hwint=no ;;
 esac
diff --git a/libcpp/configure.ac b/libcpp/configure.ac
index 170932c..e1d8851 100644
--- a/libcpp/configure.ac
+++ b/libcpp/configure.ac
@@ -150,9 +150,7 @@ case $target in
 	x86_64-*-* | \
 	ia64-*-* | \
 	hppa*64*-*-* | \
-	i[34567]86-*-darwin* | \
-	i[34567]86-*-solaris2.1[0-9]* | x86_64-*-solaris2.1[0-9]* | \
-	i[34567]86-w64-mingw* | \
+	i[34567]86-*-* | x86_64-*-solaris2.1[0-9]* | \
 	mips*-*-* | \
 	mmix-*-* | \
 	powerpc*-*-* | \
@@ -162,13 +160,6 @@ case $target in
 	spu-*-* | \
 	sh[123456789lbe]*-*-* | sh-*-*)
 		need_64bit_hwint=yes ;;
-	i[34567]86-*-linux*)
-		if test "x$enable_targets" = xall; then
-			need_64bit_hwint=yes
-		else
-			need_64bit_hwint=no
-		fi
-		;;
 	*)
 		need_64bit_hwint=no ;;
 esac


IRA vs CANNOT_CHANGE_MODE_CLASS, + 4.7 IRA regressions?

2011-07-27 Thread Sandra Loosemore

Consider this bit of code:

extern double a[20];

double test1 (int n)
{
  double accum = 0.0;
  int i;

  for (i=0; imipsisa32r2-sde-elf-gcc -O3 -fno-inline -fno-unroll-loops -march=74kf1_1 
-S abstest.c


With a GCC 4.6 compiler, this produces:
...
.L3:
mtc1$3,$f2
ldc1$f0,0($5)
addiu   $5,$5,8
mtc1$2,$f3
sub.d   $f2,$f2,$f0
mfc1$3,$f2
bne $5,$4,.L3
mfc1$2,$f3

ext $5,$2,0,31
move$4,$3
.L2:
mtc1$4,$f0
j   $31
mtc1$5,$f1
...

This is terrible code, with all that pointless register-shuffling inside 
the loop -- what's gone wrong?  Well, the bit-twiddling expansion of 
"fabs" produced by optabs.c uses subreg expressions, and on MIPS 
CANNOT_CHANGE_MODE_CLASS disallows use of FP registers for integer 
operations.  And, when IRA sees that, it decides it cannot alloc "accum" 
to a FP reg at all, even if it obviously makes sense to put it there for 
the rest of its lifetime.


On mainline trunk, things are even worse as it's spilling to memory, not 
just shuffling between registers:


.L3:
ldc1$f0,0($2)
addiu   $2,$2,8
sub.d   $f2,$f2,$f0
bne $2,$3,.L3
sdc1$f2,0($sp)

lw  $2,0($sp)
ext $3,$2,0,31
lw  $2,4($sp)
.L2:
sw  $2,4($sp)
sw  $3,0($sp)
lw  $3,4($sp)
lw  $2,0($sp)
addiu   $sp,$sp,8
mtc1$3,$f0
j   $31
mtc1$2,$f1

I've been experimenting with a patch to the MIPS backend to add 
define_insn_and_split patterns for floating-point abs -- the idea is to 
attach some constraints to the insns to tell IRA it needs a GP reg for 
these operations, so it can apply its usual cost analysis and reload 
logic instead of giving up.  Then the split to introduce the subreg 
expansion happens after reload when we already have the right register 
class.  This seems to work well enough on 4.6; for this particular 
example, I'm getting:


.L3:
ldc1$f2,0($2)
addiu   $2,$2,8
bne $2,$4,.L3
sub.d   $f0,$f0,$f2

mfc1$2,$f1
ext $2,$2,0,31
j   $31
mtc1$2,$f1

However, same patch on mainline is still giving spills to memory.  :-(

So, here's my question.  Is it worthwhile for me to continue this 
approach of trying to make the MIPS backend smarter?  Or is the way IRA 
deals with CANNOT_CHANGE_MODE_CLASS fundamentally broken and in need of 
fixing in a target-inspecific way?  And/or is there some other 
regression in IRA on mainline that's causing it to spill to memory when 
it didn't used to in 4.6?


BTW, the unary "neg" operator has the same problem as "abs" on MIPS; 
can't use the hardware instruction because it does the wrong thing with 
NaNs, and can't twiddle the sign bit directly in a FP register.  With 
both abs/neg now generating unnecessary memory spills, this seems like a 
fairly important performance regression


-Sandra