macro's and arguments

2012-10-16 Thread Mischa Baars

Hi,

Who will be fixing this? Macro arguments without brackets are not 
accepted by the assembler.


If I can be of any help, let me know.

Thanks,
Mischa.


macro's and arguments

2012-10-16 Thread Mischa Baars

Hi,

Who will be fixing this? Macro arguments without brackets are not 
accepted by the assembler.


If I can be of any help, let me know.

Thanks,
Mischa.
.intel_syntax   noprefix

.global function

.code64

.macro  A   arg1, arg2

mov ax, \arg1
mov bx, \arg2

.endm

.macro  B

.seti, 0

.rept   4

//  A   i + i, i * i
A   (i + i), (i * i)

.seti, i + 1

.endr

.endm

function:

B



How to add pass filtering for -fopt-info

2012-10-16 Thread Sharad Singhai
Hi,

I added -fopt-info option in r191883. This option allows one to
quickly see high-level optimization info instead of scanning several
verbose dump files.

However, -fopt-info implementation is not complete as it dumps
info from all the passes. It would be nice to add high level
pass-level filtering to make it more useful. For example, if someone
is interested only in loop optimizations, they could say

  -fopt-info-optimized-loop ==> show me high-level info about
   optimized locations from all the loop passes.

To achieve this I am considering these two alternatives

A) Have an additional field in the 'struct opt_pass'. It is similar to
pass name (which is already used for generating the dump file name),
this new field will indicate the high-level name which can be used for
-fopt-info. For example,

struct gimple_opt_pass pass_loop_distribution =
{
 {
  GIMPLE_PASS,
  "ldist", /* name */
  "loop",  ===> New field indicating it is a "loop" optimization pass
  ...
 }
}

Multiple loop passes would have the same opt info name "loop" under
this scheme.

B) Have an additional method, something like 'register_opt_info ()'
which would associate a high-level name with the current pass.

So for example, the call to register would be inside the actual
execute method

tree_loop_distribution (void)
{
  ...
  register_opt_info ("loop");  ==> associate this pass with "loop"
optimizatations
  ...
}

I think there are pros and cons of each.

A) has the benefit that each pass statically declares which high-level
optimization it belongs to, which is quite clear. However, the
disadvantage is that using an extra field would require a global
change as all the opt_pass static definitions would need to be
updated.

B) has the benefit that it is more flexible. The 'register_opt_info
()' needs to be called only on certain interesting passes, the other
passes would remain unchanged. Another advantage of B) is that a pass
could register for multiple high-level opt-info
names. (Though I don't know when it would be useful.)  The downside is
that it is more error prone as any opt info dumps would not be
possible until corresponding registration is done.

I would appreciate comments on either of these alternatives.  Perhaps
something else more appropriate for this purpose?

Thanks,
Sharad


New dump infrastructure

2012-10-16 Thread Sharad Singhai
Hi,

This is a solicitation for help in converting passes to use the new
dump infrastructure. More context below.

I have enhanced the dump infrastructure in r191883, r191884. These
patches updated the tree/rtl dump facility so that passes do not
reference the dump file directly, but instead use a different (and
hopefully cleaner) API.

Instead of this

if (dump_file)
  fprintf (dump_file, ...);

the new style looks like this

if (dump_kind_p (...))
  dump_printf (...)

In addition, I also added a new option -fopt-info. This option allows
one to focus on high-level optimizations without scanning lots of
verbose tree/rtl dump files. Currently, the following categories of
optimization info are defined in dumpfile.c:

MSG_OPTIMIZED_LOCATIONS   /* -fopt-info optimized sources */
MSG_MISSED_OPTIMIZATION   /* missed opportunities */
MSG_NOTE  /* general optimization info */
MSG_ALL   /* Dump all available info */

The same dump API works for both regular dumps as well as -fopt-info
dumps. This is because the dump_kind_p () accepts a union of dump
flags. These flags include all of the TDF_* flags as well as newly
designed MSG_* flags.

For example, one could say

if (dump_kind_p (MSG_OPTIMIZED_LOCATIONS | TDF_BLOCKS))
  dump_printf (...);

This means that dump the info if either the -fopt-info-optimized or
-ftree--blocks options is given. The dump files for these dumps
could be different, but individual passes do not need to worry about
that. It is handled transparently.

Another feature is that this new dump infrastructure allows dumps to
be redirected into command line named files (including stderr/stdout)
instead of auto generated filenames.

Since the number of existing dump call sites is quite large, currently
both old *and* new schemes are in use. The plan is to gradually
transition passes to use the new dump infrastructure and deprecate the
old dump style. This will also provide better optimization reports in
future.

Now I am asking for help. :)

Thus far I have converted the vectorization passes to use the new dump
scheme and output optimization details using -fopt-info. However, all
other passes need help. It would be great if you could help convert
your favorite pass (or two).

Thanks,
Sharad


Re: bounds checking in STL containers

2012-10-16 Thread Florian Weimer

On 10/15/2012 07:14 PM, Ахриев Альберт wrote:


Two-three generations ago g++ was very cautious about consistency checking but 
not now.


Not sure if this is true.  The at() member function performs such 
checking, but not operator[].


We looked at bounds checking for operator[] under -D_FORTIFY_SOURCE, but 
GCC isn't able to hoist the bounds check out of inner loops, so the 
impact is pretty significant:




--
Florian Weimer / Red Hat Product Security Team


Re: How to add pass filtering for -fopt-info

2012-10-16 Thread Richard Biener
On Tue, Oct 16, 2012 at 10:15 AM, Sharad Singhai  wrote:
> Hi,
>
> I added -fopt-info option in r191883. This option allows one to
> quickly see high-level optimization info instead of scanning several
> verbose dump files.
>
> However, -fopt-info implementation is not complete as it dumps
> info from all the passes. It would be nice to add high level
> pass-level filtering to make it more useful. For example, if someone
> is interested only in loop optimizations, they could say
>
>   -fopt-info-optimized-loop ==> show me high-level info about
>optimized locations from all the loop passes.
>
> To achieve this I am considering these two alternatives
>
> A) Have an additional field in the 'struct opt_pass'. It is similar to
> pass name (which is already used for generating the dump file name),
> this new field will indicate the high-level name which can be used for
> -fopt-info. For example,
>
> struct gimple_opt_pass pass_loop_distribution =
> {
>  {
>   GIMPLE_PASS,
>   "ldist", /* name */
>   "loop",  ===> New field indicating it is a "loop" optimization pass
>   ...
>  }
> }
>
> Multiple loop passes would have the same opt info name "loop" under
> this scheme.
>
> B) Have an additional method, something like 'register_opt_info ()'
> which would associate a high-level name with the current pass.
>
> So for example, the call to register would be inside the actual
> execute method
>
> tree_loop_distribution (void)
> {
>   ...
>   register_opt_info ("loop");  ==> associate this pass with "loop"
> optimizatations
>   ...
> }
>
> I think there are pros and cons of each.
>
> A) has the benefit that each pass statically declares which high-level
> optimization it belongs to, which is quite clear. However, the
> disadvantage is that using an extra field would require a global
> change as all the opt_pass static definitions would need to be
> updated.
>
> B) has the benefit that it is more flexible. The 'register_opt_info
> ()' needs to be called only on certain interesting passes, the other
> passes would remain unchanged. Another advantage of B) is that a pass
> could register for multiple high-level opt-info
> names. (Though I don't know when it would be useful.)  The downside is
> that it is more error prone as any opt info dumps would not be
> possible until corresponding registration is done.
>
> I would appreciate comments on either of these alternatives.  Perhaps
> something else more appropriate for this purpose?

I don't like B), it is unlike everything else a pass does.  You seem to
use the new field to indicate a group - that makes it a flat hierarchy
which might make it limiting (for example 'vect' may include both loop
and scalar vectorization, but would 'loop' also include loop vectorization?).
Using a bitmask and an enum would be my preference for this reason
(similar to how we have TDF_ flags).  Loop vectorization would then
be vect|loop for example.

Richard.

> Thanks,
> Sharad


Questions about the dg-do directive

2012-10-16 Thread Dominique Dhumieres
These questions are motivated by the comments #4 to #15 of pr54407.

The bottom line is that

{ dg-do compile targets1 }
{ dg-do run targets2 }

behaves as

{dg-do run { targets1 targets2 } }

while

{ dg-do run targets1 }
{ dg-do compile targets2 }

as

{ dg-do compile { targets1 targets2 } }

(1) Is the above correct?
(2) If yes, is it a (undocumented) feature or a bug?

While looking at the gcc.dg files, I have seen several instances of
these constructs. While most of them lacks any target, then the
first line is probably ignored, the tests gcc.dg/vect/vect-(82|83)_64.c
use it in:

/* { dg-do run { target { { powerpc*-*-* && lp64 } && powerpc_altivec_ok } } } 
*/
/* { dg-do compile { target { { powerpc*-*-* && ilp32 } && powerpc_altivec_ok } 
} } */

They do not seem to work as designed: the tests are not run on 
powerpc-apple-darwin9
with -m64.

(3) What should be done for that?

One way of doing a { dg-do run targets1 } and { dg-do compile targets2 } would
be to use the trick in gcc.dg/attr-weakref*, i.e., to duplicate the test, one
file to run and the other to only compile.

(4) Does it exists a better solution?

TIA

Dominique


Re: New dump infrastructure

2012-10-16 Thread Martin Jambor
Hi,

On Tue, Oct 16, 2012 at 01:21:29AM -0700, Sharad Singhai wrote:
> Hi,
> 
> This is a solicitation for help in converting passes to use the new
> dump infrastructure. More context below.

thanks for the email.  I hoped you'd summarize what the long thread
about this (that I lost track of) led to.  I'll be happy to convert
tree-sra and ipa-cp to the new framework.  Nevertheless, I have two
questions:

> 
> I have enhanced the dump infrastructure in r191883, r191884. These
> patches updated the tree/rtl dump facility so that passes do not
> reference the dump file directly, but instead use a different (and
> hopefully cleaner) API.
> 
> Instead of this
> 
> if (dump_file)
>   fprintf (dump_file, ...);
> 
> the new style looks like this
> 
> if (dump_kind_p (...))
>   dump_printf (...)

1. OK, I understand that e.g.

 if (dump_file && (dump_flags & TDF_DETAILS))

   should be converted into:
 
 if (dump_kind_p (TDF_DETAILS))

   But what about current code that does not care about dump_flags?
   E.g. converting simple

 if (dump_file) 

   to

 if (dump_kind_p (0))

   won't work, dump_kind_p will always return zero in such cases.

2. dump_kind_p seems to always return 0 if current_function_decl is
   NULL.  However, that precludes its use in IPA passes in which this
   can happen regularly.  Why is this restriction necessary?

Thanks,

Martin


> 
> In addition, I also added a new option -fopt-info. This option allows
> one to focus on high-level optimizations without scanning lots of
> verbose tree/rtl dump files. Currently, the following categories of
> optimization info are defined in dumpfile.c:
> 
> MSG_OPTIMIZED_LOCATIONS   /* -fopt-info optimized sources */
> MSG_MISSED_OPTIMIZATION   /* missed opportunities */
> MSG_NOTE  /* general optimization info */
> MSG_ALL   /* Dump all available info */
> 
> The same dump API works for both regular dumps as well as -fopt-info
> dumps. This is because the dump_kind_p () accepts a union of dump
> flags. These flags include all of the TDF_* flags as well as newly
> designed MSG_* flags.
> 
> For example, one could say
> 
> if (dump_kind_p (MSG_OPTIMIZED_LOCATIONS | TDF_BLOCKS))
>   dump_printf (...);
> 
> This means that dump the info if either the -fopt-info-optimized or
> -ftree--blocks options is given. The dump files for these dumps
> could be different, but individual passes do not need to worry about
> that. It is handled transparently.
> 
> Another feature is that this new dump infrastructure allows dumps to
> be redirected into command line named files (including stderr/stdout)
> instead of auto generated filenames.
> 
> Since the number of existing dump call sites is quite large, currently
> both old *and* new schemes are in use. The plan is to gradually
> transition passes to use the new dump infrastructure and deprecate the
> old dump style. This will also provide better optimization reports in
> future.
> 
> Now I am asking for help. :)
> 
> Thus far I have converted the vectorization passes to use the new dump
> scheme and output optimization details using -fopt-info. However, all
> other passes need help. It would be great if you could help convert
> your favorite pass (or two).
> 
> Thanks,
> Sharad


Re: New dump infrastructure

2012-10-16 Thread Richard Biener
On Tue, Oct 16, 2012 at 3:41 PM, Martin Jambor  wrote:
> Hi,
>
> On Tue, Oct 16, 2012 at 01:21:29AM -0700, Sharad Singhai wrote:
>> Hi,
>>
>> This is a solicitation for help in converting passes to use the new
>> dump infrastructure. More context below.
>
> thanks for the email.  I hoped you'd summarize what the long thread
> about this (that I lost track of) led to.  I'll be happy to convert
> tree-sra and ipa-cp to the new framework.  Nevertheless, I have two
> questions:
>
>>
>> I have enhanced the dump infrastructure in r191883, r191884. These
>> patches updated the tree/rtl dump facility so that passes do not
>> reference the dump file directly, but instead use a different (and
>> hopefully cleaner) API.
>>
>> Instead of this
>>
>> if (dump_file)
>>   fprintf (dump_file, ...);
>>
>> the new style looks like this
>>
>> if (dump_kind_p (...))
>>   dump_printf (...)
>
> 1. OK, I understand that e.g.
>
>  if (dump_file && (dump_flags & TDF_DETAILS))
>
>should be converted into:
>
>  if (dump_kind_p (TDF_DETAILS))
>
>But what about current code that does not care about dump_flags?
>E.g. converting simple
>
>  if (dump_file)
>
>to
>
>  if (dump_kind_p (0))
>
>won't work, dump_kind_p will always return zero in such cases.

Indeed.  I also wonder why dump_kind_p does not check if dumping is
active at all?  Thus, inside check dump_file / alternate dump_file for NULL.

> 2. dump_kind_p seems to always return 0 if current_function_decl is
>NULL.  However, that precludes its use in IPA passes in which this
>can happen regularly.  Why is this restriction necessary?

Arguably a bug.  Not sure why it was done this way.

> Thanks,
>
> Martin
>
>
>>
>> In addition, I also added a new option -fopt-info. This option allows
>> one to focus on high-level optimizations without scanning lots of
>> verbose tree/rtl dump files. Currently, the following categories of
>> optimization info are defined in dumpfile.c:
>>
>> MSG_OPTIMIZED_LOCATIONS   /* -fopt-info optimized sources */
>> MSG_MISSED_OPTIMIZATION   /* missed opportunities */
>> MSG_NOTE  /* general optimization info */
>> MSG_ALL   /* Dump all available info */
>>
>> The same dump API works for both regular dumps as well as -fopt-info
>> dumps. This is because the dump_kind_p () accepts a union of dump
>> flags. These flags include all of the TDF_* flags as well as newly
>> designed MSG_* flags.
>>
>> For example, one could say
>>
>> if (dump_kind_p (MSG_OPTIMIZED_LOCATIONS | TDF_BLOCKS))
>>   dump_printf (...);
>>
>> This means that dump the info if either the -fopt-info-optimized or
>> -ftree--blocks options is given. The dump files for these dumps
>> could be different, but individual passes do not need to worry about
>> that. It is handled transparently.
>>
>> Another feature is that this new dump infrastructure allows dumps to
>> be redirected into command line named files (including stderr/stdout)
>> instead of auto generated filenames.
>>
>> Since the number of existing dump call sites is quite large, currently
>> both old *and* new schemes are in use. The plan is to gradually
>> transition passes to use the new dump infrastructure and deprecate the
>> old dump style. This will also provide better optimization reports in
>> future.
>>
>> Now I am asking for help. :)
>>
>> Thus far I have converted the vectorization passes to use the new dump
>> scheme and output optimization details using -fopt-info. However, all
>> other passes need help. It would be great if you could help convert
>> your favorite pass (or two).
>>
>> Thanks,
>> Sharad


Re: Questions about the dg-do directive

2012-10-16 Thread Andreas Schwab
domi...@lps.ens.fr (Dominique Dhumieres) writes:

> These questions are motivated by the comments #4 to #15 of pr54407.
>
> The bottom line is that
>
> { dg-do compile targets1 }
> { dg-do run targets2 }
>
> behaves as
>
> {dg-do run { targets1 targets2 } }
>
> while
>
> { dg-do run targets1 }
> { dg-do compile targets2 }
>
> as
>
> { dg-do compile { targets1 targets2 } }
>
> (1) Is the above correct?
> (2) If yes, is it a (undocumented) feature or a bug?

>From dg.exp:

# Multiple instances are supported (since we don't support target and xfail
# selectors on one line), though it doesn't make much sense to change the
# compile/assemble/link/run field.  Nor does it make any sense to have
# multiple lines of target selectors (use one line).

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Line number information

2012-10-16 Thread Iyer, Balaji V
Hello Everyone,
I am trying to debug the trunk cc1 (revision 192483) and it is not 
finding the line number information.   I am using GDB 7.5. My OS is SuSE (not  
sure if that matters). Is everyone else having this issue?

Thanks,

Balaji V. Iyer.




Re: Questions about the dg-do directive

2012-10-16 Thread Janis Johnson
On 10/16/2012 07:14 AM, Andreas Schwab wrote:
> domi...@lps.ens.fr (Dominique Dhumieres) writes:
> 
>> These questions are motivated by the comments #4 to #15 of pr54407.
>>
>> The bottom line is that
>>
>> { dg-do compile targets1 }
>> { dg-do run targets2 }
>>
>> behaves as
>>
>> {dg-do run { targets1 targets2 } }
>>
>> while
>>
>> { dg-do run targets1 }
>> { dg-do compile targets2 }
>>
>> as
>>
>> { dg-do compile { targets1 targets2 } }
>>
>> (1) Is the above correct?
>> (2) If yes, is it a (undocumented) feature or a bug?

That's just the way it works, so I suppose you could call it a feature.
DejaGnu doesn't support having different dg-do-what values for different
targets within a test, although it can be done from outside the test in
the .exp file.  Tests that try to do that are broken.

> From dg.exp:
> 
> # Multiple instances are supported (since we don't support target and xfail
> # selectors on one line), though it doesn't make much sense to change the
> # compile/assemble/link/run field.  Nor does it make any sense to have
> # multiple lines of target selectors (use one line).
> 
> Andreas.

I wouldn't rely on that feature, either.  We have much better ways in
local test directives to skip and xfail tests for different targets.

Janis 



Re: What happened to the IRA interprocedural reg-alloc work? (function_used_regs and friends)

2012-10-16 Thread Andi Kleen
Vladimir Makarov  writes:
>>
> As I remember, the performance improvement from this optimization was
> very small.  There were problems in reviewing IRA and I decided to
> simplify this task.
>
> May be it is worth to return to this work.

... especially if you could make it work with LTO.

-Andi


Re: What happened to the IRA interprocedural reg-alloc work? (function_used_regs and friends)

2012-10-16 Thread Steven Bosscher
On Tue, Oct 16, 2012 at 7:20 PM, Andi Kleen wrote:
> Vladimir Makarov  writes:
>>>
>> As I remember, the performance improvement from this optimization was
>> very small.  There were problems in reviewing IRA and I decided to
>> simplify this task.
>>
>> May be it is worth to return to this work.
>
> ... especially if you could make it work with LTO.

I'm going to (or more accurately: have started looking at) porting
Vlad's patch to the trunk. I'll probably have this finished next
weekend.

I've been thinking a lot about how this IPA-RA could be made more
effective with LTO, but I don't see any easy ways to do this.

Vlad's patch basically looks at what registers are really call-used
and call-clobbered in the 'final' pass where assembly is emitted. This
is the only way to know which registers will be in that set for each
function, you can't already know that when the function is still
represented in GIMPLE or non-strict RTL. LTO works on GIMPLE function
bodies and summaries are written well before RTL is generated,
registers have not been allocated.

With "basic" LTO there's no problem. Functions have to be compiled to
RTL in topological order to make things work, but as far as I know
this already happens (not sure, tho'). With WHOPR (which is IMHO the
only useful LTO mode in general) , the set of call used/clobbered regs
cannot be known and streamed out as a summary for WHOPR because GCC
streams GIMPLE bodies, performs its WHOPR magic on GIMPLE, and
generates RTL only after WHOPR is done.

I suppose it's theoretically possible to make a good initial guess of
what registers might be not-clobbered by a function even if the ABI
says so. For instance, perhaps it's possible to assume that a function
that doesn't touch any variables in a floating point mode also doesn't
use/clobber any floating point registers. This assumption could be
propagated via LTO/WHOPR. If the function turns out to clobber
registers that were assumed to be untouched, you could just save and
restore them in the function ("callee saved" so to speak). But I don't
know how useful that would be.

Another, IMHO more interesting, thing to investigate would be
allocating global variables to registers. This is not part of Vlad's
original patch and I have no real ideas right now how to do that, but
it would be an interesting optimization for single-thread programs
(like GCC itself).

Ciao!
Steven


Re: Questions about the dg-do directive

2012-10-16 Thread Dominique Dhumieres
Thanks for the quick answer.

> That's just the way it works, so I suppose you could call it a feature.

So the answer to (1) is yes and to (2) it is a poorly documented feature.
May be the restriction to one dg-do directive should be added to
http://gcc.gnu.org/wiki/HowToPrepareATestcase .

In gcc/testsuite/* I have found 27 instances of such double directives
most of the in the powerpc tests (gcc.target/powerpc/altivec*).
I can provide a list if it helps.

> ... We have much better ways in
> local test directives to skip and xfail tests for different targets.

Could you elaborate please? AFAIU skip or xfail do not allow to do
what was intended in the gcc.target/powerpc/altivec* cases for instance:
run for powerpc*-*-* && vmx_hw and compile for
powerpc*-*-* && { ! vmx_hw }.

Dominique


Re: What happened to the IRA interprocedural reg-alloc work? (function_used_regs and friends)

2012-10-16 Thread Andi Kleen
Steven Bosscher  writes:
>
> I suppose it's theoretically possible to make a good initial guess of
> what registers might be not-clobbered by a function even if the ABI
> says so. For instance, perhaps it's possible to assume that a function
> that doesn't touch any variables in a floating point mode also doesn't
> use/clobber any floating point registers. This assumption could be
> propagated via LTO/WHOPR. If the function turns out to clobber
> registers that were assumed to be untouched, you could just save and
> restore them in the function ("callee saved" so to speak). But I don't
> know how useful that would be.
>

There was a discussion on this some time ago. The conclusion was that
the partition should help: if the partitions works right the callers and
callees that are commonly should be commonly in the same partition, and
those need RA between themselves. Between partitions you couldn't do RA.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only


RFD: HAVE_* pattern flags

2012-10-16 Thread Joern Rennecke

Sorry for the recent loop-doloop.c breakage.  I did test it, but I didn't take
a day to re-test it the two hundred configurations in config-list.mk and
sift out the pre-broken ports; as i had only changed 'target-independent'
code since the last full test, I only tested in on i686-pc-linux-gnu.
Which unfortunately did not cover the piece of code that was affected
by the merge failure.

I remember at some point we said that we wanted less #ifdef and more if ()
tests, so that we get more uniform syntax / warning coverage when testing
one target.  One big blind spot that's left there are the
HAVE_xxx flags for instruction patterns, like HAVE_doloop_end
or HAVE_nonlocal_goto .  So, three questions:

- Is there consensus that we would like to change this?

- What would a good naming scheme be?
  - Change the semantics of the HAVE_pattern macros for officially named
patterns so that they are defined as 0 when the pattern is not provided?
That choice would actually force people to change #ifdef into if (),
without the possibility of #if, where targets can have non-constant
pattern predicates.
  - Have_pattern?
  - have_pattern?
  - any other preferences?

- how do we get the list of 'official' named patterns?
  - We could have a header file that is maintained by hand, with a string
of #ifdef / #define / #endif .
  - Or we could build the list automatically, something like grep for
@code{..} in md.texi, and check if at least one target defines a
pattern with that name.
  - introduce a special markup for named patterns in md.texi, and  
grep for that.

  - have some markup in the compiler source files that check the value
  - a special case of the previous option, mixed with a special choice for the
previous (naming) question: using a naming scheme that can be picked out
by the generator file, e.g. HAVE_named_pattern_nonlocal_goto_receiver.


Re: What happened to the IRA interprocedural reg-alloc work? (function_used_regs and friends)

2012-10-16 Thread Vladimir Makarov

On 12-10-16 5:49 PM, Steven Bosscher wrote:

On Tue, Oct 16, 2012 at 7:20 PM, Andi Kleen wrote:

Vladimir Makarov  writes:

As I remember, the performance improvement from this optimization was
very small.  There were problems in reviewing IRA and I decided to
simplify this task.

May be it is worth to return to this work.

... especially if you could make it work with LTO.

I'm going to (or more accurately: have started looking at) porting
Vlad's patch to the trunk. I'll probably have this finished next
weekend.

Ok.

I've been thinking a lot about how this IPA-RA could be made more
effective with LTO, but I don't see any easy ways to do this.

Vlad's patch basically looks at what registers are really call-used
and call-clobbered in the 'final' pass where assembly is emitted. This
is the only way to know which registers will be in that set for each
function, you can't already know that when the function is still
represented in GIMPLE or non-strict RTL. LTO works on GIMPLE function
bodies and summaries are written well before RTL is generated,
registers have not been allocated.

With "basic" LTO there's no problem. Functions have to be compiled to
RTL in topological order to make things work, but as far as I know
this already happens (not sure, tho'). With WHOPR (which is IMHO the
only useful LTO mode in general) , the set of call used/clobbered regs
cannot be known and streamed out as a summary for WHOPR because GCC
streams GIMPLE bodies, performs its WHOPR magic on GIMPLE, and
generates RTL only after WHOPR is done.

I suppose it's theoretically possible to make a good initial guess of
what registers might be not-clobbered by a function even if the ABI
says so. For instance, perhaps it's possible to assume that a function
that doesn't touch any variables in a floating point mode also doesn't
use/clobber any floating point registers. This assumption could be
propagated via LTO/WHOPR. If the function turns out to clobber
registers that were assumed to be untouched, you could just save and
restore them in the function ("callee saved" so to speak). But I don't
know how useful that would be.

Another, IMHO more interesting, thing to investigate would be
allocating global variables to registers. This is not part of Vlad's
original patch and I have no real ideas right now how to do that, but
it would be an interesting optimization for single-thread programs
(like GCC itself).

By the way, although classic optimal RA is NP-complete task there is 
polynomial algorithm for interprocedural optimal RA (for some problem 
formulation).  But still, I guess it is too expensive to use it in a 
production compiler.  If you are interesting, here is the classical article:


Minimum Cost Interprocedural Register Allocation (1996)
by Steven M. Kurlander , Charles N. Fischer


Re: Questions about the dg-do directive

2012-10-16 Thread Janis Johnson
On 10/16/2012 03:31 PM, Dominique Dhumieres wrote:
> Thanks for the quick answer.
> 
>> That's just the way it works, so I suppose you could call it a feature.
> 
> So the answer to (1) is yes and to (2) it is a poorly documented feature.
> May be the restriction to one dg-do directive should be added to
> http://gcc.gnu.org/wiki/HowToPrepareATestcase .

And http://gcc.gnu.org/onlinedocs/gccint/Directives.html#Directives .

> In gcc/testsuite/* I have found 27 instances of such double directives
> most of the in the powerpc tests (gcc.target/powerpc/altivec*).
> I can provide a list if it helps.
> 
>> ... We have much better ways in
>> local test directives to skip and xfail tests for different targets.
> 
> Could you elaborate please? AFAIU skip or xfail do not allow to do
> what was intended in the gcc.target/powerpc/altivec* cases for instance:
> run for powerpc*-*-* && vmx_hw and compile for
> powerpc*-*-* && { ! vmx_hw }.

No, those would need to be done in separate tests, or via a different
.exp file (perhaps in a subdirectory) that sets dg-do-what-default
to "compile" or "run" depending on the target; there are several sets
of tests that already do that.

Janis


Re: macro's and arguments

2012-10-16 Thread Andrew Haley
On 10/16/2012 12:45 AM, Mischa Baars wrote:

> Who will be fixing this? Macro arguments without brackets are not 
> accepted by the assembler.
> 
> If I can be of any help, let me know.

We still don't know what your problem is.

Provide us with an example that we can try.

You need to tell us:

What happens.
What you expect to happen.
Why you think the behaviour is wrong.

I have seen your earlier posting.  It did not help.

Andrew.




Re: New dump infrastructure

2012-10-16 Thread Georg-Johann Lay

Sharad Singhai schrieb:


I have enhanced the dump infrastructure in r191883, r191884. These
patches updated the tree/rtl dump facility so that passes do not
reference the dump file directly, but instead use a different (and
hopefully cleaner) API.

Instead of this

if (dump_file)
  fprintf (dump_file, ...);

the new style looks like this

if (dump_kind_p (...))
  dump_printf (...)

[...]

Since the number of existing dump call sites is quite large, currently
both old *and* new schemes are in use. The plan is to gradually
transition passes to use the new dump infrastructure and deprecate the
old dump style. This will also provide better optimization reports in
future.


How are dumps from the backend handled then?

For example, SPU uses fprintf to dump_file.

A backend can easily add additional information to dump files by using 
printf or putc or print_inline_rtx or implement whatever own %-codes to 
neatly print information.


How will that work after the old interface has been deprecated?
Will there be %-Hooks similar to targetm.print_operand?

Johann



Fwd: New dump infrastructure

2012-10-16 Thread Sharad Singhai
> 1. OK, I understand that e.g.
>
>  if (dump_file && (dump_flags & TDF_DETAILS))
>
>should be converted into:
>
>  if (dump_kind_p (TDF_DETAILS))
>
>But what about current code that does not care about dump_flags?
>E.g. converting simple
>
>  if (dump_file)
>
>to
>
>  if (dump_kind_p (0))
>
>won't work, dump_kind_p will always return zero in such cases.


Yes, you are right, the conversion is not completely mechanical and
some care must be taken to preserve the original intent. I think one
of the following might work in the case where a pass doesn't care
about the dump_flags

1. use generic pass type flags, such as TDF_TREE, TDF_IPA, TDF_RTL
which are guaranteed to be set depending on the pass type,
2. this dump statement might be a better fit for MSG_* flags if it
deals with optimizations. Sometimes "if (dump_file) fpritnf
(dump_file, ...)" idiom was used for these situations and now these
sites might be perfect candidate for converting into MSG_* flags.

If a cleaner way to handle this is desired, perhaps we can add an
unconditional "dump_printf_always (...)", but currently it seems
unnecessary. Note that for optimization messages which should always
be printed, one can use MSG_ALL flag. However, no analogous flag
exists for regular dumps. How about adding a corresponding TDF_ALL
flag? Would that work?

>
>
> 2. dump_kind_p seems to always return 0 if current_function_decl is
>NULL.  However, that precludes its use in IPA passes in which this
>can happen regularly.  Why is this restriction necessary?


This is an oversight on my part. Originally, I wanted to be able to
print source location information and this is a remnant of that. I am
testing a patch to fix that.

Thanks,
Sharad


Re: New dump infrastructure

2012-10-16 Thread Sharad Singhai
> Indeed.  I also wonder why dump_kind_p does not check if dumping is
> active at all?  Thus, inside check dump_file / alternate dump_file for NULL.

I am testing a patch which includes a check for
dump_file/alternate_dump_file in dump_kind_p. This is in addition to
checking flags.

>> 2. dump_kind_p seems to always return 0 if current_function_decl is
>>NULL.  However, that precludes its use in IPA passes in which this
>>can happen regularly.  Why is this restriction necessary?
>
> Arguably a bug.  Not sure why it was done this way.

Yes, it is a bug. I am fixing this as well.

Thanks,
Sharad