Question on multiplied address cost computation in ivopt

2013-02-22 Thread Bin.Cheng
Hi,
Function get_address_cost in ivopt computes multiplied address cost
with below code:

First:
  rat = 1;
  for (i = 2; i <= MAX_RATIO; i++)
if (multiplier_allowed_in_address_p (i, mem_mode, as))
  {
rat = i;
break;
  }

Then:
  if (rat_p)
addr = gen_rtx_fmt_ee (MULT, address_mode, addr,
   gen_int_mode (rat, address_mode));

What's the purpose of first iteration? It just finds the first allowed
ratio in address, causing the generated ADDR always has the minimal
allowed ratio. Is it right?

For target doesn't support multiplied address, the generated ADDR is:
(MULT reg, 1). The cost generally is equal to address with pure
register. What's the meaning of this cost?

Thanks very much.


--
Best Regards.


Re: C/C++ Option to Initialize Variables?

2013-02-22 Thread David Brown

On 18/02/13 18:08, Robert Dewar wrote:



Forgive me, but I don't see where anything is guaranteed to be zero'd
before use. I'm likely wrong somewhere since you disagree.


http://en.wikipedia.org/wiki/.bss


This is about what happens to work, and specifically notes that it is
not part of the C standard. There is a big difference between programs
that obey the standard, and those that don't but happen to work on some
systems. The latter programs have latent bugs that can definitely
cause trouble.

A properly written C program should avoid uninitialized variables, just
as a properly written Ada program should avoid them.

In GNAT, we have found the Initialize_Scalars pragma to be very useful
in finding uninitialized variables. It causes all scalars to be
initialized using a specified bit pattern that can be specified at
link time, and modified at run-time.

If you run a program with different patterns, it should give the same
result, if it does not, you have an uninitialized variable or other
non-standard aspect in your program which should be tracked down and
fixed.

Note that the BSS-is-always-zero guarantee often does not apply when
embedded programs are restarted, so it is by no means a universal
guarantee.




I believe the standards require that all statically allocated data 
without an explicit initialisation are initialised to a /value/ of zero. 
 The standards do not require values of zero to actually be zero bits - 
it is legal for null pointers, 0 integers, and 0.0 floats and doubles to 
have different representations.  It is also perfectly legal for the 
compiler to put uninitialised data somewhere other than ".bss".  But the 
standards do guarantee that a definition "int x;" has the same effect as 
"int x = 0;" - and similarly for all other statically allocated data.



In embedded systems, any C startup code (the code that is run before 
main() is called) that does not clear the bss is dangerously broken.  (I 
know of no real-world embedded processors where 0 is /not/ represented 
as zero bits.)  There are a few toolchains that /are/ broken in this way 
- Texas Instruments's "Code Composer" is a prime example.  The manual 
mentions this briefly at one point - to paraphrase, "the C startup code 
does not clear the bss at startup.  We know this is against every C 
standard - but we never claimed to follow any particular C standard anyway".


So if you know that you will be working with broken compilers that don't 
follow such fundamental parts of the C standards, then you should not 
rely on the automatic zero initialisation.  But if you are using real 
working toolchains, then you (as an embedded programmer) /should/ rely 
on it - because it leads to smaller and faster code than explicitly 
initialising them to zero.  (Of course, sometimes you want to be 
explicit in initialising to zero for clarity of the program - this 
trumps efficiency every time.)





Re: Could not identify that register is clobbered already

2013-02-22 Thread Georg-Johann Lay

S, Pitchumani schrieb:


From: Georg-Johann Lay

S, Pitchumani wrote:


I was analyzing an issue for avr target (gcc-4.7.2).

Issue is that already clobbered register is used after the
transformation in post reload pass.

insns after reload pass:

set (reg:HI r24
(const:HI (plus:HI (symbol_ref:HI ("array"))
   (const_int 4))
))
...
parallel set (reg:HI r14
 (and:HI (reg:HI r14)
 (const_int 3)))
 clobber:QI r25
...
set (reg:HI r28
(const:HI (plus:HI (symbol_ref:HI ("array"))
   (const_int 4))
))

After post reload pass, insn-3 transformed as follows:

set (reg:HI r28
 reg:HI r24)

this transformation happened in reload_cse_move2add function.

Since r25 is clobbered in insn-2, above transformation (r28/29 <=
r24/25)
become incorrect.

You have a test case so that this can be reproduced?


Function 'move2add_use_add3_insn' sets only r24 info for insn-1 instead
of setting info for both r24/25. Function 'validate_change' checks only
r24
info for insn-3 transformation.
is it possible to identify clobbered register and avoid transformation?

Some passes assume that the frame pointer only spans one register, which
does not hold for the avr target where FP lives in R28/R29.

Trying to introduce hard_frame_pointer was dropped because the code
turned out to have unusable had efficiency.  I don't find the patch,
AFAIR it is Denis' work, thus CCing him.

But without a test case nobody can tell...


This behavior is observed for dejagnu test case "gcc.dg/var-expand2.c" when 
gcc-4.7.2 is used.


Options used: -O2 -funroll-loops -ffast-math -mmcu=atxmega128b1


The generated code is wrong.  You can file a PR.  Thanks.



gcc-4.6-20130222 is now available

2013-02-22 Thread gccadmin
Snapshot gcc-4.6-20130222 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.6-20130222/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.6 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_6-branch 
revision 196233

You'll find:

 gcc-4.6-20130222.tar.bz2 Complete GCC

  MD5=b07ff167ad5fa7f47100c4f35d41c1a3
  SHA1=019215f0e7a16fbab5587c089f895bbeabb423d1

Diffs from 4.6-20130215 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.6
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Propagating value ranges in ipa-cp and related ideas

2013-02-22 Thread Easwaran Raman
Hi,

Consider the following function

A *CheckNotNull (A *a_ptr) {
  if (a_ptr == NULL) {
// Code with non-trivial code size
  }
  return a_ptr;
}

If this is invoked as CheckNotNull(&a), the inliner should be able to
infer that  (a_ptr == 0) predicate is false and estimate the size and
time based on that. For this to happen, there should be a way to
specify that the  parameter is non-NULL.  Is it reasonable to replace
trees corresponding to IPA_JF_CONST with a value_range and encode
non-NULL as an anti_range?

This would mean that ipa-cp versions can not be created by just
initializing the parameter with the propagated constant. This leads to
another thought:  Versioning of functions based on values may not be
ideal in the context of cloning. Consider a similar function:

int foo (int n) {
  if (n == 0)
  {
// Non-trivial amount of code
  }
  // Some more code, but no/few expressions involving N and a constant.
}

 If ipa-cp-clone is on, we might end up with many clones for foo for
different non-zero values for N. Instead, it may be beneficial to
create just a single  foo.non_zeroN and reap most of the benefits of
cloning. Cloning could be done primarily based on the number of
predicates of the function that are true at the call site. (Doing so
might also help avoid putting clone of comdats as file-locals as one
could generate the clone name based on the predicate and put in its
comdat group. )

Are these worth pursuing?

Thanks,
Easwaran


Re: Propagating value ranges in ipa-cp and related ideas

2013-02-22 Thread Xinliang David Li
On Fri, Feb 22, 2013 at 3:08 PM, Easwaran Raman  wrote:
> Hi,
>
> Consider the following function
>
> A *CheckNotNull (A *a_ptr) {
>   if (a_ptr == NULL) {
> // Code with non-trivial code size
>   }
>   return a_ptr;
> }
>
> If this is invoked as CheckNotNull(&a), the inliner should be able to
> infer that  (a_ptr == 0) predicate is false and estimate the size and
> time based on that. For this to happen, there should be a way to
> specify that the  parameter is non-NULL.  Is it reasonable to replace
> trees corresponding to IPA_JF_CONST with a value_range and encode
> non-NULL as an anti_range?

I think this will be useful. Function splitting/outliining may or may
not happen for this case, making inline size estimation smarter is a
good way to go. Also consider the following scenario
(interprprocedural VRP:

int foo (int a)
{
   if (a > 0)
   {
  ...
   }
   ...
}

int bar ()
{

   if (a > 9) {

 foo (a) ;   // more precise size estimation.
 
  }
  else
  {
 foo (a + 1);
   }
 ...
}


More generally, can the following be handled?

nt foo (int a, int b)
{
   if (a > b)
   {
  ...
   }
   ...
}

int bar ()
{

   if (a > b) {

 foo (a, b) ;   // more precise size estimation.
 
  }
  else
  {
 foo (b, a);
   }
 ...
}


>
> This would mean that ipa-cp versions can not be created by just
> initializing the parameter with the propagated constant. This leads to
> another thought:  Versioning of functions based on values may not be
> ideal in the context of cloning. Consider a similar function:
>
> int foo (int n) {
>   if (n == 0)
>   {
> // Non-trivial amount of code
>   }
>   // Some more code, but no/few expressions involving N and a constant.
> }
>
>  If ipa-cp-clone is on, we might end up with many clones for foo for
> different non-zero values for N. Instead, it may be beneficial to
> create just a single  foo.non_zeroN and reap most of the benefits of
> cloning. Cloning could be done primarily based on the number of
> predicates of the function that are true at the call site. (Doing so
> might also help avoid putting clone of comdats as file-locals as one
> could generate the clone name based on the predicate and put in its
> comdat group. )
>
> Are these worth pursuing?
>

IMO, yes. The more context you can pass to the callee for inline/clone
benefit/cost analysis, the better.  Other things include alias info,
alignment info etc.

David



> Thanks,
> Easwaran