Question on multiplied address cost computation in ivopt
Hi, Function get_address_cost in ivopt computes multiplied address cost with below code: First: rat = 1; for (i = 2; i <= MAX_RATIO; i++) if (multiplier_allowed_in_address_p (i, mem_mode, as)) { rat = i; break; } Then: if (rat_p) addr = gen_rtx_fmt_ee (MULT, address_mode, addr, gen_int_mode (rat, address_mode)); What's the purpose of first iteration? It just finds the first allowed ratio in address, causing the generated ADDR always has the minimal allowed ratio. Is it right? For target doesn't support multiplied address, the generated ADDR is: (MULT reg, 1). The cost generally is equal to address with pure register. What's the meaning of this cost? Thanks very much. -- Best Regards.
Re: C/C++ Option to Initialize Variables?
On 18/02/13 18:08, Robert Dewar wrote: Forgive me, but I don't see where anything is guaranteed to be zero'd before use. I'm likely wrong somewhere since you disagree. http://en.wikipedia.org/wiki/.bss This is about what happens to work, and specifically notes that it is not part of the C standard. There is a big difference between programs that obey the standard, and those that don't but happen to work on some systems. The latter programs have latent bugs that can definitely cause trouble. A properly written C program should avoid uninitialized variables, just as a properly written Ada program should avoid them. In GNAT, we have found the Initialize_Scalars pragma to be very useful in finding uninitialized variables. It causes all scalars to be initialized using a specified bit pattern that can be specified at link time, and modified at run-time. If you run a program with different patterns, it should give the same result, if it does not, you have an uninitialized variable or other non-standard aspect in your program which should be tracked down and fixed. Note that the BSS-is-always-zero guarantee often does not apply when embedded programs are restarted, so it is by no means a universal guarantee. I believe the standards require that all statically allocated data without an explicit initialisation are initialised to a /value/ of zero. The standards do not require values of zero to actually be zero bits - it is legal for null pointers, 0 integers, and 0.0 floats and doubles to have different representations. It is also perfectly legal for the compiler to put uninitialised data somewhere other than ".bss". But the standards do guarantee that a definition "int x;" has the same effect as "int x = 0;" - and similarly for all other statically allocated data. In embedded systems, any C startup code (the code that is run before main() is called) that does not clear the bss is dangerously broken. (I know of no real-world embedded processors where 0 is /not/ represented as zero bits.) There are a few toolchains that /are/ broken in this way - Texas Instruments's "Code Composer" is a prime example. The manual mentions this briefly at one point - to paraphrase, "the C startup code does not clear the bss at startup. We know this is against every C standard - but we never claimed to follow any particular C standard anyway". So if you know that you will be working with broken compilers that don't follow such fundamental parts of the C standards, then you should not rely on the automatic zero initialisation. But if you are using real working toolchains, then you (as an embedded programmer) /should/ rely on it - because it leads to smaller and faster code than explicitly initialising them to zero. (Of course, sometimes you want to be explicit in initialising to zero for clarity of the program - this trumps efficiency every time.)
Re: Could not identify that register is clobbered already
S, Pitchumani schrieb: From: Georg-Johann Lay S, Pitchumani wrote: I was analyzing an issue for avr target (gcc-4.7.2). Issue is that already clobbered register is used after the transformation in post reload pass. insns after reload pass: set (reg:HI r24 (const:HI (plus:HI (symbol_ref:HI ("array")) (const_int 4)) )) ... parallel set (reg:HI r14 (and:HI (reg:HI r14) (const_int 3))) clobber:QI r25 ... set (reg:HI r28 (const:HI (plus:HI (symbol_ref:HI ("array")) (const_int 4)) )) After post reload pass, insn-3 transformed as follows: set (reg:HI r28 reg:HI r24) this transformation happened in reload_cse_move2add function. Since r25 is clobbered in insn-2, above transformation (r28/29 <= r24/25) become incorrect. You have a test case so that this can be reproduced? Function 'move2add_use_add3_insn' sets only r24 info for insn-1 instead of setting info for both r24/25. Function 'validate_change' checks only r24 info for insn-3 transformation. is it possible to identify clobbered register and avoid transformation? Some passes assume that the frame pointer only spans one register, which does not hold for the avr target where FP lives in R28/R29. Trying to introduce hard_frame_pointer was dropped because the code turned out to have unusable had efficiency. I don't find the patch, AFAIR it is Denis' work, thus CCing him. But without a test case nobody can tell... This behavior is observed for dejagnu test case "gcc.dg/var-expand2.c" when gcc-4.7.2 is used. Options used: -O2 -funroll-loops -ffast-math -mmcu=atxmega128b1 The generated code is wrong. You can file a PR. Thanks.
gcc-4.6-20130222 is now available
Snapshot gcc-4.6-20130222 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.6-20130222/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.6 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_6-branch revision 196233 You'll find: gcc-4.6-20130222.tar.bz2 Complete GCC MD5=b07ff167ad5fa7f47100c4f35d41c1a3 SHA1=019215f0e7a16fbab5587c089f895bbeabb423d1 Diffs from 4.6-20130215 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.6 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Propagating value ranges in ipa-cp and related ideas
Hi, Consider the following function A *CheckNotNull (A *a_ptr) { if (a_ptr == NULL) { // Code with non-trivial code size } return a_ptr; } If this is invoked as CheckNotNull(&a), the inliner should be able to infer that (a_ptr == 0) predicate is false and estimate the size and time based on that. For this to happen, there should be a way to specify that the parameter is non-NULL. Is it reasonable to replace trees corresponding to IPA_JF_CONST with a value_range and encode non-NULL as an anti_range? This would mean that ipa-cp versions can not be created by just initializing the parameter with the propagated constant. This leads to another thought: Versioning of functions based on values may not be ideal in the context of cloning. Consider a similar function: int foo (int n) { if (n == 0) { // Non-trivial amount of code } // Some more code, but no/few expressions involving N and a constant. } If ipa-cp-clone is on, we might end up with many clones for foo for different non-zero values for N. Instead, it may be beneficial to create just a single foo.non_zeroN and reap most of the benefits of cloning. Cloning could be done primarily based on the number of predicates of the function that are true at the call site. (Doing so might also help avoid putting clone of comdats as file-locals as one could generate the clone name based on the predicate and put in its comdat group. ) Are these worth pursuing? Thanks, Easwaran
Re: Propagating value ranges in ipa-cp and related ideas
On Fri, Feb 22, 2013 at 3:08 PM, Easwaran Raman wrote: > Hi, > > Consider the following function > > A *CheckNotNull (A *a_ptr) { > if (a_ptr == NULL) { > // Code with non-trivial code size > } > return a_ptr; > } > > If this is invoked as CheckNotNull(&a), the inliner should be able to > infer that (a_ptr == 0) predicate is false and estimate the size and > time based on that. For this to happen, there should be a way to > specify that the parameter is non-NULL. Is it reasonable to replace > trees corresponding to IPA_JF_CONST with a value_range and encode > non-NULL as an anti_range? I think this will be useful. Function splitting/outliining may or may not happen for this case, making inline size estimation smarter is a good way to go. Also consider the following scenario (interprprocedural VRP: int foo (int a) { if (a > 0) { ... } ... } int bar () { if (a > 9) { foo (a) ; // more precise size estimation. } else { foo (a + 1); } ... } More generally, can the following be handled? nt foo (int a, int b) { if (a > b) { ... } ... } int bar () { if (a > b) { foo (a, b) ; // more precise size estimation. } else { foo (b, a); } ... } > > This would mean that ipa-cp versions can not be created by just > initializing the parameter with the propagated constant. This leads to > another thought: Versioning of functions based on values may not be > ideal in the context of cloning. Consider a similar function: > > int foo (int n) { > if (n == 0) > { > // Non-trivial amount of code > } > // Some more code, but no/few expressions involving N and a constant. > } > > If ipa-cp-clone is on, we might end up with many clones for foo for > different non-zero values for N. Instead, it may be beneficial to > create just a single foo.non_zeroN and reap most of the benefits of > cloning. Cloning could be done primarily based on the number of > predicates of the function that are true at the call site. (Doing so > might also help avoid putting clone of comdats as file-locals as one > could generate the clone name based on the predicate and put in its > comdat group. ) > > Are these worth pursuing? > IMO, yes. The more context you can pass to the callee for inline/clone benefit/cost analysis, the better. Other things include alias info, alignment info etc. David > Thanks, > Easwaran