for getting millisecond resolution during profiling with gprof
Hi all... I want to profile an application while running on linux. I used -pg option with the gcc compiler, then used GPROF as a tool for profiling. But I am getting the cpu usage time of functions in terms of seconds. But I want to analyse in terms of milliseconds. Can anybody help me. If any other method or tools are avilable please trigger me. Jayaraj.p.b
RE: insns for register-move between general and floating
On 22 March 2006 00:41, Greg McGary wrote: > I'm working on a port that has instructions to move bits between > 64-bit floating-point and 64-bit general-purpose regs. I say "bits" > because there's no conversion between float and int: the bit pattern > is unaltered. Therefore, it's possible to use scratch FPRs for > spilling GPRs & vice-versa, and float<->int conversions need not go > through memory. > > Among all the knobs to turn regarding register classes, reload > classes, and modes+constraints on movM, floatMN2, fixMN2 patterns, > I need some advice on how to do this properly. Off the top of my head, you could take a look at the rs6000, which uses IIRC fp regs to move ints about in at least some circumstances[*] - sorry that I don't remember the full details, but you'll find at least something relevant there I think. The general principle would be that you'd have predicates on some insns that accept both fprs and gprs and you'd have predicates on others that accept only gprs and you teach reload how to move a pseudo between fpr and gpr reg classes and the rest pretty much takes care of itself IIUIC. HARD_REGNO_MODE_OK would come into play here as well. cheers, DaveK [*] ISTR that some of those circumstances (at least /used to/) include "when hard-float is disabled", which used to be a regular cause of surprise and confusion back round 2.95.x time... :) -- Can't think of a witty .sigline today
Re: Results for 4.2.0 20060320 (experimental) testsuite on powerpc-apple-darwin8.5.0 (-m64 results)
On Mar 21, 2006, at 11:39 PM, Shantonu Sen wrote: On Mar 21, 2006, at 12:34 PM, Bradley Lucier wrote: I'm curious about whether any of the changes recently proposed to clean up the x86-darwin port can be applied to the 64-bit PowerPC darwin compiler; Like what? I haven't really seen many cleanups that were x86/darwin- specific I was thinking of this thread http://gcc.gnu.org/ml/gcc-patches/2006-03/msg01073.html But perhaps I misunderstood. Brad
Re: alias time explosion
> > > > I used the attached one with -fpermissive > > > Thanks, i'm looking into it now. > So the alias analysis time increase *is* the result of moving the is_global_var check out of is_call_clobbered. This is easy to fix, i'll have a patch in a few hours. However, there is worse news, AFAICT. If i don't turn off scheduling entirely, this testcase now takes >10 minutes to compile (I gave up after that). With scheduling turned off, it takes 315 seconds, checking enabled. It looks like the scheduler is now trying to schedule some single region with 51,000 instructions in it. Everytime i broke into the debugger, it was busy in ready_sort re-doing qsort on the ready-list (which probably had a ton of instructions), over and over and over again. I imagine the 51k instructions comes from the recent scheduling changes. Maxim, can you please take the testcase Andrew attached earlier in the thread, and make it so the scheduler can deal with it in a reasonable amount of time again? It used to take <20 seconds. --Dan
Re: insns for register-move between general and floating
Greg McGary <[EMAIL PROTECTED]> writes: > I'm working on a port that has instructions to move bits between > 64-bit floating-point and 64-bit general-purpose regs. I say "bits" > because there's no conversion between float and int: the bit pattern > is unaltered. Therefore, it's possible to use scratch FPRs for > spilling GPRs & vice-versa, and float<->int conversions need not go > through memory. > > Among all the knobs to turn regarding register classes, reload > classes, and modes+constraints on movM, floatMN2, fixMN2 patterns, > I need some advice on how to do this properly. The Alpha port supports the "itof" and "ftoi" instructions, which do exactly that. So you might want to look there. -- Falk
Re: alias time explosion
Hi Daniel, I can't find the testcase attached to any message of the thread. Could it be because of the message size? If so, please send the testcase both to me and Maxim, one of us will look into it. Thanks, Andrey
Re: for getting profiling times in millsecond resolution.
jayaraj wrote: Hi, I want to get the profiling data of an application in linux. Now I am using -pg options of gcc for generating the profile data. then used gprof for generating profiles. Here I am getting only in terms of seconds. But I want in millisecond resolution. can anybody help me. Thanks & regards Jayaraj The sampling with the -pg profiling is fairly low resolution, 100 samples a second on linux. This would relate to about 10 milliseconds per sample. The only way that you are going to get estimates for functions in the millisecond if there are multiple calls to the funtion. The accumulated time would be divided equally between the counted function calls. You might make more runs over the same section of code to accumulate more sample and function calls to get a better estimate of the time. If you are just looking for flat profilings with higher resolution, you might look at OProfile. The sampling intervals can be much smaller. However, you need to be careful on some processors because the time for a clock cycle can be changed by power management. If you know what sections of code you are interested in you might use the timestamp register to read timing information and compute clock cycles (time) spent in certain regions of code. Alternatively you might use perfmon or perfctr to access performance counters (assuming that the kernel has appropriate patches in it for these). -Will
Re: alias time explosion
On Wed, 2006-03-22 at 16:35 +0300, Andrey Belevantsev wrote: > Hi Daniel, > > I can't find the testcase attached to any message of the thread. Could > it be because of the message size? If so, please send the testcase both > to me and Maxim, one of us will look into it. > > Thanks, Andrey > Yeah, it bounced from the list due to size. The compressed file is in the newly filed bug 26804 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26804 Andrew
want to invest ?
Don't lose your chance to make really good investor carier! http://andrewunknown.blogspot.com/
Re: Ada subtypes and base types
On Tuesday 21 March 2006 21:59, Jeffrey A Law wrote: > On Tue, 2006-03-21 at 10:14 +0100, Duncan Sands wrote: > > > Hi Jeff, on the subject of seeing through typecasts, I was playing around > > with VRP and noticed that the following "if" statement is not eliminated: > > > > int u (unsigned char c) { > > int i = c; > > > > if (i < 0 || i > 255) > > return -1; /* never taken */ > > else > > return 0; > > } > > > > Is it supposed to be? > Fixed thusly. Bootstrapped and regression tested on i686-pc-linux-gnu. Thanks a lot - building now. Best wishes, Duncan.
Re: alias time explosion
Daniel Berlin wrote: ... If i don't turn off scheduling entirely, this testcase now takes >10 minutes to compile (I gave up after that). With scheduling turned off, it takes 315 seconds, checking enabled. It looks like the scheduler is now trying to schedule some single region with 51,000 instructions in it. Everytime i broke into the debugger, it was busy in ready_sort re-doing qsort on the ready-list (which probably had a ton of instructions), over and over and over again. I imagine the 51k instructions comes from the recent scheduling changes. Maxim, can you please take the testcase Andrew attached earlier in the thread, and make it so the scheduler can deal with it in a reasonable amount of time again? It used to take <20 seconds. I've checked the trunk and everything appears ok to me. Both the trunk and the trunk with my patches reverted compile the testcase in 5m30s (they were configured with CFLAGS=-g). My best guess where the >10 minutes came from is that you tried to compile the testcase with the compiler built with profile information - in this case the compilation will last for ~15 minutes. -- Maxim
16 Mar 06 notes from GCC improvement for Itanium conference call
ON THE CALL: Shin-ming Liu (HP), Vladimir Makarov (Red Hat), Mark Smith (Gelato), Bob Kidd (UIUC), Andrey Belevantsev (RAS), Arutyun Avetisyan (RAS), Mark Davis (Intel) Diego Novillo (Red Hat) was unable to join the call, but supplied an update to include in these notes. The GCC track at the upcoming Gelato ICE conference now finalized. Gerolf Hoflehner's talk on SPEC2006 had to be canceled because of a delay in its release. A new addition to the GCC track is Arutyun Avetisyan who will give an RAS work overview and start soliciting input for the August 2006 GCC meeting in Moscow. Confirmed topics/speakers for the Gelato ICE GCC track include: * Russian Academy of Science work overview and plans for August GCC meeting in Moscow - Arutyun Avetisyan * GCC IP issues - Dan Berlin * LLVM - Chris Lattner * LTO - Mark Mitchell * ORC back end for GCC - Shin-Ming Liu * Aliasing update - Dan Berlin * Russian Academy of Science scheduler improvement update - Andrey Belevantsev * Superblock work - Bob Kidd * Parallel programming with GCC - Diego Novillo * Intel micro-architecture talk - Cameron McNairy For a detailed list of confirmed speakers and topics for Gelato ICE 2006, visit: www.gelato.org/meeting#agenda Updates from call participants can be found below. NEXT MEETING: At the Gelato ICE meeting in San Jose, CA, April 24-26, 2006. Andrey Belevantsev: --- Testing the aliasing patch with the latest mainline has revealed the changes in structure aliasing, so we had to rewrite some code that handles variables with structure field tags (SFTs). Now small arrays could also be decomposed onto elements for the sake of better aliasing. The other thing we fixed is more accurate propagation of original tree expressions saved with MEMs during expand. We have sent an updated patch to the gcc-patches list. Vladimir Makarov has approved the speculation patch and provided commetns on the ia64 part of the patch. We have fixed all issues pointed to by Vladimir. After additional regtesting on ia64 and i686, the patch was committed to trunk as rev. 112129. Earlier version of the patch was also bootstrapped and regtested on sparc-solaris. Using the patch on other platforms revealed some bugs (PR26275 and PR26734). The fixes for those PRs are submitted to the list. We have tested the basic features of code motion during this month. To accomplish this task, the main scheduling loop was written. A single iteration of the scheduling loop tries to form a group of instructions, which could be executed in parallel during one cycle (more or less corresponds to the instruction group of IA-64). At first, code motion of entire instructions inside a basic block was tested. Now we are testing interblock motions, which imply possible creation of bookkeeping code. Code motion of conditional branches is now disabled. Our next plans would be enable the code motion of right-hand sides of expressions. The last but not least, our paper proposal for GCC Summit 2006 has been accepted. The paper will talk about new scheduler work, proposed design and current state of implementation. Bob Kidd: - (Bob had his paper proposal for the GCC Summit 2006 accepted. The paper will cover the GCC superblock work in detail.) I checked the Superblock patch into the ia64-improvements tree. This patch has no significant effect on the overall estimated SPEC score for ia64 or ppc, and a slight degradation on x86_64. On IA64, some benchmarks run faster while others slow down. The overall score varies by one point. I'm looking into the changed benchmarks to see what causes the speedup or slowdown. I investigated 300.twolf, which slows down when superblocks are formed at the Tree-SSA level. One function (new_dbox_a) is significantly slower with the superblock patch than without. This function takes a pointer to an integer as an argument and updates the value of that integer inside a hot loop. The loop is structured along these lines: for (hot) if (cond) (biased) a = ... else a = ... *arg += a ... Tail duplication generates two copies of the *arg += ... line, which generates two copies of the load and store of arg. When tail duplication is not done, PRE can move the load and store of arg out of the loop, but it is unable to do this in the superblock loop. My suspicion is that superblock formation needs to fix up the alias info so that later optimizers realize these two loads are the same. Shin-ming Liu - - HP has posted the GCC 4.1 release binary in HP portal for HP-UX: www.hp.com/go/gcc - HP submitted 11 patches to stock gcc and 3 patches to binutil - The Alternative backend project has made reasonable progress. The front end for this compiler still at 3.3.2. Both C and Fortran are functional and achieved the similar performance as ORC 2.1. The current focus is to update the backend to support Itanium C++ ABI. Vladimir Makarov: - Probably Robert Kidd's superblock scheduling in gcc
Core dump in application on SuSE9 zLinux
Hi All, I have one application (named app) which I have compiled on SuSE8 SP3 zLinux. I have tested my application on SuSE8 sp3 Zlinux, But when I try to run the same application on SuSE9 zLinux SP2, I get a core dump. Please find below the gdb and ldd output of the application (gdb) where (Back Trace of core) #0 0x401f0f5c in raise () from /lib/tls/libc.so.6 #1 0x401f251c in abort () from /lib/tls/libc.so.6 #2 0x40118882 in __cxxabiv1::__terminate () from /usr/lib/libstdc++.so.5 #3 0x401188d0 in std::terminate () from /usr/lib/libstdc++.so.5 #4 0x401187b4 in __gxx_personality_v0 () from /usr/lib/libstdc++.so.5 #5 0x401c15dc in _Unwind_ForcedUnwind_Phase2 () from /lib/libgcc_s.so.1 #6 0x401c1dd4 in _Unwind_ForcedUnwind () from /lib/libgcc_s.so.1 #7 0x40063c34 in _Unwind_ForcedUnwind () from /lib/tls/libpthread.so.0 #8 0x40061a5e in __pthread_unwind () from /lib/tls/libpthread.so.0 #9 0x4005d5fc in pthread_exit () from /lib/tls/libpthread.so.0 #10 0x0040d2b8 in ConnectionService () #11 0x4005d1ac in start_thread () from /lib/tls/libpthread.so.0 #12 0x402800ba in thread_start () from /lib/tls/libc.so.6 (gdb) $ldd app libappres.so => /work/new_wrksuse/bin/libappres.so (0x40018000) libapplog.so => /work/new_wrksuse9/bin/libapplog.so (0x4001c000) libpthread.so.0 => /lib/tls/libpthread.so.0 (0x40057000) libdl.so.2 => /lib/libdl.so.2 (0x4006a000) librt.so.1 => /lib/tls/librt.so.1 (0x4006e000) libstdc++.so.5 => /usr/lib/libstdc++.so.5 (0x40076000) libm.so.6 => /lib/tls/libm.so.6 (0x4014b000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x401bc000) libc.so.6 => /lib/tls/libc.so.6 (0x401c7000) /lib/ld.so.1 => /lib/ld.so.1 (0x4000) Can anyone please help me resolve the issue? On facet it seems to be a C Library issue. Please suggest all the possible solutions. I fail to understand why an application which is working fine on SuSe8 zLinux (SP3) is failing on SuSE9 zLinux (SP2). Regards Harshpreet Singh CAUTION - Disclaimer * This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system. ***INFOSYS End of Disclaimer INFOSYS***
Re: Core dump in application on SuSE9 zLinux
On Thu, Mar 23, 2006 at 10:02:22AM +0530, harshpreet_singh wrote: > Can anyone please help me resolve the issue? On facet it seems to be a C > Library issue. Please suggest all the possible solutions. > I fail to understand why an application which is working fine on SuSe8 > zLinux (SP3) is failing on SuSE9 zLinux (SP2). Whatever it is, it's probably not a GCC issue; this is not the appropriate list to ask questions about multithreading. -- Daniel Jacobowitz CodeSourcery