Re: Adding Profiling support - GCC 4.1.1

2007-03-26 Thread William Cohen

Jim Wilson wrote:

Rohit Arul Raj wrote:

1. The function mcount: While building with native gcc, the mcount
function is defined in glibc. Is the same mcount function available in
newlib? or is it that we have to define it in our back-end as SPARC
does (gmon-sol2.c).


Did you try looking at newlib?  Try something like this
  find . -type f | xargs grep mcount
That will show you all of the mcount support in newlib/libgloss.

sparc-solaris is a special case.  Early versions of Solaris shipped 
without the necessary support files.  (Maybe it still does?  I don't 
know, and don't care to check.)  I think that there were part of the 
add-on extra-cost compiler.  This meant that people using gcc only were 
not able to use profiling unless gcc provided the mcount library. 
Otherwise it never would have been put here.  mcount belongs in the C 
library.



2. Is it possible to reuse the existing mcount definition or is it
customized for every backend?


It must be customized for every backend.


3. Any other existing back-ends that support profiling.


Pretty much all targets do, at least ones for operating systems.  It is 
much harder to make mcount work for an embedded target with no file system.


If you want to learn how mcount works, just pick any existing target 
with mcount support, and study it.


You might take a look at the profiling support in the GNU tool chain for the 
Xscale that Intel distributes. There was some support to use GDB to read the 
required information out of the embedded target even if it didn't have a file 
system.


-Will


Re: how to tweak x86 code generation to instrument certain opcodes with CC trap?

2015-10-23 Thread William Cohen
On 10/23/2015 01:37 AM, Yasser Shalabi wrote:
> Hello,
> 
> I am new to the GCC code. I want to make a simple modification to the
> back end. I want to add a debug exception (int3) to be generated
> before any instance of certain x86 instructions.
> 
> I tried to modify gcc/config/i386/i386.md by adding a "int3" to the
> define_insn for instructions of interest. But that just caused
> configure to fail (cannot run generated C programs).
> 
> Any pointers on how to approach this? Also, suggestions for
> alternative approaches are also welcome.
> 
> Thanks!
> 

Hi,

Do you need the int3 specifically before those instructions?  Or are you just 
looking to instrument the code and collect some information before those 
instructions are executed?  Some alternative instrumentation tools you might 
look at to instrument existing code are:

dyninst http://www.dyninst.org/
Valgrind http://valgrind.org/
Intel's Pin tool 
https://software.intel.com/en-us/articles/pin-a-dynamic-binary-instrumentation-tool

-Will


Re: how to tweak x86 code generation to instrument certain opcodes with CC trap?

2015-10-23 Thread William Cohen
On 10/23/2015 11:37 AM, Yasser Shalabi wrote:
> Hey Will,
> 
> Thanks for the quick reply. Yeah I need the int3 instruction to be
> statically included in he binary so I can't use any dynamic
> instrumentation tool.

Dyninst can do binary rewrites of executables so that might still be suitable.

http://www.dyninst.org/sites/default/files/downloads/w2009/legendre-binrewriter.pdf

-Will

> 
> On Fri, Oct 23, 2015 at 10:32 AM, William Cohen  wrote:
>> On 10/23/2015 01:37 AM, Yasser Shalabi wrote:
>>> Hello,
>>>
>>> I am new to the GCC code. I want to make a simple modification to the
>>> back end. I want to add a debug exception (int3) to be generated
>>> before any instance of certain x86 instructions.
>>>
>>> I tried to modify gcc/config/i386/i386.md by adding a "int3" to the
>>> define_insn for instructions of interest. But that just caused
>>> configure to fail (cannot run generated C programs).
>>>
>>> Any pointers on how to approach this? Also, suggestions for
>>> alternative approaches are also welcome.
>>>
>>> Thanks!
>>>
>>
>> Hi,
>>
>> Do you need the int3 specifically before those instructions?  Or are you 
>> just looking to instrument the code and collect some information before 
>> those instructions are executed?  Some alternative instrumentation tools you 
>> might look at to instrument existing code are:
>>
>> dyninst http://www.dyninst.org/
>> Valgrind http://valgrind.org/
>> Intel's Pin tool 
>> https://software.intel.com/en-us/articles/pin-a-dynamic-binary-instrumentation-tool
>>
>> -Will



Re: eliminate dead stores across functions

2018-03-06 Thread William Cohen
On 03/06/2018 09:28 AM, Richard Biener wrote:
> On Tue, Mar 6, 2018 at 1:00 PM, Prathamesh Kulkarni
>  wrote:
>> Hi,
>> For the following test-case,
>>
>> int a;
>>
>> __attribute__((noinline))
>> static void foo()
>> {
>>   a = 3;
>> }
>>
>> int main()
>> {
>>   a = 4;
>>   foo ();
>>   return a;
>> }
>>
>> I assume it's safe to remove "a = 4"  since 'a' would be overwritten
>> by call to foo ?
>> IIUC, ipa-reference pass does mod/ref analysis to compute side-effects
>> of function call,
>> so could we perhaps use ipa_reference_get_not_written_global() in dse
>> pass to check if a global variable will be killed on call to a
>> function ? If not, I suppose we could write a similar ipa pass that
>> computes the set of killed global variables per function but I am not
>> sure if that's the correct approach.
> 
> Do you think the situation happens often enough to make this worthwhile?
> 
> ipa-reference doesn't compute must-def, only may-def and may-use IIRC.
> 
> Richard.
> 
>> Thanks,
>> Prathamesh

This dead write optimization sounds similar to "DeadSpy: a tool to pinpoint 
program inefficiencies" by Milind Chabbi and John Mellor-Crummey of Rice 
University:

https://dl.acm.org/citation.cfm?id=2259033

The abstract says there were numerous dead writes in the SPEC 2006 gcc 
benchmark and eliminating those provided average 15% improvement in performance.

-Will


Re: for getting profiling times in millsecond resolution.

2006-03-22 Thread William Cohen

jayaraj wrote:

Hi,

I want to get the profiling data of an application in linux. Now I am
using -pg options of gcc for generating the profile data. then used
gprof for generating profiles. Here I am getting only in terms of
seconds. But I want in millisecond resolution.  can anybody help me.

Thanks & regards

Jayaraj



The sampling with the -pg profiling is fairly low resolution, 100 
samples a second on linux. This would relate to about 10 milliseconds 
per sample. The only way that you are going to get estimates for 
functions in the millisecond if there are multiple calls to the funtion. 
The accumulated time would be divided equally between the counted 
function calls.  You might make more runs over the same section of code 
to accumulate more sample and function calls to get a better estimate of 
the time.


If you are just looking for flat profilings with higher resolution, you 
might look at OProfile. The sampling intervals can be much smaller. 
However, you need to be careful on some processors because the time for 
a clock cycle can be changed by power management.


If you know what sections of code you are interested in you might use 
the timestamp register to read timing information and compute clock 
cycles (time) spent in certain regions of code. Alternatively you might 
use  perfmon or perfctr to access performance counters (assuming that 
the kernel has appropriate patches in it for these).


-Will