Re: Data race in PGO profile collection for multi-process program

2015-06-01 Thread Pengfei Yuan
Thank you very much for the advice. I will try __gcov_dump first. Yuan 2015-06-02 12:14 GMT+08:00 Xinliang David Li : > Using AutoFDO is one way. For PGO, you may want to to try using > __gcov_dump interface to explicitly control the timing and order of > the profile dump --- i.e., invoke __gcov_

Data race in PGO profile collection for multi-process program

2015-06-01 Thread Pengfei Yuan
Hi, I am trying PGO on Nginx, which has a main process and several worker processes. I find that the collected profile data files only contain information for the main process, which is probably a data race (the main process exits immediately after worker processes exit). How can I solve this prob

Re: Branch taken rate of Linux kernel compiled with GCC 4.9

2015-01-13 Thread Pengfei Yuan
Actually GCC does not help reduce branch misprediction rate on modern X86 processors. Reducing branch taken rate is more important. Related discussion: https://gcc.gnu.org/ml/gcc/2014-12/msg0.html Yuan 2015-01-13 22:13 GMT+08:00 : > Depending on what the processor hardware can do, the data y

Re: Branch taken rate of Linux kernel compiled with GCC 4.9

2015-01-13 Thread Pengfei Yuan
Thank you for the explanation! I tried the following simple code: int test(int k) { int x = 0; for (int i = 0; i < k; ++i) x += i; return x; } It was compiled (-O2) to something like: int test(int k) { if (k == 0) goto ret0; int x = 0; int i = 0; loop: x += i; i += 1; if (i

Branch taken rate of Linux kernel compiled with GCC 4.9

2015-01-13 Thread Pengfei Yuan
Hi, I have analyzed the branch taken rate of the Linux kernel compiled with GCC (using localyesconfig from Debian config) and found something strange. Hardware: Intel Core i7-4770, 32G RAM, 10GbE Software: Linux 3.16.7, GCC 4.9.3 20121201, Debian sid I use perf with rbf88:k,rff88:k events (Haswe

Re: Confusing description of GCC option `-freorder-blocks'

2014-12-01 Thread Pengfei Yuan
2014-12-01 17:50 GMT+08:00 Kyrill Tkachov : >> In https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html , the >> description of option `-freorder-blocks' says `in order to reduce >> number of taken branches and improve code locality'. It is confusing. >> When will the `and' condition happen? Tha

Confusing description of GCC option `-freorder-blocks'

2014-12-01 Thread Pengfei Yuan
Hi, In https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html , the description of option `-freorder-blocks' says `in order to reduce number of taken branches and improve code locality'. It is confusing. When will the `and' condition happen? That is, taken branches reduced AND code locality impr