[Bug gcov/profile/28441] New: Need atomic increment of gcov counters for MP programs
The arc instrumentation code generated by -fprofile-arcs increments gcov counters in a manner which is not guaranteed to be atomic. As a result, if two or more cpus enter the same basic block at nearly the same time they can leave an incorrect arc count, which makes it impossible for gcov to solve that function's arc count graph later. Within SGI, this is a significant limitation when doing kernel test coverage studies, as we have code that needs to be exercised over multiple processors. The same problem presumably affects people trying to do coverage studies on multithreaded programs in userspace. I've written a patch to solve this problem. It adds a new option, -fprofile-multithread, which changes the tree profiler code to use the recently-added __sync_add_and_fetch() builtin to update the counters. I've tested it on ia64 and i386; it generates the appropriate object code which runs and produces coverage results. Here's a diffstat: common.opt |4 + doc/invoke.texi| 14 - testsuite/gcc.misc-tests/gcov-12.c | 28 ++ tree-profile.c | 75 + 4 files changed, 119 insertions(+), 2 deletions(-) Reading the documention at gcc.gnu.org, this looks "legally significant" so I'll need to have some paperwork done right? I assume a copyright assignment form and an employer disclaimer are the correct documents. Can someone start that process? Should I just attach the patch now? -- Summary: Need atomic increment of gcov counters for MP programs Product: gcc Version: 4.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: gcov/profile AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: gnb at sgi dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28441
[Bug gcov/profile/28441] Need atomic increment of gcov counters for MP programs
--- Comment #2 from gnb at sgi dot com 2006-07-20 01:51 --- Thanks Andrew, > Also please read http://gcc.gnu.org/contribute.html if you have not already. I have read that, and while it does mention all the requirements it left me confused about I should do to start the process. Should I just submit the patch to the gcc-patches ML and wait for someone to notice that I haven't got paperwork? Or do I start with an email to [EMAIL PROTECTED] then send the patch after completing the paperwork process? Sorry for the newbie questions. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28441
[Bug gcov/profile/28441] Need atomic increment of gcov counters for MP programs
--- Comment #6 from gnb at sgi dot com 2006-07-24 02:23 --- Ian Lance Taylor says: > Please send e-mail to [EMAIL PROTECTED] first. When that process is > complete, send the patch to [EMAIL PROTECTED] Thanks, that was the guidance I needed ;-) The process is underway. Seongbae Park says: > It seems that you didn't change libgcov.c, > which suggests that you didn't address __gcov_{pow2,interval}_profiler. That's correct, I haven't addressed those, for the simple reason that I don't use them and so hadn't noticed an issue. > so it would be very nice if you can fix that as well > (this is just a suggestion). I'm willing to try ;-) but no promises. > I think there are three choices: > > #1 make above profiler routines to use atomic increment *always* I decided not to do this for the -fprofile-arcs case because: a. Atomic increment is a new feature of gcc and so I assume it's not available on all platforms, even all those whose ISA supports it. b. Even when available, atomic increment may incur additional overhead which isn't necessary in a single-threaded context, e.g. the mf (memory fence) instruction on ia64. I expect the same arguments apply to -fprofile-values. > #2 introduce a new environment variable to pick atomic/non-atomic increment The -fprofile-arcs instrumentation would be hard to make behave this way without incurring additional runtime overhead, so I think this approach would just provide an opportunity for the two options to behave inconsistently. > #3 make atomic increment version of those routines and > -fprofile-multithread to generate codes to call them. > > I prefer #3, [...] Agreed. > Another comment is about the name of -fprofile-multithread. > There's an alternative MT-safe profiling scheme of making counters TLS. Yes, actually that was my first approach. However I couldn't figure out a way to make it work in the kernel context (need to gather coverage for dynamically loaded kernel modules), safe (need to aggregate counters from multiple threads/cpus atomically while the code may be running), and efficient (avoid one or two indirections). It would clearly be a better approach if it could be made to work, because it avoid the problem of bouncing counter cachelines in large multiprocessor machines such as SGI makes. Using the atomic increment builtins is a lesser but easier solution. > So I'd prefer if the option for atomic increment is more explicit, > something like -fprofile-atomic-increment. Fair enough, I'll make that change. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28441
[Bug gcov/profile/28441] Need atomic increment of gcov counters for MP programs
--- Comment #10 from gnb at sgi dot com 2006-07-31 01:18 --- Ian: understood and agreed. FYI: the copyright assignment arrived in this morning's mail. I'll need to run it past my lawyer (as I do any legal document) but I don't expect that will take long. The bottleneck is likely to be the employer disclaimer, which will need to go back across the Pacific and up through some layers of corporate bureaucracy. I assume you need both before I can post? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28441