Dear binutils maintainers, I was recently profiling some algorithm improvements in the Bos Wars game, using gcc -pg and gprof on amd64. To my surprise, when I ran the same input through the game several times, gprof displayed different timings _and_ call counts each time, even though section 6.1 in the documentation of gprof in binutils 2.20 states:
"By contrast, the number-of-calls and basic-block figures are derived by counting, not sampling. They are completely accurate and will not vary from run to run if your program is deterministic." In one series of three 95-minute runs, gprof showed that one function had been called 1858749793, 1858746382, and 1858740724 times. The variation there was 0.0005%; not much, but anyway it made me think there was something unexpectedly random going on with my algorithms. To find out at what point the different runs diverge, I then added some debug logging to this function, and reran the game thrice with a shorter input sequence. To my surprise, the debug logging showed that the function was called 224772 times in each run. However, gprof displayed 224769, 224767, and 224763 as the number of calls in these runs. That is, the numbers shown by gprof were 0.0013%, 0.0022%, and 0.0040% too low. I eventually found the cause with gdb. In Debian libc6 2.10.2-6 (based on eglibc) on amd64, the __mcount_internal function http://www.eglibc.org/cgi-bin/viewcvs.cgi/branches/eglibc-2_10/libc/gmon/mcount.c?rev=8421&view=markup uses _gmonparam.state as an atomic flag to detect recursive calls. This detection was triggered when two threads called mcount at the same time. Bos Wars doesn't create threads of its own, but it does use the SDL library, which starts a thread for sound mixing and calls some functions of Bos Wars there. These sound-mixing functions have nothing to do with the algorithms I wanted to profile, but they interfered with the results anyway. When I disabled sounds, I got the correct call counts from gprof. The Bos Wars binaries for the earlier profiling runs had been compiled with just -D_REENTRANT (from sdl-config --cflags) and without -pthread. After finding the __mcount_internal problem, I then added -pthread, to see if that might cause a different version of __mcount_internal to be linked in. However, gdb showed that the same code got called with -pthread too. Now, it would be nice to change __mcount_internal in libc to better support multiple threads. However, it is not clear to me how that can be done: you can't just wrap the code in a pthread mutex because it can be interrupted at any time by a signal and the signal handler can then call mcount again. You could block the signals but that would cost two system calls on each entry to a profiled function and perhaps cause even more problems with threads. Some kind of lock-free list or per-thread accounting should work but that is not so trivial to implement. Therefore, I just ask that you update that gprof documentation to warn about this issue. _______________________________________________ bug-binutils mailing list bug-binutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-binutils