Curious. I ran both g++ variants in oprofile, and then compared the 
generated assembler code for the most critical functions.

The top 1 function in both cases is pointer_set_insert, and there the 
assembler code is 100% identical (module one choice between r14 and r15).

The second most critical function in the gcc-in-cxx build is walk_tree_1, 
which is only place 4 in mainline gcc.
There the code seems to be identical, too, except for code layout: The 
compiler arranges the code in a different order, and apparently has 
different a different branch prediction. The non-branching code is nearly 
identical, too.
The "hottest" assembler instructions in walk_tree_1 are memory accesses, 
apparently the mainline version causes slightly less cache misses or better 
prediction? (my interpretation, not measured yet)

I am a bit unsure how to proceed. The gcc-in-cxx assembler code looks ok, as 
it is nearly identical to the mainline code. The main differences are in the 
code/branch layout, and I wouldn't know how to debug this.

Thomas



Reply via email to