http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54402
--- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-10-30 18:24:17 UTC --- IMHO it is really not useful to dup together completely unrelated var-tracking PRs together. test-tgmath.i clearly has nothing to do with frame_pointer_needed frame replacements, or with clobber_overlapping_mems speed. The problem there is that we have just a huge amount of VALUEs that are being tracked, e.g. because of the huge result |= this; result |= that; result |= someothertest;. The reason why Alex' recent patch slows it so much down is I believe because it doesn't check whether there aren't already debug stmts for the decls in question, if there are at the beginning of the bb, it shouldn't try to add them again. Otherwise we end up with, as shown on test-tgmath2.i.*.optimized: # DEBUG result => result_3855 # DEBUG result => result_3855 # DEBUG result => result_3855 # DEBUG result => result_3855 # DEBUG result => result_3855 # DEBUG result => result_3855 # DEBUG result => result_3855 # DEBUG result => result_3855 or # DEBUG result => result_3786 # DEBUG ptype => &texpr # DEBUG result => result_3786 # DEBUG ptype => &texpr # DEBUG result => result_3786 # DEBUG ptype => &texpr # DEBUG result => result_3786 # DEBUG ptype => &texpr on pretty big number from the total of > 9000 basic blocks in the testcase. So, perhaps the copying of debug stmts should start with checking if the destination after labels contains any debug stmts, and if yes, gather their decls into some pointer set/bitmap etc., then copy only those which don't have debug stmts there yet (and set the pointer set/bitmap immediately too, so that we don't copy over more than one debug stmt for each lhs decl).