On Mon, 2017-05-15 at 15:36 +0200, Martin Liška wrote: > Hi. > > I sent this email to David some time ago, but it should be probably > answered > on gcc mailing list.
> I have idea one to improve gcov tool and I'm interested in more > precise locations for gimple > statements. For gcov purpose, we dump location in ipa-profile pass, > which is an early IPA > pass and this data is used by gcov tool to map statements (blocks) to > lines of code. > > I did a small experiment on the place we emit the location data: > inform (gimple_location (stmt), "output_location"); > > and it shows for: > $ cat m2.c > unsigned int > UuT (void) > { unsigned int true_var = 1; unsigned int false_var = 0; unsigned int > ret = 0; if (true_var) /* count(1) */ { if (false_var) /* count(1) */ > ret = 111; /* count(#####) */ } else ret = 999; /* count(#####) */ > return ret; } > > int > main (int argc, char **argv) > { > UuT (); > return 0; > } > > $ gcc --coverage m2.c > m2.c: In function ‘main’: > m2.c:8:3: note: output_location > UuT (); > ^~~~~~ > # .MEM_2 = VDEF <.MEM_1(D)> > UuT (); > m2.c:9:10: note: output_location > return 0; > ^ > _3 = 0; > m2.c: In function ‘UuT’: > m2.c:3:16: note: output_location > { unsigned int true_var = 1; unsigned int false_var = 0; unsigned > int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* > count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* > count(#####) */ return ret; } > ^~~~~~~~ > true_var_3 = 1; > m2.c:3:43: note: output_location > { unsigned int true_var = 1; unsigned int false_var = 0; unsigned > int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* > count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* > count(#####) */ return ret; } > ^~~~~~~~~ > false_var_4 = 0; > m2.c:3:71: note: output_location > { unsigned int true_var = 1; unsigned int false_var = 0; unsigned > int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* > count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* > count(#####) */ return ret; } > > ^~~ > ret_5 = 0; > m2.c:3:83: note: output_location > { unsigned int true_var = 1; unsigned int false_var = 0; unsigned > int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* > count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* > count(#####) */ return ret; } > > ^ > if (true_var_3 != 0) > m2.c:3:114: note: output_location > { unsigned int true_var = 1; unsigned int false_var = 0; unsigned > int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* > count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* > count(#####) */ return ret; } > > ^ > if (false_var_4 != 0) > m2.c:3:145: note: output_location > { unsigned int true_var = 1; unsigned int false_var = 0; unsigned > int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* > count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* > count(#####) */ return ret; } > > > ~~~~^~~~~ > ret_7 = 111; > m2.c:3:182: note: output_location > { unsigned int true_var = 1; unsigned int false_var = 0; unsigned > int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* > count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* > count(#####) */ return ret; } > > > ~~~~^~~~~ > ret_6 = 999; > m2.c:3:215: note: output_location > { unsigned int true_var = 1; unsigned int false_var = 0; unsigned > int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* > count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* > count(#####) */ return ret; } > > > > ^~~ > _8 = ret_2; > m2.c:3:215: note: output_location > # VUSE <.MEM_9(D)> > return _8; > > Which is not optimal, for some assignments I see just LHS > (false_var_4 = 0), My first though was: are there assignments for which this isn't the case? The only one I see is the: ret = 999; ~~~~^~~~~ Are the locations for these assignments coming through from the frontend? I believe that in the C frontend these are assignment-expression, and hence handled by c_parser_expr_no_commas; in particular the use of op_location and the call: set_c_expr_source_range (&ret, lhs.get_start (), rhs.get_finish ()); ought to be setting up the caret of the assignment to be on the operator token, and for the start/finish to range from the start of the lhs to the end of the rhs i.e. what we see for: ret = 999; ~~~~^~~~~ > for return statements only a returned value is displayed. Is this running on SSA form? If so, I wonder if you're running into something like this: retval_N = PHI <lots of values>; return retval_N; where it's using the location of that "return retval_N;" for all of the return statements in the function, rather than the individual source locations. > For conditions, only condition beginning is showed. > Is this known behavior or do I miss > something? c_parser_if_statement has: loc = c_parser_peek_token (parser)->location; which is that of the open-paren. Maybe we should build a location covering the range of the "if ( expression )" part of the if-statement? Note that the C++ frontend may do different things than the C frontend for any and all of these. When I added range-based locations to them, I attempted to preserve the caret locations to avoid affecting stepping behavior in the debugger (since the caret location is used when stripping away range information); we could revisit some of the decisions. BTW, in the subject line, you mention a "rich_location", but I think you mean a source_location (aka location_t): this is a caret location along with range information; a rich_location is one or more of them, potentially with fix-it hints. See "Example B" and "Example C" in line -map.h. Admittedly the terminology there is a little muddled (sorry). Dave