On Thu, 2022-11-03 at 15:59 -0400, Jason Merrill via Gcc-patches wrote: > Tested x86_64-pc-linux-gnu, OK for trunk? > > -- >8 -- > > The c++-contracts branch uses this to retrieve the source form of the > contract predicate, to be returned by contract_violation::comment(). > > gcc/ChangeLog: > > * input.cc (get_source_text_between): New fn. > * input.h (get_source_text_between): Declare. > --- > gcc/input.h | 1 + > gcc/input.cc | 76 > ++++++++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 77 insertions(+) > > diff --git a/gcc/input.h b/gcc/input.h > index 11c571d076f..f18769950b5 100644 > --- a/gcc/input.h > +++ b/gcc/input.h > @@ -111,6 +111,7 @@ class char_span > }; > > extern char_span location_get_source_line (const char *file_path, > int line); > +extern char *get_source_text_between (location_t, location_t); > > extern bool location_missing_trailing_newline (const char > *file_path); > > diff --git a/gcc/input.cc b/gcc/input.cc > index a28abfac5ac..9b36356338a 100644 > --- a/gcc/input.cc > +++ b/gcc/input.cc > @@ -949,6 +949,82 @@ location_get_source_line (const char *file_path, > int line) > return char_span (buffer, len); > } > > +/* Return a copy of the source text between two locations. The > caller is > + responsible for freeing the return value. */ > + > +char * > +get_source_text_between (location_t start, location_t end) > +{ > + expanded_location expstart = > + expand_location_to_spelling_point (start, > LOCATION_ASPECT_START); > + expanded_location expend = > + expand_location_to_spelling_point (end, LOCATION_ASPECT_FINISH); > + > + /* If the locations are in different files or the end comes before > the > + start, abort and return nothing. */
I don't like the use of the term "abort" here, as it suggests to me the use of abort(3). Maybe "bail out" instead? > + if (!expstart.file || !expend.file) > + return NULL; > + if (strcmp (expstart.file, expend.file) != 0) > + return NULL; > + if (expstart.line > expend.line) > + return NULL; > + if (expstart.line == expend.line > + && expstart.column > expend.column) > + return NULL; We occasionally use the convention that (column == 0) means "the whole line". Probably should detect that case and bail out early for it. > + > + /* For a single line we need to trim both edges. */ > + if (expstart.line == expend.line) > + { > + char_span line = location_get_source_line (expstart.file, > expstart.line); > + if (line.length () < 1) > + return NULL; > + int s = expstart.column - 1; > + int l = expend.column - s; Can we please avoid lower-case L "ell" for variable names, as it looks so similar to the numeral for one. "len" would be more descriptive here. > + if (line.length () < (size_t)expend.column) > + return NULL; > + return line.subspan (s, l).xstrdup (); > + } > + > + struct obstack buf_obstack; > + obstack_init (&buf_obstack); > + > + /* Loop through all lines in the range and append each to buf; may > trim > + parts of the start and end lines off depending on column > values. */ > + for (int l = expstart.line; l <= expend.line; ++l) Again, please let's not have a var named "l". Maybe "iter_line" as that's what is being done? > + { > + char_span line = location_get_source_line (expstart.file, l); > + if (line.length () < 1 && (l != expstart.line && l != > expend.line)) ...especially as I *think* the first comparison is against numeral one, whereas comparisons two and three are against lower-case ell, but I'm having to squint at the font in my email client to be sure :-/ > + continue; > + > + /* For the first line in the range, only start at > expstart.column */ > + if (l == expstart.line) > + { > + if (expstart.column == 0) > + return NULL; > + if (line.length () < (size_t)expstart.column - 1) > + return NULL; > + line = line.subspan (expstart.column - 1, > + line.length() - expstart.column + 1); > + } > + /* For the last line, don't go past expend.column */ > + else if (l == expend.line) > + { > + if (line.length () < (size_t)expend.column) > + return NULL; > + line = line.subspan (0, expend.column); > + } > + > + obstack_grow (&buf_obstack, line.get_buffer (), line.length > ()); Is this accumulating the trailing newline characters into the buf_obstack? I *think* it is, but it seems worth a comment for each of the three cases (first line, intermediate line, last line). > + } > + > + /* NUL-terminate and finish the buf obstack. */ > + obstack_1grow (&buf_obstack, 0); > + const char *buf = (const char *) obstack_finish (&buf_obstack); > + > + /* TODO should we collapse/trim newlines and runs of spaces? */ > + return xstrdup (buf); > +} > + Do you have test coverage for this from the DejaGnu side? If not, you could add selftest coverage for this; see input.cc's test_reading_source_line for something similar. OK for trunk otherwise. Hope this is helpful Dave