On Sat, 30 Sept 2023 at 22:12, Joern Rennecke
<joern.renne...@embecosm.com> wrote:

> Also, we might have different directives for not scanning in LTO sections -
> or just ignoring .ascii .  Or maybe the other way round - you have to do
> something special if you want to scan inside strings, and by default we
> don't look inside strings?
> LTO information uses ascii, and ISTR sometimes also a zero-terminated
> variant (asciiz?); There might also some string constant outputs, or stabs
> information.
> One possible rule I think might work is: if the RE doesn't mention a quote,
> don't scan what's quoted inside double quotes.  Although we might to have
> to look out for backslash-escaped quotes to find the proper end of a quoted
> string.

I've though about this some more, and we need something that's simple for
dejagnu and simple to describe.

So I propose we look at the first character of the regexp, and if it's neither
^ nor \ (neither caret nor backslash), we consider the regexp un-anchored,
and prepend ^[^"]* , so it won't allow a match after a double quote.
Then document this in sourcebuild.texi, with some mention of lto information
and stabs, and also mentioning that if you really want to match irrespective
of a leading quote, you can prepend ^.* to your regexp.
There are good reasons to be more specific with your regexps in general,
but the matches in LTO are particularily damaging because they appear
semi-random, so often escape a regression test when the test is made,
only to surface during somebody else's regression test.

Reply via email to