On Wed, 2018-02-07 at 17:26 +0000, Nick Clifton wrote:
> Hi Martin, Hi David,
>
> OK - attached is a new patch that:
>
> * Replaces control characters with their escape equivalents.
> * Includes a testcase.
>
> I was not sure what to do about the inconsistencies between the
> behaviour of #warning and #pragma GCC warning, (and the error
> equivalents), so I decided to leave it for now. Fixing them
> will require mucking about in the C preprocessor library, which
> I did not fancy doing at the moment.
>
> So, is this enhanced patch OK now ?
>
> Cheers
> Nick
>
> gcc/ChangeLog
> 2018-02-07 Nick Clifton <[email protected]>
>
> PR 84195
> * tree.c (warn_deprecated_use): Replace control characters in
> deprecation messages with escape sequences.
>
> gcc/testsuite/ChangeLog
> 2018-02-07 Nick Clifton <[email protected]>
>
> * gcc.c-torture/compile/pr84195.c: New test.
> Index: gcc/tree.c
> ===================================================================
> --- gcc/tree.c (revision 257451)
> +++ gcc/tree.c (working copy)
> @@ -12431,8 +12431,66 @@
> if (attr)
> attr = lookup_attribute ("deprecated", attr);
>
> + char * new_msg = NULL;
> + unsigned int new_i = 0;
> +
> if (attr)
> - msg = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr)));
> + {
> + msg = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr)));
> +
> + if (msg)
> + {
> + /* PR 84195: Replace control characters in the message with their
> + escaped equivalents. Allow newlines if -fmessage-length has
> + been set to a non-zero value.
I'm not quite sure why we allow newlines in this case, sorry.
> This is done here, rather than
> + where the attribute is recorded as the message length can
> + change between these two locations. */
> +
> + /* FIXME: Do we need to check to see if msg is longer than 2^32 ? */
> + unsigned int i;
Please can you split this out into a subroutine, and please add selftests
for it, so that we can unit-test it directly.
Try grepping for e.g. tree_c_tests and other users of selftest.h to see
the idea.
This would be in addition to the DejaGnu-based test (I prefer a "belt and
braces" approach here).
Suggested unittest coverage:
- the empty string
- a non-empty string in which nothing is escaped
- a non-empty string in which everything is escaped
- all of the cases in the switch
- strings with and without newlines, since you're special-casing that
Ownership of the string seems fiddly, so given that gcc is implemented
in C++98, maybe instead of a subroutine, make it a class, with
responsibility for a string that we may or may not own; something like:
class escaped_string
{
public:
escaped_string (const char *unescaped);
~escaped_string () { if (m_owned) free (m_str); }
operator const char *() const { return m_str; }
private:
char *m_str;
bool m_owned;
};
so you can (I hope) simply do:
ASSERT_STREQ ("foo\\nbar", escaped_string ("foo\nbar"));
and similar for the various cases, without needing an explicit "free"
that can get lost if the function ever gains an early return.
(Or, better, have an escape_string function return an instance of a
class that has responsibility for the "if (ownership) free" dtor, but I
can't think of a good name for such a class right now).
Note that "make selftest-valgrind" will run the selftests under
valgrind (sadly, there are a few pre-existing leaks, even if you do
configure gcc with --enable-valgrind-annotations).
> +
> + for (i = 0; i < strlen (msg); i++)
Just get strlen once, rather than each time through the loop?
"size_t unescaped_len", maybe?
> + {
> + char c = msg[i];
> +
> + if (! ISCNTRL (c))
> + {
> + if (new_msg)
> + new_msg[new_i++] = c;
> + continue;
> + }
> +
> + if (c != '\n' || ! pp_is_wrapping_line (global_dc->printer))
Could a "bool needs_escaping_p (char)" subroutine clarify the logic
here?
> + {
Maybe a comment here noting this is the first char needing escaping?
> + if (new_msg == NULL)
> + {
> + new_msg = (char *) xmalloc (strlen (msg) * 2);
Off by one, I think - shouldn't this be
(strlen (msg) * 2) + 1
to allow for every char potentially being escaped, *plus* the NIL-
terminator?
Hence worth having a unittest for the "non-empty and everything is
escaped" case.
(and reuse the one-time strlen from above).
> + strncpy (new_msg, msg, i);
> + new_i = i;
> + }
> + new_msg [new_i++] = '\\';
> + switch (c)
> + {
> + case 7: new_msg[new_i++] = 'a'; break;
> + case 8: new_msg[new_i++] = 'b'; break;
> + case 27: new_msg[new_i++] = 'e'; break;
> + case 12: new_msg[new_i++] = 'f'; break;
> + case 10: new_msg[new_i++] = 'n'; break;
> + case 13: new_msg[new_i++] = 'r'; break;
> + case 9: new_msg[new_i++] = 't'; break;
> + case 11: new_msg[new_i++] = 'v'; break;
Can all these case labels be char constants, so e.g.:
case '\n': new_msg[new_i++] = 'n'; break;
(does that work for every build/host combo?)
> + default: new_msg[new_i++] = '?'; break;
> + }
> + }
> + }
> +
> + if (new_msg)
> + {
> + new_msg[new_i] = 0;
> + msg = new_msg;
> + }
> + }
> + }
> else
> msg = NULL;
>
> @@ -12505,6 +12563,8 @@
> }
> }
> }
> +
> + free (new_msg);
> }
>
> /* Return true if REF has a COMPONENT_REF with a bit-field field declaration
> --- /dev/null 2018-02-07 09:26:20.608663241 +0000
> +++ gcc/testsuite/gcc.c-torture/compile/pr84195.c 2018-02-07
> 17:20:13.494587052 +0000
> @@ -0,0 +1,17 @@
> +/* { dg-options "-Wdeprecated-declarations" } */
> +
> +/* Check that MSG is printed without the escape characters being interpreted.
> + Especially the newlines.
> +
> + Note - gcc's behaviour is inconsistent in this regard as #error and
> + #warning will also display control characters as escape sequences,
> + whereas #pragma GCC error and #pragma GCC warning will perform the
> + control operations of the control characters. */
> +
> +#define MSG "foo\n\t\rbar"
> +
> +int f (int i __attribute__ ((deprecated (MSG))))
> +{
> + return 0 ? i : 0; /* { dg-warning "'i' is deprecated: foo.n.t.rbar" } */
> +}
We want the output to be:
file:line:col: warning: 'i' is deprecated: foo\n\t\rbar [-Wblah-blah]
I think you can use "\\" to signifiy "\" in a DG test, so I *think*
this dg directive can
read:
return 0 ? i : 0; /* { dg-warning "'i' is deprecated: foo\\n\\t\\rbar" } */
Hope this is constructive
Dave