https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120610

            Bug ID: 120610
           Summary: pp_token_lists are not always balanced
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: other
          Assignee: unassigned at gcc dot gnu.org
          Reporter: dmalcolm at gcc dot gnu.org
            Blocks: 116792
  Target Milestone: ---

In the experimental-html output sink, html_token_printer::print_tokens takes a
pp_token_list and prints it as html.

Unfortunately this sometimes doesn't fully work; I've seen:

 ./xgcc -B. -S -fanalyzer -fdiagnostics-add-output=experimental-html
../../src/gcc/testsuite/gcc.dg/bad-binary-ops.c

where the BEGIN_QUOTE END_QUOTE and highlighting aren't balanced, with "aka"

The token list in question is:
  TEXT("invalid operands to binary / (have "), BEGIN_QUOTE,
BEGIN_COLOR("highlight-a"), TEXT("__m128’ {aka ‘float’}"), END_COLOR, TEXT("
and "), BEGIN_QUOTE, BEGIN_COLOR("highlight-b"), TEXT("const int *"),
END_COLOR, END_QUOTE, TEXT(")")]

which hierarchically is:

  TEXT("invalid operands to binary / (have "),
  BEGIN_QUOTE,
    BEGIN_COLOR("highlight-a"),
      TEXT("__m128’ {aka ‘float’}"), <--- note there's no END_QUOTE here to
match the BEGIN_QUOTE
    END_COLOR,
    TEXT(" and "),
    BEGIN_QUOTE,
      BEGIN_COLOR("highlight-b"),
        TEXT("const int *"),
      END_COLOR,
    END_QUOTE,
    TEXT(")")]
  (unclosed END_QUOTE)

and it's probably should instead be:

  TEXT("invalid operands to binary / (have "),
  BEGIN_QUOTE,
    BEGIN_COLOR("highlight-a"),
      TEXT("__m128")
    END_COLOR
  END_QUOTE <--- here's the missing END_QUOTE, and more fine-grained nodes for
the rest of the type
  TEXT (" {aka ")
  BEGIN_QUOTE
    BEGIN_COLOR("highlight-a"),
      TEXT("float")
    END_COLOR,
  END_QUOTE
  TEXT("} and "),
  BEGIN_QUOTE,
    BEGIN_COLOR("highlight-b"),
      TEXT("const int *"),
    END_COLOR,
  END_QUOTE,
  TEXT(")")]

which is properly nested.

This leads to not-quite-correct generated HTML.

Something's going wrong inside the "aka" handling.

For now I'm going to firewall the html_token_printer so it has its own
xml::printer to restrict the scope of the mismatches.

Ultimately we might want to have an "is nested" precondition for pp_token_list
that might have to hold for calls to token_printer::print_tokens vfuncs.   We
might want a validator for pp_token_list.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116792
[Bug 116792] RFE: should we be able to generate diagnostics in HTML format?

Reply via email to