https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120610
Bug ID: 120610 Summary: pp_token_lists are not always balanced Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: dmalcolm at gcc dot gnu.org Blocks: 116792 Target Milestone: --- In the experimental-html output sink, html_token_printer::print_tokens takes a pp_token_list and prints it as html. Unfortunately this sometimes doesn't fully work; I've seen: ./xgcc -B. -S -fanalyzer -fdiagnostics-add-output=experimental-html ../../src/gcc/testsuite/gcc.dg/bad-binary-ops.c where the BEGIN_QUOTE END_QUOTE and highlighting aren't balanced, with "aka" The token list in question is: TEXT("invalid operands to binary / (have "), BEGIN_QUOTE, BEGIN_COLOR("highlight-a"), TEXT("__m128’ {aka ‘float’}"), END_COLOR, TEXT(" and "), BEGIN_QUOTE, BEGIN_COLOR("highlight-b"), TEXT("const int *"), END_COLOR, END_QUOTE, TEXT(")")] which hierarchically is: TEXT("invalid operands to binary / (have "), BEGIN_QUOTE, BEGIN_COLOR("highlight-a"), TEXT("__m128’ {aka ‘float’}"), <--- note there's no END_QUOTE here to match the BEGIN_QUOTE END_COLOR, TEXT(" and "), BEGIN_QUOTE, BEGIN_COLOR("highlight-b"), TEXT("const int *"), END_COLOR, END_QUOTE, TEXT(")")] (unclosed END_QUOTE) and it's probably should instead be: TEXT("invalid operands to binary / (have "), BEGIN_QUOTE, BEGIN_COLOR("highlight-a"), TEXT("__m128") END_COLOR END_QUOTE <--- here's the missing END_QUOTE, and more fine-grained nodes for the rest of the type TEXT (" {aka ") BEGIN_QUOTE BEGIN_COLOR("highlight-a"), TEXT("float") END_COLOR, END_QUOTE TEXT("} and "), BEGIN_QUOTE, BEGIN_COLOR("highlight-b"), TEXT("const int *"), END_COLOR, END_QUOTE, TEXT(")")] which is properly nested. This leads to not-quite-correct generated HTML. Something's going wrong inside the "aka" handling. For now I'm going to firewall the html_token_printer so it has its own xml::printer to restrict the scope of the mismatches. Ultimately we might want to have an "is nested" precondition for pp_token_list that might have to hold for calls to token_printer::print_tokens vfuncs. We might want a validator for pp_token_list. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116792 [Bug 116792] RFE: should we be able to generate diagnostics in HTML format?