[Bug cobol/119336] cobol: missing copybooks break parser

jklowden at gcc dot gnu.org via Gcc-bugs Wed, 21 May 2025 08:55:17 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119336


James K. Lowden <jklowden at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
           Assignee|unassigned at gcc dot gnu.org      |jklowden at gcc dot 
gnu.org
         Resolution|---                         |WONTFIX
                 CC|                            |jklowden at gcc dot gnu.org

--- Comment #2 from James K. Lowden <jklowden at gcc dot gnu.org> ---
This is a comedy of errors, only partly fixable.  

When the COPY statement fails, lexing has moved to the next line.  The reported
error is correct.  But the source range for the caret is wrong on the 2nd
message because of the dance between the CDF parser, the lexer, and the main
parser.  It's not that the caret points to the wrong word ("DIVISION") in a
2-word token.  It's pointing to the column for the '.' on the previous line,
which was abandoned by the CDF is and is invalid in the main program ahead of
IDENTIFICATION DIVISION.  

What actually happens by design might be surprising.  We're using the CDF here
only to report errors.   

COPY processing happens in the file-reader (lexio) before the input stream is
directed to the lexer.  If lexio parses the COPY directive successfully (and
reads the file), the COPY directive is erased from the input, and the file's
text is inserted, bracketed by line-directives to keep the parser informed
about the filenames.  If lexio cannot parse the COPY directive or read the
file, it leaves the COPY directive intact, where it is parsed by the CDF
parser, which naturally reports syntax errors and missing-file problems.  

Errors at the very top of the program are at the very top of the LALR parser. 
Error recovery in Bison proceeds (usually) by discarding tokens that would fill
the current production, and moving "up a level".  If that level is the very
top, it quits.  

IMO that's just fine.  If the problem is the very top of the program, nothing
is gained by parsing, say, 400,000 lines filled with errors due to a missing
copybook.  Most programmers learn early to fix the first error first, and that
lesson is well applied here.

[Bug cobol/119336] cobol: missing copybooks break parser

Reply via email to