-------- Original Message --------
Subject: Re: [PATCH] Postpone __LINE__ evaluation to the end of #line
directives
Date: Thu, 28 Nov 2013 17:32:41 -0500
From: Max Woodbury <mtewoodb...@gmail.com>
To: Joseph S. Myers <jos...@codesourcery.com>
On 11/28/2013 11:34 AM, Joseph S. Myers wrote:
On Wed, 27 Nov 2013, Max Woodbury wrote:
There should be a way to change the __FILE__ value without changing the
line number sequencing. Whatever that mechanism is, it should NOT
introduce maintenance problems that involve counting lines of code.
I think that #line is mainly intended for use by code generators that
generate C code, rather than directly by people writing C programs. Such
a code generator can easily manage counting lines of code.
A little Googeling quickly turns up examples that make it clear that:
#line __LINE__ "new__FILE__value"
is that expected mechanism,
You'll find any number of examples online based on misconceptions about
the C languages, possibly together with what one particular implementation
does. Any recommendation to do things based on an area where the editor
of the standard has said the ambiguity in the standard is deliberate is
clearly a bad recommendation. Recommendations on use of C should be based
on areas where the standard is clear and implementations agree.
Please try not to be deliberately obstructive. While #line is indeed
used extensively by code generators to map generated code back to the
source code used by the generator, other uses are possible, and the
expectations associated with those uses are worthy of serious
consideration. '#line __LINE__' is indeed a common idiom and it is
expected to leave the line numbering sequence unchanged.
As for the sequence of comments you point to, they are discussing the
use of __LINE__ in macros, not directives. The standard is quite a bit
more explicit about token substitution in directives, making it fairly
clear that substitution is not to occur in directives until
specifically called for. The elaboration of three distinct forms for
the '#line' directive with substitution only being called for in the
third and last form, indicates that something special is intended.
The standard was not created in a vacuum. The ideas did not
materialize out of thin air. The elaborate specification was intended
to codify actual usage. That usage included the '#line __LINE__' idiom
with its intent to NOT break line sequencing.
In other words, if you processed the text in multiple phases the way
the standard requires, you would not substitute the value for the
__LINE__ token until after the end of the directive has been seen.
Thus the problem only arises because this implementation folds the
translation phases into a single pass over the text and takes an
improper short-cut as it does so. The standard explicitly warns
against this kind of mistake.
The standard itself mixes up the phases. Recall that the definition of
line number is "one greater than the number of new-line characters read or
introduced in translation phase 1 (5.1.1.2) while processing the source
file to the current token" (where "current token" is never defined). If
the phases were completely separate, by your reasoning every newline has
been processed in phase 1 before any of phases 2, 3 or 4 do anything, and
so all line numbers relate to the end of the file. There is absolutely
nothing to say that the newline at the end of the #line directive has been
read "while processing the source file to the current token" (if __LINE__
in the #line directive is the current token) but that the newline after it
hasn't been read; if anything, the phases imply that all newlines have
been read.
The standard also includes a mechanism for encoding <end-of-line>s seen
in tokens, so that argument falls apart fairly easily.
This case is just as ambiguous as the case of a multi-line macro call,
where __LINE__ gets expanded somewhere in the macro arguments, and the
line number can be that of the macro name, or of the closing parenthesis
of the call, or somewhere in between, and the standard does not make a
conformance distinction between those choices.
So, I don't think we should make complicated changes to implement one
particular choice in an area of deliberate ambiguity without direction
from WG14 to eliminate the ambiguity in the standard. Instead, we can let
the choices be whatever is most natural in the implementation. If you
believe the standard is defective in not defining certain things, I advise
filing a DR (or, when next open for revisions, proposing a paper at a
meeting to change the definition as you think appropriate).
As pointed out above, this case is distinct from the macro CALL case.
The rules are much more explicitly spelled out for directives and is
only ambiguous if you start with the preconceived notion that it is.
The standard is explicit enough as it stands.
Further, the changes are not all that complicated. One check in the
__LINE__ macro expansion code. A flag set and reset, and two special
case tests in the #line processing code. The alternative is to
actually code to the standard, but that is quite a bit more work.