RIscRIpt added inline comments.
================
Comment at: clang/include/clang/Basic/TokenKinds.h:87-93
+/// Return true if this token is a predefined macro
+/// unexpandable by MSVC preprocessor.
+inline bool isUnexpandableMsMacro(TokenKind K) {
+ return K == tok::kw___FUNCTION__ || K == tok::kw___FUNCSIG__ ||
+ K == tok::kw_L__FUNCTION__ || K == tok::kw_L__FUNCSIG__ ||
+ K == tok::kw___FUNCDNAME__;
+}
----------------
cor3ntin wrote:
> RIscRIpt wrote:
> > tahonermann wrote:
> > > cor3ntin wrote:
> > > > tahonermann wrote:
> > > > > RIscRIpt wrote:
> > > > > > tahonermann wrote:
> > > > > > >
> > > > > > Thanks, I like the name. But shouldn't we reflect that we are
> > > > > > referring to only Microsoft (unexpandable) macros? How about
> > > > > > `isFunctionLocalPredefinedMsMacro`?
> > > > > I don't think the Microsoft association is meaningful. Sure, some of
> > > > > the predefined macros are only treated differently when compiling in
> > > > > Microsoft mode, but some of those macros aren't Microsoft specific.
> > > > > Also, there might be macros provided by other implementations that
> > > > > we'll want to emulate some day.
> > > > I think it is, there is currently no way to link
> > > > `isFunctionLocalPredefinedMacro` to the MS feature. "MSPredefinedMacro"
> > > > is pretty self explanatory, same reason we most often - but not always
> > > > - use GNU in the name of function related to GNU extensions.
> > > > There are enough similar-sounding features and extensions that we
> > > > should try to make it clear which feature we are talking about.
> > > Perhaps we still haven't found the right name then. With the name that
> > > I've been advocating, this function should also return true for
> > > `tok::kw___PRETTY_FUNCTION__` even though that macro should not expand to
> > > something that can be concatenated with other string literals (whether
> > > compiling in Microsoft mode or not).
> > >
> > > The Microsoft extension is the localized expansion to a string literal
> > > that can be concatenated with other string literals. We probably should
> > > isolate the conditions for that behavior to one place and if we do that,
> > > then I guess naming this function as specific to Microsoft mode is ok; we
> > > can always rename it later if it gets used for non-Microsoft extensions.
> > >
> > > Here is my suggestion then:
> > > // Return true if the token corresponds to a function local predefined
> > > // macro that, in Microsoft modes, expands to a string literal (that can
> > > // then be concatenated with other string literals).
> > > inline bool isMsFunctionLocalStringLiteralMacro(TokenKind K, const
> > > LangOptions &langOpts) {
> > > return langOpts.MicrosoftExt && (
> > > K == tok::kw___FUNCTION__ || K == tok::kw_L__FUNCTION__ ||
> > > K == tok::kw___FUNCSIG__ || K == tok::kw_L__FUNCSIG__ ||
> > > K == tok::kw___FUNCDNAME__);
> > > }
> > >
> > > This will avoid the need for callers to check for `MicrosoftExt`; that is
> > > what I really want to avoid since it is easy to forget to do that check.
> > 1. By requiring user to pass `LangOptions` I think we can remove MS
> > association (implying that there could be other non-msft cases in the
> > future)
> > 2. We would have to include a `LangOptions.h` in `TokenKinds.h`, are we ok
> > with this? Alternatively while this function is for msft cases only, we
> > could pass `bool MicrosoftExt`
> >
> > Personally, I like version with `LangOptions` and removal of `MS`.
> > By requiring user to pass LangOptions I think we can remove MS association
>
> I don't think there is any motivation to future proof against hypotheticals,
> we can always refactor later
>
> > We would have to include a LangOptions.h in TokenKinds.h
>
> This makes me uneasy, but i think we can move the function to
> `LiteralSupport.h` and include that in `ParseExpr.cpp`.
Sounds good to me. Due to other changes I also moved `TokenIsLikeStringLiteral`
there.
> I don't think there is any motivation to future proof against hypotheticals,
> we can always refactor later
Yes, not a strong argument, but I'd like to keep current name if you are not
against.
================
Comment at: clang/lib/Sema/SemaExpr.cpp:2010-2013
+ OS << "R\"EFLPM("
+ << PredefinedExpr::ComputeName(getPredefinedExprKind(Tok.getKind()),
+ currentDecl)
+ << ")EFLPM\"";
----------------
tahonermann wrote:
> cor3ntin wrote:
> > tahonermann wrote:
> > > RIscRIpt wrote:
> > > > tahonermann wrote:
> > > > > "EFLPM"? I'm assuming this is an attempt to avoid an accidental clash
> > > > > with the computed name. Does this suffice to ensure a clash can't
> > > > > happen? If not, can we do better? Per
> > > > > http://eel.is/c++draft/lex.string#nt:r-char and
> > > > > http://eel.is/c++draft/lex.charset#1, we have a lot of flexibility
> > > > > regarding which characters to use.
> > > > At first I thought you were hinting me to use non-ascii characters for
> > > > d-char-sequence. However, when I looked closely d-char-sequence is not
> > > > as flexible as r-chars that you referenced:
> > > > http://eel.is/c++draft/lex.string#nt:d-char and
> > > > http://eel.is/c++draft/lex.charset#2
> > > >
> > > > Here're my thoughts:
> > > > 1. I was afraid that there could be a possibility of a function name
> > > > contain either quote (`"`) or a backslash (`\`) despite it seemed
> > > > impossible.
> > > > 2. At first I used `Lexer::Stringify` to make it bullet-proof. I didn't
> > > > like it, neither did you (reviewers).
> > > > 3. I replaced `Lexer::Stringify` with raw-string-literal
> > > > (d-char-sequence was chosen as acronym of current function name).
> > > > 4. Current `EFLPM` looks weird and cryptic. And we are limited to
> > > > [lex.charset.basic] with exceptions (space and earlier chars are not
> > > > allowed). I think, using a raw-string-literal without d-char-sequence
> > > > would be enough, because problem could be caused only by two
> > > > consecutive characters `)"`, neither of which can appear in a function
> > > > name.
> > > Sorry about that; you are absolutely right that `d-char-sequence` is
> > > (much) more constrained than `r-char-sequence`.
> > >
> > > For `__FUNCSIG__`, the generated text can include arbitrary text. For
> > > example, given this declaration:
> > > void f(decltype(sizeof("()"))*)
> > > the macro value is:
> > > void __cdecl f(decltype(sizeof ("()")) *)
> > > which does contain `)"`.
> > >
> > > I think we should do more to avoid any possibility of the resulting
> > > string being unparseable because the resulting diagnostics will be
> > > completely inscrutable. The best way to do that would be to avoid trying
> > > to produce a parseable string literal in the first place. Based on that
> > > and given that `Sema::ActOnStringLiteral()` immediately uses
> > > `StringLiteralParser` to produce an expanded string literal, I think we
> > > would be better off moving this expansion there such that these macros
> > > can be expanded to already cooked string literals that don't need to be
> > > parsed at all. This will mean having to modify the `StringLiteralParser`
> > > constructor to accept a `Decl*` pointer that will be passed
> > > `getCurLocalScopeDecl()` in `Sema::ActOnStringLiteral()`.
> > > `StringLiteralParser` can then assert if it ever encounters one of these
> > > function local predefined tokens when it hasn't been provided a
> > > corresponding `Decl*` to use with them.
> > I didn't think about that scenario but in that case the original idea to
> > use Lexer::Stringify seems like a cleaner solution to me.
> > Adding that kind of logic to `LiteralSupport` creates a lot of coupling for
> > not much gain that I can see.
> >
> >
> > In that light, `Lexer::Stringify` seems like the least worse option.
> Yeah, that might be the simpler solution. That matches how `__FILE__`,
> `__BASE_FILE__`, and `__FILE_NAME__` are handled in
> `Preprocessor::ExpandBuiltinMacro()`.
Thanks for pointing out good counter-example.
> In that light, Lexer::Stringify seems like the least worse option.
Unfortunately, yes.
In regards other possible implementations, I commented previously here:
https://reviews.llvm.org/D153914#4500807
> to modify the `StringLiteralParser` constructor to accept a `Decl*`
This requires dependency (and linkage) of Lex with AST.
I am going to add your counter-examples to test, and change implementation back
to `Lexer::Stringify`
================
Comment at: clang/test/Sema/ms_predefined_expr.cpp:156
+ static_assert(__FUNCTION__ ""_len == 14); // expected-warning{{string
literal concatenation of predefined identifier '__FUNCTION__' is a Microsoft
extension}}
}
----------------
tahonermann wrote:
> Here is another case to test. MSVC accepts this.
> void test_static_assert() {
> static_assert(true, __FUNCTION__);
> }
Thanks for finding this! I had to do some adjustments to support it. In
addition I changed diagnostics message because it no longer made sense, and
removed `assert` about "single predefined identifiers shall be handled by
ActOnPredefinedExpr". Other code around references of
`err_expected_string_literal` do not seem to require support of this MS ext.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D153914/new/
https://reviews.llvm.org/D153914
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits