RIscRIpt added inline comments.

================
Comment at: clang/include/clang/Basic/TokenKinds.h:87-93
+/// Return true if this token is a predefined macro
+/// unexpandable by MSVC preprocessor.
+inline bool isUnexpandableMsMacro(TokenKind K) {
+  return K == tok::kw___FUNCTION__ || K == tok::kw___FUNCSIG__ ||
+         K == tok::kw_L__FUNCTION__ || K == tok::kw_L__FUNCSIG__ ||
+         K == tok::kw___FUNCDNAME__;
+}
----------------
cor3ntin wrote:
> RIscRIpt wrote:
> > tahonermann wrote:
> > > cor3ntin wrote:
> > > > tahonermann wrote:
> > > > > RIscRIpt wrote:
> > > > > > tahonermann wrote:
> > > > > > > 
> > > > > > Thanks, I like the name. But shouldn't we reflect that we are 
> > > > > > referring to only Microsoft (unexpandable) macros? How about 
> > > > > > `isFunctionLocalPredefinedMsMacro`?
> > > > > I don't think the Microsoft association is meaningful. Sure, some of 
> > > > > the predefined macros are only treated differently when compiling in 
> > > > > Microsoft mode, but some of those macros aren't Microsoft specific. 
> > > > > Also, there might be macros provided by other implementations that 
> > > > > we'll want to emulate some day.
> > > > I think it is, there is currently no way to link 
> > > > `isFunctionLocalPredefinedMacro` to the MS feature. "MSPredefinedMacro" 
> > > > is pretty self explanatory, same reason we most often - but not always 
> > > > - use GNU in the name of function related to GNU extensions.
> > > > There are enough similar-sounding features and extensions that we 
> > > > should try to make it clear which feature we are talking about.
> > > Perhaps we still haven't found the right name then. With the name that 
> > > I've been advocating, this function should also return true for 
> > > `tok::kw___PRETTY_FUNCTION__` even though that macro should not expand to 
> > > something that can be concatenated with other string literals (whether 
> > > compiling in Microsoft mode or not).
> > > 
> > > The Microsoft extension is the localized expansion to a string literal 
> > > that can be concatenated with other string literals. We probably should 
> > > isolate the conditions for that behavior to one place and if we do that, 
> > > then I guess naming this function as specific to Microsoft mode is ok; we 
> > > can always rename it later if it gets used for non-Microsoft extensions.
> > > 
> > > Here is my suggestion then:
> > >   // Return true if the token corresponds to a function local predefined
> > >   // macro that, in Microsoft modes, expands to a string literal (that can
> > >   // then be concatenated with other string literals).
> > >   inline bool isMsFunctionLocalStringLiteralMacro(TokenKind K, const 
> > > LangOptions &langOpts) {
> > >     return langOpts.MicrosoftExt && (
> > >         K == tok::kw___FUNCTION__ || K == tok::kw_L__FUNCTION__ ||
> > >         K == tok::kw___FUNCSIG__ || K == tok::kw_L__FUNCSIG__ ||
> > >         K == tok::kw___FUNCDNAME__);
> > >   }  
> > > 
> > > This will avoid the need for callers to check for `MicrosoftExt`; that is 
> > > what I really want to avoid since it is easy to forget to do that check.
> > 1. By requiring user to pass `LangOptions` I think we can remove MS 
> > association (implying that there could be other non-msft cases in the 
> > future)
> > 2. We would have to include a `LangOptions.h` in `TokenKinds.h`, are we ok 
> > with this? Alternatively while this function is for msft cases only, we 
> > could pass `bool MicrosoftExt`
> > 
> > Personally, I like version with `LangOptions` and removal of `MS`.
> > By requiring user to pass LangOptions I think we can remove MS association
> 
> I don't think there is any motivation to future proof against hypotheticals, 
> we can always refactor later 
> 
> > We would have to include a LangOptions.h in TokenKinds.h
> 
> This makes me uneasy, but i think we can move the function to 
> `LiteralSupport.h` and include that in `ParseExpr.cpp`.
Sounds good to me. Due to other changes I also moved `TokenIsLikeStringLiteral` 
there.

> I don't think there is any motivation to future proof against hypotheticals, 
> we can always refactor later
Yes, not a strong argument, but I'd like to keep current name if you are not 
against.


================
Comment at: clang/lib/Sema/SemaExpr.cpp:2010-2013
+    OS << "R\"EFLPM("
+       << PredefinedExpr::ComputeName(getPredefinedExprKind(Tok.getKind()),
+                                      currentDecl)
+       << ")EFLPM\"";
----------------
tahonermann wrote:
> cor3ntin wrote:
> > tahonermann wrote:
> > > RIscRIpt wrote:
> > > > tahonermann wrote:
> > > > > "EFLPM"? I'm assuming this is an attempt to avoid an accidental clash 
> > > > > with the computed name. Does this suffice to ensure a clash can't 
> > > > > happen? If not, can we do better? Per 
> > > > > http://eel.is/c++draft/lex.string#nt:r-char and 
> > > > > http://eel.is/c++draft/lex.charset#1, we have a lot of flexibility 
> > > > > regarding which characters to use.
> > > > At first I thought you were hinting me to use non-ascii characters for 
> > > > d-char-sequence. However, when I looked closely d-char-sequence is not 
> > > > as flexible as r-chars that you referenced: 
> > > > http://eel.is/c++draft/lex.string#nt:d-char and 
> > > > http://eel.is/c++draft/lex.charset#2
> > > > 
> > > > Here're my thoughts:
> > > > 1. I was afraid that there could be a possibility of a function name 
> > > > contain either quote (`"`) or a backslash (`\`) despite it seemed 
> > > > impossible.
> > > > 2. At first I used `Lexer::Stringify` to make it bullet-proof. I didn't 
> > > > like it, neither did you (reviewers).
> > > > 3. I replaced `Lexer::Stringify` with raw-string-literal 
> > > > (d-char-sequence was chosen as acronym of current function name).
> > > > 4. Current `EFLPM` looks weird and cryptic. And we are limited to 
> > > > [lex.charset.basic] with exceptions (space and earlier chars are not 
> > > > allowed). I think, using a raw-string-literal without d-char-sequence 
> > > > would be enough, because problem could be caused only by two 
> > > > consecutive characters `)"`, neither of which can appear in a function 
> > > > name.
> > > Sorry about that; you are absolutely right that `d-char-sequence` is 
> > > (much) more constrained than `r-char-sequence`.
> > > 
> > > For `__FUNCSIG__`, the generated text can include arbitrary text. For 
> > > example, given this declaration:
> > >   void f(decltype(sizeof("()"))*)
> > > the macro value is:
> > >   void __cdecl f(decltype(sizeof ("()")) *)
> > > which does contain `)"`.
> > > 
> > > I think we should do more to avoid any possibility of the resulting 
> > > string being unparseable because the resulting diagnostics will be 
> > > completely inscrutable. The best way to do that would be to avoid trying 
> > > to produce a parseable string literal in the first place. Based on that 
> > > and given that `Sema::ActOnStringLiteral()` immediately uses 
> > > `StringLiteralParser` to produce an expanded string literal, I think we 
> > > would be better off moving this expansion there such that these macros 
> > > can be expanded to already cooked string literals that don't need to be 
> > > parsed at all. This will mean having to modify the `StringLiteralParser` 
> > > constructor to accept a `Decl*` pointer that will be passed 
> > > `getCurLocalScopeDecl()` in `Sema::ActOnStringLiteral()`. 
> > > `StringLiteralParser` can then assert if it ever encounters one of these 
> > > function local predefined tokens when it hasn't been provided a 
> > > corresponding `Decl*` to use with them.
> > I didn't think about that scenario but in that case the original idea to 
> > use Lexer::Stringify seems like a cleaner solution to me.
> > Adding that kind of logic to `LiteralSupport` creates a lot of coupling for 
> > not much gain that I can see.
> > 
> > 
> > In that light, `Lexer::Stringify` seems like the least worse option.
> Yeah, that might be the simpler solution. That matches how `__FILE__`, 
> `__BASE_FILE__`, and `__FILE_NAME__` are handled in 
> `Preprocessor::ExpandBuiltinMacro()`.
Thanks for pointing out good counter-example.

> In that light, Lexer::Stringify seems like the least worse option.
Unfortunately, yes.

In regards other possible implementations, I commented previously here: 
https://reviews.llvm.org/D153914#4500807

> to modify the `StringLiteralParser` constructor to accept a `Decl*`
This requires dependency (and linkage) of Lex with AST.

I am going to add your counter-examples to test, and change implementation back 
to `Lexer::Stringify`


================
Comment at: clang/test/Sema/ms_predefined_expr.cpp:156
+    static_assert(__FUNCTION__ ""_len == 14); // expected-warning{{string 
literal concatenation of predefined identifier '__FUNCTION__' is a Microsoft 
extension}}
 }
----------------
tahonermann wrote:
> Here is another case to test. MSVC accepts this.
>   void test_static_assert() {
>     static_assert(true, __FUNCTION__);
>   }
Thanks for finding this! I had to do some adjustments to support it. In 
addition I changed diagnostics message because it no longer made sense, and 
removed `assert` about "single predefined identifiers shall be handled by 
ActOnPredefinedExpr". Other code around references of 
`err_expected_string_literal` do not seem to require support of this MS ext.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153914/new/

https://reviews.llvm.org/D153914

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to