Hi! This is an attempt to implement the https://wg21.link/p3034r1 paper, but I'm afraid the wording in the paper is bad for multiple reasons. I think I understand the intent, that the module name and partition if any shouldn't come from macros so that they can be scanned for without preprocessing, but on the other side doesn't want to disable macro expansion in pp-module altogether, because e.g. the optional attribute in module-declaration would be nice to come from macros as which exact attribute is needed might need to be decided based on preprocessor checks. The paper added https://eel.is/c++draft/cpp.module#2 which uses partly the wording from https://eel.is/c++draft/cpp.module#1
The first issue I see is that using that "defined as an object-like macro" from there means IMHO something very different in those 2 paragraphs. As per https://eel.is/c++draft/cpp.pre#7.sentence-1 preprocessing tokens in preprocessing directives aren't subject to macro expansion unless otherwise stated, and so the export and module tokens aren't expanded and so the requirement that they aren't defined as an object-like macro makes perfect sense. The problem with the new paragraph is that https://eel.is/c++draft/cpp.module#3.sentence-1 says that the rest of the tokens are macro expanded and after macro expansion none of the tokens can be defined as an object-like macro, if they would be, they'd be expanded to that. So, I think either the wording needs to change such that not all preprocessing tokens after module are macro expanded, only those which are after the pp-module-name and if any pp-module-partition tokens, or all tokens after module are macro expanded but none of the tokens in pp-module-name and pp-module-partition if any must come from macro expansion. The patch below implements it as if the former would be specified (but see later), so essentially scans the preprocessing tokens after module without expansion, if the first one is an identifier, it disables expansion for it and then if followed by . or : expects another such identifier (again with disabled expansion), but stops after second : is seen. Second issue is that while the global-module-fragment start is fine, matches the syntax of the new paragraph where the pp-tokens[opt] aren't present, there is also private-module-fragment in the syntax where module is followed by : private ; and in that case the colon doesn't match the pp-module-name grammar and appears now to be invalid. I think the https://eel.is/c++draft/cpp.module#2 paragraph needs to change so that it allows also that pp-tokens of a pp-module may also be : pp-tokens[opt] (and in that case, I think the colon shouldn't come from a macro and private and/or ; can). Third issue is that there are too many pp-tokens in https://eel.is/c++draft/cpp.module , one is all the tokens between module keyword and the semicolon and one is the optional extra tokens after pp-module-partition (if any, if missing, after pp-module). Perhaps introducing some other non-terminal would help talking about it? So in "where the pp-tokens (if any) shall not begin with a ( preprocessing token" it isn't obvious which pp-tokens it is talking about (my assumption is the latter) and also whether ( can't appear there just before macro expansion or also after expansion. The patch expects only before expansion, so #define F (); export module foo F would be valid during preprocessing but obviously invalid during compilation, but #define foo(n) n; export module foo (3) would be invalid already during preprocessing. The last issue applies only if the first issue is resolved to allow expansion of tokens after : if first token, or after pp-module-partition if present or after pp-module-name if present. When non-preprocessing scanner sees export module foo.bar:baz.qux; it knows nothing can come from preprocessing macros and is ok, but if it sees export module foo.bar:baz qux then it can't know whether it will be export module foo.bar:baz; or export module foo.bar:baz [[]]; or export module foo.bar:baz.freddy.garply; because qux could be validly a macro, which expands to ; or [[]]; or .freddy.garply; etc. So, either the non-preprocessing scanner would need to note it as possible export of foo.bar:baz* module partitions and preprocess if it needs to know the details or just compile, or if that is not ok, the wording would need to rule out that the expansion of (the second) pp-tokens if any can't start with . or : (colon would be only problematic if it isn't present in the tokens before it already). So, if e.g. defining qux above to . whatever is invalid, then the scanner can rely it sees the whole module name and partition. The patch below implements what is above described as the first variant of the first issue resolution, i.e. disables expansion of as many tokens as could be in the valid module name and module partition syntax, but as soon as it e.g. sees two adjacent identifiers, the second one can be macro expanded. So, effectively: #define SEMI ; export module SEMI used to be valid and isn't anymore, #define FOO bar export module FOO; isn't valid, #define COLON : export module COLON private; isn't valid, #define BAR baz export module foo.bar:baz.qux.BAR; isn't valid, but #define BAZ .qux export module foo BAZ; or #define QUX [[]] export module foo QUX; or #define FREDDY :garply export module foo FREDDY; or #define GARPLY private module : GARPLY; etc. is. Do you agree with the above or does the wording look clear to you? If you agree, shall I file an issue, or will you handle that? I think the patch is at least a step in the direction of the paper's intent, but perhaps not full. If we need to check for initial : or . in the expansion of the first identifier after the module name or module partition, not sure how it would be implemented, currently the patch just lexes direct tokens and pushes them back as backups with possibly NO_EXPAND added on those, but not sure if there is an easy way to peek at the first token from expansion. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2024-08-08 Jakub Jelinek <ja...@redhat.com> PR c++/114461 libcpp/ * lex.cc (cpp_maybe_module_directive): Implement C++26 P3034R1 - Module Declarations Shouldn’t be Macros. For pp-module, if module keyword is followed by CPP_NAME, ensure all CPP_NAME tokens possibly matching module name and module partition syntax aren't expanded and aren't defined as object-like macros. Verify first token after that isn't start with open paren. gcc/testsuite/ * g++.dg/modules/cpp-7.C: New test. * g++.dg/modules/cpp-8.C: New test. * g++.dg/modules/cpp-9.C: New test. * g++.dg/modules/cpp-10.C: New test. * g++.dg/modules/cpp-11.C: New test. * g++.dg/modules/pmp-4.C: New test. * g++.dg/modules/pmp-5.C: New test. * g++.dg/modules/pmp-6.C: New test. * g++.dg/modules/token-6.C: New test. * g++.dg/modules/token-7.C: New test. * g++.dg/modules/token-8.C: New test. * g++.dg/modules/token-9.C: New test. * g++.dg/modules/token-10.C: New test. * g++.dg/modules/token-11.C: New test. * g++.dg/modules/token-12.C: New test. * g++.dg/modules/token-13.C: New test. * g++.dg/modules/token-14.C: New test. * g++.dg/modules/token-15.C: New test. * g++.dg/modules/token-16.C: New test. * g++.dg/modules/dir-only-3.C: Expect an error. * g++.dg/modules/dir-only-4.C: Replace export module foo; with export module baz; and adjust accordingly. * g++.dg/modules/atom-preamble-2_a.C: In export module malcolm; replace malcolm with kevin. * g++.dg/modules/atom-preamble-4.C: Add bob token before NAME(bob). (NAME): Define to just semicolon. --- libcpp/lex.cc.jj 2024-08-07 09:38:00.950805926 +0200 +++ libcpp/lex.cc 2024-08-07 17:23:22.987862447 +0200 @@ -3538,6 +3538,72 @@ cpp_maybe_module_directive (cpp_reader * /* Maybe tell the tokenizer we expect a header-name down the road. */ pfile->state.directive_file_token = header_count; + + /* According to P3034R1, pp-module-name and pp-module-partition tokens + if any shouldn't be macro expanded and identifiers shouldn't be + defined as object-like macro. */ + if (!header_count && peek->type == CPP_NAME) + { + int state = 0; + do + { + cpp_token *tok = peek; + if (tok->type == CPP_NAME) + { + cpp_hashnode *node = tok->val.node.node; + /* Don't attempt to expand the token. */ + tok->flags |= NO_EXPAND; + if (_cpp_defined_macro_p (node) + && _cpp_maybe_notify_macro_use (pfile, node, + tok->src_loc) + && !cpp_fun_like_macro_p (node)) + { + if (state == 0) + cpp_error_with_line (pfile, CPP_DL_ERROR, + tok->src_loc, 0, + "module name \"%s\" cannot " + "be an object-like macro", + NODE_NAME (node)); + else + cpp_error_with_line (pfile, CPP_DL_ERROR, + tok->src_loc, 0, + "module partition \"%s\" cannot " + "be an object-like macro", + NODE_NAME (node)); + } + } + peek = _cpp_lex_direct (pfile); + backup++; + if (tok->type == CPP_NAME) + { + if (peek->type == CPP_DOT) + continue; + else if (peek->type == CPP_COLON && state == 0) + { + ++state; + continue; + } + else if (peek->type == CPP_OPEN_PAREN) + { + if (state == 0) + cpp_error_with_line (pfile, CPP_DL_ERROR, + peek->src_loc, 0, + "module name followed by \"(\""); + else + cpp_error_with_line (pfile, CPP_DL_ERROR, + peek->src_loc, 0, + "module partition followed by " + "\"(\""); + break; + } + else + break; + } + else if (peek->type != CPP_NAME) + break; + } + while (true); + } } else { --- gcc/testsuite/g++.dg/modules/cpp-7.C.jj 2024-08-07 19:22:09.137437301 +0200 +++ gcc/testsuite/g++.dg/modules/cpp-7.C 2024-08-07 19:23:28.172439914 +0200 @@ -0,0 +1,8 @@ +// { dg-do preprocess } +// { dg-additional-options "-fmodules-ts" } + +#define NAME(X) X; + +export module NAME(bob) // { dg-error "module name followed by \\\"\\\(\\\"" } + +int i; --- gcc/testsuite/g++.dg/modules/cpp-8.C.jj 2024-08-07 19:59:14.018375544 +0200 +++ gcc/testsuite/g++.dg/modules/cpp-8.C 2024-08-07 20:00:20.538537180 +0200 @@ -0,0 +1,7 @@ +// { dg-do preprocess } +// { dg-additional-options "-fmodules-ts" } + +#define bob fred; +export module bob; // { dg-error "module name \\\"bob\\\" cannot be an object-like macro" } + +int i; --- gcc/testsuite/g++.dg/modules/cpp-9.C.jj 2024-08-07 20:02:31.160892253 +0200 +++ gcc/testsuite/g++.dg/modules/cpp-9.C 2024-08-07 20:02:40.519774439 +0200 @@ -0,0 +1,7 @@ +// { dg-do preprocess } +// { dg-additional-options "-fmodules-ts" } + +#define bob fred; +export module foo.bob; // { dg-error "module name \\\"bob\\\" cannot be an object-like macro" } + +int i; --- gcc/testsuite/g++.dg/modules/cpp-10.C.jj 2024-08-07 20:02:50.200652577 +0200 +++ gcc/testsuite/g++.dg/modules/cpp-10.C 2024-08-07 20:03:14.469347069 +0200 @@ -0,0 +1,7 @@ +// { dg-do preprocess } +// { dg-additional-options "-fmodules-ts" } + +#define bob fred; +export module foo:bob; // { dg-error "module partition \\\"bob\\\" cannot be an object-like macro" } + +int i; --- gcc/testsuite/g++.dg/modules/cpp-11.C.jj 2024-08-07 20:03:23.340235398 +0200 +++ gcc/testsuite/g++.dg/modules/cpp-11.C 2024-08-07 20:04:02.563741631 +0200 @@ -0,0 +1,7 @@ +// { dg-do preprocess } +// { dg-additional-options "-fmodules-ts" } + +#define bob fred; +export module foo:bar.bob; // { dg-error "module partition \\\"bob\\\" cannot be an object-like macro" } + +int i; --- gcc/testsuite/g++.dg/modules/pmp-4.C.jj 2024-08-07 19:30:11.769346692 +0200 +++ gcc/testsuite/g++.dg/modules/pmp-4.C 2024-08-07 19:30:35.370048861 +0200 @@ -0,0 +1,9 @@ +// { dg-additional-options -fmodules-ts } + +export module bob; +// { dg-module-cmi bob } + +#define PRIVATE private + +module :PRIVATE; // { dg-message "sorry, unimplemented: private module fragment" } +int i; --- gcc/testsuite/g++.dg/modules/pmp-5.C.jj 2024-08-07 19:30:44.894928655 +0200 +++ gcc/testsuite/g++.dg/modules/pmp-5.C 2024-08-07 19:30:57.630767939 +0200 @@ -0,0 +1,9 @@ +// { dg-additional-options -fmodules-ts } + +export module bob; +// { dg-module-cmi bob } + +#define PRIVATE_SEMI private ; + +module :PRIVATE_SEMI // { dg-message "sorry, unimplemented: private module fragment" } +int i; --- gcc/testsuite/g++.dg/modules/pmp-6.C.jj 2024-08-07 19:31:12.812576351 +0200 +++ gcc/testsuite/g++.dg/modules/pmp-6.C 2024-08-07 19:31:24.212432491 +0200 @@ -0,0 +1,9 @@ +// { dg-additional-options -fmodules-ts } + +export module bob; +// { dg-module-cmi bob } + +#define SEMI ; + +module :private SEMI // { dg-message "sorry, unimplemented: private module fragment" } +int i; --- gcc/testsuite/g++.dg/modules/token-6.C.jj 2024-08-07 19:39:42.057149910 +0200 +++ gcc/testsuite/g++.dg/modules/token-6.C 2024-08-07 19:53:59.557338719 +0200 @@ -0,0 +1,9 @@ +// { dg-additional-options "-fmodules-ts" } + +#define bob fred +export module bob; +// { dg-error "module name \\\"bob\\\" cannot be an object-like macro" "" { target *-*-* } .-1 } + +// { dg-module-cmi !bob } +// { dg-module-cmi !fred } +// { dg-prune-output "not writing module" } --- gcc/testsuite/g++.dg/modules/token-7.C.jj 2024-08-07 19:39:50.363045093 +0200 +++ gcc/testsuite/g++.dg/modules/token-7.C 2024-08-07 19:54:08.173230129 +0200 @@ -0,0 +1,9 @@ +// { dg-additional-options "-fmodules-ts" } + +#define bob fred +export module foo.bar.bob; +// { dg-error "module name \\\"bob\\\" cannot be an object-like macro" "" { target *-*-* } .-1 } + +// { dg-module-cmi !foo.bar.bob } +// { dg-module-cmi !foo.bar.fred } +// { dg-prune-output "not writing module" } --- gcc/testsuite/g++.dg/modules/token-8.C.jj 2024-08-07 19:40:16.586714161 +0200 +++ gcc/testsuite/g++.dg/modules/token-8.C 2024-08-07 19:54:57.717605716 +0200 @@ -0,0 +1,9 @@ +// { dg-additional-options "-fmodules-ts" } + +#define bob fred +export module foo.bar:bob; +// { dg-error "module partition \\\"bob\\\" cannot be an object-like macro" "" { target *-*-* } .-1 } + +// { dg-module-cmi !foo.bar:bob } +// { dg-module-cmi !foo.bar:fred } +// { dg-prune-output "not writing module" } --- gcc/testsuite/g++.dg/modules/token-9.C.jj 2024-08-07 19:40:38.464438072 +0200 +++ gcc/testsuite/g++.dg/modules/token-9.C 2024-08-07 19:55:07.246485628 +0200 @@ -0,0 +1,9 @@ +// { dg-additional-options "-fmodules-ts" } + +#define garply fred +export module foo.bar:baz.garply; +// { dg-error "module partition \\\"garply\\\" cannot be an object-like macro" "" { target *-*-* } .-1 } + +// { dg-module-cmi !foo.bar:baz.garply } +// { dg-module-cmi !foo.bar:baz.fred } +// { dg-prune-output "not writing module" } --- gcc/testsuite/g++.dg/modules/token-10.C.jj 2024-08-07 19:41:28.988800478 +0200 +++ gcc/testsuite/g++.dg/modules/token-10.C 2024-08-07 19:41:58.822423990 +0200 @@ -0,0 +1,6 @@ +// { dg-additional-options "-fmodules-ts" } + +#define semi ; +export module foo.bar:baz.bob semi + +// { dg-module-cmi foo.bar:baz.bob } --- gcc/testsuite/g++.dg/modules/token-11.C.jj 2024-08-07 19:42:09.790285580 +0200 +++ gcc/testsuite/g++.dg/modules/token-11.C 2024-08-07 19:42:23.915107329 +0200 @@ -0,0 +1,6 @@ +// { dg-additional-options "-fmodules-ts" } + +#define attr [[]] +export module foo.bar:baz.bob attr ; + +// { dg-module-cmi foo.bar:baz.bob } --- gcc/testsuite/g++.dg/modules/token-12.C.jj 2024-08-07 19:42:53.328736145 +0200 +++ gcc/testsuite/g++.dg/modules/token-12.C 2024-08-07 19:43:05.682580242 +0200 @@ -0,0 +1,6 @@ +// { dg-additional-options "-fmodules-ts" } + +#define bob() fred +export module bob; + +// { dg-module-cmi bob } --- gcc/testsuite/g++.dg/modules/token-13.C.jj 2024-08-07 19:43:15.601455071 +0200 +++ gcc/testsuite/g++.dg/modules/token-13.C 2024-08-07 19:43:36.789187692 +0200 @@ -0,0 +1,6 @@ +// { dg-additional-options "-fmodules-ts" } + +#define bob() fred +export module foo.bar.bob; + +// { dg-module-cmi foo.bar.bob } --- gcc/testsuite/g++.dg/modules/token-14.C.jj 2024-08-07 19:43:48.980033866 +0200 +++ gcc/testsuite/g++.dg/modules/token-14.C 2024-08-07 19:44:07.482800674 +0200 @@ -0,0 +1,6 @@ +// { dg-additional-options "-fmodules-ts" } + +#define bob(n) fred +export module foo.bar:bob; + +// { dg-module-cmi foo.bar:bob } --- gcc/testsuite/g++.dg/modules/token-15.C.jj 2024-08-07 19:44:21.664621940 +0200 +++ gcc/testsuite/g++.dg/modules/token-15.C 2024-08-07 19:44:28.340537798 +0200 @@ -0,0 +1,6 @@ +// { dg-additional-options "-fmodules-ts" } + +#define bob() fred +export module foo.bar:baz.bob; + +// { dg-module-cmi foo.bar:baz.bob } --- gcc/testsuite/g++.dg/modules/token-16.C.jj 2024-08-07 19:57:45.801487348 +0200 +++ gcc/testsuite/g++.dg/modules/token-16.C 2024-08-07 19:58:44.232750934 +0200 @@ -0,0 +1,9 @@ +// { dg-additional-options "-fmodules-ts" } + +#define bob() fred +export module foo.bar:baz.bob (); +// { dg-error "module partition followed by \\\"\\\(\\\"" "" { target *-*-* } .-1 } +// { dg-error "expected" "" { target *-*-* } .-2 } + +// { dg-module-cmi !foo.bar:baz.bob } +// { dg-module-cmi !foo.bar:baz.fred } --- gcc/testsuite/g++.dg/modules/dir-only-3.C.jj 2020-12-22 23:50:17.057972516 +0100 +++ gcc/testsuite/g++.dg/modules/dir-only-3.C 2024-08-08 09:37:54.039836627 +0200 @@ -7,10 +7,12 @@ # 32 "<command-line>" 2 # 1 "dir-only-3.C" // { dg-additional-options {-fmodules-ts -fpreprocessed -fdirectives-only} } -// { dg-module-cmi foo } +// { dg-module-cmi !foo } module; #define foo baz export module foo; +// { dg-error "module name \\\"foo\\\" cannot be an object-like macro" "" { target *-*-* } 5 } +// { dg-prune-output "not writing module" } class import {}; --- gcc/testsuite/g++.dg/modules/dir-only-4.C.jj 2020-12-22 23:50:17.057972516 +0100 +++ gcc/testsuite/g++.dg/modules/dir-only-4.C 2024-08-08 09:33:09.454522024 +0200 @@ -1,8 +1,8 @@ // { dg-additional-options {-fmodules-ts -fpreprocessed -fdirectives-only} } -// { dg-module-cmi !foo } +// { dg-module-cmi !baz } module; #define foo baz -export module foo; +export module baz; class import {}; --- gcc/testsuite/g++.dg/modules/atom-preamble-2_a.C.jj 2020-12-22 23:50:17.055972539 +0100 +++ gcc/testsuite/g++.dg/modules/atom-preamble-2_a.C 2024-08-08 09:35:56.093364042 +0200 @@ -1,6 +1,6 @@ // { dg-additional-options "-fmodules-ts" } #define malcolm kevin -export module malcolm; +export module kevin; // { dg-module-cmi kevin } export class X; --- gcc/testsuite/g++.dg/modules/atom-preamble-4.C.jj 2020-12-22 23:50:17.055972539 +0100 +++ gcc/testsuite/g++.dg/modules/atom-preamble-4.C 2024-08-08 09:35:32.463670046 +0200 @@ -1,5 +1,5 @@ // { dg-additional-options "-fmodules-ts" } -#define NAME(X) X; +#define NAME(X) ; -export module NAME(bob) +export module bob NAME(bob) Jakub