On Wed, Jun 18, 2025 at 3:28 PM Richard Biener <richard.guent...@gmail.com> wrote: > > On Tue, Jun 17, 2025 at 7:51 PM James K. Lowden > <jklow...@schemamania.org> wrote: > > > > The COBOL FE emits code for a recent ARM VM that is definitely not what > > the user or, ahem, the FE author intended. The observed behavior is > > that the program enters an infinite loop calling the main entry point, > > eventually exhausting the stack. The observed assembler code does or > > does not refer to the GOT and ends up not going where it should. > > > > We think either we're not using GENERIC as intended, or what we're > > doing is tripping up the code generator. Possibly both. > > > > The working VM is > > > > hostname = gcc-cobol > > uname -m = aarch64 > > uname -r = 5.15.0-122-generic > > uname -s = Linux > > uname -v = #132-Ubuntu SMP Thu Aug 29 13:45:17 UTC 2024 > > > > The broken VM is > > > > hostname = potato > > uname -m = aarch64 > > uname -r = 6.8.0-60-generic > > uname -s = Linux > > uname -v = #63-Ubuntu SMP PREEMPT_DYNAMIC Tue Apr 15 18:51:58 UTC 2025 > > > > The COBOL is > > > > IDENTIFICATION DIVISION. > > PROGRAM-ID. prog. > > PROCEDURE DIVISION. > > PROG-MAIN. > > DISPLAY "I am prog" > > CALL "prog2" > > STOP RUN. > > > > IDENTIFICATION DIVISION. > > PROGRAM-ID. prog2. > > PROCEDURE DIVISION. > > PROG-PROG2. > > DISPLAY "I am prog2". > > END PROGRAM prog2. > > > > END PROGRAM prog. > > > > The problem is a forward reference to a function without external > > linkage, namely prog2. > > > > In COBOL parlance, prog2 is a "contained program". The containing > > program, prog1, can call contained programs but not vice-versa. There > > is no requirement for a function prototype denoting a forward > > reference. > > > > A COBOL program (top-level) is a function with external linkage and C > > semantics. A contained program is function with "internal linkage" if > > there is such a thing. In C terms, the above might be represented as > > > > void prog() { puts("I am prog"); prog2(); } > > static void prog2() { puts("I am prog2"); } > > > > Names with external linkage are published verbatim. Names with > > internal linkage get an internal name unique to the translation unit, in > > this case, "prog2.62". It is the compiler's job, I think obviously, to > > find prog2; the linker is not involved. > > > > Because a contained program always appears after the containing program, > > the compiler does not know when it encounters CALL whether "prog2" > > names a contained program or is a reference to another module to be > > linked in later. We begin by assuming it's an external reference. At > > EOF we review the CALLs and, for string constants that name contained > > programs, substitute the name of the function representing the contained > > program. For your reference, that touch-up work is done by > > parser_call_target_update(). > > I'm just looking there where you do > > for( auto func : p->second ) > { > func.convention = cbl_call_verbatim_e; > DECL_NAME(func.node) = get_identifier(mangled_name); > > but GCC, when matching a CALL_EXPR and a destination _definition_, > requires the actual FUNCTION_DECL trees to match up, not just > their name. That is, while GCC builds up a symbol table it does that > based on _decls_, not based on decl names.
So, for the testcase I see with -fdump-tree-original-uid (and a fix for that...): signed long prog2.62D.280 () { ... D.292 = __gg__is_canceledD.242 ((unsigned long) prog2.62D.282); ... ..function_handle.1.0D.275 = prog2.62D.276; so I see three different DECLs for prog2.62 where there should be only one. > That means, the adjustment should end up unifying the FUNCTION_DECL > used for all calls. > > Btw, is there any way that the thing 'prog' calls can turn out a "wrong > thing"? > Aka, not a function? How would you emit a diagnostic for that? That is, > in C the called thing could be a variable with the same mangling. > > Traditionally I'd have the parser register a tentative (extern) declaration > at the point of the call in 'prog' and when parsing 'prog2' I'd query the > table of (tentative) declarations, find one for 'prog2' and then rewrite > that to a definition. So, I would have expected the Cobol frontend to have > a symbol table based on its name lookup rules. > > > > > One other data point, as a sidebar. The target of a CALL statement in > > COBOL need not be a literal. In C there's no syntax to "call by name", > > where the name is a mere string determined at runtime. In COBOL for, > > > > CALL P. > > > > P names an alphanumeric variable whose contents are of course resolved > > at runtime with dlsym(3), even if the value of P was established when > > initialized and never changed. If we change the above program to call > > prog2 through a variable, the program works on both architectures. The > > above substitution does not occur (because the compiler doesn't know > > what's being called). dlsym(3) nevertheless finds the internal name, I > > think because of -rdynamic. End sidebar. > > > > The first question, then, is "Are we doing it right?" If not, what are > > the constraints on changing GENERIC as it's being built up? It would > > be nice to support forward references without redesigning the FE. > > > > If we are doing it right, then we want to report mumble something else > > is wrong. We can supply an infinitude of details, including assembly > > listings. > > > > --jkl