> -----Original Message-----
> From: Richard Biener <richard.guent...@gmail.com>
> Sent: Wednesday, June 18, 2025 09:28
> To: James K. Lowden <jklow...@schemamania.org>
> Cc: gcc@gcc.gnu.org; Bob Dubner <rdub...@cobolworx.com>
> Subject: Re: harm in ARM
>
> On Tue, Jun 17, 2025 at 7:51 PM James K. Lowden
> <jklow...@schemamania.org> wrote:
> >
> > The COBOL FE emits code for a recent ARM VM that is definitely not what
> > the user or, ahem, the FE author intended. The observed behavior is
> > that the program enters an infinite loop calling the main entry point,
> > eventually exhausting the stack. The observed assembler code does or
> > does not refer to the GOT and ends up not going where it should.
> >
> > We think either we're not using GENERIC as intended, or what we're
> > doing is tripping up the code generator. Possibly both.
> >
> > The working VM is
> >
> > hostname = gcc-cobol
> > uname -m = aarch64
> > uname -r = 5.15.0-122-generic
> > uname -s = Linux
> > uname -v = #132-Ubuntu SMP Thu Aug 29 13:45:17 UTC 2024
> >
> > The broken VM is
> >
> > hostname = potato
> > uname -m = aarch64
> > uname -r = 6.8.0-60-generic
> > uname -s = Linux
> > uname -v = #63-Ubuntu SMP PREEMPT_DYNAMIC Tue Apr 15 18:51:58 UTC 2025
> >
> > The COBOL is
> >
> > IDENTIFICATION DIVISION.
> > PROGRAM-ID. prog.
> > PROCEDURE DIVISION.
> > PROG-MAIN.
> > DISPLAY "I am prog"
> > CALL "prog2"
> > STOP RUN.
> >
> > IDENTIFICATION DIVISION.
> > PROGRAM-ID. prog2.
> > PROCEDURE DIVISION.
> > PROG-PROG2.
> > DISPLAY "I am prog2".
> > END PROGRAM prog2.
> >
> > END PROGRAM prog.
> >
> > The problem is a forward reference to a function without external
> > linkage, namely prog2.
> >
> > In COBOL parlance, prog2 is a "contained program". The containing
> > program, prog1, can call contained programs but not vice-versa. There
> > is no requirement for a function prototype denoting a forward
> > reference.
> >
> > A COBOL program (top-level) is a function with external linkage and C
> > semantics. A contained program is function with "internal linkage" if
> > there is such a thing. In C terms, the above might be represented as
> >
> > void prog() { puts("I am prog"); prog2(); }
> > static void prog2() { puts("I am prog2"); }
> >
> > Names with external linkage are published verbatim. Names with
> > internal linkage get an internal name unique to the translation unit, in
> > this case, "prog2.62". It is the compiler's job, I think obviously, to
> > find prog2; the linker is not involved.
> >
> > Because a contained program always appears after the containing program,
> > the compiler does not know when it encounters CALL whether "prog2"
> > names a contained program or is a reference to another module to be
> > linked in later. We begin by assuming it's an external reference. At
> > EOF we review the CALLs and, for string constants that name contained
> > programs, substitute the name of the function representing the contained
> > program. For your reference, that touch-up work is done by
> > parser_call_target_update().
>
> I'm just looking there where you do
>
> for( auto func : p->second )
> {
> func.convention = cbl_call_verbatim_e;
> DECL_NAME(func.node) = get_identifier(mangled_name);
>
> but GCC, when matching a CALL_EXPR and a destination _definition_,
> requires the actual FUNCTION_DECL trees to match up, not just
> their name. That is, while GCC builds up a symbol table it does that
> based on _decls_, not based on decl names.
>
> That means, the adjustment should end up unifying the FUNCTION_DECL
> used for all calls.
<nodding> Richard, you hit the nail right on the head. <grin> I have been
trying to convey to Jim my growing conviction that CALL_EXPR expressions to
static functions have to take a FUNCTION_DECL. And I have told him that the
fact that our CALL code is working is more-or-less accidental.
I will be revisiting this very soon.
One of the reasons I haven't yet fixed it is because, well, it's COBOL. CALLs
had to be implemented very early, back when I was young and foolish and I
didn't know anything about GENERIC.
And because it's COBOL, there are no function prototypes. And we implemented
the front end doing a single pass.
Thus, the statement
CALL "foo"
might be the equivalent, implemented in C, of
extern <type> foo(...); //External reference
foo();
or
static <type> foo(...); // Forward reference to a static function
foo();
At the point of the CALL, we don't know which it's going to be.
This is further complicated by COBOL being able to do something that C can't:
CALL call_me
If C could do it, it would look like this:
char call_me[] = "foo";
call_me(); // I told you C can't do it.
And, as before, foo can be an external, or a static function local to the
source code module..
And, of course, the value stored in call_me can change, which means we are
building a run-time table for program "prog" of the addresses of nested
functions that can be accessed by "prog" as "foo". Those functions have names
like "foo.62".
This is all by way of agreeing with you. What I need to do is, as you say, for
each CALL to a literal name, create a tentative declaration, making it external
for that name. Later on, when we figure out that it is actually a call to a
local, then edit that FUNCTION_DECL (rather than the CALL_EXPR), setting
TREE_PUBLIC to zero and changing the identifier.
Thank you.
>
> Btw, is there any way that the thing 'prog' calls can turn out a "wrong
> thing"?
> Aka, not a function? How would you emit a diagnostic for that? That is,
> in C the called thing could be a variable with the same mangling.
>
> Traditionally I'd have the parser register a tentative (extern)
> declaration
> at the point of the call in 'prog' and when parsing 'prog2' I'd query the
> table of (tentative) declarations, find one for 'prog2' and then rewrite
> that to a definition. So, I would have expected the Cobol frontend to
> have
> a symbol table based on its name lookup rules.
>
> >
> > One other data point, as a sidebar. The target of a CALL statement in
> > COBOL need not be a literal. In C there's no syntax to "call by name",
> > where the name is a mere string determined at runtime. In COBOL for,
> >
> > CALL P.
> >
> > P names an alphanumeric variable whose contents are of course resolved
> > at runtime with dlsym(3), even if the value of P was established when
> > initialized and never changed. If we change the above program to call
> > prog2 through a variable, the program works on both architectures. The
> > above substitution does not occur (because the compiler doesn't know
> > what's being called). dlsym(3) nevertheless finds the internal name, I
> > think because of -rdynamic. End sidebar.
> >
> > The first question, then, is "Are we doing it right?" If not, what are
> > the constraints on changing GENERIC as it's being built up? It would
> > be nice to support forward references without redesigning the FE.
> >
> > If we are doing it right, then we want to report mumble something else
> > is wrong. We can supply an infinitude of details, including assembly
> > listings.
> >
> > --jkl