On Wed, Jun 18, 2025 at 3:28 PM Richard Biener
<richard.guent...@gmail.com> wrote:
>
> On Tue, Jun 17, 2025 at 7:51 PM James K. Lowden
> <jklow...@schemamania.org> wrote:
> >
> > The COBOL FE emits code for a recent ARM VM that is definitely not what
> > the user or, ahem, the FE author intended.  The observed behavior is
> > that the program enters an infinite loop calling the main entry point,
> > eventually exhausting the stack. The observed assembler code does or
> > does not refer to the GOT and ends up not going where it should.
> >
> > We think either we're not using GENERIC as intended, or what we're
> > doing is tripping up the code generator.  Possibly both.
> >
> > The working VM is
> >
> > hostname = gcc-cobol
> > uname -m = aarch64
> > uname -r = 5.15.0-122-generic
> > uname -s = Linux
> > uname -v = #132-Ubuntu SMP Thu Aug 29 13:45:17 UTC 2024
> >
> > The broken VM is
> >
> > hostname = potato
> > uname -m = aarch64
> > uname -r = 6.8.0-60-generic
> > uname -s = Linux
> > uname -v = #63-Ubuntu SMP PREEMPT_DYNAMIC Tue Apr 15 18:51:58 UTC 2025
> >
> > The COBOL is
> >
> >         IDENTIFICATION DIVISION.
> >         PROGRAM-ID. prog.
> >         PROCEDURE DIVISION.
> >         PROG-MAIN.
> >             DISPLAY "I am prog"
> >             CALL "prog2"
> >             STOP RUN.
> >
> >         IDENTIFICATION DIVISION.
> >         PROGRAM-ID. prog2.
> >         PROCEDURE DIVISION.
> >         PROG-PROG2.
> >             DISPLAY "I am prog2".
> >         END PROGRAM prog2.
> >
> >         END PROGRAM prog.
> >
> > The problem is a forward reference to a function without external
> > linkage, namely prog2.
> >
> > In COBOL parlance, prog2 is a "contained program".  The containing
> > program, prog1, can call contained programs but not vice-versa.  There
> > is no requirement for a function prototype denoting a forward
> > reference.
> >
> > A COBOL program (top-level) is a function with external linkage and C
> > semantics. A contained program is function with "internal linkage" if
> > there is such a thing.  In C terms, the above might be represented as
> >
> >         void prog() { puts("I am prog"); prog2(); }
> >         static void prog2() { puts("I am prog2"); }
> >
> > Names with external linkage are published verbatim.  Names with
> > internal linkage get an internal name unique to the translation unit, in
> > this case, "prog2.62". It is the compiler's job, I think obviously, to
> > find prog2; the linker is not involved.
> >
> > Because a contained program always appears after the containing program,
> > the compiler does not know when it encounters CALL whether "prog2"
> > names a contained program or is a reference to another module to be
> > linked in later.  We begin by assuming it's an external reference.  At
> > EOF we review the CALLs and, for string constants that name contained
> > programs, substitute the name of the function representing the contained
> > program.  For your reference, that touch-up work is done by
> > parser_call_target_update().
>
> I'm just looking there where you do
>
>     for( auto func : p->second )
>       {
>       func.convention = cbl_call_verbatim_e;
>       DECL_NAME(func.node) = get_identifier(mangled_name);
>
> but GCC, when matching a CALL_EXPR and a destination _definition_,
> requires the actual FUNCTION_DECL trees to match up, not just
> their name.  That is, while GCC builds up a symbol table it does that
> based on _decls_, not based on decl names.

So, for the testcase I see with -fdump-tree-original-uid (and a fix
for that...):


signed long prog2.62D.280 ()
{
...
  D.292 = __gg__is_canceledD.242 ((unsigned long) prog2.62D.282);
...
  ..function_handle.1.0D.275 = prog2.62D.276;

so I see three different DECLs for prog2.62 where there should be only one.

> That means, the adjustment should end up unifying the FUNCTION_DECL
> used for all calls.
>
> Btw, is there any way that the thing 'prog' calls can turn out a "wrong 
> thing"?
> Aka, not a function?  How would you emit a diagnostic for that?  That is,
> in C the called thing could be a variable with the same mangling.
>
> Traditionally I'd have the parser register a tentative (extern) declaration
> at the point of the call in 'prog' and when parsing 'prog2' I'd query the
> table of (tentative) declarations, find one for 'prog2' and then rewrite
> that to a definition.  So, I would have expected the Cobol frontend to have
> a symbol table based on its name lookup rules.
>
> >
> > One other data point, as a sidebar.  The target of a CALL statement in
> > COBOL need not be a literal.  In C there's no syntax to "call by name",
> > where the name is a mere string determined at runtime.  In COBOL for,
> >
> >         CALL P.
> >
> > P names an alphanumeric variable whose contents are of course resolved
> > at runtime with dlsym(3), even if the value of P was established when
> > initialized and never changed.  If we change the above program to call
> > prog2 through a variable, the program works on both architectures.  The
> > above substitution does not occur (because the compiler doesn't know
> > what's being called). dlsym(3) nevertheless finds the internal name, I
> > think because of -rdynamic. End sidebar.
> >
> > The first question, then, is "Are we doing it right?"  If not, what are
> > the constraints on changing GENERIC as it's being built up?  It would
> > be nice to support forward references without redesigning the FE.
> >
> > If we are doing it right, then we want to report mumble something else
> > is wrong.  We can supply an infinitude of details, including assembly
> > listings.
> >
> > --jkl

Reply via email to