https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82158

--- Comment #10 from Peter Cordes <peter at cordes dot ca> ---
(In reply to Jakub Jelinek from comment #9)
> None of the above options is IMHO acceptable.
> This is UB like any other.

I still think it's a quality-of-implementation bug that could be fixed without
downsides for conforming programs by emitting an illegal instruction instead of
bx lr.  (Thanks for pointing out some cases I hadn't considered, BTW.  That
narrows down the possible good solutions.)

IMO corrupting registers while otherwise working normally is really dangerous. 
That could appear to work in a unit test but fail sometimes with a different
caller, maybe even causing security problems if the clobbered register was only
used in an error-handling case.

If a fix would take too much work to implement, then I'm fine with leaving this
as WONTFIX; there is a warning enabled by default for cases where gcc is sure
that a noreturn function *does* return.  I disagree about INVALID, though.

I'm imagining a codebase where a stray _Noreturn attribute got attached to a
function that was never intended to be _Noreturn.

A buggy _Noreturn function that has a combination of inputs that does return
would always be problematic regardless of clobbering registers, and I guess
that's why you're thinking about sanitize?

> What we could add is -fsanitize=noreturn that would add runtime
> instrumentation

I'm not proposing anything that heavy, just an illegal instruction instead of a
return, or simply no instruction at all.  In a _Noreturn function (that has
clobbered call-preserved registers), I think gcc should never emit instructions
that return.

As you point out, even if gcc can't prove whether that path is/isn't taken, a
correct program *won't* take it, so an illegal instruction is a good thing. 
(For performance, an illegal instruction can stop incorrect speculation down a
wrong path after a mispredicted branch, potentially saving useless prefetch of
code/data and speculative page walks).

There is precedent for trapping with illegal instructions on UB:

int* address_of_local(int v) { return &v;  }
int foo() {
    int* p = address_of_local(4);
    return *p;
}
   // gcc4.x returns 4
   // x86-64 gcc5 and later.  https://godbolt.org/g/W5vUiA

foo():
        movl    0, %eax       # load from absolute address 0
        ud2                   # undefined 2-byte instruction: future-proof way
to raise #UD

Falling through into whatever's next would also be a better option, since it
saves code size instead of generating instructions that a correct program will
never execute.  (gcc does that for foo() on ARM after trying a NULL-pointer
load.  ARM64 gcc uses brk #1000 after trying a NULL-pointer load.)

If the block that would return isn't at the end of the function, then the
fall-through would be into another block of the current function.  This is
potentially hard to debug, so probably an illegal instruction is a good idea,
maybe with an option to omit it.

Anything in a basic block that returns can be assumed not to be executed, so
the example from the original report could be compiled to:

  push {r4, lr}
     #  mov r5, r1    # optionally optimize these out, because we know
     #  mov r4, r0    # the code that uses them can't be reached.
  bl ext
   # don't emit the store
   # or the pop / bx lr
   # Fall through into padding.
   # Ideally pad with illegal instructions instead of NOPs

Hmm, unless the store faults, and that's what made this function not return. 
But we can definitely replace the pop/bx with an illegal instruction or a
fall-through, because executing them is guaranteed to be the wrong thing.

Making functions that almost works but violate the ABI seems like the worst
possible way to handle this corner case.  Crashing is better than silent
corruption when there's no correct way to continue execution, right?


> Compile time error is undesirable

That's what I thought, too.  -Werror exists for users that want that.

I only mentioned it for completeness, but good point that it's not an option
even when gcc can prove that a function returns, because it might never be
called. 

> , or say the
> return is only conditional and that path doesn't ever happen in a valid
> program, or say there are calls in the noreturn function that just throw
> exceptions or loop forever or abort or don't really return some other way.

That's a good point.  We don't want to save extra registers that a correct
program doesn't need saved, so ignoring the noreturn attribute and emitting
code to save/restore is not a good solution for the general case.

Illegal instruction or fall through into other code is left as the only
solution I think is really good.

Reply via email to