https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110648
Bug ID: 110648 Summary: Missed optimization for small returned optional leads to redundant memory accesses Product: gcc Version: 11.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: cfsteefel at arista dot com Target Milestone: --- The following code #include <optional> std::optional< int > foo( int x ) { return 1; } produces x86_64 assembly which does two stores into the stack, and then a load into rax, rather than simply operating directly on rax. i.e.: foo(int): mov DWORD PTR [rsp-8], 1 mov BYTE PTR [rsp-4], 1 mov rax, QWORD PTR [rsp-8] ret clang produces much more direct code: foo(int): # @foo(int) movabs rax, 4294967297 ret Since the returned value is always returned in rax as the optional is small enough (less than two registers wide), there is no reason for the memory accesses here. The code can be improved by naming the returned object, but this breaks down again if there are any conditionals, i.e.: std::optional< int > foo( int x ) { std::optional< int > ret = 1; return ret; } produces better code, but there is no way to get this better code once a branch is introduced, i.e. std::optional< int > foo( int x ) { std::optional< int > ret = 1; if ( x < 1 ) { return std::nullopt; } return ret; } The same applies with godbolts' trunk version of gcc, as well as gcc11.3.