https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84013
Bug ID: 84013 Summary: wrong __restrict clique with inline asm operand Product: gcc Version: 7.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: katsunori.kumatani at gmail dot com Target Milestone: --- Using a __restrict parameter in an asm results in the wrong clique assigned to the MEM_REF if the same pointer is used without an asm (where it has the correct clique). This happens since GCC 6.4.0 and up (including GCC 7 and trunk). GCC 6.3.0 assigns the clique differently (and even worse, in some cases, even generates wrong code), for 6.4.0 a fix was backported but the fix still has wrong cliques. Consider this snippet: auto abc(int** __restrict a, int** __restrict b) { *a = 0; asm volatile("":"+m"(*b)); *a = 0; return *b; } Compile with -O2 -fipa-pta (the clique is _always_ wrong, but -fipa-pta is needed for demonstration for some reason, I don't know why). The output on x86_64: GCC 6.3.0: movq $0, (%rdi) movq (%rsi), %rax ret GCC 6.4.0 / GCC 7 / trunk: movq $0, (%rdi) movq $0, (%rdi) # redundant movq (%rsi), %rax ret As you can see, the RTL DSE pass doesn't remove the redundant store *a, because the clique is wrong and thinks the asm can alias with *a, which is not the case, because they're both __restrict. GCC 6.3.0 generates "better" code because it has the other bug 79552 (which is fixed), so that's not a solution. I'll comment with the cliques I got when inspecting after the "optimized" pass (final gimple pass): auto abc(int** __restrict a, int** __restrict b) { *a = 0; // clique 1 base 1 asm volatile("":"+m"(*b)); // clique 0 base 0 (wrong) *a = 0; // clique 1 base 1 return *b; // clique 1 base 2 (what it should be) } The asm should obviously have clique 1 base 2 to match the return, since it's the same pointer. To me, this is clearly a bug. I don't know if it's a missed optimization or can even produce wrong code in certain cases, however. Note that if you remove the return (i.e. any use of the *b pointer other than the asm), you get this: auto abc(int** __restrict a, int** __restrict b) { *a = 0; // clique 1 base 1 asm volatile("":"+m"(*b)); // clique 1 base 0 *a = 0; // clique 1 base 1 } Which is correct because it has the same clique as *a (1), but the base 0 worries me a bit, shouldn't it be base 2 as well just like before? Note that the cliques are _always_ wrong, but the redundant output appears only with -fipa-pta (even though not even one call is involved), not sure why (not sure how it can remove the redundant store in that case though).