https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120022
Bug ID: 120022 Summary: [Optimization opportunity] Related with Bug 119917 and 120020 Product: gcc Version: 16.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: a1343922569 at outlook dot com Target Milestone: --- Relate to Bug 119917 and 120020 In Bug 120020, I give the wrong code at the first, and I want to fix the code, but the administrator reply so fast and mark it invalid immediately. Now I will re-show the right code, and the right explanation of this optimization opportunity. Godbolt link (GCC, generate suboptimal assembly code for myDivMod1, which is not the same as myDivMod2) 64-bit variable: https://gcc.godbolt.org/z/7j9MsKsMr 32-bit variable: https://gcc.godbolt.org/z/EEKzWWx14 16-bit variable: https://gcc.godbolt.org/z/5j9zoxrro 8-bit variable: https://gcc.godbolt.org/z/3sWh9K6r5 Godbolt link (Clang, generate optimal assembly code for myDivMod1, which is the same as myDivMod2) 64-bit variable: https://gcc.godbolt.org/z/cneMq3eYT 32-bit variable: https://gcc.godbolt.org/z/E1e9x65hj 16-bit variable: https://gcc.godbolt.org/z/3aq9dsa3M 8-bit variable: https://gcc.godbolt.org/z/8EG8TEzGq Optimization suggestion I suggest enhancing GCC to recognize situations where multiple non-volatile inline assembly blocks across function calls share identical or highly similar operations, and optimize them by merging the operations when semantically safe. For the example above, the assembly code generated for myDivMod1 should be no more complex than that of myDivMod2 and contain no more than one div instruction. Another strong reason to support this optimization suggestion is that, Clang generates identical code for both myDivMod1 and myDivMod2 (see the godbolt link above for details), which proves that Clang has effectively optimized the case in myDivMod1, but GCC has not. And the Bug 119917 is actually the right code, and the latest clang (trunk version) has applied this optimization, including 8/16/32/64-bit situation. The similar issue has mentionedin Bug 117529 about five months ago, but this issue has still not been resolved so far, and with no more reply. If my description has a little mistake, let's discuss it slowly, but don't label it as "invalid" too early. Thank you for your understanding!