https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102023
Bug ID: 102023 Summary: Unnecessary duplication of mtcrf instruction Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: christophe.leroy at csgroup dot eu Target Milestone: --- Building Linux Kernel, mpc885_ads_defconfig (powerpc), several places we get multiple mtcrf whereas a single mtcrf instruction would be enough: Exemple: arch/powerpc/perf/8xx-pmu.o, function mpc8xx_pmu_read() At two places we get: 258: 7d 82 01 20 mtcrf 32,r12 25c: 7d 81 01 20 mtcrf 16,r12 260: 7d 80 81 20 mtcrf 8,r12 The above should be replaced by the following: mtcrf 56,r12 56 being the or-ing of 32|16|8 Dump of the complete function is below: 000000e0 <mpc8xx_pmu_read>: e0: 94 21 ff b0 stwu r1,-80(r1) e4: 7c 08 02 a6 mflr r0 e8: 7d 80 00 26 mfcr r12 ec: 93 21 00 34 stw r25,52(r1) f0: 90 01 00 54 stw r0,84(r1) f4: 91 81 00 14 stw r12,20(r1) f8: 7c 79 1b 78 mr r25,r3 fc: 4b ff ff 05 bl 0 <event_type> 100: 2c 03 00 00 cmpwi r3,0 104: 41 80 01 f4 blt 2f8 <mpc8xx_pmu_read+0x218> 108: 2c 03 00 04 cmpwi r3,4 10c: 2e 03 00 03 cmpwi cr4,r3,3 110: 2d 03 00 01 cmpwi cr2,r3,1 114: 2d 83 00 02 cmpwi cr3,r3,2 118: 93 81 00 40 stw r28,64(r1) 11c: 93 a1 00 44 stw r29,68(r1) 120: 92 41 00 18 stw r18,24(r1) 124: 92 61 00 1c stw r19,28(r1) 128: 92 81 00 20 stw r20,32(r1) 12c: 92 a1 00 24 stw r21,36(r1) 130: 92 c1 00 28 stw r22,40(r1) 134: 92 e1 00 2c stw r23,44(r1) 138: 93 01 00 30 stw r24,48(r1) 13c: 93 41 00 38 stw r26,56(r1) 140: 93 61 00 3c stw r27,60(r1) 144: 93 c1 00 48 stw r30,72(r1) 148: 93 e1 00 4c stw r31,76(r1) 14c: 7e 80 00 26 mfcr r20 150: 3b 79 01 88 addi r27,r25,392 154: 3a 40 00 00 li r18,0 158: 3a 60 00 00 li r19,0 15c: 3b c0 00 00 li r30,0 160: 3b e0 00 00 li r31,0 164: 3e e0 00 00 lis r23,0 166: R_PPC_ADDR16_HA dtlb_miss_counter 168: 3a c0 00 00 li r22,0 16c: 3f 40 00 00 lis r26,0 16e: R_PPC_ADDR16_HA instruction_counter 170: 3e a0 00 01 lis r21,1 174: 3f 00 00 00 lis r24,0 176: R_PPC_ADDR16_HA itlb_miss_counter 178: 7f 63 db 78 mr r3,r27 17c: 48 00 00 01 bl 17c <mpc8xx_pmu_read+0x9c> 17c: R_PPC_REL24 generic_atomic64_read 180: 7c 9c 23 78 mr r28,r4 184: 7c 7d 1b 78 mr r29,r3 188: 41 92 01 40 beq cr4,2c8 <mpc8xx_pmu_read+0x1e8> 18c: 41 91 01 50 bgt cr4,2dc <mpc8xx_pmu_read+0x1fc> 190: 41 8a 00 dc beq cr2,26c <mpc8xx_pmu_read+0x18c> 194: 39 3a 00 00 addi r9,r26,0 196: R_PPC_ADDR16_LO instruction_counter 198: 40 8e 00 fc bne cr3,294 <mpc8xx_pmu_read+0x1b4> 19c: 81 49 00 00 lwz r10,0(r9) 1a0: 7f f6 22 a6 mfspr r31,150 1a4: 81 09 00 00 lwz r8,0(r9) 1a8: 7c 0a 40 00 cmpw r10,r8 1ac: 40 82 ff f0 bne 19c <mpc8xx_pmu_read+0xbc> 1b0: 57 ff 84 3e rlwinm r31,r31,16,16,31 1b4: 7d 49 fe 70 srawi r9,r10,31 1b8: 55 47 84 3e rlwinm r7,r10,16,16,31 1bc: 51 5f 80 1e rlwimi r31,r10,16,0,15 1c0: 51 27 80 1e rlwimi r7,r9,16,0,15 1c4: 7e 7f e0 10 subfc r19,r31,r28 1c8: 7e 47 e9 10 subfe r18,r7,r29 1cc: 2c 12 00 00 cmpwi r18,0 1d0: 7c fe 3b 78 mr r30,r7 1d4: 40 a0 00 c0 bge 294 <mpc8xx_pmu_read+0x1b4> 1d8: 7e 73 b0 14 addc r19,r19,r22 1dc: 7f c7 f3 78 mr r7,r30 1e0: 7f e8 fb 78 mr r8,r31 1e4: 7f a5 eb 78 mr r5,r29 1e8: 7f 86 e3 78 mr r6,r28 1ec: 7f 63 db 78 mr r3,r27 1f0: 7e 52 a9 14 adde r18,r18,r21 1f4: 48 00 00 01 bl 1f4 <mpc8xx_pmu_read+0x114> 1f4: R_PPC_REL24 generic_atomic64_cmpxchg 1f8: 7c 1d 18 00 cmpw r29,r3 1fc: 40 82 ff 7c bne 178 <mpc8xx_pmu_read+0x98> 200: 7c 1c 20 40 cmplw r28,r4 204: 40 82 ff 74 bne 178 <mpc8xx_pmu_read+0x98> 208: 81 81 00 14 lwz r12,20(r1) 20c: 80 01 00 54 lwz r0,84(r1) 210: 82 81 00 20 lwz r20,32(r1) 214: 82 a1 00 24 lwz r21,36(r1) 218: 82 c1 00 28 lwz r22,40(r1) 21c: 82 e1 00 2c lwz r23,44(r1) 220: 83 01 00 30 lwz r24,48(r1) 224: 83 41 00 38 lwz r26,56(r1) 228: 83 61 00 3c lwz r27,60(r1) 22c: 83 81 00 40 lwz r28,64(r1) 230: 83 a1 00 44 lwz r29,68(r1) 234: 83 c1 00 48 lwz r30,72(r1) 238: 83 e1 00 4c lwz r31,76(r1) 23c: 38 b9 00 68 addi r5,r25,104 240: 7e 43 93 78 mr r3,r18 244: 83 21 00 34 lwz r25,52(r1) 248: 82 41 00 18 lwz r18,24(r1) 24c: 7e 64 9b 78 mr r4,r19 250: 82 61 00 1c lwz r19,28(r1) 254: 7c 08 03 a6 mtlr r0 258: 7d 82 01 20 mtcrf 32,r12 25c: 7d 81 01 20 mtcrf 16,r12 260: 7d 80 81 20 mtcrf 8,r12 264: 38 21 00 50 addi r1,r1,80 268: 48 00 00 00 b 268 <mpc8xx_pmu_read+0x188> 268: R_PPC_REL24 generic_atomic64_add 26c: 7f cd 42 e6 mftbu r30 270: 7f ec 42 e6 mftb r31 274: 7d 2d 42 e6 mftbu r9 278: 7c 1e 48 40 cmplw r30,r9 27c: 40 82 ff f0 bne 26c <mpc8xx_pmu_read+0x18c> 280: 7e 7c f8 10 subfc r19,r28,r31 284: 56 72 27 3e rlwinm r18,r19,4,28,31 288: 7d 3d f1 10 subfe r9,r29,r30 28c: 51 32 20 36 rlwimi r18,r9,4,0,27 290: 56 73 20 36 rlwinm r19,r19,4,0,27 294: 7f c7 f3 78 mr r7,r30 298: 7f e8 fb 78 mr r8,r31 29c: 7f a5 eb 78 mr r5,r29 2a0: 7f 86 e3 78 mr r6,r28 2a4: 7f 63 db 78 mr r3,r27 2a8: 48 00 00 01 bl 2a8 <mpc8xx_pmu_read+0x1c8> 2a8: R_PPC_REL24 generic_atomic64_cmpxchg 2ac: 7c 1d 18 00 cmpw r29,r3 2b0: 41 a2 ff 50 beq 200 <mpc8xx_pmu_read+0x120> 2b4: 7f 63 db 78 mr r3,r27 2b8: 48 00 00 01 bl 2b8 <mpc8xx_pmu_read+0x1d8> 2b8: R_PPC_REL24 generic_atomic64_read 2bc: 7c 9c 23 78 mr r28,r4 2c0: 7c 7d 1b 78 mr r29,r3 2c4: 40 92 fe c8 bne cr4,18c <mpc8xx_pmu_read+0xac> 2c8: 83 f8 00 00 lwz r31,0(r24) 2ca: R_PPC_ADDR16_LO itlb_miss_counter 2cc: 3b c0 00 00 li r30,0 2d0: 7e 64 f8 50 subf r19,r4,r31 2d4: 7e 72 fe 70 srawi r18,r19,31 2d8: 4b ff ff bc b 294 <mpc8xx_pmu_read+0x1b4> 2dc: 7e 88 01 20 mtcrf 128,r20 2e0: 40 a2 ff b4 bne 294 <mpc8xx_pmu_read+0x1b4> 2e4: 83 f7 00 00 lwz r31,0(r23) 2e6: R_PPC_ADDR16_LO dtlb_miss_counter 2e8: 3b c0 00 00 li r30,0 2ec: 7e 64 f8 50 subf r19,r4,r31 2f0: 7e 72 fe 70 srawi r18,r19,31 2f4: 4b ff ff a0 b 294 <mpc8xx_pmu_read+0x1b4> 2f8: 81 81 00 14 lwz r12,20(r1) 2fc: 80 01 00 54 lwz r0,84(r1) 300: 83 21 00 34 lwz r25,52(r1) 304: 7c 08 03 a6 mtlr r0 308: 7d 82 01 20 mtcrf 32,r12 30c: 7d 81 01 20 mtcrf 16,r12 310: 7d 80 81 20 mtcrf 8,r12 314: 38 21 00 50 addi r1,r1,80 318: 4e 80 00 20 blr