https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63635
Bug ID: 63635 Summary: Reduce toc relative address computation for multiple data access Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: carrot at google dot com Target: powerpc64le Currently ppc gcc generates two instructions to compute the address of non local data. If the data layout is known to compiler, we can reduce one instruction for the second and later data address computation. Following is an example: #include <stdio.h> static int a,b,c; void bar(int x) { a = b = c = x; } int foo() { return a+b+c; } int aa = 1; int bb = 2; int cc = 3; int asdf() { return aa + bb + cc; } int main() { printf("Hello"); printf(", "); printf("world.\n"); } Compile it with options -O2 -m64 -mvsx -mcpu=power8 Function asdf is compiled to: asdf: 0: addis 2,12,.TOC.-0b@ha addi 2,2,.TOC.-0b@l .localentry asdf,.-asdf addis 3,2,.LANCHOR1@toc@ha // A addis 10,2,.LANCHOR1+4@toc@ha // B addis 9,2,.LANCHOR1+8@toc@ha // C lwz 3,.LANCHOR1@toc@l(3) // D lwz 10,.LANCHOR1+4@toc@l(10) // E lwz 9,.LANCHOR1+8@toc@l(9) // F add 3,3,10 add 3,3,9 extsw 3,3 blr ... .globl cc .globl bb .globl aa .section ".data" .align 2 .set .LANCHOR1,. + 0 .type aa, @object .size aa, 4 aa: .long 1 .type bb, @object .size bb, 4 bb: .long 2 .type cc, @object .size cc, 4 cc: .long 3 Since the data layout of aa,bb,cc is known to compiler and their distance is less than 64k, so the code sequence A-F can be optimized to: addis 3,2,.LANCHOR1@toc@ha addi 3,3,.LANCHOR1@toc@l lwz 10,4(3) lwz 9,8(3) lwz 3,0(3) Other functions can be similarly optimized.