[PATCH] D60335: Use -fomit-frame-pointer when optimizing PowerPC code
kernigh created this revision. kernigh added reviewers: joerg, brad, cfe-commits. Herald added subscribers: jsji, krytarowski, nemanjai. Herald added a project: clang. This enables -fomit-frame-pointer when optimizing code for all PowerPC targets, instead of only Linux and NetBSD. I mailed this patch to cfe-commits earlier this week. Roman Lebedev and Hal Finkel pointed me to Phabricator; this is my first attempt to use Phabricator. My earlier mail wrote: > The attached patch is for clang to use -fomit-frame-pointer by default > for all PowerPC targets when optimizing code. Right now, clang uses > -fomit-frame-pointer for PowerPC Linux and NetBSD but not for other > targets. I have been running `clang -target powerpc-openbsd`. > > The patch is for llvm-project.git master. I previously posted this > patch to https://bugs.llvm.org/show_bug.cgi?id=41094 , but the patch > in this email is for a newer revision of master. > > In most functions, the frame pointer in r31 is an unnecessary extra > copy of the stack pointer in r1. GCC is using -fomit-frame-pointer by > default (in my PowerPC machine running OpenBSD/macppc); I want Clang > to be at least as good as GCC. Also, this patch helps me to compare > the output of `clang -target powerpc-openbsd -O2 -S` with the output > for Linux or NetBSD. In bug 41094, I showed how -fomit-frame-pointer > simplifies the C function `void nothing(void) {}`. Repository: rC Clang https://reviews.llvm.org/D60335 Files: clang/lib/Driver/ToolChains/Clang.cpp Index: clang/lib/Driver/ToolChains/Clang.cpp === --- clang/lib/Driver/ToolChains/Clang.cpp +++ clang/lib/Driver/ToolChains/Clang.cpp @@ -523,8 +523,12 @@ // XCore never wants frame pointers, regardless of OS. // WebAssembly never wants frame pointers. return false; + case llvm::Triple::ppc: + case llvm::Triple::ppc64: + case llvm::Triple::ppc64le: case llvm::Triple::riscv32: case llvm::Triple::riscv64: +// PowerPC's frame pointer is often an extra copy of the stack pointer. return !areOptimizationsEnabled(Args); default: break; @@ -542,9 +546,6 @@ case llvm::Triple::mips64el: case llvm::Triple::mips: case llvm::Triple::mipsel: -case llvm::Triple::ppc: -case llvm::Triple::ppc64: -case llvm::Triple::ppc64le: case llvm::Triple::systemz: case llvm::Triple::x86: case llvm::Triple::x86_64: Index: clang/lib/Driver/ToolChains/Clang.cpp === --- clang/lib/Driver/ToolChains/Clang.cpp +++ clang/lib/Driver/ToolChains/Clang.cpp @@ -523,8 +523,12 @@ // XCore never wants frame pointers, regardless of OS. // WebAssembly never wants frame pointers. return false; + case llvm::Triple::ppc: + case llvm::Triple::ppc64: + case llvm::Triple::ppc64le: case llvm::Triple::riscv32: case llvm::Triple::riscv64: +// PowerPC's frame pointer is often an extra copy of the stack pointer. return !areOptimizationsEnabled(Args); default: break; @@ -542,9 +546,6 @@ case llvm::Triple::mips64el: case llvm::Triple::mips: case llvm::Triple::mipsel: -case llvm::Triple::ppc: -case llvm::Triple::ppc64: -case llvm::Triple::ppc64le: case llvm::Triple::systemz: case llvm::Triple::x86: case llvm::Triple::x86_64: ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D60335: Use -fomit-frame-pointer when optimizing PowerPC code
kernigh added a comment. I'm stuck. I didn't put a test in this patch because I can't run the tests yet. So far, I can build a clang executable but can't build the rest of the project. I won't run the tests without a complete build, and I won't edit the tests without running them. I had difficulty because llvm is a large C++ project that takes a long time to build. I tried to save time by building it on my fastest machine, an x86 desktop with 2 cores and 4G RAM running OpenBSD 6.4 amd64. I tried a parallel build in my usual way, `cmake -GNinja ...` then `ninja`, but it got stuck near the end when it tried to run 3 linkers in parallel. The linkers used too much RAM, so my machine got stuck in swap. (Do other people have more RAM?) I learned to save RAM by building the project with clang -Oz and without -g (so there is no debug info), and by using `ninja -j2 clang` to link only the clang executable and nothing else. My "running OpenBSD 6.4" is a problem. Code in llvm/lib/Support/Unix/Threading.inc for OpenBSD tries to call pthread_get_name_np(), which will appear in OpenBSD 6.5. I needed to edit that file. Then I built a clang executable, but it didn't run, because of an "Out of memory" error. The kernel of OpenBSD 6.4 seems to refuse to run such a large executable. I worked around it by hacking clang into a large shared library named almost_clang, then building a tiny clang executable that just calls the shared library (by editing clang/tools/driver/{CMakeLists.txt,driver.cpp}). That caused another error, because llvm tried to set the shared library's version to "9svn", which isn't legal. I edited llvm/cmake/modules/AddLLVM.cmake to remove the shared library's version. Now I can run clang. If I try to build the rest of the project, like `ninja -j1` (because I don't want parallel linkers), then I get a linker error. It seems confused about the shared library version. The trick that I used to build clang might have broken something else. Before I try to fix my build, I want to upgrade this machine to OpenBSD 6.5 amd64. (So I am waiting for 6.5 to release; I already have 6.5-beta on my slow PowerPC machine.) That should at least let me undo the pthread_get_name_np() removal, so I don't have too many local changes. If I can build the rest of llvm and clang on OpenBSD 6.5, then I can try to run and edit the tests. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D60335/new/ https://reviews.llvm.org/D60335 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D90329: [PowerPC] Fix va_arg in C++, Objective-C on 32-bit ELF targets
kernigh added a comment. Hi, Eli. I'm missing emails from Phabricator, so I didn't know about your recent post. I will respond to your question about numUsedRegs when I find time. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D90329/new/ https://reviews.llvm.org/D90329 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D90329: [PowerPC] Fix va_arg in Objective-C on 32-bit ELF targets
kernigh added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:4723 + bool isInt = Ty->isIntegerType() || Ty->hasPointerRepresentation() || + Ty->isAggregateType(); bool isF64 = Ty->isFloatingType() && getContext().getTypeSize(Ty) == 64; efriedma wrote: > I suspect this code doesn't handle C++ member pointers correctly. But maybe > we can leave that for a followup. > > Could we simplify this to `bool isInt = !Ty->isFloatingType();`? Yes, C++ member pointers are broken. I declared a struct animal, tried `typedef void (animal::*noise)(); ... va_arg(ap, noise)`, and it segfaulted. In the disassembly, I saw 2 problems: (1) va_arg was looking at the floating-point registers, and (2) va_arg was not indirecting through a pointer. The caller had passed a pointer to a `noise` in a general-purpose register. I'm not sure about `bool isInt = !Ty->isFloatingType();`, because I haven't checked whether it would break other types. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D90329/new/ https://reviews.llvm.org/D90329 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D90329: [PowerPC] Fix va_arg in Objective-C on 32-bit ELF targets
kernigh added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:4723 + bool isInt = Ty->isIntegerType() || Ty->hasPointerRepresentation() || + Ty->isAggregateType(); bool isF64 = Ty->isFloatingType() && getContext().getTypeSize(Ty) == 64; kernigh wrote: > efriedma wrote: > > I suspect this code doesn't handle C++ member pointers correctly. But > > maybe we can leave that for a followup. > > > > Could we simplify this to `bool isInt = !Ty->isFloatingType();`? > Yes, C++ member pointers are broken. I declared a struct animal, tried > `typedef void (animal::*noise)(); ... va_arg(ap, noise)`, and it segfaulted. > In the disassembly, I saw 2 problems: (1) va_arg was looking at the > floating-point registers, and (2) va_arg was not indirecting through a > pointer. The caller had passed a pointer to a `noise` in a general-purpose > register. > > I'm not sure about `bool isInt = !Ty->isFloatingType();`, because I haven't > checked whether it would break other types. `bool isInt = !Ty->isFloatingType();` is working for me. I'm doing more checks; if it continues to work, I will update this diff. Another change in `bool isIndirect = Ty->isAggregateType();` to instead call `isAggregateTypeForABI(Ty)` is fixing va_arg of a C++ member function pointer. I may want to put this change in the same diff because I will be running both changes together. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D90329/new/ https://reviews.llvm.org/D90329 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D90329: [PowerPC] Fix va_arg in C++, Objective-C on 32-bit ELF targets
kernigh updated this revision to Diff 305335. kernigh retitled this revision from "[PowerPC] Fix va_arg in Objective-C on 32-bit ELF targets" to "[PowerPC] Fix va_arg in C++, Objective-C on 32-bit ELF targets". kernigh edited the summary of this revision. kernigh added a comment. I have updated the diff to use `bool isInt = !Ty->isFloatingType();` and `bool isIndirect = isAggregateTypeForABI(Ty);`. This removes both calls to `Ty->isAggregateType()`. This fixes some C++ types, because aren't aggregates in the C++ language, but do become aggregates in the ABI. The new test ppc32-varargs-method.cpp checks a pointer to a member function, and passes only with both the `isInt` and `isIndirect` changes. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D90329/new/ https://reviews.llvm.org/D90329 Files: clang/lib/CodeGen/TargetInfo.cpp clang/test/CodeGenCXX/ppc32-varargs-method.cpp clang/test/CodeGenObjC/ppc32-varargs-id.m Index: clang/test/CodeGenObjC/ppc32-varargs-id.m === --- /dev/null +++ clang/test/CodeGenObjC/ppc32-varargs-id.m @@ -0,0 +1,33 @@ +// REQUIRES: powerpc-registered-target +// RUN: %clang_cc1 -triple powerpc-unknown-openbsd -fblocks -emit-llvm -o - %s | FileCheck %s + +#include + +id testObject(va_list ap) { + return va_arg(ap, id); +} +// CHECK: @testObject +// CHECK: using_regs: +// CHECK-NEXT: getelementptr inbounds %struct.__va_list_tag, %struct.__va_list_tag* %{{[0-9]+}}, i32 0, i32 4 +// CHECK-NEXT: load i8*, i8** %{{[0-9]+}}, align 4 +// CHECK-NEXT: mul i8 %numUsedRegs, 4 +// CHECK-NEXT: getelementptr inbounds i8, i8* %{{[0-9]+}}, i8 %{{[0-9]+}} +// CHECK-NEXT: bitcast i8* %{{[0-9]+}} to i8** +// CHECK-NEXT: add i8 %numUsedRegs, 1 +// CHECK-NEXT: store i8 %{{[0-9]+}}, i8* %gpr, align 4 +// CHECK-NEXT: br label %cont + +typedef void (^block)(void); +block testBlock(va_list ap) { + return va_arg(ap, block); +} +// CHECK: @testBlock +// CHECK: using_regs: +// CHECK-NEXT: getelementptr inbounds %struct.__va_list_tag, %struct.__va_list_tag* %{{[0-9]+}}, i32 0, i32 4 +// CHECK-NEXT: load i8*, i8** %{{[0-9]+}}, align 4 +// CHECK-NEXT: mul i8 %numUsedRegs, 4 +// CHECK-NEXT: getelementptr inbounds i8, i8* %{{[0-9]+}}, i8 %{{[0-9]+}} +// CHECK-NEXT: bitcast i8* %{{[0-9]+}} to void ()** +// CHECK-NEXT: add i8 %numUsedRegs, 1 +// CHECK-NEXT: store i8 %{{[0-9]+}}, i8* %gpr, align 4 +// CHECK-NEXT: br label %cont Index: clang/test/CodeGenCXX/ppc32-varargs-method.cpp === --- /dev/null +++ clang/test/CodeGenCXX/ppc32-varargs-method.cpp @@ -0,0 +1,20 @@ +// REQUIRES: powerpc-registered-target +// RUN: %clang_cc1 -triple powerpc-unknown-openbsd -emit-llvm -o - %s | FileCheck %s + +#include + +class something; +typedef void (something::*method)(); + +method test(va_list ap) { + return va_arg(ap, method); +} +// CHECK: using_regs: +// CHECK-NEXT: getelementptr inbounds %struct.__va_list_tag, %struct.__va_list_tag* %{{[0-9]+}}, i32 0, i32 4 +// CHECK-NEXT: load i8*, i8** %{{[0-9]+}}, align 4 +// CHECK-NEXT: mul i8 %numUsedRegs, 4 +// CHECK-NEXT: getelementptr inbounds i8, i8* %{{[0-9]+}}, i8 %{{[0-9]+}} +// CHECK-NEXT: bitcast i8* %{{[0-9]+}} to { i32, i32 }** +// CHECK-NEXT: add i8 %numUsedRegs, 1 +// CHECK-NEXT: store i8 %{{[0-9]+}}, i8* %gpr, align 4 +// CHECK-NEXT: br label %cont Index: clang/lib/CodeGen/TargetInfo.cpp === --- clang/lib/CodeGen/TargetInfo.cpp +++ clang/lib/CodeGen/TargetInfo.cpp @@ -4719,13 +4719,12 @@ // }; bool isI64 = Ty->isIntegerType() && getContext().getTypeSize(Ty) == 64; - bool isInt = - Ty->isIntegerType() || Ty->isPointerType() || Ty->isAggregateType(); + bool isInt = !Ty->isFloatingType(); bool isF64 = Ty->isFloatingType() && getContext().getTypeSize(Ty) == 64; // All aggregates are passed indirectly? That doesn't seem consistent // with the argument-lowering code. - bool isIndirect = Ty->isAggregateType(); + bool isIndirect = isAggregateTypeForABI(Ty); CGBuilderTy &Builder = CGF.Builder; Index: clang/test/CodeGenObjC/ppc32-varargs-id.m === --- /dev/null +++ clang/test/CodeGenObjC/ppc32-varargs-id.m @@ -0,0 +1,33 @@ +// REQUIRES: powerpc-registered-target +// RUN: %clang_cc1 -triple powerpc-unknown-openbsd -fblocks -emit-llvm -o - %s | FileCheck %s + +#include + +id testObject(va_list ap) { + return va_arg(ap, id); +} +// CHECK: @testObject +// CHECK: using_regs: +// CHECK-NEXT: getelementptr inbounds %struct.__va_list_tag, %struct.__va_list_tag* %{{[0-9]+}}, i32 0, i32 4 +// CHECK-NEXT: load i8*, i8** %{{[0-9]+}}, align 4 +// CHECK-NEXT: mul i8 %numUsedRegs, 4 +// CHECK-NEXT: getelementptr inbounds i8, i8* %{{[0-9]+}}, i8 %{{[0-9]+}} +// CHECK-NEXT: bitcast i8* %{{[0-9]+}} to i8** +// CHECK-NEXT: add i8 %numUsedRegs, 1 +// CHECK-NEXT: store i8 %{{[0-9]+}},
[PATCH] D90329: [PowerPC] Fix va_arg in Objective-C on 32-bit ELF targets
kernigh created this revision. kernigh added reviewers: brad, nemanjai, efriedma. Herald added subscribers: cfe-commits, shchenz, kbarton. Herald added a project: clang. kernigh requested review of this revision. In the PPC32 SVR4 ABI, a va_list has copies of registers from the function call. va_arg looked in the wrong registers for (the pointer representation of) an object in Objective-C. Fix va_arg to look in the general-purpose registers, not the floating-point registers. Anthony Richardby found the problem. Eli Friedman suggested to call hasPointerRepresentation(). Fixes https://bugs.llvm.org/show_bug.cgi?id=47921 Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D90329 Files: clang/lib/CodeGen/TargetInfo.cpp clang/test/CodeGenObjC/ppc32-varargs-id.m Index: clang/test/CodeGenObjC/ppc32-varargs-id.m === --- /dev/null +++ clang/test/CodeGenObjC/ppc32-varargs-id.m @@ -0,0 +1,33 @@ +// REQUIRES: powerpc-registered-target +// RUN: %clang_cc1 -triple powerpc-unknown-openbsd -fblocks -emit-llvm -o - %s | FileCheck %s + +#include + +id testObject(va_list ap) { + return va_arg(ap, id); +} +// CHECK: @testObject +// CHECK: using_regs: +// CHECK-NEXT: getelementptr inbounds %struct.__va_list_tag, %struct.__va_list_tag* %{{[0-9]+}}, i32 0, i32 4 +// CHECK-NEXT: load i8*, i8** %{{[0-9]+}}, align 4 +// CHECK-NEXT: mul i8 %numUsedRegs, 4 +// CHECK-NEXT: getelementptr inbounds i8, i8* %{{[0-9]+}}, i8 %{{[0-9]+}} +// CHECK-NEXT: bitcast i8* %{{[0-9]+}} to i8** +// CHECK-NEXT: add i8 %numUsedRegs, 1 +// CHECK-NEXT: store i8 %{{[0-9]+}}, i8* %gpr, align 4 +// CHECK-NEXT: br label %cont + +typedef void (^block)(void); +block testBlock(va_list ap) { + return va_arg(ap, block); +} +// CHECK: @testBlock +// CHECK: using_regs: +// CHECK-NEXT: getelementptr inbounds %struct.__va_list_tag, %struct.__va_list_tag* %{{[0-9]+}}, i32 0, i32 4 +// CHECK-NEXT: load i8*, i8** %{{[0-9]+}}, align 4 +// CHECK-NEXT: mul i8 %numUsedRegs, 4 +// CHECK-NEXT: getelementptr inbounds i8, i8* %{{[0-9]+}}, i8 %{{[0-9]+}} +// CHECK-NEXT: bitcast i8* %{{[0-9]+}} to void ()** +// CHECK-NEXT: add i8 %numUsedRegs, 1 +// CHECK-NEXT: store i8 %{{[0-9]+}}, i8* %gpr, align 4 +// CHECK-NEXT: br label %cont Index: clang/lib/CodeGen/TargetInfo.cpp === --- clang/lib/CodeGen/TargetInfo.cpp +++ clang/lib/CodeGen/TargetInfo.cpp @@ -4719,8 +4719,8 @@ // }; bool isI64 = Ty->isIntegerType() && getContext().getTypeSize(Ty) == 64; - bool isInt = - Ty->isIntegerType() || Ty->isPointerType() || Ty->isAggregateType(); + bool isInt = Ty->isIntegerType() || Ty->hasPointerRepresentation() || + Ty->isAggregateType(); bool isF64 = Ty->isFloatingType() && getContext().getTypeSize(Ty) == 64; // All aggregates are passed indirectly? That doesn't seem consistent Index: clang/test/CodeGenObjC/ppc32-varargs-id.m === --- /dev/null +++ clang/test/CodeGenObjC/ppc32-varargs-id.m @@ -0,0 +1,33 @@ +// REQUIRES: powerpc-registered-target +// RUN: %clang_cc1 -triple powerpc-unknown-openbsd -fblocks -emit-llvm -o - %s | FileCheck %s + +#include + +id testObject(va_list ap) { + return va_arg(ap, id); +} +// CHECK: @testObject +// CHECK: using_regs: +// CHECK-NEXT: getelementptr inbounds %struct.__va_list_tag, %struct.__va_list_tag* %{{[0-9]+}}, i32 0, i32 4 +// CHECK-NEXT: load i8*, i8** %{{[0-9]+}}, align 4 +// CHECK-NEXT: mul i8 %numUsedRegs, 4 +// CHECK-NEXT: getelementptr inbounds i8, i8* %{{[0-9]+}}, i8 %{{[0-9]+}} +// CHECK-NEXT: bitcast i8* %{{[0-9]+}} to i8** +// CHECK-NEXT: add i8 %numUsedRegs, 1 +// CHECK-NEXT: store i8 %{{[0-9]+}}, i8* %gpr, align 4 +// CHECK-NEXT: br label %cont + +typedef void (^block)(void); +block testBlock(va_list ap) { + return va_arg(ap, block); +} +// CHECK: @testBlock +// CHECK: using_regs: +// CHECK-NEXT: getelementptr inbounds %struct.__va_list_tag, %struct.__va_list_tag* %{{[0-9]+}}, i32 0, i32 4 +// CHECK-NEXT: load i8*, i8** %{{[0-9]+}}, align 4 +// CHECK-NEXT: mul i8 %numUsedRegs, 4 +// CHECK-NEXT: getelementptr inbounds i8, i8* %{{[0-9]+}}, i8 %{{[0-9]+}} +// CHECK-NEXT: bitcast i8* %{{[0-9]+}} to void ()** +// CHECK-NEXT: add i8 %numUsedRegs, 1 +// CHECK-NEXT: store i8 %{{[0-9]+}}, i8* %gpr, align 4 +// CHECK-NEXT: br label %cont Index: clang/lib/CodeGen/TargetInfo.cpp === --- clang/lib/CodeGen/TargetInfo.cpp +++ clang/lib/CodeGen/TargetInfo.cpp @@ -4719,8 +4719,8 @@ // }; bool isI64 = Ty->isIntegerType() && getContext().getTypeSize(Ty) == 64; - bool isInt = - Ty->isIntegerType() || Ty->isPointerType() || Ty->isAggregateType(); + bool isInt = Ty->isIntegerType() || Ty->hasPointerRepresentation() || + Ty->isAggregateType(); bool isF64 = Ty->isFloatingType() && getContext().getTypeSize(Ty)
[PATCH] D90329: [PowerPC] Fix va_arg in Objective-C on 32-bit ELF targets
kernigh added a comment. I'm using this TargetInfo.cpp diff in clang 10.0.1 on OpenBSD/macppc to build GNUstep. For me, it fixes the build of gnustep-gui. I rebuilt libobjc2, gnustep-make, and gnustep-base before building gnustep-gui. This diff doesn't change 64-bit PowerPC, because 64-bit does va_list in a different way. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D90329/new/ https://reviews.llvm.org/D90329 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D90329: [PowerPC] Fix va_arg in C++, Objective-C on 32-bit ELF targets
kernigh added a comment. I forgot about this diff for a month. Comment at: clang/lib/CodeGen/TargetInfo.cpp:4722 bool isI64 = Ty->isIntegerType() && getContext().getTypeSize(Ty) == 64; - bool isInt = - Ty->isIntegerType() || Ty->isPointerType() || Ty->isAggregateType(); + bool isInt = !Ty->isFloatingType(); bool isF64 = Ty->isFloatingType() && getContext().getTypeSize(Ty) == 64; nemanjai wrote: > nit (feel free to ignore): seems like a better name might be something like > `isSingleGPR` since it would appear that this is for types that go into a > (single) general purpose register. Hi. `isInt` is also true for a 64-bit integer, which goes into a pair of GPRs, because each register has 32 bits. Comment at: clang/test/CodeGenCXX/ppc32-varargs-method.cpp:15 +// CHECK-NEXT: load i8*, i8** %{{[0-9]+}}, align 4 +// CHECK-NEXT: mul i8 %numUsedRegs, 4 +// CHECK-NEXT: getelementptr inbounds i8, i8* %{{[0-9]+}}, i8 %{{[0-9]+}} efriedma wrote: > Not sure referring to numUsedRegs like this will work in a non-Asserts build. > Please verify. I did a build with cmake -DLLVM_ENABLE_ASSERTIONS=OFF and verified that the ppc32*varargs* tests still pass. `%numUsedRegs` is still in the llvm output. `%numUsedRegs` //does not// appear in the output of `clang -S -emit-llvm ...` (not cc1), but `%numUsedRegs` //does// appear in the output of `clang -cc1 ...`. The test runs cc1 and sees the `%numUsedRegs`. I am confused by how llvm sometimes erases the name of `%numUsedRegs` and sometimes keeps the name, but I observe that turning LLVM_ENABLE_ASSERTIONS on or off doesn't affect the test. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D90329/new/ https://reviews.llvm.org/D90329 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D73290: [PowerPC] Add clang -msvr4-struct-return for 32-bit ELF
kernigh created this revision. kernigh added reviewers: brad, markmi, chmeee. kernigh added a project: clang. Herald added subscribers: cfe-commits, steven.zhang, shchenz, jsji, kbarton, krytarowski, arichardson, nemanjai, emaste. This is a patch for https://bugs.llvm.org/show_bug.cgi?id=40736 **Beware:** This diff passes its own tests, and writes asm that looks correct to me, but I have not yet tried running the code on PowerPC. I intend to backport the diff to llvm/clang 8.0.1 (the version in OpenBSD), and try it on an old and slow PowerPC Macintosh. I upload the diff now, because other people might want to see it. This is only my 2nd patch on reviews.llvm.org, so I might have filled some fields wrong. I don't have commit access to LLVM. [PowerPC] Add clang -msvr4-struct-return for 32-bit ELF Change the default ABI to be compatible with GCC. For 32-bit ELF targets other than Linux, Clang now returns small structs in registers r3/r4. This affects FreeBSD, NetBSD, OpenBSD. There is no change for 32-bit Linux, where Clang continues to return all structs in memory. Add clang options -maix-struct-return (to return structs in memory) and -msvr4-struct-return (to return structs in registers) to be compatible with gcc. These options are only for PPC32; reject them on PPC64 and other targets. The options are like -fpcc-struct-return and -freg-struct-return for X86_32, and use similar code. To actually return a struct in registers, coerce it to an integer of the same size. LLVM may optimize the code to remove unnecessary accesses to memory, and will return i32 in r3 or i64 in r3:r4. Fixes PR#40736 Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D73290 Files: clang/docs/ClangCommandLineReference.rst clang/include/clang/Driver/Options.td clang/lib/CodeGen/TargetInfo.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/test/CodeGen/ppc32-struct-return.c clang/test/Driver/ppc-unsupported.c Index: clang/test/Driver/ppc-unsupported.c === --- /dev/null +++ clang/test/Driver/ppc-unsupported.c @@ -0,0 +1,10 @@ +// REQUIRES: powerpc-registered-target +// RUN: not %clang -target powerpc64-unknown-freebsd -maix-struct-return \ +// RUN: -c %s 2>&1 | FileCheck %s +// RUN: not %clang -target powerpc64-unknown-freebsd -msvr4-struct-return \ +// RUN: -c %s 2>&1 | FileCheck %s +// RUN: not %clang -target powerpc64le-unknown-linux -maix-struct-return \ +// RUN: -c %s 2>&1 | FileCheck %s +// RUN: not %clang -target powerpc64le-unknown-linux -msvr4-struct-return \ +// RUN: -c %s 2>&1 | FileCheck %s +// CHECK: unsupported option Index: clang/test/CodeGen/ppc32-struct-return.c === --- /dev/null +++ clang/test/CodeGen/ppc32-struct-return.c @@ -0,0 +1,88 @@ +// REQUIRES: powerpc-registered-target +// RUN: %clang_cc1 -triple powerpc-unknown-freebsd \ +// RUN: -emit-llvm -o - %s | FileCheck %s --check-prefix=CHECK-SVR4 +// RUN: %clang_cc1 -triple powerpc-unknown-linux \ +// RUN: -emit-llvm -o - %s | FileCheck %s --check-prefix=CHECK-AIX +// RUN: %clang_cc1 -triple powerpc-unknown-linux -maix-struct-return \ +// RUN: -emit-llvm -o - %s | FileCheck %s --check-prefix=CHECK-AIX +// RUN: %clang_cc1 -triple powerpc-unknown-linux -msvr4-struct-return \ +// RUN: -emit-llvm -o - %s | FileCheck %s --check-prefix=CHECK-SVR4 +// RUN: %clang_cc1 -triple powerpc-unknown-netbsd \ +// RUN: -emit-llvm -o - %s | FileCheck %s --check-prefix=CHECK-SVR4 +// RUN: %clang_cc1 -triple powerpc-unknown-openbsd \ +// RUN: -emit-llvm -o - %s | FileCheck %s --check-prefix=CHECK-SVR4 +// RUN: %clang_cc1 -triple powerpc-unknown-openbsd -maix-struct-return \ +// RUN: -emit-llvm -o - %s | FileCheck %s --check-prefix=CHECK-AIX +// RUN: %clang_cc1 -triple powerpc-unknown-openbsd -msvr4-struct-return \ +// RUN: -emit-llvm -o - %s | FileCheck %s --check-prefix=CHECK-SVR4 + +typedef struct { +} Zero; +typedef struct { + char c; +} One; +typedef struct { + short s; +} Two; +typedef struct { + char c[3]; +} Three; +typedef struct { + float f; +} Four; // svr4 to return i32, not float +typedef struct { + char c[5]; +} Five; +typedef struct { + short s[3]; +} Six; +typedef struct { + char c[7]; +} Seven; +typedef struct { + int i; + char c; +} Eight; // padded for alignment +typedef struct { + char c[9]; +} Nine; + +// CHECK-AIX-LABEL: define void @ret0(%struct.Zero* noalias sret {{[^,]*}}) +// CHECK-SVR4-LABEL: define void @ret0() +Zero ret0(void) { return (Zero){}; } + +// CHECK-AIX-LABEL: define void @ret1(%struct.One* noalias sret {{[^,]*}}) +// CHECK-SVR4-LABEL: define i8 @ret1() +One ret1(void) { return (One){'a'}; } + +// CHECK-AIX-LABEL: define void @ret2(%struct.Two* noalias sret {{[^,]*}}) +// CHECK-SVR4-LABEL: define i16 @ret2() +Two ret2(void) { return (Two){123}; } + +// CHECK-AIX-LABEL: define void @ret3(%struct