Issue |
144845
|
Summary |
__arm_rsr64 treated as CSE'able on arm32
|
Labels |
backend:ARM
|
Assignees |
|
Reporter |
frobtech
|
Consider this code:
```
#include <arm_acle.h>
#include <stdint.h>
#ifdef __aarch64__
#define REG "cntvct_el0"
#else
#define REG "cp15:1:c14"
#endif
uint64_t get_cntvct_xor() {
uint64_t v1 = __arm_rsr64(REG);
uint64_t v2 = __arm_rsr64(REG);
return v1 ^ v2;
}
```
When compiled for aarch64, it produces two `mrs` instructions as expected.
For example, `clang++ --target=aarch64-fuchsia -S -o - -O2 rsr.cc` produces (trimmed):
```
_Z14get_cntvct_xorv: // @_Z14get_cntvct_xorv
.cfi_startproc
// %bb.0:
mrs x8, CNTVCT_EL0
mrs x9, CNTVCT_EL0
eor x0, x9, x8
ret
```
However, when compiled for aarch32, it acts as if the intrinsic has "non-volatile" semantics and can be presumed to return the same value when called twice.
For example, `clang++ --target=armv7-linux-gnueabihf -S -o - -O2 rsr.cc` produces (trimmed):
```
_Z14get_cntvct_xorv: @ @_Z14get_cntvct_xorv
.fnstart
@ %bb.0:
mov r0, #0
mov r1, #0
bx lr
```
(It's similar with `-mthumb` added.)
The ARM ACLE spec does not say whether the `__arm_rsr64` lowering should have "volatile" (non-CSE'able) or "non-volatile" (CSE'able) semantics. But for aarch64, both LLVM and GCC agree that it has the "volatile" semantics, and users now rely on that.
This example is exercising the aarch64 and aarch32 spellings of the exact same hardware access. IMHO they should definitely be treated consistently between the two backends. (GCC does not support the same intrinsics for aarch32 targets as for aarch64, so we don't have that precedent to refer to here.)
That seems to be the intent of the LLVM code too. To wit, in both cases above with `-emit-llvm` added, the IR is basically the same:
```
define dso_local noundef i64 @_Z14get_cntvct_xorv() local_unnamed_addr #0 {
%1 = tail call i64 @llvm.read_volatile_register.i64(metadata !5)
%2 = tail call i64 @llvm.read_volatile_register.i64(metadata !5)
%3 = xor i64 %2, %1
ret i64 %3
}
```
It certainly seems wrong that `llvm.read_volatile_register.i64` is being lowered on aarch32 as CSE'able. The "volatile" in the name really suggests the contrary.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs