[PATCH] D119720: [ARM] Pass for Cortex-A57 and Cortex-A72 Fused AES Erratum

Sam Elliott via Phabricator via cfe-commits Thu, 31 Mar 2022 02:13:42 -0700

lenary added a comment.

Thanks for the review. Lots of comments inline, hopefully Phab doesn't mangle 
the large one.

================
Comment at: llvm/lib/Target/ARM/ARMFixCortexA57AES1742098Pass.cpp:145
+  case ARM::VMVNq:
+    return CondCodeIsAL(3);
+  // VMOV of 64-bit value between D registers (when condition = al)
----------------
dmgreen wrote:
> Can/should all these use findFirstPredOperandIdx?
> 
> And is it worth checking for more instruction? Anything that sets a Q 
> register, or is that too broad?
`findFirstPredOperandIdx` doesn't work as lots of these instructions are not 
marked `isPredicable` in the tablegen. I'm not sure if we want to solve that in 
this work, or as a follow-up (I'd lean towards follow-up).

I believe "anything that sets a Q register" is too broad, as we model 
subregister insertion as setting the the whole register in LLVM, but I'm not 
sure that micro-architecturally they are actually doing that. This is why I've 
tried to only add 64- and 128-bit setting instructions rather than ones that 
are less wide. Originally I also included the `VMOVv2f32` instructions that are 
now at the bottom of this switch, but I felt that might have been too risky.

================
Comment at: llvm/lib/Target/ARM/ARMFixCortexA57AES1742098Pass.cpp:250
+    // Early return if no instructions are the start of an AES Pair.
+    if (!llvm::any_of(MBB.instrs(), isFirstAESPairInstr))
+      continue;
----------------
dmgreen wrote:
> This needn't scan through checking for the instruction that the loop below is 
> going to check for too.
Ack. Will remove. I think this is vestigal from a previous (unshared) version 
of the patch which was doing something more complex in the loop.

================
Comment at: llvm/lib/Target/ARM/ARMFixCortexA57AES1742098Pass.cpp:306
+          // fixup at the start of the function.
+          LLVM_DEBUG(dbgs()
+                     << "Fixup Planned: Live-In (with safe defining instrs): "
----------------
dmgreen wrote:
> Can you explain more about the IsLiveIn && UnsafeCount==0 case. Am I 
> understanding that correctly that it would be:
> ```
> function(q0, ...)
>   lotsofcode...
>   q0 = load
>   aes q0
> ```
> Is there a better way to detect that the live-in doesn't matter in cases like 
> that?
I don't believe there is, and this comes down to issues with the 
`RDA.getGlobalReachingDefs` implementation, which I want to fix/enhance, but in 
a follow-up patch.

To start with, this is actually not a problem, as the pass is intended to be 
conservative, and we know the clobbers are a no-op at the architectural level 
(we insert them for their micro-architectural effects). So code will still do 
the right thing, but maybe with a little too much overhead in the case you 
showed.

However, this is necessary in some other cases, such as:

```
function(q0)
   code
   conditional branch to L2
L1:
   q0 = safe_op(…)
   branch to L3
L2:
   code without update to q0
L3:
   aes q0
```

In this case, `AllDefs` is a set containing one single defining instruction for 
Q0, because there is only one within the function (which is all that 
`RDA.getGlobalReachingDefs` can report instructions for).

But in my example, we *need* to protect q0 on the other paths, because the safe 
definitions of q0, when considered as a set, do not entirely dominate the AES 
use of q0 (this is slightly stretching the conventional definition of 
dominance, but think of this as "there exists a path from entry to the aes, 
which does not contain any of the safe instructions". Sadly, MachineDominance 
doesn't allow us to make this kind of query either!).

In this case though, it is safe to insert the protection at function entry, 
because that will (by definition) dominate all the AES uses, and the protection 
doesn't need to be dominated by the safe definitions, as we know they're safe.

I intend to follow-up this initial patch with an enhancement to 
ReachingDefAnalysis which will also provide the information that you have a set 
of defs inside the function, and also you're live-in, as this is required info 
for any conservative pass using the ReachingDefAnalysis. I felt, however, that 
given the pass is safe as-is, it was good to proceed without this enhancement.

================
Comment at: llvm/lib/Target/ARM/ARMTargetMachine.cpp:585

+  addPass(createARMFixCortexA57AES1742098Pass());
+
----------------
dmgreen wrote:
> Passes can't insert new instructions (or move things further apart) after 
> ConstantIslandPass. The branches/constant pools it has created may go out of 
> range of the instructions that use them. Would it be OK to move it before 
> that?
TIL. I'll add a comment about the constant island pass as well.

Should I also look at the restrictions on the Branch Targets pass? I can 
imagine we also don't want to separate instructions once we've calculated their 
targets locally?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119720/new/

https://reviews.llvm.org/D119720

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D119720: [ARM] Pass for Cortex-A57 and Cortex-A72 Fused AES Erratum

Reply via email to