[PATCH] D119720: [ARM] Pass for Cortex-A57 and Cortex-A72 Fused AES Erratum

Sam Elliott via Phabricator via cfe-commits Tue, 05 Apr 2022 08:59:41 -0700

lenary marked 6 inline comments as done.
lenary added a comment.

A few comments before I post the next version of the patch.




================
Comment at: llvm/lib/Target/ARM/ARMFixCortexA57AES1742098Pass.cpp:145
+  case ARM::VMVNq:
+    return CondCodeIsAL(3);
+  // VMOV of 64-bit value between D registers (when condition = al)
----------------
lenary wrote:
> dmgreen wrote:
> > Can/should all these use findFirstPredOperandIdx?
> > 
> > And is it worth checking for more instruction? Anything that sets a Q 
> > register, or is that too broad?
> `findFirstPredOperandIdx` doesn't work as lots of these instructions are not 
> marked `isPredicable` in the tablegen. I'm not sure if we want to solve that 
> in this work, or as a follow-up (I'd lean towards follow-up).
> 
> I believe "anything that sets a Q register" is too broad, as we model 
> subregister insertion as setting the the whole register in LLVM, but I'm not 
> sure that micro-architecturally they are actually doing that. This is why 
> I've tried to only add 64- and 128-bit setting instructions rather than ones 
> that are less wide. Originally I also included the `VMOVv2f32` instructions 
> that are now at the bottom of this switch, but I felt that might have been 
> too risky.
I'm wrong about this. The tablegen `isPredicable` is an override, other code 
might also set `isPredicable` to true, so I think `findFirstPredOperandIdx` 
should work too.


================
Comment at: llvm/lib/Target/ARM/ARMFixCortexA57AES1742098Pass.cpp:200-207
+  case ARM::VMOVv2f32:
+  case ARM::VMOVv4f32:
+  case ARM::VMOVv2i32:
+  case ARM::VMOVv4i32:
+  case ARM::VMOVv4i16:
+  case ARM::VMOVv8i16:
+  case ARM::VMOVv8i8:
----------------
dmgreen wrote:
> Are these vmov of an immediate? Are they not safe?
> 
> I was expecting it to be the lanes sets (VSETLNi8) and other scalar 
> instructions that were unsafe.
I have received the information on what is safe and what is not, and the next 
version of the patch will have this correct.


================
Comment at: llvm/lib/Target/ARM/ARMFixCortexA57AES1742098Pass.cpp:306
+          // fixup at the start of the function.
+          LLVM_DEBUG(dbgs()
+                     << "Fixup Planned: Live-In (with safe defining instrs): "
----------------
dmgreen wrote:
> lenary wrote:
> > dmgreen wrote:
> > > Can you explain more about the IsLiveIn && UnsafeCount==0 case. Am I 
> > > understanding that correctly that it would be:
> > > ```
> > > function(q0, ...)
> > >   lotsofcode...
> > >   q0 = load
> > >   aes q0
> > > ```
> > > Is there a better way to detect that the live-in doesn't matter in cases 
> > > like that?
> > I don't believe there is, and this comes down to issues with the 
> > `RDA.getGlobalReachingDefs` implementation, which I want to fix/enhance, 
> > but in a follow-up patch.
> > 
> > To start with, this is actually not a problem, as the pass is intended to 
> > be conservative, and we know the clobbers are a no-op at the architectural 
> > level (we insert them for their micro-architectural effects). So code will 
> > still do the right thing, but maybe with a little too much overhead in the 
> > case you showed.
> > 
> > However, this is necessary in some other cases, such as:
> > 
> > ```
> > function(q0)
> >    code
> >    conditional branch to L2
> > L1:
> >    q0 = safe_op(…)
> >    branch to L3
> > L2:
> >    code without update to q0
> > L3:
> >    aes q0
> > ```
> > 
> > In this case, `AllDefs` is a set containing one single defining instruction 
> > for Q0, because there is only one within the function (which is all that 
> > `RDA.getGlobalReachingDefs` can report instructions for).
> > 
> > But in my example, we *need* to protect q0 on the other paths, because the 
> > safe definitions of q0, when considered as a set, do not entirely dominate 
> > the AES use of q0 (this is slightly stretching the conventional definition 
> > of dominance, but think of this as "there exists a path from entry to the 
> > aes, which does not contain any of the safe instructions". Sadly, 
> > MachineDominance doesn't allow us to make this kind of query either!).
> > 
> > In this case though, it is safe to insert the protection at function entry, 
> > because that will (by definition) dominate all the AES uses, and the 
> > protection doesn't need to be dominated by the safe definitions, as we know 
> > they're safe.
> > 
> > I intend to follow-up this initial patch with an enhancement to 
> > ReachingDefAnalysis which will also provide the information that you have a 
> > set of defs inside the function, and also you're live-in, as this is 
> > required info for any conservative pass using the ReachingDefAnalysis. I 
> > felt, however, that given the pass is safe as-is, it was good to proceed 
> > without this enhancement.
> > 
> > 
> > 
> > 
> OK sounds good.
One note is that the exact problem you describe does show up in the tests (in 
`aese_set64_via_ptr`, where the `vldr` is "safe"), so when the pass is 
enhanced, we will notice the improvements. 


================
Comment at: llvm/lib/Target/ARM/ARMTargetMachine.cpp:585
 
+  addPass(createARMFixCortexA57AES1742098Pass());
+
----------------
dmgreen wrote:
> lenary wrote:
> > dmgreen wrote:
> > > Passes can't insert new instructions (or move things further apart) after 
> > > ConstantIslandPass. The branches/constant pools it has created may go out 
> > > of range of the instructions that use them. Would it be OK to move it 
> > > before that?
> > TIL. I'll add a comment about the constant island pass as well.
> > 
> > Should I also look at the restrictions on the Branch Targets pass? I can 
> > imagine we also don't want to separate instructions once we've calculated 
> > their targets locally?
> Yeah - It sounds like the BTI would need to remain as the first instruction 
> in the block.
Turns out the AES pass doesn't *have* to come before the BTI pass, because AES 
instructions are only available on A-profile, and BTI is M-profile. I still 
will move it to before all these passes anyway, just so it's clear what is 
going on.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119720/new/

https://reviews.llvm.org/D119720

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D119720: [ARM] Pass for Cortex-A57 and Cortex-A72 Fused AES Erratum

Reply via email to