================ @@ -21248,6 +21297,51 @@ static SDValue foldTruncStoreOfExt(SelectionDAG &DAG, SDNode *N) { return SDValue(); } +// A custom combine to lower load <3 x i8> as the more efficient sequence +// below: +// ldrb wX, [x0, #2] +// ldrh wY, [x0] +// orr wX, wY, wX, lsl #16 +// fmov s0, wX +// +static SDValue combineV3I8LoadExt(LoadSDNode *LD, SelectionDAG &DAG) { + EVT MemVT = LD->getMemoryVT(); + if (MemVT != EVT::getVectorVT(*DAG.getContext(), MVT::i8, 3) || + LD->getOriginalAlign() >= 4) + return SDValue(); + + SDLoc DL(LD); + MachineFunction &MF = DAG.getMachineFunction(); + SDValue Chain = LD->getChain(); + SDValue BasePtr = LD->getBasePtr(); + MachineMemOperand *MMO = LD->getMemOperand(); + assert(LD->getOffset().isUndef() && "undef offset expected"); ---------------- fhahn wrote:
Could you share more details? Not sure how if encountering this with a non-AArch64 backend would directly translate to the AArch64 backend? I am also curious how the `<3 x i8>` vectors have been introduced before the backend? > Should we use some if-branch to detect this, and return SDValue()? We could, but unless we have a test case, I'd prefer to keep it as an assert for now, to give us a chance to catch a test case, if it is possible. https://github.com/llvm/llvm-project/pull/78632 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits