craig.topper added a comment. While X86 does have multiplies that return double width results, it also has 16/32/64-bit forms of imul that only return the lower portion of the result. Those multiplies are typically faster and have fewer uops than the double width multiplies so we prefer to use that instruction when possible. We are currently using the double width multiply for 8*8->8 multiplies because there is no 8-bit single width multiply.
https://reviews.llvm.org/D44559 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits