pengfei accepted this revision. pengfei added inline comments. This revision is now accepted and ready to land.
================ Comment at: clang/lib/Headers/fmaintrin.h:22 +/// Computes a multiply-add of 128-bit vectors of [4 x float]. +/// For each element, computes <c> (__A * __B) + __C </c>. +/// ---------------- probinson wrote: > pengfei wrote: > > We are using a special format to describute the function in a pseudo code > > to share it with the intrinsic guide, e.g., > > https://github.com/llvm/llvm-project/blob/main/clang/lib/Headers/avx512fintrin.h#L9604-L9610 > > https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=_mm512_i32logather_pd&ig_expand=4077 > > > > There's no strong requirement to follow it, but it would be better to adopt > > a uniform format. > Is a FOR loop with computed bit offsets really clearer than "For each > element" ? Is it valuable to repeat information that can be found in the > instruction reference? > I can accept answers of "yes" and "yes" because I am not someone who ever > deals with vector data, but I would be a little surprised by those answers. > We have internal tools to verify them according to these offsets (but of course not for 100% intrinsics), which means we can trust such information in most time. The suggestion is based on: 1. Correctness. It's easy to be errorprone for tedious documentations and we don't have other verifications but eyes; 2. Unity. Using the same format is not only better for clean code but also distinct for future developers to follow; As I said before, I'm not forcing this way. You decide it. ================ Comment at: clang/lib/Headers/fmaintrin.h:26 +/// +/// This intrinsic corresponds to the <c> VFMADD213PS </c> instruction. +/// ---------------- probinson wrote: > pengfei wrote: > > It would be odd to user given this is just 1/3 instructions the intrinsic > > may generate, but I don't have a good idea here. > I listed the 213 version because that's the one that multiplies the first two > operands, and the intrinsic multiplies the first two operands. So it's the > instruction that most closely corresponds to the intrinsic. > We don't guarantee that the "corresponding" instruction is what is actually > generated, in general. I know this point has come up before regarding > intrinsic descriptions. My thinking is that the "corresponding instruction" > gives the reader a place to look in the instruction reference manual, so > listing only one is again okay. Note, intrinsics are force inlined. It's rare user will wrap it in a simple function with the same arguments order. In reality, each version may be generated. https://godbolt.org/z/q7j8Wxrnb However, I'm not against this approach if we don't have any better way to describe it. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D148021/new/ https://reviews.llvm.org/D148021 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits