pengfei added inline comments.

================
Comment at: clang/include/clang/CodeGen/CGFunctionInfo.h:590
+  /// Log 2 of the maximum vector width.
+  unsigned MaxVectorWidth : 4;
+
----------------
LuoYuanke wrote:
> I notice some code would indicate it is log 2 size with Log2 suffix in the 
> variable name. Do you think it is more readable to add Log2 suffix?
I think it doesn't matter. We have getter and setter for the variable. People 
won't access it directly.


================
Comment at: clang/lib/CodeGen/CGCall.cpp:5238
+  for (unsigned i = 0; i < IRCallArgs.size(); ++i)
+    LargestVectorWidth = std::max(LargestVectorWidth,
+                                  getMaxVectorWidth(IRCallArgs[i]->getType()));
----------------
LuoYuanke wrote:
> Does this also affect other calling convention besides fastcall?
I don't think so. The change here adds for the missing cases like `[2 x <4 x 
double>]` or `{ <2 x double>, <4 x double> }` which should also set 
`min-legal-width-width` to the maximum of the vector length.
There're several reasons why other calling convention won't be affected.
1. If a target has ability to pass arguments like `[2 x <4 x double>]`, it must 
have the ability for `<4 x double>` and have set `min-legal-width-width` to 256 
when passing it. So it makes more sense to set `min-legal-width-width` to 256 
for `[2 x <4 x double>]` rather than keeping it as 0;
2. AFAIK, targets other than X86 simply ignore `min-legal-width-width`. So the 
change won't affect them;
3. On x86, calling conventions other than regcall don't allow arguments size 
larger than 512, see `if (!IsRegCall && Size > 512)`. They will be turned into 
pointers, so they won't be affected by this change;
4. For arguments size no larger than 512 and only contain single vector 
element, we have already turned them into pure vectors. So they have already 
set `min-legal-width-width` to the correct value;
5. For arguments have more then one vector elements. Clang has bug which 
doesn't match with GCC and ICC. I have filed a bug here 
https://github.com/llvm/llvm-project/issues/54582
6. Thus, only regcall can generate arguments type like `[2 x <4 x double>]` on 
X86. So only it will be affected by this.


================
Comment at: clang/lib/CodeGen/TargetInfo.cpp:3031
     // than eight eightbytes, ..., it has class MEMORY.
-    if (Size > 512)
+    if (!IsRegCall && Size > 512)
       return;
----------------
LuoYuanke wrote:
> Would you add a test for non regcall? Pass 1024 bit vector parameter and 
> check if it is well handled both with regcall and without regcall.
> Would you add comments to depict why regcall accept the size which is more 
> than 512?
Added one to non regcall. regcall doesn't specify how to handle 1024 bit 
vector. I'd take it as UB, so we don't need such a test.
https://www.intel.com/content/www/us/en/develop/documentation/cpp-compiler-developer-guide-and-reference/top/compiler-reference/c-c-calling-conventions.html


================
Comment at: clang/test/CodeGen/aarch64-neon-tbl.c:45
 
-// CHECK-LABEL: define{{.*}} <8 x i8> @test_vqtbl2_s8([2 x <16 x i8>] 
%a.coerce, <8 x i8> noundef %b) #0 {
+// CHECK-LABEL: define{{.*}} <8 x i8> @test_vqtbl2_s8([2 x <16 x i8>] 
%a.coerce, <8 x i8> noundef %b) #1 {
 // CHECK:   [[__P0_I:%.*]] = alloca %struct.int8x16x2_t, align 16
----------------
LuoYuanke wrote:
> I'm curious why aarch64 test cases are affected by the patch.
As I explained above, `[2 x <16 x i8>]` should have the same value of 
`min-legal-vector-width` as `<16 x i8>`. The difference between `#0` and `#1` 
is the value of `min-legal-vector-width`.


================
Comment at: clang/test/CodeGen/regcall2.c:2
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// RUN: %clang_cc1 -emit-llvm %s -o - -ffreestanding -target-feature +avx512vl 
-triple=x86_64-pc-win32     | FileCheck %s --check-prefix=Win
+// RUN: %clang_cc1 -emit-llvm %s -o - -ffreestanding -target-feature +avx512vl 
-triple=x86_64-pc-linux-gnu | FileCheck %s --check-prefix=Lin
----------------
LuoYuanke wrote:
> Add test case for target that has no avx512 feature?
I'd take it as UB for Clang and GCC using 512 bit vector without avx512 
feature. ICC always promotes to avx512 when it finds 512 bit vector. So we 
don't need such tests.


================
Comment at: clang/test/CodeGen/regcall2.c:9
+  __m512d r1[4];
+  __m512 r2[4];
+} __sVector;
----------------
LuoYuanke wrote:
> May add a test case to show what's the max register we can pass with regcall.
We have tests for it in `clang/test/CodeGen/regcall.c`. This patch doesn't 
affect the capability of regcall.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122104/new/

https://reviews.llvm.org/D122104

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to