================
@@ -95,14 +97,16 @@ gpu.func @gemm(%arg0: memref<1024x1024xbf16>, %arg1:
memref<1024x1024xbf16>, %ar
-> !xegpu.tensor_desc<16x16xbf16, #xegpu.layout<lane_layout = [1, 16],
lane_data = [2, 1]>>
%7 = xegpu.load_nd %5[%0, %arg3]
- {layout_result_0 = #xegpu.layout<lane_layout = [1, 16], lane_data = [1,
1]>}
+ {layout = #xegpu.layout<lane_layout = [1, 16], lane_data = [1, 1]>}
: !xegpu.tensor_desc<8x16xbf16, #xegpu.layout<lane_layout = [1, 16],
lane_data = [1, 1]>> -> vector<8x16xbf16>
%8 = xegpu.load_nd %6[%arg3, %1]
- {layout_result_0 = #xegpu.layout<lane_layout = [1, 16], lane_data = [2,
1]>}
+ {layout = #xegpu.layout<lane_layout = [1, 16], lane_data = [2, 1]>}
: !xegpu.tensor_desc<16x16xbf16, #xegpu.layout<lane_layout = [1, 16],
lane_data = [2, 1]>> -> vector<16x16xbf16>
%9 = xegpu.dpas %7, %8, %arg4
- {layout_result_0 = #xegpu.layout<lane_layout = [1, 16], lane_data = [1,
1]>}
+ {layout_a = #xegpu.layout<lane_layout = [1, 16], lane_data = [1, 1]>,
+ layout_b = #xegpu.layout<lane_layout = [1, 16], lane_data = [2, 1]>,
+ layout_cd = #xegpu.layout<lane_layout = [1, 16], lane_data = [1, 1]>}
: vector<8x16xbf16>, vector<16x16xbf16>, vector<8x16xf32> ->
vector<8x16xf32>
----------------
Jianhui-Li wrote:
Here we face a choice that allowing duplicate layout info (anchor and local
layout) for anchor ops.
The down-side of having these redundant local layout makes the test case more
verbose and introduces a potential source of inconsistency between two layouts
(they should be always same).
https://github.com/llvm/llvm-project/pull/172125
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits