[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)
https://github.com/hanhanW edited https://github.com/llvm/llvm-project/pull/149624 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)
@@ -150,9 +150,13 @@ def Linalg_PackOp : Linalg_RelayoutOp<"pack", [ `padding_value` specifies a padding value at the boundary on non-perfectly divisible dimensions. Padding is optional: -- If absent, it is UB if the tile does not perfectly divide the dimension. +- If absent, it assumes the tile perfectly divides the dimension. hanhanW wrote: I see what you meant. Sorry, somehow I thought that you're talking about the other case. My brain state must be messy when I replied the comments. Yes, I found an invalid test here: https://github.com/llvm/llvm-project/blob/45d99c26c3513945a454e90b69d48257886f8284/mlir/test/Dialect/Linalg/invalid.mlir#L1692-L1696 https://github.com/llvm/llvm-project/pull/149624 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)
hanhanW wrote: > think you should drop @2548809 from this PR, so we could handle #150127 > separately :) Or target that branch from this one if you're planning to get > that one in first. Otherwise LGTM. I need to land the other one first. Otherwise, the CI is not happy. I recreate the other PR because the branch needs to live in upstream and it has to associate to a PR: https://github.com/llvm/llvm-project/pull/150272 > It is possible to create branches in llvm/llvm-project/ that start with > users//, however this is intended to be able to support “stacked” > pull-request. Do not create any branches in the llvm/llvm-project repository > otherwise, please use a fork (see above). User branches that aren’t > associated with a pull-request will be deleted. https://llvm.org/docs/GitHub.html#branches https://github.com/llvm/llvm-project/pull/149624 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)
https://github.com/hanhanW edited https://github.com/llvm/llvm-project/pull/149624 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)
https://github.com/hanhanW closed https://github.com/llvm/llvm-project/pull/149624 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][linalg] Enable scalable vectorization of linalg.unpack (PR #149293)
@@ -1876,22 +1872,35 @@ static VectorType getCollapsedVecType(VectorType type, return VectorType::get(newShape, type.getElementType(), newScalableFlags); } -/// Vectorize a `linalg::UnPackOp` to these 4 Ops: -/// Vector::TransferReadOp - Reads a vector from the source tensor -/// vector::TransposeOp - Transpose the Source tensor -/// ShapeCastOp - Reshape the data based on the target. -/// vector::TransferWriteOp. - Write the result vector back to the destination -/// tensor. -/// If the vector sizes are not provided: -/// Vectorize `linalg.unpack %src into %dest` as: -/// // Reads a vector from the source tensor -/// %read = vector.transfer_read %src -/// // Transpose %read as specified in `outer_dims_perm` attribute -/// %tr = vector.transpose %read -/// // Reshape the data based on the target -/// %sc = vector.shape_cast %tr -/// // Write the result vector to the destination tensor. -/// vector.transfer_write %sc into %dest +/// Vectorize `linalg.unpack` into: +/// * xfer_read -> vector.transpose -> vector.shape_cast -> xfer_write +/// +/// The input-vector-sizes specify both the read and the write vector +/// sizes and are passed as one array covering both operations, i.e.: +/// +/// input-vector-sizes = [1, 1, 8, [8], 8, [8]] +///\ /\/ +///read-sizes write-sizes +/// +/// (for brefity, in the diagram, +///* input-vector-sizes = `inputVectorSizes` + `inputScalableDims` +/// ) +/// +/// If the vector sizes are not provided: +/// * the vector sizes are determined by the operands, +/// * the inBounds attribute is used instead of masking. +/// +/// EXAMPLE (no vector sizes): +/// ``` +/// %unpack = linalg.unpack %src +///inner_dims_pos = [0, 1] +///inner_tiles = [8, 8] +///into %dest : tensor<1x1x8x8xf32> -> tensor<8x8xf32> +/// ``` +/// is vectorized as: +/// ``` +/// vector.transfer_write %sc into %dest : vector<8x8xf32>, tensor<8x8xf32> +/// ``` hanhanW wrote: You deleted few ops in https://github.com/llvm/llvm-project/pull/149293/commits/b073854c4e681f2a8c89528ae4108f8bf7c6193f. I think we should add the xfer_read, transpose and shape_cast back. https://github.com/llvm/llvm-project/pull/149293 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][linalg] Enable scalable vectorization of linalg.unpack (PR #149293)
@@ -1871,115 +1872,98 @@ static VectorType getCollapsedVecType(VectorType type,
return VectorType::get(newShape, type.getElementType(), newScalableFlags);
}
-/// Vectorize a `linalg::UnPackOp` to these 4 Ops:
-/// Vector::TransferReadOp - Reads a vector from the source tensor
-/// vector::TransposeOp - Transpose the Source tensor
-/// ShapeCastOp - Reshape the data based on the target.
-/// vector::TransferWriteOp. - Write the result vector back to the
destination
-/// tensor.
-/// If the vector sizes are not provided:
-/// * the vector sizes are determined by the input operand and attributes,
-/// * update the inBounds attribute instead of masking.
+/// Vectorize `linalg.unpack` as:
+/// * xfer_read -> vector.transpose -> vector.shape_cast -> xfer_write
+///
+/// The input-vector-sizes specify the read vector sizes (i.e. the vector sizes
+/// for the xfer_read operation). This is sufficient to infer the other vector
+/// sizes required here.
+///
+/// If the vector sizes are not provided:
+/// * the vector sizes are determined by the operands,
+/// * the inBounds attribute is used instead of masking.
+///
+/// EXAMPLE (no vector sizes):
+/// ```
+/// %unpack = linalg.unpack %src
+///inner_dims_pos = [0, 1]
+///inner_tiles = [8, 8]
+///into %dest : tensor<1x1x8x8xf32> -> tensor<8x8xf32>
+/// ```
+/// is vectorized as:
+/// ```
+/// vector.transfer_write %sc into %dest : vector<8x8xf32>, tensor<8x8xf32>
+/// ```
static LogicalResult
vectorizeAsTensorUnpackOp(RewriterBase &rewriter, linalg::UnPackOp unpackOp,
ArrayRef inputVectorSizes,
+ ArrayRef inputScalableVecDims,
SmallVectorImpl &newResults) {
+ if (!inputVectorSizes.empty()) {
+assert(inputVectorSizes.size() == unpackOp.getSourceRank() &&
+ "Invalid number of input vector sizes!");
+assert(inputVectorSizes.size() == inputScalableVecDims.size() &&
+ "Incompatible number of vector sizes and vector scalable flags!");
+ }
// TODO: Introduce a parent class that will handle the insertion point
update.
OpBuilder::InsertionGuard g(rewriter);
rewriter.setInsertionPoint(unpackOp);
RankedTensorType unpackTensorType = unpackOp.getSourceType();
- ArrayRef innerDimPos = unpackOp.getInnerDimsPos();
- ArrayRef innerTiles = unpackOp.getStaticInnerTiles();
ArrayRef sourceShape = unpackTensorType.getShape();
+ ArrayRef destShape = unpackOp.getDestType().getShape();
bool useInBoundsInsteadOfMasking = false;
- ArrayRef outerDimsPerm = unpackOp.getOuterDimsPerm();
-
- auto destSize = unpackOp.getDestRank();
-
- if (!inputVectorSizes.empty())
-assert(inputVectorSizes.size() == destSize &&
- "Incorrect number of input vector sizes");
-
- // vectorSizes is the shape of the vector that will be used to do final
- // write on the destination tensor. It is set like this: Let's say the
- // source tensor is rank 'M' and the dest tensor rank 'N', where N <= M.
- // Thus:
- // 1. vectorSizes = sourceShape.take_front(N)
- // 2. if outer_dims_perms is present: do that permutation on vectorSizes.
- // 3. multiply all the locations in vectorSize pointed by innerDimPos by the
- //innerTiles attribute value.
- SmallVector vectorSizes(inputVectorSizes);
- if (vectorSizes.empty()) {
-llvm::append_range(vectorSizes, sourceShape.take_front(destSize));
-if (!outerDimsPerm.empty())
- applyPermutationToVector(vectorSizes, outerDimsPerm);
-for (auto [i, pos] : llvm::enumerate(innerDimPos))
- vectorSizes[pos] *= innerTiles[i];
-useInBoundsInsteadOfMasking = true;
- }
+ Location loc = unpackOp->getLoc();
- // readVectorSizes is the size of tensor used to read and apply mask. It is
- // set like this: Let's say the vectorSize (VS) array is size 'N' and
- // the sourceShape(SS) is 'M' where M >= N and InnerTileSizes (IT) of
- // size M-N
- // Thus:
- // - initially: readVectorSizes = vectorInputSizes
- // - Divide all the readMaskShape locations pointed by innerDimPos
- // by the innerTileSize attribute value.
- // - if outer_dims_perms is present: do that permutation on readVectorSizes.
- // - Append the remaining shape from SS
- // E.g. let's say let's say unpackTensorType.getShape() = <8x8x32x16>
- // inner Dim Pos = [0, 1] and Inner Tiles = [32, 16], vector_sizes are [512,
- // 128] and outer_dims_perm is [1, 0] then read shape is:
- // ReadVectorSizes(initial): [512, 128]
- // Final Value(after innerDim Adjustment): [512/32, 128/16]
- // = [16, 8]
- // After applying outer_dims_perm: [8, 16]
- // After appending the rest of the sourceShape: [8, 16, 32, 16]
-
- SmallVector readVectorSizes(vectorSizes.begin(), vectorSizes.end());
-
- for (auto [index, size] : enumerate(innerTiles)) {
-readVectorSizes[innerDimPos[index]] =
-llvm::divideCeil(readVectorSizes[i
[llvm-branch-commits] [mlir] [mlir][linalg] Enable scalable vectorization of linalg.unpack (PR #149293)
@@ -1871,115 +1872,98 @@ static VectorType getCollapsedVecType(VectorType type,
return VectorType::get(newShape, type.getElementType(), newScalableFlags);
}
-/// Vectorize a `linalg::UnPackOp` to these 4 Ops:
-/// Vector::TransferReadOp - Reads a vector from the source tensor
-/// vector::TransposeOp - Transpose the Source tensor
-/// ShapeCastOp - Reshape the data based on the target.
-/// vector::TransferWriteOp. - Write the result vector back to the
destination
-/// tensor.
-/// If the vector sizes are not provided:
-/// * the vector sizes are determined by the input operand and attributes,
-/// * update the inBounds attribute instead of masking.
+/// Vectorize `linalg.unpack` as:
+/// * xfer_read -> vector.transpose -> vector.shape_cast -> xfer_write
+///
+/// The input-vector-sizes specify the read vector sizes (i.e. the vector sizes
+/// for the xfer_read operation). This is sufficient to infer the other vector
+/// sizes required here.
+///
+/// If the vector sizes are not provided:
+/// * the vector sizes are determined by the operands,
+/// * the inBounds attribute is used instead of masking.
+///
+/// EXAMPLE (no vector sizes):
+/// ```
+/// %unpack = linalg.unpack %src
+///inner_dims_pos = [0, 1]
+///inner_tiles = [8, 8]
+///into %dest : tensor<1x1x8x8xf32> -> tensor<8x8xf32>
+/// ```
+/// is vectorized as:
+/// ```
+/// vector.transfer_write %sc into %dest : vector<8x8xf32>, tensor<8x8xf32>
+/// ```
static LogicalResult
vectorizeAsTensorUnpackOp(RewriterBase &rewriter, linalg::UnPackOp unpackOp,
ArrayRef inputVectorSizes,
+ ArrayRef inputScalableVecDims,
SmallVectorImpl &newResults) {
+ if (!inputVectorSizes.empty()) {
+assert(inputVectorSizes.size() == unpackOp.getSourceRank() &&
+ "Invalid number of input vector sizes!");
+assert(inputVectorSizes.size() == inputScalableVecDims.size() &&
+ "Incompatible number of vector sizes and vector scalable flags!");
+ }
// TODO: Introduce a parent class that will handle the insertion point
update.
OpBuilder::InsertionGuard g(rewriter);
rewriter.setInsertionPoint(unpackOp);
RankedTensorType unpackTensorType = unpackOp.getSourceType();
- ArrayRef innerDimPos = unpackOp.getInnerDimsPos();
- ArrayRef innerTiles = unpackOp.getStaticInnerTiles();
ArrayRef sourceShape = unpackTensorType.getShape();
+ ArrayRef destShape = unpackOp.getDestType().getShape();
bool useInBoundsInsteadOfMasking = false;
- ArrayRef outerDimsPerm = unpackOp.getOuterDimsPerm();
-
- auto destSize = unpackOp.getDestRank();
-
- if (!inputVectorSizes.empty())
-assert(inputVectorSizes.size() == destSize &&
- "Incorrect number of input vector sizes");
-
- // vectorSizes is the shape of the vector that will be used to do final
- // write on the destination tensor. It is set like this: Let's say the
- // source tensor is rank 'M' and the dest tensor rank 'N', where N <= M.
- // Thus:
- // 1. vectorSizes = sourceShape.take_front(N)
- // 2. if outer_dims_perms is present: do that permutation on vectorSizes.
- // 3. multiply all the locations in vectorSize pointed by innerDimPos by the
- //innerTiles attribute value.
- SmallVector vectorSizes(inputVectorSizes);
- if (vectorSizes.empty()) {
-llvm::append_range(vectorSizes, sourceShape.take_front(destSize));
-if (!outerDimsPerm.empty())
- applyPermutationToVector(vectorSizes, outerDimsPerm);
-for (auto [i, pos] : llvm::enumerate(innerDimPos))
- vectorSizes[pos] *= innerTiles[i];
-useInBoundsInsteadOfMasking = true;
- }
+ Location loc = unpackOp->getLoc();
- // readVectorSizes is the size of tensor used to read and apply mask. It is
- // set like this: Let's say the vectorSize (VS) array is size 'N' and
- // the sourceShape(SS) is 'M' where M >= N and InnerTileSizes (IT) of
- // size M-N
- // Thus:
- // - initially: readVectorSizes = vectorInputSizes
- // - Divide all the readMaskShape locations pointed by innerDimPos
- // by the innerTileSize attribute value.
- // - if outer_dims_perms is present: do that permutation on readVectorSizes.
- // - Append the remaining shape from SS
- // E.g. let's say let's say unpackTensorType.getShape() = <8x8x32x16>
- // inner Dim Pos = [0, 1] and Inner Tiles = [32, 16], vector_sizes are [512,
- // 128] and outer_dims_perm is [1, 0] then read shape is:
- // ReadVectorSizes(initial): [512, 128]
- // Final Value(after innerDim Adjustment): [512/32, 128/16]
- // = [16, 8]
- // After applying outer_dims_perm: [8, 16]
- // After appending the rest of the sourceShape: [8, 16, 32, 16]
-
- SmallVector readVectorSizes(vectorSizes.begin(), vectorSizes.end());
-
- for (auto [index, size] : enumerate(innerTiles)) {
-readVectorSizes[innerDimPos[index]] =
-llvm::divideCeil(readVectorSizes[i
[llvm-branch-commits] [mlir] [mlir][linalg] Enable scalable vectorization of linalg.unpack (PR #149293)
https://github.com/hanhanW edited https://github.com/llvm/llvm-project/pull/149293 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][linalg] Enable scalable vectorization of linalg.unpack (PR #149293)
https://github.com/hanhanW approved this pull request. LGTM, thanks! Just 3 nits. https://github.com/llvm/llvm-project/pull/149293 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)
@@ -150,9 +150,15 @@ def Linalg_PackOp : Linalg_RelayoutOp<"pack", [ `padding_value` specifies a padding value at the boundary on non-perfectly divisible dimensions. Padding is optional: -- If absent, it is UB if the tile does not perfectly divide the dimension. +- If absent, it is assumed that for all inner tiles, + `shape(source)[inner_dims_pos[i]] % inner_tiles[i] == 0`, i.e. all inner + tiles divide perfectly the corresponding outer dimension in the result + tensor. - If present, it will pad along high dimensions (high-padding) to make the - tile complete. + tile complete. Note that it is not allowed to have artificial padding that + is not strictly required by linalg.pack (i.e., padding past what is needed + to complete the last tile along each packed dimension). It is UB if extra + padding is requested. hanhanW wrote: Enforcing that is more like a dynamic check, but not a static check. I.e., it can only be asserted during runtime for most cases, IMO. E.g., you don't do any out-of-bound checks for tensor.extract_slice for dynamic sizes/dims, but you do out-of-bound checks when the op has static sizes and offsets. https://github.com/llvm/llvm-project/pull/149624 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)
https://github.com/hanhanW updated
https://github.com/llvm/llvm-project/pull/149624
>From 6de929abea2ad4fac10605e333050b64eedbb904 Mon Sep 17 00:00:00 2001
From: hanhanW
Date: Tue, 22 Jul 2025 15:15:08 -0700
Subject: [PATCH 1/4] [mlir][linalg] Restrict linalg.pack to not have extra
padding sizes.
Signed-off-by: hanhanW
---
.../Dialect/Linalg/IR/LinalgRelayoutOps.td| 14 +++-
mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp | 28 ++-
.../Transforms/PackAndUnpackPatterns.cpp | 1 +
mlir/test/Dialect/Linalg/canonicalize.mlir| 25 +++---
.../Linalg/data-layout-propagation.mlir | 18 -
mlir/test/Dialect/Linalg/invalid.mlir | 17 ++--
.../Dialect/Linalg/transform-lower-pack.mlir | 16 ++--
.../tile-and-fuse-consumer.mlir | 81 ---
8 files changed, 50 insertions(+), 150 deletions(-)
diff --git a/mlir/include/mlir/Dialect/Linalg/IR/LinalgRelayoutOps.td
b/mlir/include/mlir/Dialect/Linalg/IR/LinalgRelayoutOps.td
index fa572024ff72b..f5dd7ae2c84f3 100644
--- a/mlir/include/mlir/Dialect/Linalg/IR/LinalgRelayoutOps.td
+++ b/mlir/include/mlir/Dialect/Linalg/IR/LinalgRelayoutOps.td
@@ -150,9 +150,10 @@ def Linalg_PackOp : Linalg_RelayoutOp<"pack", [
`padding_value` specifies a padding value at the boundary on non-perfectly
divisible dimensions. Padding is optional:
-- If absent, it is UB if the tile does not perfectly divide the dimension.
+- If absent, it assumes the tile perfectly divides the dimension.
- If present, it will pad along high dimensions (high-padding) to make the
- tile complete.
+ tile complete. Note that it is not allowed to have artificial padding
that
+ is not strictly required by linalg.pack.
Example:
```mlir
@@ -167,6 +168,15 @@ def Linalg_PackOp : Linalg_RelayoutOp<"pack", [
//
// Note: Only tiled dimensions can be padded.
```
+
+Invalid example that has artificial padding:
+```mlir
+%0 = linalg.pack %src padding_value(%cst : f32) inner_dims_pos = [0]
+inner_tiles = [8] into %dest
+: tensor<9xf32> -> tensor<3x8xf32>
+// \
+//expect tensor<2x8xf32> because CeilDiv(9, 8) = 2
+```
}];
let arguments = (ins AnyRankedTensor:$source,
AnyRankedTensor:$dest,
diff --git a/mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
b/mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
index 046a73c90f110..2d473daa1e1f4 100644
--- a/mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
+++ b/mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
@@ -32,6 +32,7 @@
#include "mlir/IR/OpImplementation.h"
#include "mlir/IR/OperationSupport.h"
#include "mlir/IR/PatternMatch.h"
+#include "mlir/IR/TypeUtilities.h"
#include "mlir/Interfaces/InferTypeOpInterface.h"
#include "mlir/Interfaces/SideEffectInterfaces.h"
@@ -4622,22 +4623,6 @@ static bool
isInvalidPackingPosSpecification(ArrayRef dimsPos,
});
}
-/// Returns true if the dimension of `sourceShape` is smaller than the
dimension
-/// of the `limitShape`.
-static bool areAllInBound(ArrayRef sourceShape,
- ArrayRef limitShape) {
- assert(
- sourceShape.size() == limitShape.size() &&
- "expected source shape rank, and limit of the shape to have same rank");
- return llvm::all_of(
- llvm::zip(sourceShape, limitShape), [](std::tuple it) {
-int64_t sourceExtent = std::get<0>(it);
-int64_t limit = std::get<1>(it);
-return ShapedType::isDynamic(sourceExtent) ||
- ShapedType::isDynamic(limit) || sourceExtent <= limit;
- });
-}
-
template
static LogicalResult commonVerifierPackAndUnPackOp(OpTy packOrUnPack) {
static_assert(llvm::is_one_of::value,
@@ -4696,11 +4681,6 @@ static LogicalResult commonVerifierPackAndUnPackOp(OpTy
packOrUnPack) {
// represents full tiles.
RankedTensorType expectedPackedType = PackOp::inferPackedType(
unpackedType, packOrUnPack.getStaticTiles(), innerDimsPos, outerDimPerm);
- if (!areAllInBound(expectedPackedType.getShape(), packedType.getShape())) {
-return op->emitError("the shape of output is not large enough to hold the "
- "packed data. Expected at least ")
- << expectedPackedType << ", got " << packedType;
- }
if (!llvm::all_of(
llvm::zip(packedType.getShape().take_back(mixedTiles.size()),
mixedTiles),
@@ -4717,6 +4697,12 @@ static LogicalResult commonVerifierPackAndUnPackOp(OpTy
packOrUnPack) {
return op->emitError("mismatch in inner tile sizes specified and shaped of
"
"tiled dimension in the packed type");
}
+ if (failed(verifyCompatibleShape(expectedPackedType.getShape(),
+ packedType.getShape( {
+return op->emitError("the shape of unpacked domain value is not large "
+ "enough to hold the packed data. Expected at least ")
+ << expectedPackedTy
[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)
hanhanW wrote: Yes, updated. Thanks for pointing it out! https://github.com/llvm/llvm-project/pull/149624 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)
@@ -4717,6 +4697,12 @@ static LogicalResult commonVerifierPackAndUnPackOp(OpTy
packOrUnPack) {
return op->emitError("mismatch in inner tile sizes specified and shaped of
"
"tiled dimension in the packed type");
}
+ if (failed(verifyCompatibleShape(expectedPackedType.getShape(),
+ packedType.getShape( {
+return op->emitError("expected ")
+ << expectedPackedType << " for the unpacked domain value, got "
hanhanW wrote:
good catch, thanks!
https://github.com/llvm/llvm-project/pull/149624
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)
@@ -1824,27 +1825,47 @@ func.func @unpack_invalid_outer_dims_perm(%source:
tensor<128x256xf32>, %dest: t
// -
+func.func @pack_with_artificial_padding(%input: tensor<9xf32>, %output:
tensor<3x8xf32>) -> tensor<3x8xf32> {
+ %cst = arith.constant 0.0 : f32
+ // expected-error@+1 {{expected 'tensor<2x8xf32>' for the unpacked domain
value, got 'tensor<3x8xf32>'}}
+ %0 = linalg.pack %input padding_value(%cst : f32) inner_dims_pos = [0]
+ inner_tiles = [8] into %output
+ : tensor<9xf32> -> tensor<3x8xf32>
+ return %0 : tensor<3x8xf32>
+}
+
+// -
+
// The outer dims in the output tensor are incorrectly/unexpectedly transposed.
// This could be fixed by adding `outer_dims_perm = [1, 0]` (the default value
assumes no transpose).
func.func @pack_invalid_result_shape(%input: tensor<256x128xf32>, %output:
tensor<4x16x32x16xf32>) -> tensor<4x16x32x16xf32> {
- // expected-error@+1 {{the shape of output is not large enough to hold the
packed data. Expected at least 'tensor<16x4x32x16xf32>', got
'tensor<4x16x32x16xf32>'}}
+ // expected-error@+1 {{expected 'tensor<16x4x32x16xf32>' for the unpacked
domain value, got 'tensor<4x16x32x16xf32>'}}
%0 = linalg.pack %input inner_dims_pos = [1, 0] inner_tiles = [32, 16] into
%output : tensor<256x128xf32> -> tensor<4x16x32x16xf32>
return %0 : tensor<4x16x32x16xf32>
}
// -
-func.func @pack_invalid(%input: tensor<256x128xf32>, %output:
tensor<8x8x32x16xf32>) -> tensor<8x8x32x16xf32> {
- // expected-error@+1 {{the shape of output is not large enough to hold the
packed data. Expected at least 'tensor<8x8x16x32xf32>', got
'tensor<8x8x32x16xf32>'}}
- %0 = linalg.pack %input inner_dims_pos = [1, 0] inner_tiles = [16, 32] into
%output : tensor<256x128xf32> -> tensor<8x8x32x16xf32>
- return %0 : tensor<8x8x32x16xf32>
+func.func @pack_invalid(%input: tensor<256x128xf32>, %output:
tensor<8x7x16x32xf32>) -> tensor<8x7x16x32xf32> {
+ // expected-error@+1 {{expected 'tensor<8x8x16x32xf32>' for the unpacked
domain value, got 'tensor<8x7x16x32xf32>'}}
+ %0 = linalg.pack %input inner_dims_pos = [1, 0] inner_tiles = [16, 32] into
%output : tensor<256x128xf32> -> tensor<8x7x16x32xf32>
+ return %0 : tensor<8x7x16x32xf32>
+}
+
+// -
+
+func.func @unpack_with_slicing_tiles(%input: tensor<3x8xf32>, %output:
tensor<9xf32>) -> tensor<9xf32> {
hanhanW wrote:
SG, let's use `artificial` for consistency.
https://github.com/llvm/llvm-project/pull/149624
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)
@@ -150,9 +150,15 @@ def Linalg_PackOp : Linalg_RelayoutOp<"pack", [ `padding_value` specifies a padding value at the boundary on non-perfectly divisible dimensions. Padding is optional: -- If absent, it is UB if the tile does not perfectly divide the dimension. +- If absent, it is assumed that for all inner tiles, + `shape(source)[inner_dims_pos[i]] % inner_tiles[i] == 0`, i.e. all inner + tiles divide perfectly the corresponding outer dimension in the result + tensor. - If present, it will pad along high dimensions (high-padding) to make the - tile complete. + tile complete. Note that it is not allowed to have artificial padding that + is not strictly required by linalg.pack (i.e., padding past what is needed + to complete the last tile along each packed dimension). It is UB if extra + padding is requested. hanhanW wrote: Restored the UB for the previous point, because it also happens in dynamic shape that I forgot. I also added one more sentence about why they are UB. Does it look better? https://github.com/llvm/llvm-project/pull/149624 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
