The recently merged [CUTLASS BYOC](https://github.com/apache/tvm/pull/9261)
relies on C-codegen based BYOC infra to JIT generate and compile C++ template
classes.
Currently it doesn't support Constants embedded in an external function and
instead requires all weight and bias parameters etc to be passed in at runtime.
This caused a problem for me, when I apply CUTLASS BYOC to a real model: I need
to run constant folding to turn fp32 bias parameters into fp16 for pattern
matching purpose and sending fp16 tensors to CUTLASS. For that, I need to bind
parameters to the module by `bind_params_by_name`, which embeds constant to the
external functions like this, which is not supported by CUTLASS BYOC right now:
```
def @tvmgen_default_cutlass_main_267(%cutlass_267_i0: Tensor[(1024, 1024),
float16], %cutlass_267_i1: Tensor[(4096, 1024), float16], Inline=1,
Compiler="cutlass", global_symbol="tvmgen_default_cutlass_main_267",
Primitive=1) -> Tensor[(1024, 4096), float16] {
%9 = fn (%FunctionVar_8_0: Tensor[(1024, 1024), float16], %FunctionVar_8_1:
Tensor[(4096, 1024), float16], %FunctionVar_8_2: Tensor[(4096), float16],
PartitionedFromPattern="nn.dense_add_multiply_cast_erf_cast_multiply_add_multiply_",
Composite="cutlass.dense_bias_gelu_fp16") -> Tensor[(1024, 4096), float16] {
%1 = nn.dense(%FunctionVar_8_0, %FunctionVar_8_1, units=None,
out_dtype="float16") /* ty=Tensor[(1024, 4096), float16] */;
%2 = add(%1, %FunctionVar_8_2) /* ty=Tensor[(1024, 4096), float16] */;
%3 = multiply(%2, meta[relay.Constant][0] /* ty=float16 */) /*
ty=Tensor[(1024, 4096), float16] */;
%4 = cast(%3, dtype="float32") /* ty=Tensor[(1024, 4096), float32] */;
%5 = erf(%4) /* ty=Tensor[(1024, 4096), float32] */;
%6 = cast(%5, dtype="float16") /* ty=Tensor[(1024, 4096), float16] */;
%7 = multiply(%6, meta[relay.Constant][1] /* ty=float16 */) /*
ty=Tensor[(1024, 4096), float16] */;
%8 = add(%7, meta[relay.Constant][2] /* ty=float16 */) /* ty=Tensor[(1024,
4096), float16] */;
multiply(%8, %2) /* ty=Tensor[(1024, 4096), float16] */
};
// meta[relay.Constant][3] is the bias constant, not supported by CUTLASS
BYOC for now
%9(%cutlass_267_i0, %cutlass_267_i1, meta[relay.Constant][3] /*
ty=Tensor[(4096), float16] */) /* ty=Tensor[(1024, 4096), float16] */
}
```
So I now need to deal with Constants. I think embedding all constants into
C-source is infeasible for models like `BERT-large` which I'm working with.
Alternative I think of is to somehow "unbind" constants after constant folding.
But this requires modifying signatures of external functions and passing
additional parameters inside `main` module, for which I don't see an easy way
to achieve.
My questions:
* Is there a good way to deal with Constants in C-source codegen based BYOC?
Has there been any improvement since discussions from last year such as
https://discuss.tvm.apache.org/t/external-codegen-constant-tensors-in-c-codegen/5890
and https://github.com/apache/tvm/pull/5310 (also cc @lhutton1 @manupa-arm
@matt-arm)
* Should CUTLASS codegen switch to JSON runtime, which I believe has no issues
with constants? How can we compile generated C-source with JSON based BYOC? cc
@Laurawly @comaniac @zhiics
---
[Visit
Topic](https://discuss.tvm.apache.org/t/byoc-cutlass-dealing-with-constants-in-c-source-gen-based-byoc/11362/1)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discuss.tvm.apache.org/email/unsubscribe/19c33f7efdc48881d2d968f7aeaa4a06e6ed321155a2859a15cc7f0473fa56f7).