Hi all,
I would like to contribute to this project by implementing 8-bit quantization
for 3d convolution. Currently my implementation works fine without auto-tuning.
It is quite similar to what is happening in 2D:
1. Reshape the input data and the kernel such as the convolution computation
can be vectorized
2. Perform the convolution computation in a vectorized fashion via dp4a.
3. Reshape the output
The 8-bit convolution outputs are relatively close to the standard convolution
one.
The auto-tuning step runs smoothly (it takes more time to run) and it outputs a
log file with the optimal configuration for the 3d convolution
(conv3d_ncdhw_int8).
However, during the compilation phase, I sometimes encounter the following
error:
```
[12:11:45] /usr/tvm/src/tir/transforms/loop_partition.cc:548: Cannot prove:
((((((floordiv(((threadIdx.z*2) + 1), 4) + 1) - floordiv(threadIdx
.z, 2)) - 1) - (29 - (blockIdx.z*4))) + 1) >= 0), when generating the post
doubt loop
Traceback (most recent call last):
[0/1634]
File "tune_relay_cuda_int8.py", line 508, in <module>
tune_and_evaluate(tuning_option)
File "tune_relay_cuda_int8.py", line 409, in tune_and_evaluate
graph, lib, params = relay.build_module.build(mod, target=target,
params=params)
File "/usr/tvm/python/tvm/relay/build_module.py", line 260, in build
graph_json, mod, params = bld_mod.build(mod, target, target_host, params)
File "/usr/tvm/python/tvm/relay/build_module.py", line 127, in build
self._build(mod, target, target_host)
File "tvm/_ffi/_cython/./packed_func.pxi", line 322, in
tvm._ffi._cy3.core.PackedFuncBase.__call__
File "tvm/_ffi/_cython/./packed_func.pxi", line 257, in
tvm._ffi._cy3.core.FuncCall
File "tvm/_ffi/_cython/./packed_func.pxi", line 246, in
tvm._ffi._cy3.core.FuncCall3
File "tvm/_ffi/_cython/./base.pxi", line 160, in tvm._ffi._cy3.core.CALL
tvm._ffi.base.TVMError: Traceback (most recent call last):
[bt] (8) /usr/tvm/build/libtvm.so(tvm::build(tvm::Map<tvm::runtime::String,
tvm::IRModule, void, void> const&, tvm::Target const&)+0x83c) [0
x7fd6f772267c]
[bt] (7) /usr/tvm/build/libtvm.so(tvm::build(tvm::Map<tvm::Target,
tvm::IRModule, void, void> const&, tvm::Target const&)+0x2c7) [0x7fd6f772
1397]
[bt] (6) /usr/tvm/build/libtvm.so(tvm::SplitDevHostFuncs(tvm::IRModule,
tvm::Target const&, tvm::Target const&, tvm::transform::PassContext
const&)+0x488) [0x7fd6f771fff8]
[bt] (5)
/usr/tvm/build/libtvm.so(tvm::transform::Pass::operator()(tvm::IRModule)
const+0x6a) [0x7fd6f71d8e7a]
[bt] (4)
/usr/tvm/build/libtvm.so(tvm::transform::SequentialNode::operator()(tvm::IRModule,
tvm::transform::PassContext const&) const+0x40e)
[0x7fd6f7241d1e]
[bt] (3)
/usr/tvm/build/libtvm.so(tvm::transform::ModulePassNode::operator()(tvm::IRModule,
tvm::transform::PassContext const&) const+0x1e2)
[0x7fd6f723fe52]
[bt] (2) /usr/tvm/build/libtvm.so(+0x8d347c) [0x7fd6f74d147c]
[bt] (1)
/usr/tvm/build/libtvm.so(tvm::tir::MakePackedAPI(tvm::tir::PrimFunc&&,
int)+0x2d19) [0x7fd6f74ce7a9]
[bt] (0)
/usr/tvm/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x61)
[0x7fd6f7138f91]
File "/usr/tvm/src/tir/transforms/make_packed_api.cc", line 210
TVMError: Not all Vars are passed in api_args: 'threadIdx.z' is not bound to
any variables
```
Depending on the optimization that has been found by the auto-tuner, this error
may or may not occur. For instance, by modifying the log that
was produced during the auto-tuning, I am able to make a invalid configuration
actually work.
Invalid configuration
```
{"input": ["cuda -keys=cuda,gpu -max_num_threads=1024 -model=unknown
-thread_warp_size=32", "conv3d_NCDHWc_int8.cuda", [["TENSOR", [1, 128, 18, 56,
56], "int8"], ["TENSOR", [128, 128, 3, 3, 3], "int8"], [1, 1, 1], [1, 1, 1, 1,
1, 1], [1, 1, 1], "NCDHW", "int32"], {}], "config": {"index": 77070610321,
"code_hash": null, "entity": [["tile_f", "sp", [-1, 1, 8, 2]], ["tile_d", "sp",
[-1, 1, 1, 2]], ["tile_y", "sp", [-1, 1, 7, 2]], ["tile_x", "sp", [-1, 2, 1,
1]], ["fuse_yx", "ot", 0], ["tile_rc", "sp", [-1, 1]], ["tile_rd", "sp", [-1,
1]], ["tile_ry", "sp", [-1, 1]], ["tile_rx", "sp", [-1, 1]], ["reorder_inner",
"re", [1, 2, 0, 3]], ["auto_unroll_max_step", "ot", 1500]]}, "result":
[[0.0027175069], 0, 11.701743602752686, 1603898087.1376908], "version": 0.2,
"tvm_version": "0.8.dev0"}
```
Valid configuration
```
{"input": ["cuda -keys=cuda,gpu -max_num_threads=1024 -model=unknown
-thread_warp_size=32", "conv3d_NCDHWc_int8.cuda", [["TENSOR", [1, 128, 18, 56,
56], "int8"], ["TENSOR", [128, 128, 3, 3, 3], "int8"], [1, 1, 1], [1, 1, 1, 1,
1, 1], [1, 1, 1], "NCDHW", "int32"], {}], "config": {"index": 77070610321,
"code_hash": null, "entity": [["tile_f", "sp", [-1, 1, 8, 1]], ["tile_d", "sp",
[-1, 1, 1, 2]], ["tile_y", "sp", [-1, 1, 7, 2]], ["tile_x", "sp", [-1, 2, 1,
1]], ["fuse_yx", "ot", 0], ["tile_rc", "sp", [-1, 1]], ["tile_rd", "sp", [-1,
1]], ["tile_ry", "sp", [-1, 1]], ["tile_rx", "sp", [-1, 1]], ["reorder_inner",
"re", [1, 2, 0, 3]], ["auto_unroll_max_step", "ot", 1500]]}, "result":
[[0.0027175069], 0, 11.701743602752686, 1603898087.1376908], "version": 0.2,
"tvm_version": "0.8.dev0"}
```
I am not sure how to solve this problem. What would you advice me ?
---
[Visit
Topic](https://discuss.tvm.apache.org/t/quantization-and-3d-convolution/8338/1)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discuss.tvm.apache.org/email/unsubscribe/bef7fe2403c32b4c1bbe047d8c76f38257813d59fcc398535dae7b64c7b26e9f).