[TVM Discuss] [Questions] CUDA_ERROR_INVALID_PTX when trying to run TensorFlow DeeplabV3+ model

wwwwcu via TVM Discuss Mon, 29 Jun 2020 03:00:34 -0700

Hi,


I imported DeeplabV3+(xception) model named 'xception65_coco_voc_trainval' 
downloaded from TF model zoo 
(https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/model_zoo.md)
It runs well on CPU but gets some error on GPU.

```
target = tvm.target.cuda()
ctx = tvm.gpu(0)
model_path = '/tensorflow/deeplabv3_pascal_train_aug/frozen_inference_graph.pb'
image_url = "/deeplab/test_dataset/image3.jpg"

with tf.gfile.GFile(model_path, 'rb') as f:
    graph_def = tf.GraphDef()
    graph_def.ParseFromString(f.read())
    graph = tf.import_graph_def(graph_def, name='')
    graph_def = tf_testing.ProcessGraphDefParam(graph_def)


image = Image.open(image_url)
image_resize = image.convert('RGB').resize((513, 513))
x = np.array(image_resize)
x = np.expand_dims(x, 0)

mod, params = relay.frontend.from_tensorflow(graph_def, layout='NHWC', 
shape=x.shape)
with relay.build_config(opt_level=3):
    json, lib, params = relay.build(mod,
                                     target=target,
                                     params=params)

module = runtime.create(json, lib, ctx)
input = "ImageTensor"
module.set_input(key=input, value=x, **params)
module.run()
out_0 = module.get_output(0).asnumpy()
```

The error is shown below:
```
[17:36:18] /home/incubator-tvm/src/te/schedule/bound.cc:119: not in feed graph 
consumer = compute(placeholder_red_temp.repl, 0x11f290c0)
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 129, 129, 256), 'float32'), 
('TENSOR', (1, 1, 256, 21), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 129, 129, 256), 'float32'), 
('TENSOR', (1, 1, 256, 256), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 129, 129, 304), 'float32'), 
('TENSOR', (1, 1, 304, 256), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 65, 65, 1280), 'float32'), 
('TENSOR', (1, 1, 1280, 256), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 1, 1, 2048), 'float32'), 
('TENSOR', (1, 1, 2048, 256), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 65, 65, 1536), 'float32'), 
('TENSOR', (1, 1, 1536, 2048), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 65, 65, 1536), 'float32'), 
('TENSOR', (1, 1, 1536, 1536), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 65, 65, 1024), 'float32'), 
('TENSOR', (1, 1, 1024, 1536), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 65, 65, 1024), 'float32'), 
('TENSOR', (1, 1, 1024, 1024), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 65, 65, 728), 'float32'), 
('TENSOR', (1, 1, 728, 1024), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 65, 65, 728), 'float32'), 
('TENSOR', (1, 1, 728, 728), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 65, 65, 256), 'float32'), 
('TENSOR', (1, 1, 256, 728), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 65, 65, 256), 'float32'), 
('TENSOR', (1, 1, 256, 256), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 129, 129, 128), 'float32'), 
('TENSOR', (1, 1, 128, 256), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 129, 129, 128), 'float32'), 
('TENSOR', (1, 1, 128, 128), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 257, 257, 128), 'float32'), 
('TENSOR', (1, 1, 128, 128), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 257, 257, 64), 'float32'), 
('TENSOR', (1, 1, 64, 128), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 257, 257, 32), 'float32'), 
('TENSOR', (3, 3, 32, 64), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc_winograd_direct.cuda', ('TENSOR', (1, 257, 257, 32), 
'float32'), ('TENSOR', (3, 3, 32, 64), 'float32'), (1, 1), (1, 1, 1, 1), (1, 
1), 'float32'). A fallback configuration is used, which may bring great 
performance regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 515, 515, 3), 'float32'), 
('TENSOR', (3, 3, 3, 32), 'float32'), (2, 2), (0, 0, 0, 0), (1, 1), 'float32'). 
A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 257, 257, 64), 'float32'), 
('TENSOR', (1, 1, 64, 128), 'float32'), (2, 2), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 129, 129, 128), 'float32'), 
('TENSOR', (1, 1, 128, 256), 'float32'), (2, 2), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 65, 65, 2048), 'float32'), 
('TENSOR', (1, 1, 2048, 256), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
WARNING:autotvm:Cannot find config for target=cuda -model=unknown, 
workload=('conv2d_nhwc.cuda', ('TENSOR', (1, 129, 129, 256), 'float32'), 
('TENSOR', (1, 1, 256, 48), 'float32'), (1, 1), (0, 0, 0, 0), (1, 1), 
'float32'). A fallback configuration is used, which may bring great performance 
regression.
running TVM deeplab on image ...
Traceback (most recent call last):
  File 
"/home/incubator-tvm/tutorials/test_tensorflow/deeplab/deeplabv3_plus_tvm.py", 
line 168, in <module>
    module.run()
  File "/home/incubator-tvm/python/tvm/contrib/graph_runtime.py", line 177, in 
run
    self._run()
  File "/home/incubator-tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 225, 
in __call__
    raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
  [bt] (3) /home/incubator-tvm/build/libtvm.so(TVMFuncCall+0x65) 
[0x7f5094810c65]
  [bt] (2) /home/incubator-tvm/build/libtvm.so(std::_Function_handler<void 
(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), 
tvm::runtime::detail::PackFuncVoidAddr_<4, 
tvm::runtime::CUDAWrappedFunc>(tvm::runtime::CUDAWrappedFunc, 
std::vector<tvm::runtime::detail::ArgConvertCode, 
std::allocator<tvm::runtime::detail::ArgConvertCode> > 
const&)::{lambda(tvm::runtime::TVMArgs, 
tvm::runtime::TVMRetValue*)#1}>::_M_invoke(std::_Any_data const&, 
tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)+0xb6) [0x7f50948aac56]
  [bt] (1) 
/home/incubator-tvm/build/libtvm.so(tvm::runtime::CUDAWrappedFunc::operator()(tvm::runtime::TVMArgs,
 tvm::runtime::TVMRetValue*, void**) const+0x9df) [0x7f50948aaa5f]
  [bt] (0) 
/home/incubator-tvm/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x82)
 [0x7f5093e640b2]
  File "/home/incubator-tvm/src/runtime/cuda/cuda_module.cc", line 105
  File "/home/incubator-tvm/src/runtime/library_module.cc", line 78
CUDAError: Check failed: ret == 0 (-1 vs. 0) : 
cuModuleLoadData(&(module_[device_id]), data_.c_str()) failed with error: 
CUDA_ERROR_INVALID_PTX

Process finished with exit code 1

```

I tried to tune it by AutoTVM, but for some conv ops, the GFLOPS is always 0.00
```
[Task 17/26]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (132/1296) | 
50.07 s
WARNING:autotvm:Too many errors happen in the tuning. Now is in debug mode
```





---
[Visit 
Topic](https://discuss.tvm.ai/t/cuda-error-invalid-ptx-when-trying-to-run-tensorflow-deeplabv3-model/7121/1)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/c04e10211676faa03ed2ca3da8f8bad584de67bf2248b2b83781df51240f7f08).

[TVM Discuss] [Questions] CUDA_ERROR_INVALID_PTX when trying to run TensorFlow DeeplabV3+ model

Reply via email to