https://discuss.tvm.apache.org/t/vta-autotuning-from-tutorial-fails-with-one-pynq-but-succeeds-with-two-pynqs/4265/3?u=hht
I find the workaround for autotuning with one PYNQ and locate the problem.
In the VTA autotuning tutorial, there is a handle named `remote`.
The `remote` does two things. One is to program FPGA.
```
if env.TARGET != "sim":
# Get remote from fleet node
remote = autotvm.measure.request_remote(
env.TARGET, tracker_host, tracker_port, timeout=10000
)
# Reconfigure the JIT runtime and FPGA.
vta.reconfig_runtime(remote)
vta.program_fpga(remote, bitstream=None)
else:
# In simulation mode, host the RPC server locally.
remote = rpc.LocalSession()
```
Another is to run the whole net and give the result after autotuning.
```
# compile kernels with history best records
with autotvm.tophub.context(target, extra_files=[log_file]):
# Compile network
print("Compile...")
if target.device_name != "vta":
with tvm.transform.PassContext(opt_level=3,
disabled_pass={"AlterOpLayout"}):
lib = relay.build(
relay_prog, target=target, params=params,
target_host=env.target_host
)
else:
with vta.build_config(opt_level=3, disabled_pass={"AlterOpLayout"}):
lib = relay.build(
relay_prog, target=target, params=params,
target_host=env.target_host
)
# Export library
print("Upload...")
temp = util.tempdir()
lib.save(temp.relpath("graphlib.o"))
remote.upload(temp.relpath("graphlib.o"))
lib = remote.load_module("graphlib.o")
# Generate the graph runtime
ctx = remote.ext_dev(0) if device == "vta" else remote.cpu(0)
m = graph_runtime.GraphModule(lib["default"](ctx))
# upload parameters to device
image = tvm.nd.array((np.random.uniform(size=(1, 3, 224,
224))).astype("float32"))
m.set_input("data", image)
# evaluate
print("Evaluate inference time cost...")
timer = m.module.time_evaluator("run", ctx, number=1, repeat=10)
tcost = timer()
prof_res = np.array(tcost.results) * 1000 # convert to millisecond
print(
"Mean inference time (std dev): %.2f ms (%.2f ms)"
% (np.mean(prof_res), np.std(prof_res))
)
```
The `remote` occupies a device all the time but it play no role in autotuning.
So my workaround is to comment out the code above to remove the `remote` and it
works.
```
Extract tasks...
Extracted 10 conv2d tasks:
(1, 14, 14, 256, 512, 1, 1, 0, 0, 2, 2)
(1, 28, 28, 128, 256, 1, 1, 0, 0, 2, 2)
(1, 56, 56, 64, 128, 1, 1, 0, 0, 2, 2)
(1, 56, 56, 64, 64, 3, 3, 1, 1, 1, 1)
(1, 28, 28, 128, 128, 3, 3, 1, 1, 1, 1)
(1, 56, 56, 64, 128, 3, 3, 1, 1, 2, 2)
(1, 14, 14, 256, 256, 3, 3, 1, 1, 1, 1)
(1, 28, 28, 128, 256, 3, 3, 1, 1, 2, 2)
(1, 7, 7, 512, 512, 3, 3, 1, 1, 1, 1)
(1, 14, 14, 256, 512, 3, 3, 1, 1, 2, 2)
Tuning...
[Task 1/10] Current/Best: 0.00/ 28.79 GFLOPS | Progress: (480/480) |
306.61 s Done.
[Task 2/10] Current/Best: 0.00/ 31.41 GFLOPS | Progress: (576/576) |
389.47 s Done.
[Task 3/10] Current/Best: 0.00/ 43.20 GFLOPS | Progress: (1000/1000) |
667.90 s Done.
[Task 4/10] Current/Best: 0.00/ 46.37 GFLOPS | Progress: (1000/1000) |
564.08 s Done.
[Task 5/10] Current/Best: 0.00/ 38.90 GFLOPS | Progress: (1000/1000) |
641.09 s Done.
[Task 6/10] Current/Best: 0.00/ 44.39 GFLOPS | Progress: (1000/1000) |
560.03 s Done.
[Task 7/10] Current/Best: 0.00/ 40.67 GFLOPS | Progress: (1000/1000) |
731.33 s Done.
[Task 8/10] Current/Best: 0.00/ 9.58 GFLOPS | Progress: (1000/1000) |
1046.03 s Done.
[Task 9/10] Current/Best: 0.00/ 12.51 GFLOPS | Progress: (1000/1000) |
1276.48 s Done.
[Task 10/10] Current/Best: 0.31/ 11.95 GFLOPS | Progress: (480/480) |
619.91 s Done.
```
---
[Visit
Topic](https://discuss.tvm.apache.org/t/vta-workaround-for-autotuning-with-one-pynq-z1-board/8091/1)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discuss.tvm.apache.org/email/unsubscribe/94f47a9d4308101b7a5db3647e0b51485338189b3598d9a5306746634c3f8e1f).