[Apache TVM Discuss] [Questions] How to do heterogeneous execution on cpu and gpu?
Hi~ Can this unittest case help you? https://github.com/apache/tvm/blob/be03d62e5b0afd607964365bc73e94f72fdfaaef/tests/python/relay/test_vm.py#L1071 --- [Visit Topic](https://discuss.tvm.apache.org/t/how-to-do-heterogeneous-execution-on-cpu-and-gpu/11561/2) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/43014ee8bbb80d8b94366fad0c6cba25f45fea321944ed0c22772aaeb7588d5c).
[Apache TVM Discuss] [Questions] How to do heterogeneous execution on cpu and gpu?
If you are using `relay.build()` -> `graph_executor.GraphModule` path, the point I remember is that it should pass a multi-target dict into `target` argument of build and pass a device list into GraphModule like ```python lib = relay.build(relay_mod, target={"cpu": "llvm", "gpu": "cuda"}, params=params) m = graph_executor.GraphModule(lib["default"](tvm.cpu(), tvm.gpu())) ``` --- [Visit Topic](https://discuss.tvm.apache.org/t/how-to-do-heterogeneous-execution-on-cpu-and-gpu/11561/3) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/0e823c9c4118e3d4b9c4de99d3b6f2e78c4d7b44a9fed14480b0664bd09f63e6).
[Apache TVM Discuss] [Questions] How to generate a region from for loop iter var and predicate?
```python def estimate_region_lower_bound(region, var_dom, predicate): """Analyze the region with affine map, given the domain of variables and their predicate Parameters -- region : List[Range] The region to be analyzed. var_dom : Dict[Var, Range] The ranges of the variables predicate : PrimExpr The predicate for the affine map Returns -- region_int_set : Optional[List[IntSet]] None if the detection fails, or an array of IntSets as the result of analysis """ ``` Perhaps this interface can help the purpose. --- [Visit Topic](https://discuss.tvm.apache.org/t/how-to-generate-a-region-from-for-loop-iter-var-and-predicate/11838/2) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/3b20ed20f416c6285692d111ceafe193666d94f457535668809412907e5f9389).
[Apache TVM Discuss] [Questions] Why relay.floor_mod allows float64?
Hi, as a context, we can see other dl-frameworks also allow that like https://www.tensorflow.org/api_docs/python/tf/math/floormod. For TVM, below is where float `floormod` is lowered to normal arithmetics as `a - floor(a/b) * b` https://github.com/apache/tvm/blob/7396be5645fa59cb10ae8ee14b718dbf7737390b/src/tir/transforms/lower_intrin.cc#L184-L186 Since the case in the test script is to compute floor_mod(e^c1, c2) . I think it would be great to check what sub-step actually cause the difference (`exp`, `div`, or `floor()`?). It is known that CUDA arithmetics do not take exact same precision with c/c++ standard math libs, https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#mathematical-functions-appendix. In my environment, the cpu result match what I test with raw python: ``` >>> import math >>> def f(a,b): return a - math.floor(a/b) * b ... >>> f(math.exp(415.748715), 787.644532) 4.606887725612233e+164 ``` --- [Visit Topic](https://discuss.tvm.apache.org/t/why-relay-floor-mod-allows-float64/12073/2) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/78e2e6be1aed6944c5a313bc7958fd5574c06be37de77290580fbea871a7a467).