[Apache TVM Discuss] [Development/RFC] DataLoader -- an API to wrap datasets from other machine learning frameworks

Josh Fromm via Apache TVM Discuss Wed, 24 Mar 2021 14:03:23 -0700


Thanks for writing this up, Lily, I think standardizing how we handle external 
datasets is highly valuable.


To comment on some of the points Tianqi raised, I quite like that this approach 
is fundamentally separated from any dependencies as it allows user's to wrap 
any dataset or dataloader they want.

I agree with Tianqi that we should consider returning ndarray instead of numpy 
as it's more tightly integrated with TVM. The point about zero-copy through 
DLPack is quite interesting and could be cool follow-up work if we go with the 
ndarray standardization.

I also like using `@property` for `batch_size` and `num_batches` since itll 
look a little cleaner.

In terms of naming, I think the proposed `DataLoader` is a better description 
of the functionality than `Dataset` and lean towards `tvm.utils.data` being the 
best namespace for this work.

That said, these are all pretty minor points, this work overall is great. 
Thanks Lily!





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/dataloader-an-api-to-wrap-datasets-from-other-machine-learning-frameworks/9498/3)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/f7c1fb2237b6be6ecd9f9acdb591b9b16b9e6798c40f9be357d4694d9af11d33).

[Apache TVM Discuss] [Development/RFC] DataLoader -- an API to wrap datasets from other machine learning frameworks

Reply via email to