Commenting to agree that I like the approach, and strongly believe this will be useful (e.g. for reducing the boilerplate involved with setting up datasets for TVM training, since common datasets already exist in PyTorch or TF). Also agree with Tianqi about the NDArray/DLPack interfacing as we want to eliminate any unnecessary data copying especially in the training workflow.
Perhaps this is more specific to the PR, but I'm a bit wary of assuming a specific input and target/label shape (e.g. NCHW and integer) for some of the loaders, since this seems overfit to vision (how would we support a BERT dataset, for example?) Is knowledge of the layout really required? I'm also not sure about `__next__` return a list of ndarrays, since when batching inputs we want them to be in a single contiguous array of shape `(batch_size, ...)`. Hope this makes sense and would be happy to formally review the PR once you've had time to incorporate the other feedback! --- [Visit Topic](https://discuss.tvm.apache.org/t/dataloader-an-api-to-wrap-datasets-from-other-machine-learning-frameworks/9498/8) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/26096d1fa719f8f5b76a10d6bc5ce9f41f8105ad53fee18e90b86ed31113ce60).