corleyma commented on code in PR #753: URL: https://github.com/apache/iceberg-python/pull/753#discussion_r1608626491
########## pyiceberg/table/__init__.py: ########## @@ -1774,8 +1774,19 @@ def to_duckdb(self, table_name: str, connection: Optional[DuckDBPyConnection] = def to_ray(self) -> ray.data.dataset.Dataset: import ray + from pyiceberg.io.pyarrow import ray_project_table - return ray.data.from_arrow(self.to_arrow()) + tables = ray_project_table( Review Comment: might folks be interested in using this functionality even when not getting Ray datasets? I think there are better way to integrate with Ray Datasets (we've seen some MRs for this already), but this could be a useful way to enable concurrency for folks who want to fully utilize their CPUs using a local Ray runner. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org