GitHub user avamingli added a comment to the discussion: Extend the gpfdist tool to support SFTP/HDFS protocols for high-performance multi-source data ingestion
With `transform`, you could do anything to data before it's inserted into the database. For example, it can work as an independent tool that integrates and parses various source data. And have you evaluated [hdfs_fdw](https://github.com/EnterpriseDB/hdfs_fdw) as a potential solution for Cloudberry? Much of the code is already compatible with PostgreSQL and works well, so adapting this FDW for Cloudberry might require significantly less effort compared to integrating multiple client-side protocols into gpfdist. This approach could also prove more cost-effective in the long run. By making FDW MPP, it could use multiple segments to insert, select. GitHub link: https://github.com/apache/cloudberry/discussions/1205#discussioncomment-13660384 ---- This is an automatically sent email for dev@cloudberry.apache.org. To unsubscribe, please send an email to: dev-unsubscr...@cloudberry.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cloudberry.apache.org For additional commands, e-mail: dev-h...@cloudberry.apache.org