GitHub user avamingli added a comment to the discussion: Extend the gpfdist 
tool to support SFTP/HDFS protocols for high-performance multi-source data 
ingestion

With `transform`, you could do anything to data before it's inserted into the 
database. For example, it can work as an independent tool that integrates and 
parses various source data.

And have you evaluated [hdfs_fdw](https://github.com/EnterpriseDB/hdfs_fdw) as 
a potential solution for Cloudberry?
Much of the code is already compatible with PostgreSQL and works well, so 
adapting this FDW for Cloudberry might require significantly less effort 
compared to integrating multiple client-side protocols into gpfdist. This 
approach could also prove more cost-effective in the long run.
By making FDW MPP, it could use multiple segments to insert, select. 



GitHub link: 
https://github.com/apache/cloudberry/discussions/1205#discussioncomment-13660384

----
This is an automatically sent email for dev@cloudberry.apache.org.
To unsubscribe, please send an email to: dev-unsubscr...@cloudberry.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cloudberry.apache.org
For additional commands, e-mail: dev-h...@cloudberry.apache.org

Reply via email to