GitHub user ZTE-EBASE added a comment to the discussion: Extend the gpfdist tool to support SFTP/HDFS protocols for high-performance multi-source data ingestion
Yes, our implementation relies on libssh along with the arrow/parquet libraries. This approach is tailored to specific business requirements, and since the business scenario involves large-scale data, we adopt a parallel strategy to achieve high-performance data ingestion and querying. Regarding the HDFS protocol, we have implemented FDW (Foreign Data Wrapper) for it. However, this involved a significant amount of code modification and changes to the kernel. Should there be a need, we can provide this implementation later. GitHub link: https://github.com/apache/cloudberry/discussions/1205#discussioncomment-13638452 ---- This is an automatically sent email for dev@cloudberry.apache.org. To unsubscribe, please send an email to: dev-unsubscr...@cloudberry.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cloudberry.apache.org For additional commands, e-mail: dev-h...@cloudberry.apache.org