GitHub user ZTE-EBASE added a comment to the discussion: Extend the gpfdist 
tool to support SFTP/HDFS protocols for high-performance multi-source data 
ingestion

Yes, our implementation relies on libssh along with the arrow/parquet 
libraries. This approach is tailored to specific business requirements, and 
since the business scenario involves large-scale data, we adopt a parallel 
strategy to achieve high-performance data ingestion and querying.
Regarding the HDFS protocol, we have implemented FDW (Foreign Data Wrapper) for 
it. However, this involved a significant amount of code modification and 
changes to the kernel. Should there be a need, we can provide this 
implementation later.

GitHub link: 
https://github.com/apache/cloudberry/discussions/1205#discussioncomment-13638452

----
This is an automatically sent email for dev@cloudberry.apache.org.
To unsubscribe, please send an email to: dev-unsubscr...@cloudberry.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cloudberry.apache.org
For additional commands, e-mail: dev-h...@cloudberry.apache.org

Reply via email to