GitHub user gfphoenix78 added a comment to the discussion: Extend the gpfdist tool to support SFTP/HDFS protocols for high-performance multi-source data ingestion
The discussion seems to support more protocols for external tables, not multiple data sources for a single external table. To be clear, the external table has supported multiple data sources for a single external table. The topic has two targets: 1. support more transfer protocol 2. support addtional file format Let's discuss about them one by one. ## Transfer Protocol Looks to support more clients to fetch files. BTW, gpfdist supports to transform on server, like https://github.com/apache/cloudberry/blob/main/src/bin/gpfdist/regress/input/exttab1.source#L541 ## File Format The external table support **CUSTOM** format: ``` Syntax: CREATE [READABLE] EXTERNAL [TEMPORARY | TEMP] TABLE table_name ( column_name data_type [, ...] | LIKE other_table ) LOCATION ('file://seghost[:port]/path/file' [, ...]) | ('gpfdist://filehost[:port]/file_pattern[#transform]' | ('gpfdists://filehost[:port]/file_pattern[#transform]' [, ...]) FORMAT 'TEXT' [( [HEADER] [DELIMITER [AS] 'delimiter' | 'OFF'] [NULL [AS] 'null string'] [ESCAPE [AS] 'escape' | 'OFF'] [NEWLINE [ AS ] 'LF' | 'CR' | 'CRLF'] [FILL MISSING FIELDS] )] | 'CSV' [( [HEADER] [QUOTE [AS] 'quote'] [DELIMITER [AS] 'delimiter'] [NULL [AS] 'null string'] [FORCE NOT NULL column [, ...]] [ESCAPE [AS] 'escape'] [NEWLINE [ AS ] 'LF' | 'CR' | 'CRLF'] [FILL MISSING FIELDS] )] | 'CUSTOM' (Formatter=<formatter specifications>) [ OPTIONS ( key 'value' [, ...] ) ] [ ENCODING 'encoding' ] [ [LOG ERRORS] SEGMENT REJECT LIMIT count [ROWS | PERCENT] ] ``` You could consider to implement a new file format. GitHub link: https://github.com/apache/cloudberry/discussions/1205#discussioncomment-13646968 ---- This is an automatically sent email for dev@cloudberry.apache.org. To unsubscribe, please send an email to: dev-unsubscr...@cloudberry.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cloudberry.apache.org For additional commands, e-mail: dev-h...@cloudberry.apache.org