morningman commented on PR #15839: URL: https://github.com/apache/doris/pull/15839#issuecomment-1401381547
Some questions and suggestions: 1. `builder_scanner` doesn't seem to be used? Only `builder_scanner_memtable` is used? 2. Need to unify the inputs and outputs: 1. Inputs: * header file in json and data file in parquet(can be orc or other supported file format) * In the code, the reading methods of different file systems can be unified, and there is no need to use `isHDFS` to judge. 2. Outputs * new header file in json and Doris segment data file 3. The final upload logic can be encapsulated without being limited to HDFS 4. I think we can generate a manifest file to save all output file list. So that the downstream system can read it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org