morningman commented on PR #15839:
URL: https://github.com/apache/doris/pull/15839#issuecomment-1401381547

   Some questions and suggestions:
   
   1. `builder_scanner` doesn't seem to be used? Only 
`builder_scanner_memtable` is used?
   2. Need to unify the inputs and outputs:
        
        1. Inputs:
        
                * header file in json and data file in parquet(can be orc or 
other supported file format)
                * In the code, the reading methods of different file systems 
can be unified, and there is no need to use `isHDFS` to judge.
        
        2. Outputs
        
                * new header file in json and Doris segment data file
   
   3. The final upload logic can be encapsulated without being limited to HDFS
   4. I think we can generate a manifest file to save all output file list. So 
that the downstream system can read it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to