blackfox1983 opened a new issue #5153:
URL: https://github.com/apache/incubator-doris/issues/5153


   **Is your feature request related to a problem? Please describe.**
   At present, i use the broker to export data from doris and save 
query-outfile to some filesystem, now i found some very inconvenient and 
inefficient scenarios.
   
   1. Trial and Devlop
   
   I will first try this feature in my development environment. In our 
development environment, we don't have distributed storage systems such as BOS, 
S3 and HDFS. The cost of building these systems is very high, and it may take a 
long time to build and debug them. What we need is a stand-alone environment. 
For example, the data can be written to the local disk through the broker. 
(generally speaking, test data is also very small)
   
   But now, we have to do a lot of debugging work based on current broker and 
our own system. For example, we have to debug with cos(tencent cloud), which 
takes a long time. And this is just a trial phase.
   
   Another problem is that the interfaces of various cloud environments may not 
be unified, that is, they cannot be completely inherited from the base class 
org.apache.hadoop. fs.FileSystem. This is the base class of the file system 
managed by the current broker. The framework compatibility is not strong.
   
   2. Prod
   In Prod env, we hope the data can be exported stably. We will specify column 
separator, row separator, etc. Our data may contain the same character as the 
separator, which will lead to unexpected data processing: we have to be careful 
not to have a separator in the data field. In the current mode, it was a 
terrible experience to maintain this stability.
   
   **Describe the solution you'd like**
   It's better to design with open mode other than binding in broker.
   
   Doris support SCAN feature. User can Scan paritions of Table like hbase / es.
   
   Data from Scan can be stored in a variety of systems according to the user's 
desired format.
   
   **Describe alternatives you've considered**
   ```
   scan [partions] from table where a = 1 and b = 2
   with properties (
       "scan_thread_in_doris" = "1",
       "..."
   )
   ```
   return cursur and ttl(e.g. 5mins)
   
   ```
   scan by cursur
   ```
   
   In this way, it can support breakpoint continuation after application restart
   
   **Additional context**
   Add any other context or screenshots about the feature request here.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to