Im-Manshushu commented on issue #11258:
URL: https://github.com/apache/doris/issues/11258#issuecomment-1198922608

   After the function development is completed, a reference threshold can be 
given to users. Users can set sink concurrency and checkpoint interval 
according to scenarios such as data volume and effectiveness
   
   
   
   
   
   
   
   ------------------ 原始邮件 ------------------
   发件人:                                                                         
                                               "apache/doris"                   
                                                                 
***@***.***>;
   发送时间: 2022年7月29日(星期五) 中午11:47
   ***@***.***>;
   抄送: "I'm ~ ***@***.******@***.***>;
   主题: Re: [apache/doris] [Feature] JSON data is dynamically written to 
the Doris table (Issue #11258)
   
   
   
   
   
     
   Many users put all the canal logs of all tables in the business library into 
one topic, which needs to be distributed before they can use 
doris-flink-connector. His idea is to edit a task to synchronize the entire 
library. Because currently doris-flink-connector uses http inputstream, that 
is, a checkpoint opens a stream, and a streamLoad url is strongly bound. 
Therefore, the current doris-flink-connector architecture is not suitable for 
the entire library synchronization, because it will involve too many http long 
link. In this case, we can only use the old streamload batch mode: the flink 
side caches data, then a table generates a buffer, and binds the corresponding 
table-streamload-url, and sets a threshold, such as rows number or batch size 
to submit tasks, just like doris-datax-writer.
     
   However, in the old version of stream load and batch writing, there may be 
several problems:
     
   A series of problems caused by the unreasonable setting of the cached batch 
size: For example, if it is too small, it will cause the -235 problem caused by 
frequent imports; if the setting is too large, the flink memory will be under 
pressure.
    
   And does not guarantee exactly-once semantics
     
   —
   Reply to this email directly, view it on GitHub, or unsubscribe.
   You are receiving this because you authored the thread.Message ID: 
***@***.***>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to