vagetablechicken opened a new issue #2780: OlapTableSink::send is low efficient?
URL: https://github.com/apache/incubator-doris/issues/2780
 
 
   **Problem**
   When we use broker load, OlapTableSink::send() takes the longest time, 
almost all of the plan_fragment active time.
   Example in one BE:
   ```
   Fragment f59d832368a84c94-be109f903cf4698d:(Active: 3h36m, % non-child: 
0.00%)
      - AverageThreadTokens: 1.00
      - PeakReservation: 0
      - PeakUsedReservation: 0
      - RowsProduced: 168.61M
      - SizeProduced: 30.25 GB
     BlockMgr:
        - BlockWritesOutstanding: 0
        - BlocksCreated: 0
        - BlocksRecycled: 0
        - BufferedPins: 0
        - BytesWritten: 0
        - MaxBlockSize: 8.00 MB
        - MemoryLimit: 2.00 GB
        - TotalBufferWaitTime: 0.000ns
        - TotalEncryptionTime: 0.000ns
        - TotalIntegrityCheckTime: 0.000ns
        - TotalReadBlockTime: 0.000ns
     OlapTableSink:(Active: 3h35m, % non-child: 0.00%)
        - CloseTime: 102.932ms
        - ConvertBatchTime: 0.000ns
        - OpenTime: 247.194ms
        - RowsFiltered: 0
        - RowsRead: 168.61M
        - RowsReturned: 168.61M
        - SendDataTime: 3h34m
        - SerializeBatchTime: 8m26s
        - ValidateDataTime: 19s554ms
        - WaitInFlightPacketTime: 3h23m
     BROKER_SCAN_NODE (id=0):(Active: 1m8s, % non-child: 0.00%)
        - BytesRead: 0
        - MemoryUsed: 0
        - NumThread: 0
        - PerReadThreadRawHdfsThroughput: 0.00 /sec
        - RowsRead: 168.61M
        - RowsReturned: 168.61M
        - RowsReturnedRate: 2.48 M/sec
        - ScanRangesComplete: 0
        - ScannerThreadsInvoluntaryContextSwitches: 0
        - ScannerThreadsTotalWallClockTime: 0.000ns
          - MaterializeTupleTime(*): 5m37s
          - ScannerThreadsSysTime: 0.000ns
          - ScannerThreadsUserTime: 0.000ns
        - ScannerThreadsVoluntaryContextSwitches: 0
        - TotalRawReadTime(*): 38m58s
        - TotalReadThroughput: 0.00 /sec
        - WaitScannerTime: 1m7s
   ```
   
   As can be seen above, WaitInFlightPacketTime is the most time-consuming 
portion.
   
   **Analysis**
   I describe the whole progress here.
   
   PlanFragmentExecutor pseudo code:
   ```
   while(1){
       batch=get_one_batch();
       OlapTableSink::send(batch);
   }
   ```
   Then, OlapTableSink::send() pseudo code:
   ```
   for(row in batch){
       channel=get_corresponding_channel(row);
   
       // channel::add_row() explanation:
       ok=channel::add_row_in_cur_batch(row);
       if(!ok){
           if(channel::has_in_flight_packet){
               channel::wait_in_flight_packet(); // (*)
           }
           channel::send_add_batch_req();
           channel::add_row_in_cur_batch(row);
       }
       // channel::add_row() end
   }
   ```
   
   So if we trigger channel::wait_in_flight_packet(), it will block the whole 
process. But there's no need to block other channels add_row(). 
   For example, channel0 is waiting in_flight_packet, we can still add row to 
other channels.
   
   **Better solutions(preliminary thoughts)**
   * make channel::add_row() non-blocking. It might be a massive change.
   * make channel::add_row() less blocking. e.g. avoid adding row to channel0 
immediately after channel0 send a add_batch request.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to