vagetablechicken opened a new issue #2780: OlapTableSink::send is low efficient? URL: https://github.com/apache/incubator-doris/issues/2780 **Problem** When we use broker load, OlapTableSink::send() takes the longest time, almost all of the plan_fragment active time. Example in one BE: ``` Fragment f59d832368a84c94-be109f903cf4698d:(Active: 3h36m, % non-child: 0.00%) - AverageThreadTokens: 1.00 - PeakReservation: 0 - PeakUsedReservation: 0 - RowsProduced: 168.61M - SizeProduced: 30.25 GB BlockMgr: - BlockWritesOutstanding: 0 - BlocksCreated: 0 - BlocksRecycled: 0 - BufferedPins: 0 - BytesWritten: 0 - MaxBlockSize: 8.00 MB - MemoryLimit: 2.00 GB - TotalBufferWaitTime: 0.000ns - TotalEncryptionTime: 0.000ns - TotalIntegrityCheckTime: 0.000ns - TotalReadBlockTime: 0.000ns OlapTableSink:(Active: 3h35m, % non-child: 0.00%) - CloseTime: 102.932ms - ConvertBatchTime: 0.000ns - OpenTime: 247.194ms - RowsFiltered: 0 - RowsRead: 168.61M - RowsReturned: 168.61M - SendDataTime: 3h34m - SerializeBatchTime: 8m26s - ValidateDataTime: 19s554ms - WaitInFlightPacketTime: 3h23m BROKER_SCAN_NODE (id=0):(Active: 1m8s, % non-child: 0.00%) - BytesRead: 0 - MemoryUsed: 0 - NumThread: 0 - PerReadThreadRawHdfsThroughput: 0.00 /sec - RowsRead: 168.61M - RowsReturned: 168.61M - RowsReturnedRate: 2.48 M/sec - ScanRangesComplete: 0 - ScannerThreadsInvoluntaryContextSwitches: 0 - ScannerThreadsTotalWallClockTime: 0.000ns - MaterializeTupleTime(*): 5m37s - ScannerThreadsSysTime: 0.000ns - ScannerThreadsUserTime: 0.000ns - ScannerThreadsVoluntaryContextSwitches: 0 - TotalRawReadTime(*): 38m58s - TotalReadThroughput: 0.00 /sec - WaitScannerTime: 1m7s ``` As can be seen above, WaitInFlightPacketTime is the most time-consuming portion. **Analysis** I describe the whole progress here. PlanFragmentExecutor pseudo code: ``` while(1){ batch=get_one_batch(); OlapTableSink::send(batch); } ``` Then, OlapTableSink::send() pseudo code: ``` for(row in batch){ channel=get_corresponding_channel(row); // channel::add_row() explanation: ok=channel::add_row_in_cur_batch(row); if(!ok){ if(channel::has_in_flight_packet){ channel::wait_in_flight_packet(); // (*) } channel::send_add_batch_req(); channel::add_row_in_cur_batch(row); } // channel::add_row() end } ``` So if we trigger channel::wait_in_flight_packet(), it will block the whole process. But there's no need to block other channels add_row(). For example, channel0 is waiting in_flight_packet, we can still add row to other channels. **Better solutions(preliminary thoughts)** * make channel::add_row() non-blocking. It might be a massive change. * make channel::add_row() less blocking. e.g. avoid adding row to channel0 immediately after channel0 send a add_batch request.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org