[GitHub] [pinot] mqliang opened a new pull request #8083: Timeout if waiting server channel lock takes a long time

GitBox Thu, 27 Jan 2022 20:35:22 -0800


mqliang opened a new pull request #8083:
URL: https://github.com/apache/pinot/pull/8083



   ## Description
   <!-- Add a description of your PR here.
   A good description should include pointers to an issue or design document, 
etc.
   -->
   
   We currently timeout a query at different phase:
   * After building query router, check timeout. 
   * When waiting server response, check timeout and cancel waiting if need
   * Timeout when reducing response
   
   But we don't time out when "waiting the channel lock and sending request". 
It may cause some problem. Say we have a QPS = 100, Latency_SLA = 100ms.
   
   Now broker received 100 requests, they need to be send to a same server, 
let's ignore the time spending on building query router for sake of simplicity 
now. Let's say the 1st request is an extremely large request (or the server 
side reading the request from netty TCP channel extremely slowly for some 
reasons), thus writing the 1st request to channel takes 200ms.
   
   As we only have 1 TCP connection between broker and each server. The 
requests must be write to the channel ***sequentially***, and currently we 
don't  time out when "waiting channel lock and sending request", all the 
following 99 requests will continue to be written to channel, whereas they 
already exceed the 100ms SLA.
   
   This RB make requests be able to time out early when "waiting the channel 
lock". It also move the request serialization logic out of critical section.
   
   NOTE: Once a thread acquired the channel lock and start sending request by 
writing bytes to TCP channel, we can not abort it, even if the "sending" exceed 
the latency SLA, this is because server side can not read partial requests from 
channel then throw it, it must read a complete request. Thus the "head line of 
blocking" effect still exist.  If we wanna completely solve the issue, we need 
consider gRPC or HTTP/2, which can split large request into small chunks, thus 
multiplex/interleave on a single TCP connection, which make it possible of 
abort a single request without impacting others.
   
   ## Upgrade Notes
   Does this PR prevent a zero down-time upgrade? (Assume upgrade order: 
Controller, Broker, Server, Minion)
   * [ ] Yes (Please label as **<code>backward-incompat</code>**, and complete 
the section below on Release Notes)
   
   Does this PR fix a zero-downtime upgrade introduced earlier?
   * [ ] Yes (Please label this as **<code>backward-incompat</code>**, and 
complete the section below on Release Notes)
   
   Does this PR otherwise need attention when creating release notes? Things to 
consider:
   - New configuration options
   - Deprecation of configurations
   - Signature changes to public methods/interfaces
   - New plugins added or old plugins removed
   * [ ] Yes (Please label this PR as **<code>release-notes</code>** and 
complete the section on Release Notes)
   ## Release Notes
   <!-- If you have tagged this as either backward-incompat or release-notes,
   you MUST add text here that you would like to see appear in release notes of 
the
   next release. -->
   
   <!-- If you have a series of commits adding or enabling a feature, then
   add this section only in final commit that marks the feature completed.
   Refer to earlier release notes to see examples of text.
   -->
   ## Documentation
   <!-- If you have introduced a new feature or configuration, please add it to 
the documentation as well.
   See 
https://docs.pinot.apache.org/developers/developers-and-contributors/update-document
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [pinot] mqliang opened a new pull request #8083: Timeout if waiting server channel lock takes a long time

Reply via email to