[ https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17169049#comment-17169049 ]
David Smiley commented on SOLR-14354: ------------------------------------- I think public API changes (includes additions) to the SolrClient hierarchy are important and deserving of explicit call-out attention and peer-review. Maybe the dev list would be the best place to make an API change proposal. It wouldn't need to get into the details of how it _works_ -- that's what JIRA is for, but more from a developer usability standpoint. In that spirit, I will start such a dev list discussion now on how best for users to use SolrClient with asynchronous semantics. > HttpShardHandler send requests in async > --------------------------------------- > > Key: SOLR-14354 > URL: https://issues.apache.org/jira/browse/SOLR-14354 > Project: Solr > Issue Type: Improvement > Reporter: Cao Manh Dat > Assignee: Cao Manh Dat > Priority: Major > Fix For: master (9.0), 8.7 > > Attachments: image-2020-03-23-10-04-08-399.png, > image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png > > Time Spent: 4h > Remaining Estimate: 0h > > h2. 1. Current approach (problem) of Solr > Below is the diagram describe the model on how currently handling a request. > !image-2020-03-23-10-04-08-399.png! > The main-thread that handles the search requests, will submit n requests (n > equals to number of shards) to an executor. So each request will correspond > to a thread, after sending a request that thread basically do nothing just > waiting for response from other side. That thread will be swapped out and CPU > will try to handle another thread (this is called context switch, CPU will > save the context of the current thread and switch to another one). When some > data (not all) come back, that thread will be called to parsing these data, > then it will wait until more data come back. So there will be lots of context > switching in CPU. That is quite inefficient on using threads.Basically we > want less threads and most of them must busy all the time, because threads > are not free as well as context switching. That is the main idea behind > everything, like executor > h2. 2. Async call of Jetty HttpClient > Jetty HttpClient offers async API like this. > {code:java} > httpClient.newRequest("http://domain.com/path") > // Add request hooks > .onRequestQueued(request -> { ... }) > .onRequestBegin(request -> { ... }) > // Add response hooks > .onResponseBegin(response -> { ... }) > .onResponseHeaders(response -> { ... }) > .onResponseContent((response, buffer) -> { ... }) > .send(result -> { ... }); {code} > Therefore after calling {{send()}} the thread will return immediately without > any block. Then when the client received the header from other side, it will > call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not > all response) from the data it will call {{onContent(buffer)}} listeners. > When everything finished it will call {{onComplete}} listeners. One main > thing that will must notice here is all listeners should finish quick, if the > listener block, all further data of that request won’t be handled until the > listener finish. > h2. 3. Solution 1: Sending requests async but spin one thread per response > Jetty HttpClient already provides several listeners, one of them is > InputStreamResponseListener. This is how it is get used > {code:java} > InputStreamResponseListener listener = new InputStreamResponseListener(); > client.newRequest(...).send(listener); > // Wait for the response headers to arrive > Response response = listener.get(5, TimeUnit.SECONDS); > if (response.getStatus() == 200) { > // Obtain the input stream on the response content > try (InputStream input = listener.getInputStream()) { > // Read the response content > } > } {code} > In this case, there will be 2 thread > * one thread trying to read the response content from InputStream > * one thread (this is a short-live task) feeding content to above > InputStream whenever some byte[] is available. Note that if this thread > unable to feed data into InputStream, this thread will wait. > By using this one, the model of HttpShardHandler can be written into > something like this > {code:java} > handler.sendReq(req, (is) -> { > executor.submit(() -> > try (is) { > // Read the content from InputStream > } > ) > }) {code} > The first diagram will be changed into this > !image-2020-03-23-10-09-10-221.png! > Notice that although “sending req to shard1” is wide, it won’t take long time > since sending req is a very quick operation. With this operation, handling > threads won’t be spin up until first bytes are sent back. Notice that in this > approach we still have active threads waiting for more data from InputStream > h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread. > Jetty have another listener called BufferingResponseListener. This is how it > is get used > {code:java} > client.newRequest(...).send(new BufferingResponseListener() { > public void onComplete(Result result) { > try { > byte[] response = getContent(); > //handling response > } > } > }); {code} > On receiving data, Jetty (one of its thread) will call the listener with the > given data (data here is just byte[] represent part of the response). The > listener will then buffer that byte[] into an internal buffer. When all the > data are received, Jetty will call onComplete of the listener and inside that > method we will get all the response. > By using this one, the model of HttpShardHandler can be written into > something like this > {code:java} > handle.send(req, (byte[]) -> { > // handling data here > }) {code} > The first diagram will be changed into this > !image-2020-03-23-10-12-00-661.png! > Pros: > * We don’t need additional threads for each request → Less threads > * No thread are activately waiting for data from an InputStream → Threads > are more busy > Cons > * Data must be buffered all before able being to parse → double memory being > used for parsing a response. > h2. 5. Solution 3: Why not both? > Solution 1 is good for parsing very large response or sometimes _unbounded_ > (like in StreamingExpression) response. > Solution 2 is good for parsing small response (may be < 10KB) since overhead > is little. > Should we combine both solutions above? After all what is returned by > HttpSolrClient so far for all requests is a NamedList<>, so as long as we can > return a NamedList<> using Solution 1 or Solution 2 are not matter with users. > Therefore the idea here is based on “CONTENT_LENGTH” of the response’s > headers. If the response body less than a certain size we will go with > solution 2 and vice versa. > _Note:_ Solr seems doesn’t return content-length accurately, need more > investigation. > h2. 6. Further improvement > The best approach to solve this problem is instead of converting InputStream > to NamedList, why don’t we just converting byte by byte and make it > resumable. Like this > {code:java} > Parser parser = new Parser(); > public void onContent(ByteBuffer buffer) { > parser.parse(buffer) > } > public void onComplete() { > NamedList<> result = parser.getResult(); > } {code} > Therefore, there will be no blocking operation inside parser, thus making a > very efficient model. But doing this requires tons of change in Solr, rewrite > all ResponseParsers in Solr, not mention the flow here must be rewritten. Not > sure it is worth it for doing that. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org