Hi David,

I added the test you suggested here https://github.com/apache/solr/pull/3362

Surprisingly they pass. I think it is because although the idleTimeout set on 
the jetty HttpClient passed into the Http2SolrClient remains unchanged, the 
Http2SolrClient-level timeout is still enforced via Jetty's 
InputStreamResponseListener::get which effectively works like an idle timeout:

https://github.com/apache/solr/blob/b91905a653ead168cea9de2a1f1791c76fb17882/solr/solrj/src/java/org/apache/solr/client/solrj/impl/Http2SolrClient.java#L517

I got somewhat paranoid and added the SlowStream test servlet to make sure that 
both timeouts are respected when they are overridden. As far as I can tell, 
they are.

I will admit that it still feels weird to pass an idleTimeout into a 
Http2SolrClient and then *not* propagate it to the underlying Jetty HttpClient. 
Also, there could be other properties that don't work as nicely as these two 
when overridden on an Http2SolrClient that takes in a Jetty HttpClient. But I 
haven't given that a deeper look.

Thanks,
Luke


From: dev@solr.apache.org At: 05/19/25 16:32:09 UTC-4:00To:  dev@solr.apache.org
Cc:  Luke Kot-Zaniewski (BLOOMBERG/ 919 3RD A ) ,  stilla...@apache.org,  
gus.h...@gmail.com
Subject: Re: SolrJ timeouts

My annotated draft PR:
https://github.com/apache/solr/pull/3357
Doesn't solve the issue but its exception throwing means many tests fail
since those places in Solr were trying to both use an existing HttpClient
_and_ set timeouts.  We can't do that and we *weren't* doing that!  'course
if I'm missing something obvious; let me know!

> I guess my attempt to fix it (https://github.com/apache/solr/pull/3356)
is doomed to fail as it currently stands?

Agreed.  But let's see:  Maybe a test would be: configure a super low idle
timeout on an Http2SolrClient that's constructed manually.  Then create a
new Http2SolrClient with the first using withHttpClient and set a
reasonable timeout.  We should be able to do a request that takes time
(like a Solr request referencing "sleep(5000)" in a function query.  See
features-slow.json with such an excerpt.  I believe we'll sadly have a
timeout because the idle timeout customization isn't honored when
withHttpClient is used.  Such a test would go in Http2SolrClientTest.

On Mon, May 19, 2025 at 9:45 AM David Smiley <dsmi...@apache.org> wrote:

> This weekend I was harmonizing some of the timeouts between JDK & Jetty
> based HTTP SolrClient implementations.  I saw some inconsistencies,
> duplication, and the need for javadocs to explain what these timeouts
> fundamentally mean.  I have some nice WIP code I was looking forward to
> sharing by now.
>
> Then I noticed something insidious:  If an Http2SolrClient (Jetty) is
> created using the builder's withHttpClient(...) method, then the idle (aka
> socket) timeout & connection timeout (and basically any other setting
> that's inside the Jetty HttpClient) cannot *actually* be customized
> henceforth.  This should be pretty obvious, when you think about it (duh).
> But we don't do any checks in construction to alert you, plus we
> redundantly put some of these settings on the Http2SolrClient (incl. base
> class) which fools any onlooker into thinking it might do something.  My
> WIP addresses this by throwing an IllegalArgumentException on build() if
> you try to set something that won't take effect.  This revealed a number of
> prominent places in Solr that both call withHttpClient and also customize
> the timeouts (to no avail):  Overseer, SolrClientCache (for streaming
> expressions), IndexFetcher, and some recovery stuff; maybe I missed
> something.  This solution won't work.
>
> Perhaps withHttpClient should not reuse the underlying HttpClient?  It's a
> shame to not reuse it but it's at least the safest choice.  In my opinion,
> withHttpClient doesn't have an ideal name either; it should be
> withHttpSolrClient.  Its current name is ambiguous; I hate confusing
> HttpSolrClient with HttpClient.  And if we copy configuration from the
> client but don't actually use it, then maybe we should name this method
> copyConfigurationFrom or withConfigurationFrom.
>
> I'll throw up a draft PR today that does the cleanup/harmonization but not
> yet solve the above conundrum.
>
> (CC'ed some folks that worked on timeouts)
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>


Reply via email to