The JavaDoc needs a lot more information. As I remember it, SolrJ started as a 
thin layer over Apache HttpClient, so the authors may have assumed that 
programmers were familiar with that library. HttpClient makes a shared object 
that manages a pool of connections to the target server. HttpClient is 
seriously awesome—I first used it in the late 1990’s when I hit the limitations 
of the URL classes written by Sun.

I looked at the JavaDoc and various examples and none of them make this clear. 
Not your fault, we need a serious upgrade on those docs.

On the plus side, your program should be a lot faster after you reuse the 
client class.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Jan 31, 2016, at 3:46 PM, Steven White <swhite4...@gmail.com> wrote:
> 
> Thanks Walter.  Yes, I saw your answer and fixed the issue per your
> suggestion.
> 
> The JavaDoc need to make this clear.  The fact there is a close() on this
> class and the JavaDoc does not say "your program should have exactly as
> many HttpSolrClient objects as there are servers it talks to" is a prime
> candidate for missuses of the class.
> 
> Steve
> 
> 
> On Sun, Jan 31, 2016 at 5:20 PM, Walter Underwood <wun...@wunderwood.org>
> wrote:
> 
>> I already answered this.
>> 
>> Move the creation of the HttpSolrClient outside the loop. Your code will
>> run much fast, because it will be able to reuse the connections.
>> 
>> Put another way, your program should have exactly as many HttpSolrClient
>> objects as there are servers it talks to. If there is one Solr server, you
>> have one object.
>> 
>> There is no leak in HttpSolrClient, you are misusing the class, massively.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>> 
>>> On Jan 31, 2016, at 2:10 PM, Steven White <swhite4...@gmail.com> wrote:
>>> 
>>> Thank you all for your feedback.
>>> 
>>> This is code that I inherited and the example i gave is intended to
>>> demonstrate the memory leak which based on YourKit is
>>> on java/util/LinkedHashMap$Entry.  In short, I'm getting core dumps with
>>> "Detail "java/lang/OutOfMemoryError" "Java heap space" received "
>>> 
>>> Here is a more detailed layout of the code.  This is a crawler that runs
>>> 24x7 without any recycle logic in place:
>>> 
>>>   init_data()
>>> 
>>>   while (true)
>>>   {
>>>       HttpSolrClient client = new HttpSolrClient("
>>> http://localhost:8983/solr/core1 <http://192.168.202.129:8983/solr/core1
>>> /");
>>> <<<< this is real code
>>> 
>>>       see_if_we_have_new_data();
>>> 
>>>       send_new_data_to_solr();
>>> 
>>>       client.close();    <<<< this is real code
>>> 
>>>       sleep_for_a_bit(N);    <<<< 'N' can be any positive int
>>>   }
>>> 
>>> By default, our Java program is given 4gb of ram "-Xmx4g" and N is set
>> for
>>> 5 min.  We had a customer set N to 10 second and we started seeing core
>>> dumps with OOM.  As I started to debug, I narrowed the OOM to
>>> HttpSolrClient per my original email.
>>> 
>>> The follow up answers I got suggest that I move the construction of
>>> HttpSolrClient object outside the while loop which I did (but I also had
>> to
>>> move "client.close()" outside the loop) and the leak is gone.
>>> 
>>> Give this, is this how HttpSolrClient is suppose to be used?  If so,
>> what's
>>> the point of HttpSolrClient.close()?
>>> 
>>> Another side question.  I noticed HttpSolrClient has a setBaseUrl().
>> Now,
>>> if I call it and give it "http://localhost:8983/solr/core1
>>> <http://192.168.202.129:8983/solr/core1>/" (ntoice the "/" at the end)
>> next
>>> time I use HttpSolrClient to send Solr data, I get back 404. The fix is
>> to
>>> remove the ending "/".  This is not how the constructor of HttpSolrClient
>>> behaves; HttpSolrClient will take the URL with or without "/".
>>> 
>>> In summary, it would be good if someone can confirm f we have a memory
>> leak
>>> in HttpSolrClient if used per my example; if so this is a defect.  Also,
>>> can someone confirm the fix I used for this issue: move the constructor
>> of
>>> HttpSolrClient outside the loop and reuse the existing object "client".
>>> 
>>> Again, thank you all for the quick response it is much appreciated.
>>> 
>>> Steve
>>> 
>>> 
>>> 
>>> On Sat, Jan 30, 2016 at 1:24 PM, Erick Erickson <erickerick...@gmail.com
>>> 
>>> wrote:
>>> 
>>>> Assuming you're not really using code like above and it's a test
>> case....
>>>> 
>>>> What's your evidence that memory consumption goes up? Are you sure
>>>> you're not just seeing uncollected garbage?
>>>> 
>>>> When I attached Java Mission Control to this program it looked pretty
>>>> scary at first, but the heap allocated after old generation garbage
>>>> collections leveled out to a steady state.
>>>> 
>>>> 
>>>> On Sat, Jan 30, 2016 at 9:29 AM, Walter Underwood <
>> wun...@wunderwood.org>
>>>> wrote:
>>>>> Create one HttpSolrClient object for each Solr server you are talking
>>>> to. Reuse it for all requests to that Solr server.
>>>>> 
>>>>> It will manage a pool of connections and keep them alive for faster
>>>> communication.
>>>>> 
>>>>> I took a look at the JavaDoc and the wiki doc, neither one explains
>> this
>>>> well. I don’t think they even point out what is thread safe.
>>>>> 
>>>>> wunder
>>>>> Walter Underwood
>>>>> wun...@wunderwood.org
>>>>> http://observer.wunderwood.org/  (my blog)
>>>>> 
>>>>> 
>>>>>> On Jan 30, 2016, at 7:42 AM, Susheel Kumar <susheel2...@gmail.com>
>>>> wrote:
>>>>>> 
>>>>>> Hi Steve,
>>>>>> 
>>>>>> Can you please elaborate what error you are getting and i didn't
>>>> understand
>>>>>> your code above, that why initiating Solr client object  is in loop.
>> In
>>>>>> general  creating client instance should be outside the loop and a one
>>>> time
>>>>>> activity during the complete execution of program.
>>>>>> 
>>>>>> Thanks,
>>>>>> Susheel
>>>>>> 
>>>>>> On Sat, Jan 30, 2016 at 8:15 AM, Steven White <swhite4...@gmail.com>
>>>> wrote:
>>>>>> 
>>>>>>> Hi folks,
>>>>>>> 
>>>>>>> I'm getting memory leak in my code.  I narrowed the code to the
>>>> following
>>>>>>> minimal to cause the leak.
>>>>>>> 
>>>>>>>  while (true) {
>>>>>>>      HttpSolrClient client = new HttpSolrClient("
>>>>>>> http://192.168.202.129:8983/solr/core1";);
>>>>>>>      client.close();
>>>>>>>  }
>>>>>>> 
>>>>>>> Is this a defect or an issue in the way I'm using HttpSolrClient?
>>>>>>> 
>>>>>>> I'm on Solr 5.2.1
>>>>>>> 
>>>>>>> Thanks.
>>>>>>> 
>>>>>>> Steve
>>>>>>> 
>>>>> 
>>>> 
>> 
>> 

Reply via email to