RE: CoreAdmin STATUS performance

Shahar Davidson Thu, 10 Jan 2013 01:10:51 -0800

Thanks Shawn.

As for your first question, the core info needs to be gathered upon every 
search request because cores are created dynamically.
When a user initiates a search request, the system must be aware of all 
available cores in order to execute distributed search on _all_ relevant cores. 
(the user must get reliable and most up to date data)
The reason that 800ms seems a lot to me is because the overall execution time 
takes about 2500ms and a large part of it is due to the STATUS request.

The "minimal interval" concept is a good idea and indeed we've considered it, 
yet it poses a slight problem when building a RT system which needs to return 
to most up to date data.
I am just trying to understand if there's some other way to hasten the STATUS 
reply (for example, by asking the STATUS request to return just certain core 
attributes, such as name, instead of collecting everything)

Thanks,

Shahar.

-----Original Message-----
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Thursday, January 10, 2013 2:52 AM
To: solr-user@lucene.apache.org
Subject: Re: CoreAdmin STATUS performance

On 1/9/2013 8:38 AM, Shahar Davidson wrote:
> I have a client app that uses SolrJ and which requires to collect the names 
> (and just the names) of all loaded cores.
> I have about 380 Solr Cores on a single Solr server (net indices size is 
> about 220GB).
>
> Running the STATUS action takes about 800ms - that seems a bit too long, 
> given my requirements.
>
> So here are my questions:
> 1) Is there any way to get _only_ the core Name of all cores?
> 2) Why does the STATUS request take such a long time and is there a way to 
> improve its performance?

I'm curious why 800 milliseconds isn't fast enough.  How often do you actually 
need to gather this information?

If you are incorporating it into something that will get accessed a lot (such 
as a status servlet page), put a "minimum interval" capability into the part of 
the program that contacts solr.  If it's been less than that minimum interval 
(5-10 seconds could be a recommended starting
point) since the last time the information was gathered, just use the 
previously stored response rather than make a new request.

I have used this approach in a homegrown status servlet written with SolrJ.  I 
have been trying to come up with a way to generalize the paradigm so it can be 
incorporated directly into a future SolrJ version.

Thanks,
Shawn

Email secured by Check Point

RE: CoreAdmin STATUS performance

Reply via email to