Re: Solr3.5 Vs Solr4.1 - Help please

Erick Erickson Tue, 05 Mar 2013 16:08:49 -0800

4.1 turns on stored field compression by default, perhaps what's happening
here is that you're seeing the spike when you fetch your very large
document and it gets uncompressed? Just a shot in the dark.....


But you could test it by turning off compression...

That said, I shouldn't think that compression of even a 6M field would take
all that long.

Best
Erick


On Mon, Mar 4, 2013 at 11:52 PM, Sandeep Mirchandani <skmi...@hotmail.com>wrote:

> I work with Aditya, so this information is in continuation where Aditya
> left
> off.
>
> Here are some of the observations based on running a query on a particular
> unique id .  The nature of the document (corresponding to the uniqueid) is
> such that it is fairly large if we were to run a query without an fl list
> for this document, the total size would be in the neighborhood of 6MB.
> However, we are using fl list to get a subset of this document.   We use a
> script that uses curl to call the server, run from a different box, for the
> same uniqueId's but with different fl list.   After the first few runs of
> the search (something like, q=id:foo) we change the fl list to return some
> other fields which produce a different set of fields perhaps larger than
> the
> first query for the same id but different fl list.
>
> 1.  The curl client blocks when the fl list changes.  The CPU from VisualVM
> shows 50% CPU utilization.
> 2. This spin continues till the result is returned back to the curl client.
> 3. We see the same thing from a browser as well and this reproduces the
> problem and helps identify that the spin occurs after the server has
> completed searching for document (since we see an entry in the solr log
> file
> and that contains the QTime for this query), and is now trying to return.
> The browser waits till all the data is received and only after this is
> done,
> renders the page.  So what is taking so long for the server to respond to
> the client?
> 4.  Monitor the sampler from VisualVM and you can see the getFields() on
> the
> top of the list. Since I see it on the top of the list I believe that it
> may
> be spinning here.
> 5.  Restart the Server running SOLR.
> 6.  Start with running the same query from the browser and it returns in a
> couple of seconds.
> 7.  Running the same curl script and we see that sail through the query as
> well, with the server responding back almost immediately.
> 6.  Monitoring sampler this time around and you _don't_ see CPU spinning on
> getFields().
> 7.  I change the solrconfig.xml file in the definitions for firstSearcher
> and add the uniqueId in the q parameter and restart.
> 8.  This time running the curl script runs well.
> 9.  If the server is restarted again, we run the curl script with the
> blocking (spinning) query right on top, the script sails through again.
>
> Just from this observation, it seems like the code for SOLR 4.1 takes a
> wrong turn somewhere for large responses if it comes across the same query
> with a different fl list again.    If the spinning query is pre-cached via
> the solrconfig.xml firstsearcher change or via the browser or run ahead of
> other queries for the same id, it seems to work fine after the first run of
> the command.  However, running it after running the same search with
> different fl does have an effect.   This did not happen with SOLR 3.5 and
> seems like a regression.   The above is repeatable for us.
>
> Question:  Why is this happening on SOLR 4.1?   Seems like the workaround
> for now may be to cache the queries with large document sizes in
> solrconfig.xml .
>
> Would appreciate hearing from others facing this issue thus validating what
> we see as well. Thanks.
>
> Best regards,
> -- Sandeep
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr3-5-Vs-Solr4-1-Help-please-tp4043543p4044742.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Solr3.5 Vs Solr4.1 - Help please

Reply via email to