This sounds like a memory-handling problem. The JVM could be too
small, forcing a lot of garbage collections during the first search.
It could be too big and choke off the OS disk cache. It could be too
big and cause paging.

Does this search query include a sort command? Sorting creates a large
data structure the first time, then caches it. For 12 million
documents this should not take 50 seconds.

How big are the index files? Not the number of documents, but the
total size in gigabytes in solr/data/index.

On Fri, Sep 11, 2009 at 10:21 AM, Jonathan Ariel <ionat...@gmail.com> wrote:
> Ok thanks, if it's the IO OS Disk cache, which would be my options? changing
> the disk to a faster one?
>
> On Fri, Sep 11, 2009 at 1:32 PM, Yonik Seeley <
> yonik.see...@lucidimagination.com> wrote:
>
>> At the Lucene level there is the term index and the norms too:
>>
>> http://search.lucidimagination.com/search/document/b5eee1fc75cc454c/caching_in_lucene
>>
>> But 50s? That would seem to indicate it's the OS disk cache and you're
>> waiting for IO.  You should be able to confirm if you're IO bound by
>> simply looking at the CPU utilization during this 50s query.
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
>>
>>
>> On Fri, Sep 11, 2009 at 8:59 AM, Jonathan Ariel <ionat...@gmail.com>
>> wrote:
>> > yes of course. but in my case I'm not using filter queries nor facets.
>> > it is a really simple query. actually the query params are like this:
>> > ?q=location_country:1 AND category:377 AND location_state:"CA" and
>> > location_city:"Sacramento"
>> >
>> > location_country is an integer
>> > category is an integer
>> > location_state is a string
>> > and location_city is a string
>> >
>> > as you can see no filter query and no facets. and for this query the
>> first
>> > time that I execute it it takes almost 50s to run, while for the
>> following
>> > query:
>> >
>> > ?q=title_search:test
>> >
>> > title_search is a tokenized text field with a bunch of filters
>> >
>> > it takes a couple of ms
>> >
>> > I'm always talking about executing these queries the first time after
>> > restarting solr.
>> >
>> > I just want to understand the cause and be sure I won't have this
>> behaviour
>> > every time I commit or optimize.
>> >
>> > Jonathan
>> >
>> > On Fri, Sep 11, 2009 at 7:28 AM, Uri Boness <ubon...@gmail.com> wrote:
>> >
>> >> "Not having any facet" and "Not using a filter cache" are two different
>> >> things. If you're not using query filters, you can still have facet
>> >> calculated and returned as part of the search result. The facet
>> component
>> >> uses lucene's field cache to retrieve values for the facet field.
>> >>
>> >>
>> >> Jonathan Ariel wrote:
>> >>
>> >>> Yes, but in this case the query that I'm executing doesn't have any
>> facet.
>> >>> I
>> >>> mean for this query I'm not using any filter cache.What does it means
>> >>> "operating system cache can be significant"? That my first query
>> uploads a
>> >>> big chunk on the index into memory (maybe even the entire index)?
>> >>>
>> >>> On Thu, Sep 10, 2009 at 10:07 PM, Yonik Seeley
>> >>> <yo...@lucidimagination.com>wrote:
>> >>>
>> >>>
>> >>>
>> >>>> At 12M documents, operating system cache can be significant.
>> >>>> Also, the first time you sort or facet on a field, a field cache
>> >>>> instance is populated which can take a lot of time.  You can prevent
>> >>>> slow first queries by configuring a static warming query in
>> >>>> solrconfig.xml that includes the common sorts and facets.
>> >>>>
>> >>>> -Yonik
>> >>>> http://www.lucidimagination.com
>> >>>>
>> >>>> On Thu, Sep 10, 2009 at 8:55 PM, Jonathan Ariel <ionat...@gmail.com>
>> >>>> wrote:
>> >>>>
>> >>>>
>> >>>>> Hi!Why would it take for the first query that I execute almost 60
>> >>>>> seconds
>> >>>>>
>> >>>>>
>> >>>> to
>> >>>>
>> >>>>
>> >>>>> run and after that no more than 50ms? I disabled all my caching to
>> check
>> >>>>>
>> >>>>>
>> >>>> if
>> >>>>
>> >>>>
>> >>>>> it is the reason for the subsequent fast responses, but the same
>> >>>>> happens.
>> >>>>> I'm using solr 1.3.
>> >>>>> Something really strange is that it doesn't happen with all the
>> queries.
>> >>>>>
>> >>>>>
>> >>>> It
>> >>>>
>> >>>>
>> >>>>> is happening with a query that filters some integer and string fields
>> >>>>>
>> >>>>>
>> >>>> joined
>> >>>>
>> >>>>
>> >>>>> by an AND operator. Something like A:1 AND B:2 AND (C:3 AND D:"CA")
>> >>>>>
>> >>>>>
>> >>>> (exact
>> >>>>
>> >>>>
>> >>>>> match).
>> >>>>> My index is around 12000000M documents.
>> >>>>>
>> >>>>> Thanks,
>> >>>>>
>> >>>>> Jonathan
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>>
>> >>
>> >
>>
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to