The index is 8GB and I'm giving it 1,5 GB of RAM On Fri, Sep 11, 2009 at 5:09 PM, Lance Norskog <goks...@gmail.com> wrote:
> This sounds like a memory-handling problem. The JVM could be too > small, forcing a lot of garbage collections during the first search. > It could be too big and choke off the OS disk cache. It could be too > big and cause paging. > > Does this search query include a sort command? Sorting creates a large > data structure the first time, then caches it. For 12 million > documents this should not take 50 seconds. > > How big are the index files? Not the number of documents, but the > total size in gigabytes in solr/data/index. > > On Fri, Sep 11, 2009 at 10:21 AM, Jonathan Ariel <ionat...@gmail.com> > wrote: > > Ok thanks, if it's the IO OS Disk cache, which would be my options? > changing > > the disk to a faster one? > > > > On Fri, Sep 11, 2009 at 1:32 PM, Yonik Seeley < > > yonik.see...@lucidimagination.com> wrote: > > > >> At the Lucene level there is the term index and the norms too: > >> > >> > http://search.lucidimagination.com/search/document/b5eee1fc75cc454c/caching_in_lucene > >> > >> But 50s? That would seem to indicate it's the OS disk cache and you're > >> waiting for IO. You should be able to confirm if you're IO bound by > >> simply looking at the CPU utilization during this 50s query. > >> > >> -Yonik > >> http://www.lucidimagination.com > >> > >> > >> > >> On Fri, Sep 11, 2009 at 8:59 AM, Jonathan Ariel <ionat...@gmail.com> > >> wrote: > >> > yes of course. but in my case I'm not using filter queries nor facets. > >> > it is a really simple query. actually the query params are like this: > >> > ?q=location_country:1 AND category:377 AND location_state:"CA" and > >> > location_city:"Sacramento" > >> > > >> > location_country is an integer > >> > category is an integer > >> > location_state is a string > >> > and location_city is a string > >> > > >> > as you can see no filter query and no facets. and for this query the > >> first > >> > time that I execute it it takes almost 50s to run, while for the > >> following > >> > query: > >> > > >> > ?q=title_search:test > >> > > >> > title_search is a tokenized text field with a bunch of filters > >> > > >> > it takes a couple of ms > >> > > >> > I'm always talking about executing these queries the first time after > >> > restarting solr. > >> > > >> > I just want to understand the cause and be sure I won't have this > >> behaviour > >> > every time I commit or optimize. > >> > > >> > Jonathan > >> > > >> > On Fri, Sep 11, 2009 at 7:28 AM, Uri Boness <ubon...@gmail.com> > wrote: > >> > > >> >> "Not having any facet" and "Not using a filter cache" are two > different > >> >> things. If you're not using query filters, you can still have facet > >> >> calculated and returned as part of the search result. The facet > >> component > >> >> uses lucene's field cache to retrieve values for the facet field. > >> >> > >> >> > >> >> Jonathan Ariel wrote: > >> >> > >> >>> Yes, but in this case the query that I'm executing doesn't have any > >> facet. > >> >>> I > >> >>> mean for this query I'm not using any filter cache.What does it > means > >> >>> "operating system cache can be significant"? That my first query > >> uploads a > >> >>> big chunk on the index into memory (maybe even the entire index)? > >> >>> > >> >>> On Thu, Sep 10, 2009 at 10:07 PM, Yonik Seeley > >> >>> <yo...@lucidimagination.com>wrote: > >> >>> > >> >>> > >> >>> > >> >>>> At 12M documents, operating system cache can be significant. > >> >>>> Also, the first time you sort or facet on a field, a field cache > >> >>>> instance is populated which can take a lot of time. You can > prevent > >> >>>> slow first queries by configuring a static warming query in > >> >>>> solrconfig.xml that includes the common sorts and facets. > >> >>>> > >> >>>> -Yonik > >> >>>> http://www.lucidimagination.com > >> >>>> > >> >>>> On Thu, Sep 10, 2009 at 8:55 PM, Jonathan Ariel < > ionat...@gmail.com> > >> >>>> wrote: > >> >>>> > >> >>>> > >> >>>>> Hi!Why would it take for the first query that I execute almost 60 > >> >>>>> seconds > >> >>>>> > >> >>>>> > >> >>>> to > >> >>>> > >> >>>> > >> >>>>> run and after that no more than 50ms? I disabled all my caching to > >> check > >> >>>>> > >> >>>>> > >> >>>> if > >> >>>> > >> >>>> > >> >>>>> it is the reason for the subsequent fast responses, but the same > >> >>>>> happens. > >> >>>>> I'm using solr 1.3. > >> >>>>> Something really strange is that it doesn't happen with all the > >> queries. > >> >>>>> > >> >>>>> > >> >>>> It > >> >>>> > >> >>>> > >> >>>>> is happening with a query that filters some integer and string > fields > >> >>>>> > >> >>>>> > >> >>>> joined > >> >>>> > >> >>>> > >> >>>>> by an AND operator. Something like A:1 AND B:2 AND (C:3 AND > D:"CA") > >> >>>>> > >> >>>>> > >> >>>> (exact > >> >>>> > >> >>>> > >> >>>>> match). > >> >>>>> My index is around 12000000M documents. > >> >>>>> > >> >>>>> Thanks, > >> >>>>> > >> >>>>> Jonathan > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>> > >> >>> > >> >>> > >> >> > >> > > >> > > > > > > -- > Lance Norskog > goks...@gmail.com >