Hi Peter,

A few more details about your setup would help list members to answer your 
questions.
How large is your index?  
How much memory is on the machine and how much is allocated to the JVM?
Besides the Solr caches, Solr and Lucene depend on the operating system's disk 
caching for caching of postings lists.  So you need to leave some memory for 
the OS.  On the other hand if you are optimizing and refreshing every 10-15 
minutes, that will invalidate all the caches, since an optimized index is 
essentially a set of new files.

Can you give us some examples of the slow queries?  Are you using stop words?  

If your slow queries are phrase queries, then you might try either adding the 
most frequent terms in your index to the stopwords list  or try CommonGrams and 
add them to the common words list.  (Details on CommonGrams here: 
http://www.hathitrust.org/blogs/large-scale-search/slow-queries-and-common-words-part-2)

Tom Burton-West

-----Original Message-----
From: Peter Karich [mailto:peat...@yahoo.de] 
Sent: Tuesday, August 10, 2010 9:54 AM
To: solr-user@lucene.apache.org
Subject: Improve Query Time For Large Index

Hi,

I have 5 Million small documents/tweets (=> ~3GB) and the slave index
replicates itself from master every 10-15 minutes, so the index is
optimized before querying. We are using solr 1.4.1 (patched with
SOLR-1624) via SolrJ.

Now the search speed is slow >2s for common terms which hits more than 2
mio docs and acceptable for others: <0.5s. For those numbers I don't use
highlighting or facets. I am using the following schema [1] and from
luke handler I know that numTerms =~20 mio. The query for common terms
stays slow if I retry again and again (no cache improvements).

How can I improve the query time for the common terms without using
Distributed Search [2] ?

Regards,
Peter.


[1]
<field name="id" type="tlong" indexed="true" stored="true"
required="true" />
<field name="date" type="tdate" indexed="true" stored="true" />
<!-- term* attributes to prepare faster highlighting. -->
<field name="txt" type="text" indexed="true" stored="true"
               termVectors="true" termPositions="true" termOffsets="true"/>

[2]
http://wiki.apache.org/solr/DistributedSearch

Reply via email to