> Date: Wed, 31 Oct 2007 17:54:53 -0700> Subject: Re: Phrase Query Performance
> Question> From: [EMAIL PROTECTED]> To: solr-user@lucene.apache.org> >
> "hurricane katrina" is a very expensive query against a collection> focused
> on Hurricane Katrina. There will be many matches in many> documents. If you
> want to measure worst-case, this is fine.> > I'd try other things, like:> > *
> ninth ward> * Ray Nagin> * Audubon Park> * Canal Street> * French Quarter> *
> FEMA mistakes> * storm surge> * Jackson Square> > Of course, real query logs
> are the only real test.> > wunder
These terms are not frequent in my index. I believe they are going to be fast.
The thing is that I feel 2 million documents is a small index.
100,000 or 200,000 hits is a small set and should always have sub second query
performance. Now I am only querying one field and the
response is almost one second. I feel I can't achieve sub second performance if
I add a bit more complexity to the query.
Many of the category terms in my index will appear in more than 5% of the
documents and those category terms are very popular search
terms. So the example I gave were not extreme cases for my index
When I start tomcat I saw this message:
The Apache Tomcat Native library which allows optimal performance in production
environments was not found on the java.library.path
Is that mean if I use Apache Tomcat Native library the query performance will
be better. Anyone has experience on that?
Thanks a lot
-Haishan
> > On 10/31/07 3:25 PM, "Mike Klaas" <[EMAIL PROTECTED]> wrote:> > > On
> > 31-Oct-07, at 2:40 PM, Haishan Chen wrote:> > > >> > >>
> > http://mail-archives.apache.org/mod_mbox/lucene-java-user/> >>
> > 200512.mbox/[EMAIL PROTECTED]> >> It mentioned that
> > http://websearch.archive.org/katrina/ (in nutch)> >> had 10M documents and
> > a search of "hurricane katrina" was able to> >> return in 1.35 seconds with
> > 600,867 hits. Althought the computer> >> it was using might be more
> > powerful than mine. I feel 937ms for a> >> phrase query on a single field
> > is kind of slower. Nutch actually> >> expand a search to more complex
> > queries. My index and the number of> >> hits on my query ("auto repair") is
> > about one fifth of> >> websearch.archive.org and its testing query. So I
> > feel a reasonable> >> performance for my query should be less than 300 ms.
> > I am not sure> >> if I am right on that logic.> > > > I'm not sure that it
> > is reasonable, but I'm not sure that it isn't.> > However, have you tried
> > other queries? 937ms seems a little high,> > even for phrase queries.> > >
> > >> Anyway I will collect the statistic on linux first and try out> >> other
> > options.> > > > Have you tried using the performance enhancements present
> > in solr-trunk?> > > > -Mike>
_________________________________________________________________
Peek-a-boo FREE Tricks & Treats for You!
http://www.reallivemoms.com?ocid=TXT_TAGHM&loc=us