> From: [EMAIL PROTECTED]> Subject: Re: Phrase Query Performance Question>
> Date: Tue, 30 Oct 2007 11:22:17 -0700> To: solr-user@lucene.apache.org> > On
> 30-Oct-07, at 6:09 AM, Yonik Seeley wrote:> > > On 10/30/07, Haishan Chen
> <[EMAIL PROTECTED]> wrote:> >> Thanks a lot for replying Yonik!> >>> >> I am
> running solr on a windows 2003 server (standard version). > >> intel Xeon CPU
> 3.00GHz, with 4.00 GB RAM.> >> The index is locate on Raid5 with 2 million
> documents. Is there > >> any way to improve query performance without moving
> to more > >> powerful computer?> >>> >> I understand that the query
> performances of phrase query ("auto > >> repair") has to do with the number
> of documents containing the two > >> words. In fact the number of documents
> that have auto and repair > >> are about 100000. It is like 5% of the
> documents containing auto > >> and repair. It seems to me 937 ms is too
> slower.> >> > Chen, that does seem slow.... I'm not sure why.> > 1) was this
> the first search on the index? if so, try running some> > other searches to
> warm things up first.> > Indeed--phrase matching uses a completely different
> part of the > index, so that needs to be warmed too.> > One thing to try is
> solr trunk: it contains some speedups for phrase > queries (though perhaps
> not as substantial as you hope for).> > -MIke> >
Thanks for replying.
The statistics I collected were not on the first query. And I believe I was
runing JVM on server mode.
I configure tomcat to use the server version of JVM.dll. I guess that is the
way to set it on windows.
I execute the same phrase query ("auto repair") over and over again and that is
the best performance I observe.
Also when I did the test I disable all solr cache. I want to see the
performance without Solr cache
I am currently trying to test the index on linux system with similar hardware.
It will take me some time to set it up.
I read a discussion between Doug cutting and Andrzej Bialecki about lucene
performance.
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200512.mbox/[EMAIL
PROTECTED]
It mentioned that http://websearch.archive.org/katrina/ (in nutch) had 10M
documents and a search of "hurricane katrina" was able to return in 1.35
seconds with 600,867 hits. Althought the computer it was using might be more
powerful than mine. I feel 937ms for a phrase query on a single field is kind
of slower. Nutch actually expand a search to more complex queries. My index and
the number of hits on my query ("auto repair") is about one fifth of
websearch.archive.org and its testing query. So I feel a reasonable performance
for my query should be less than 300 ms. I am not sure if I am right on that
logic.
Anyway I will collect the statistic on linux first and try out other options.
Thanks a lot
Haishan
_________________________________________________________________
Windows Live Hotmail and Microsoft Office Outlook – together at last. Get it
now.
http://office.microsoft.com/en-us/outlook/HA102225181033.aspx?pid=CL100626971033