Andy,

What are the QTimes for the 0fq,1fq,2fq,4fq & 4fq cases with spellcheck 
entirely turned off?  Is it about (or a little more than) half the total when 
maxCollationTries=1 ?  Also, with the varying # of fq's, how many collation 
tries does it take to get 10 collations?

Possibly, a better way to test this is to set maxCollations = 
maxCollationTries.  The reason is that it quits "trying" once it finds 
"maxCollations", so if with 0fq's, lots of combinations can generate hits and 
it doesn't need to try very many to get to 10.  But with more fq's, fewer 
collations will pan out so now it is trying more up to 100 before (if ever) it 
gets to 10.

I would predict that for each "try" it has to do (and you can force this by 
setting maxCollations = maxCollationTries), qtime will grow linerally per try.  
(I'm assuming you have all non-search components like faceting turned off).  So 
say with 2fq's it takes 10ms for the query to complete with spellcheck off, and 
20ms with "maxCollation = maxCollationTries = 1", then it will take about 110ms 
with "maxCollation = maxCollationTries = 10".

Now if you are finding that with a certain # of fq's, qtime with spellcheck off 
is, for instance, 2ms, 1 try is 10ms, 2 tries is 19ms, etc, then this is more 
than liner growth.  In this case we would need to look at how spell check 
applies fq's and see if there is a bug with it using the cache correctly.

But I think you're just setting maxCollationTries too high.  You're asking it 
to do too much work in trying teens of combinations.  Really, this feature was 
designed to spellcheck and not suggest.  But see 
https://issues.apache.org/jira/browse/SOLR-3240 , which is committed to the 4x 
branch for inclusion in an eventual 4.4 release.  This will make the time to do 
collation tries growth less than linear, possibly making it more suitable for 
suggest.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Andy Lester [mailto:a...@petdance.com] 
Sent: Tuesday, May 28, 2013 2:29 PM
To: solr-user@lucene.apache.org
Subject: Why do FQs make my spelling suggestions so slow?

I'm working on using spellcheck for giving suggestions, and collations
are giving me good results, but they turn out to be very slow if
my original query has any FQs in it.  We can do 100 maxCollationTries
in no time at all, but if there are FQs in the query, things get
very slow.  As maxCollationTries and the count of FQs increase,
things get very slow very quickly.

         1    10    20    50   100 MaxCollationTries
0FQs     8     9    10    11    10
1FQ     11   160   599  1597  1668
2FQs    20   346  1163  3360  3361
3FQs    29   474  1852  5039  5095
4FQs    36   589  2463  6797  6807

All times are QTimes of ms.

See that top row?  With no FQs, 50 MaxCollationTries comes back
instantly.  Add just one FQ, though, and things go bad, and they
get worse as I add more of the FQs.  Also note that things seem to
level off at 100 MaxCollationTries.

Here's a query that I've been using as a test:

df=title_tracings_t&
fl=flrid,nodeid,title_tracings_t&
q=bagdad+AND+diaries+AND+-parent_tracings:(bagdad+AND+diaries)&
spellcheck.q=bagdad+AND+diaries&
rows=4&
wt=xml&
sort=popular_score+desc,+grouping+asc,+copyrightyear+desc,+flrid+asc&
spellcheck=true&
spellcheck.dictionary=direct&
spellcheck.onlyMorePopular=false&
spellcheck.count=15&
spellcheck.extendedResults=false&
spellcheck.collate=true&
spellcheck.maxCollations=10&
spellcheck.maxCollationTries=50&
spellcheck.collateExtendedResults=true&
spellcheck.alternativeTermCount=5&
spellcheck.maxResultsForSuggest=10&
debugQuery=off&
fq=((grouping:"1"+OR+grouping:"2"+OR+grouping:"3")+OR+solrtype:"N")&
fq=((item_source:"F"+OR+item_source:"B"+OR+item_source:"M")+OR+solrtype:"N")&
fq={!tag%3Dgrouping}((grouping:"1"+OR+grouping:"2")+OR+solrtype:"N")&
fq={!tag%3Dlanguagecode}(languagecode:"eng"+OR+solrtype:"N")&

The only thing that changes between tests is the value of
spellcheck.maxCollationTries and how many FQs are at the end.

Am I doing something wrong?  Do the collation internals not handle
FQs correctly?  The lookup/hit counts on filterCache seem to be
increasing just fine.  It will do N lookups, N hits, so I'm not
thinking that caching is the problem.

We'd really like to be able to use the spellchecker but the results
with only 10-20 maxCollationTries aren't nearly as good as if we
can bump that up to 100, but we can't afford the slow response time.
We also can't do without the FQs.

Thanks,
Andy


--
Andy Lester => a...@petdance.com => www.petdance.com => AIM:petdance



Reply via email to