Hi Steve, Thanks for your kind response. I checked PositionfilterFactory (re-index as well) but that also didn't solve the problem. Interesting the problem is not reproduceable from Solr's Field Analysis page, it manifests only when it's in a query.
I guess the subject for this post is not very correct, it's not that ShingleFilter is failing but -- using ShingleFilter, there is no score provided by the shingle field when I pass more terms than the indexed terms. I observe this using debugQuery. I had actually posted to solr-user but received no response yet. Probably because the problem is not clear at first glance. However, there's an example I have put in the mail for someone interested to try out and check if there's a problem. Let's see if I receive any response. -Ethan On Tue, Jul 13, 2010 at 9:15 PM, Steven A Rowe <sar...@syr.edu> wrote: > Hi Ethan, > > You'll probably get better answers about Solr specific stuff on the > solr-u...@a.l.o list. > > Check out PositionFilterFactory - it may address your issue: > > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PositionFilterFactory > > Steve > >> -----Original Message----- >> From: Ethan Collins [mailto:collins.eth...@gmail.com] >> Sent: Tuesday, July 13, 2010 3:42 AM >> To: java-u...@lucene.apache.org >> Subject: ShingleFilter failing with more terms than index phrase >> >> I am using lucene 2.9.3 (via Solr 1.4.1) on windows and am trying to >> understand ShingleFilter. I wrote the following code and find that if I >> provide more words than the actual phrase indexed in the field, then the >> search on that field fails (no score found with debugQuery=true). >> >> Here is an example to reproduce, with field names: >> Id: 1 >> title_1: Nina Simone >> title_2: I put a spell on you >> >> Query (dismax) with: >> - “Nina Simone I put” <- Fails i.e. no score shown from title_1 search >> (using debugQuery) >> - “Nina Simone” <- SUCCESS >> >> But, when I used Solr’s Field Analysis with the ‘shingle’ field (given >> below) and tried “Nina Simone I put”, it succeeds. It’s only during the >> query that no score is provided. I also checked ‘parsedquery’ and it shows >> disjunctionMaxQuery issuing the string “Nina_Simone Simone_I I_put” to the >> title_1 field. >> >> title_1 and title_2 fields are of type ‘shingle’, defined as: >> >> <fieldType name="shingle" class="solr.TextField" >> positionIncrementGap="100" indexed="true" stored="true"> >> <analyzer type="index"> >> <tokenizer class="solr.StandardTokenizerFactory"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> <filter class="solr.ShingleFilterFactory" >> maxShingleSize="2" outputUnigrams="false"/> >> </analyzer> >> <analyzer type="query"> >> <tokenizer class="solr.StandardTokenizerFactory"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> <filter class="solr.ShingleFilterFactory" >> maxShingleSize="2" outputUnigrams="false"/> >> </analyzer> >> </fieldType> >> >> Note that I also have a catchall field which is text. I have qf set >> to: 'id^2 catchall' and pf set to: 'title_1^1.5 title_2^1.2' >> >> If I am missing something or doing something wrong please let me know. >> >> -Ethan >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >