I'm not very familiar with shingles but it seems to be that you should have
ShingleFilter at index time and make the query as a phrase query?

On Mon, Sep 8, 2008 at 1:00 PM, Mck <[EMAIL PROTECTED]> wrote:

> > So then i change type="string" to type="shingleString" along with
> > > [snip]
> > >       <analyzer type="query">
> > >         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> > >         <filter class="solr.ShingleFilterFactory" outputUnigrams="true"
> outputUnigramIfNoNgram="true" maxShingleSize="99" />
> > >       </analyzer>
>
> Debugging ShingleFilter I see that without quotes the shingles
> StringBuffer array consists of just the current token.
>
> When the query does have quotes the shingles array fills up with the
> expected shingles.
> And the Query (infact a MultiPhraseQuery)
>  returned from SolrQueryParser.getFieldQuery()
>  looks like
>
> list_entry_shingle:"(abcd abcd efgh abcd efgh ijkl) (efgh efgh ijkl) ijkl"
>
> I'm struggling to make sense of this.
> How can the shingles be matched if they aren't quoted?
> Why put the parenthesis () when the query has default operator OR?
>
> I would be expecting a Query instead like:
> abcd "abcd efgh" "abcd efgh ijkl" efgh "efgh ijkl" ijkl
>
> (This with the ShingleFilter disabled does indeed work perfectly).
>
> Am i barking up the wrong tree?
> Is there a way to get the shingles phrased?
>
> Otis, you mentioned this briefly on your reply on the dev list:
> > Make sure you turn them into phrase queries
>
> did you mean here something more than just quoting the original query?
>
> ~mck
>
> --
> "Claiming Java is easier than C++ is like saying that K2 is shorter than
> Everest." Larry O'Brien
> | semb.wever.org | sesat.no | sesam.no |
>



-- 
Regards,
Shalin Shekhar Mangar.

Reply via email to