Re: Use case for the Shingle Filter

2017-03-06 Thread Ryan Yacyshyn
The query parser will split on whitespace. I'm not sure how I can use the shingle filter in my query, and use-cases for it. For example, if my fieldType looks like this: ** and I have a document that has "my babysitter is terrific" in the content_t field, a q

Re: Use case for the Shingle Filter

2017-03-05 Thread Ryan Josal
I thought new versions of solr didn't split on whitespace at the query parser anymore, so this should work? That being said, I think I remember it having a problem coming after a synonym filter. IIRC, if your input is "Foo Bar" and you have a synonym "foo <=> baz" you would get foobaz bazbar inst

RE: Use case for the Shingle Filter

2017-03-05 Thread Markus Jelsma
Hello - we use it for text classification and online near-duplicate document detection/filtering. Using shingles means you want to consider order in the text. It is analogous to using bigrams and trigrams when doing language detection, you cannot distinguish between Danish and Norwegian solely o

Re: Use case for the Shingle Filter

2017-03-04 Thread Walter Underwood
I use the shingle filter to help with the “one word or two” problem. Is it “baby sitter” or “babysitter”? With the shingle filter, searches for “babysitter” will work for content with “baby sitter”, but not the other way around. If you can identify a list of the one/two-word compounds that are