Re: Use case for the Shingle Filter

2017-03-06 Thread Ryan Yacyshyn
The query parser will split on whitespace. I'm not sure how I can use the shingle filter in my query, and use-cases for it. For example, if my fieldType looks like this: ** and I have a document that has "my babysitter is terrific" in the con

Re: Use case for the Shingle Filter

2017-03-05 Thread Ryan Josal
:Ryan Yacyshyn > > Sent: Sunday 5th March 2017 5:57 > > To: solr-user@lucene.apache.org > > Subject: Use case for the Shingle Filter > > > > Hi everyone, > > > > I was thinking of using the Shingle Filter to help solve an issue I'm > > facing. I can see this wo

RE: Use case for the Shingle Filter

2017-03-05 Thread Markus Jelsma
on single characters. Markus -Original message- > From:Ryan Yacyshyn > Sent: Sunday 5th March 2017 5:57 > To: solr-user@lucene.apache.org > Subject: Use case for the Shingle Filter > > Hi everyone, > > I was thinking of using the Shingle Filter to help sol

Re: Use case for the Shingle Filter

2017-03-04 Thread Walter Underwood
I use the shingle filter to help with the “one word or two” problem. Is it “baby sitter” or “babysitter”? With the shingle filter, searches for “babysitter” will work for content with “baby sitter”, but not the other way around. If you can identify a list of the one/two-word compounds that are

Use case for the Shingle Filter

2017-03-04 Thread Ryan Yacyshyn
Hi everyone, I was thinking of using the Shingle Filter to help solve an issue I'm facing. I can see this working in the analysis panel in the Solr admin, but not when I make my queries. I find out it's because of the query parser splitting up the tokens on white space before passing

Re: Configure Shingle Filter to ignore ngrams made of tokens with same start and end

2013-05-03 Thread Steve Rowe
An issue exists for this problem: https://issues.apache.org/jira/browse/LUCENE-3475 On May 3, 2013, at 11:00 AM, Walter Underwood wrote: > The shingle filter should respect positions. If it doesn't, that is worth > filing a bug so we know about it. > > wunder > > On M

Re: Configure Shingle Filter to ignore ngrams made of tokens with same start and end

2013-05-03 Thread Walter Underwood
The shingle filter should respect positions. If it doesn't, that is worth filing a bug so we know about it. wunder On May 3, 2013, at 10:50 AM, Jack Krupansky wrote: > In short, no. I don't think you want to use the shingle filter on a token > stream that has multiple to

Re: Configure Shingle Filter to ignore ngrams made of tokens with same start and end

2013-05-03 Thread Jack Krupansky
In short, no. I don't think you want to use the shingle filter on a token stream that has multiple tokens at the same position, otherwise, you will get confused "suggestions", as you've encountered. -- Jack Krupansky -Original Message- From: Rounak Jain Sent: Fr

Re: using KeywordTokenizer in indexing and StandardTokenizer with shingle filter in query

2012-03-15 Thread Erick Erickson
The best advice I can give is to spend some time on the admin/analysis page. For instance, I believe that your first index analysis chain will do nothing. KeywordTokenizerFactory does not break up the incoming text at all. Since there is only a single token, the shinglefilter isn't doing anything e

Re: Shingle filter factory and the min shingles

2010-09-14 Thread Jason Rutherglen
And here's the issue... https://issues.apache.org/jira/browse/SOLR-1740 On Tue, Sep 14, 2010 at 6:08 PM, Jason Rutherglen wrote: > To answer my own question, and this sucks :)  the minShingleSize isn't > set in at least 1.4.2.  I'm guessing a later version though? > > On Tue, Sep 14, 2010 at 5:49

Re: Shingle filter factory and the min shingles

2010-09-14 Thread Jason Rutherglen
To answer my own question, and this sucks :) the minShingleSize isn't set in at least 1.4.2. I'm guessing a later version though? On Tue, Sep 14, 2010 at 5:49 PM, Jason Rutherglen wrote: > positionIncrementGap="100"> > > > > words="stopwords.txt"/> > maxShingleSize="4" outputUnigrams="fal

Shingle filter factory and the min shingles

2010-09-14 Thread Jason Rutherglen
I'm using for a field, indexing, then looking at the terms component. I'm seeing shingles that consist of only 2 terms, whereas I'm expecting all the terms to be at least 4 terms... What's up? Thanks.

Re: shingle filter

2009-08-26 Thread Shalin Shekhar Mangar
On Tue, Aug 25, 2009 at 4:24 AM, Joe Calderon wrote: > hello *, im currently faceting on a shingled field to obtain popular > phrases and its working well, however ide like to limit the number of > shingles that get created, the solr.ShingleFilterFactory supports > maxShingleSize, can it be made t

shingle filter

2009-08-24 Thread Joe Calderon
hello *, im currently faceting on a shingled field to obtain popular phrases and its working well, however ide like to limit the number of shingles that get created, the solr.ShingleFilterFactory supports maxShingleSize, can it be made to support a minimum as well? can someone point me in the right

Re: Trouble with Shingle filter and query parsing / expansion

2009-08-11 Thread Mark Bennett
;ve got an index building with the shingle filter and I can see the > compound terms with Luke, etc. So far so good. One detail, I did tell it > to not emit unigrams - I've got single words covered in a normal field. > > And a bit of poking around the other day explained why shingle

Trouble with Shingle filter and query parsing / expansion

2009-08-11 Thread Mark Bennett
I've got an index building with the shingle filter and I can see the compound terms with Luke, etc. So far so good. One detail, I did tell it to not emit unigrams - I've got single words covered in a normal field. And a bit of poking around the other day explained why shingle queri