Why not use ShingleFilterFactory and then match on that token if you find it?

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory


Jeff Porter
co-founder
email: jpor...@o2ointeractive.com
mobile: +1-303-332-4006

On Aug 19, 2013, at 11:23 AM, Dan Davis wrote:

> This is an interesting topic - my employer is a medical library and there
> are many keywords that may need to be aliased in various ways, and 2 or 3
> word phrases that perhaps should be treated specially.   Jack, can you give
> me an example of how to do that sort of thing?    Perhaps I need to buy
> your almost released Deep Dive book...
> Sorry to be too tangential - it is my strange way.
> 
> 
> On Mon, Aug 19, 2013 at 12:32 PM, Jack Krupansky 
> <j...@basetechnology.com>wrote:
> 
>> Okay, but what is it that you are trying to "prevent"??
>> 
>> And, "diet follower" is a phrase, not a keyword or term.
>> 
>> So, I'm still baffled as to what you are really trying to do. Trying
>> explaining it in plain English.
>> 
>> And given this same input, how would it be queried?
>> 
>> 
>> -- Jack Krupansky
>> 
>> -----Original Message----- From: Furkan KAMACI
>> Sent: Monday, August 19, 2013 11:22 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Prevent Some Keywords at Analyzer Step
>> 
>> 
>> Let's assume that my sentence is that:
>> 
>> *Alice is a diet follower*
>> 
>> My special keyword => *diet follower*
>> 
>> Tokens will be:
>> 
>> Token 1) Alice
>> Token 2) is
>> Token 3) a
>> Token 4) diet
>> Token 5) follower
>> Token 6) *diet follower*
>> 
>> 
>> 2013/8/19 Jack Krupansky <j...@basetechnology.com>
>> 
>> Your example doesn't "prevent" any keywords.
>>> 
>>> You need to elaborate the specific requirements with more detail.
>>> 
>>> Given a long stream of text, what tokenization do you expect in the index?
>>> 
>>> -- Jack Krupansky
>>> 
>>> -----Original Message----- From: Furkan KAMACI Sent: Monday, August 19,
>>> 2013 8:07 AM To: solr-user@lucene.apache.org Subject: Prevent Some
>>> Keywords at Analyzer Step
>>> Hi;
>>> 
>>> I want to write an analyzer that will prevent some special words. For
>>> example sentence to be indexed is:
>>> 
>>> diet follower
>>> 
>>> it will tokenize it as like that
>>> 
>>> token 1) diet
>>> token 2) follower
>>> token 3) diet follower
>>> 
>>> How can I do that with Solr?
>>> 
>>> 
>> 

Reply via email to