Hi everybody! Ahmet, do I get it correct - if I use this text_char_norm field type, for input "myName=aaa bbb" I'll index terms "myName", "aaa", "bbb"? So I'll match with query like "myName" or query like "bbb", but not match with "myName aaa". I can use this type for query value, so split "myName aaa" into ( "myName" && "aaa") - and it will work. But this approach will give false positive match with "myName bbb". What do you think, how I can handle this? One of the approaches is to use in this field type KeywordTokenizer+ShingleFilter instead of WhitespaceTokenizerFactory, so tokens like "myName", "myName aaa", "myName aaa bbb", "aaa", "aaa bbb", "bbb" will be indexed, but it significantly increased index size in case of long values.
26.12.2013, 03:20, "Ahmet Arslan" <iori...@yahoo.com>: > Hi Haya, > > With MappingCharFilter you can have full control over character set that you > want to split. > > in mappings.txt you will have > > ":" => " " > "=" => " " > > Use the following type and see if it suits for your needs. Update > mappings.txt according to your needs. > > <fieldType name="text_char_norm" class="solr.TextField" > positionIncrementGap="100" > > <analyzer> > <charFilter class="solr.MappingCharFilterFactory" > mapping="mappings.txt"/> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory" /> > </analyzer> > </fieldType> > > On Sunday, December 22, 2013 9:19 PM, haya.axelrod <haya.axel...@gmail.com> > wrote: > I have a text field that can contain very long values (like text files). I > want to create field type for it (text, not string), in order to have > something like "Match whole word only" in notepad++, but the delimiter > should not be only white spaces. If i have: > > myName=aaa bbb > > I would like to get it for the following search strings "aaa", "bbb", "aaa > bbb", "myName=aaa bbb", "myName", but not for "aa" or "ame=a" or "a bb". > Another example is: > > <myName>aaa bbb</myName> > Can i do this somehow? > > What should be my field type definition? > > The text can contain any character. Before search i'm escaping the search > string using > http://lucene.apache.org/solr/4_2_1/solr-solrj/org/apache/solr/client/solrj/util/ClientUtils.html > > Thanks > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-Match-whole-word-only-in-text-fields-tp4107795.html > Sent from the Solr - User mailing list archive at Nabble.com.