set generateWordParts=1 on wordDelimiter or use
PatternTokenizerFactory to split on commas

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PatternTokenizerFactory


you can use the analysis page to see what your filter chains are going
to do before you index

/admin/analysis.jsp

On Fri, Jun 18, 2010 at 6:41 AM, Vitaliy Avdeev <vavd...@sistyma.net> wrote:
> Hello.
> In indexing text I have such string John,Mark,Sam. Then I looks at it in
> TermVectorComponent it looks like this johnmarksam.
>
> I am using this type for storing data
>
>    <fieldType name="textTight2" class="solr.TextField"
> positionIncrementGap="100" >
>      <analyzer>
>    <tokenizer class="solr.HTMLStripWhitespaceTokenizerFactory"/>
>        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="false"/>
>        <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt"/>
>        <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="0" generateNumberParts="0" catenateWords="1"
> catenateNumbers="1" catenateAll="0"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>      </analyzer>
>    </fieldType>
>
> What filter I need to use to get John Mark Sam as different words?
>

Reply via email to