Thanks both for your replies Eric, Yep, I use the Analysis page extensively, but what I was directly looking for was whether all of only the last line of values given by the analysis page, where eventually indexed. I think we've concluded it's only the last line.
Cheers, Ben On Wed, Apr 13, 2011 at 2:41 PM, Erick Erickson <erickerick...@gmail.com>wrote: > CharFilterFactories are applied to the raw input before tokenization. > Each token output from the tokenization is then sent through > the rest of the chain. > > The Analysis page available from the Solr admin page is > invaluable in answering in great detail what each part of > an analysis chain does. > > TokenFilterFactories are applied to each token emitted from > the tokenizer, and this includes the similar > PatternReplaceFilterFactory. The difference is that the > PatternReplaceCharFilterFactory is applied before tokenization > to the entire input stream and PatternReplaceFilterFactory > is applied to each token emitted by the tokenizer. > > And to make it even more fun, you can do both! > > Best > Erick > > On Wed, Apr 13, 2011 at 8:14 AM, Ben Davies <ben.dav...@gmail.com> wrote: > > > Hi there, > > > > Just a quick question that the wiki page ( > > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters) didn't seem > > to > > answer very well. > > > > Given an analyzer that has zero or more Char Filter Factories, one > > Tokenizer Factory, and zero or more Token Filter Factories, which > value(s) > > are indexed? > > > > Is every value that is produced from each char filter, tokenizer, and > > filter > > indexed? > > Or is the only the final value after completing the whole chain indexed? > > > > Cheers, > > Ben > > >