Its the pre-analyzed form thats copied. The field that its copied to will determine the analyzer/filters for that field. If you want to check out the code doing it, its in org.apache.solr.update.DocumentBuilder
-- - Mark http://www.lucidimagination.com On Mon, Aug 3, 2009 at 8:12 AM, Chantal Ackermann < chantal.ackerm...@btelligent.de> wrote: > Dear all, > > before searching through the source code - maybe one of you can answer this > easily: > > When and based on what are the tokenizer and filters applied when copying > fields? Can it happen that fields are analyzed twice (once when creating the > first field, and a second time when they are copied to the another field)? > > > Here an example from my current setup: > I have the following types defined, in schema.xml: > > <fieldType name="text_de" class="solr.TextField" > positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.StandardTokenizerFactory" /> > <filter class="solr.LengthFilterFactory" min="2" max="5000" /> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords_de.txt" /> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1" /> > <filter class="solr.LowerCaseFilterFactory" /> > <filter class="solr.SnowballPorterFilterFactory" language="German" > /> > <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.StandardTokenizerFactory" /> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords_de.txt" /> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="0" > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1" /> > <filter class="solr.LowerCaseFilterFactory" /> > <filter class="solr.SnowballPorterFilterFactory" language="German" > /> > <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> > </analyzer> > </fieldType> > > Used for those fields: > > <field name="title" type="keyword" index="true" stored="true" > required="true" /> > <field name="title_de" type="text_de" index="true" stored="false" > required="false" /> > <field name="subtitle_text_de" type="text_de" index="true" stored="true" > required="false" /> > <field name="dtext_de" type="text_de" index="true" stored="false" > required="false" /> > > Which are used to populate this field using the copy field directive: > > <field name="all_text_de" type="text_de" indexed="true" stored="false" > multiValued="true" /> > > like that (that is what I do, now, at least): > > <copyField source="title" dest="title_de" /> > <copyField source="title" dest="all_text_de" /> > <copyField source="subtitle_text_de" dest="all_text_de" /> > <copyField source="dtext_de" dest="all_text_de" /> > > > I am copying fields with different types to all_text_de, e.g. title is > different from subtitle_text_de. Is the valued copied to the destination > field the raw (input) value or the already analyzed one? > > > Thanks! > Chantal > > > -- > Chantal Ackermann >