Hello all,

Documentation states “Fields are copied before analysis is done, meaning you 
can have two fields with identical original content, but which use different 
analysis chains and are stored in the index differently.”

I have a field definition for a case insensitive string which I use for 
querying:

    <fieldType name="string_ci" class="solr.TextField" sortMissingLast="true" 
omitNorms="true">
      <analyzer type="query">
          <tokenizer class="solr.KeywordTokenizerFactory"/>
          <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>
    <fieldType name="strings_ci" class="solr.TextField" sortMissingLast="true" 
omitNorms="true" multiValued="true">
      <analyzer type="query">
          <tokenizer class="solr.KeywordTokenizerFactory"/>
          <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>

And a regular string without any analyzers:
    <fieldType name="string" class="solr.StrField" sortMissingLast="true" 
docValues="true"/>

And I have 2 fields, one for searching and one for faceting:

<field name="place.name_orig" type="string"  indexed="false" stored="false" 
docValues="true"/>
<field name="place.name" type="string_ci" indexed="true" stored="true" 
docValues="false"/>

New documents arrive at Solr with a place.name field, so I’m using a copyField 
to copy value to the string:

<copyField source="place.name" dest="place.name_orig" maxChars="1024"/>

My question is, will there be any difference on the resulting indexed documents 
if I switched source and dest fields in copyField directive? My understanding 
is copyField operates on raw data arriving at Solr as is, and field 
declarations themselves decide what to do with it, so there shouldn’t be any 
difference, but I’m currently investigating an issue which,

- Same data is indexed in two different collections, one uses a copyField 
directive like above
- Other one don’t use copyField, but same value is sent both in place.name and 
place.name_orig fields during indexing
But I’m seeing some slight differences in resulting documents, mainly in casing 
between i and İ.

Have a nice weekend

Sent from Mail for Windows 10

Reply via email to