Faceting is much happier if you use a single valued field, but my apps
all require multivalued fields:
<doc>
<arr name="subject">
 <str>aaa</str>
 <str>bbb</str>
 <str>ccc</str>
</arr>
</doc>

I'd like to use copyField to accumulate the multivalued fields into a
single field that can be efficiently faceted.  (As written, it adds a
new field for each one and throws an error if multiValued="false")

The simplest thing i can think of is to check if the copyField target
is multivalued, if not, accumulate the values separated by some token
that the copyField target will split.

perhaps something like:

<fieldtype name="facetable" class="solr.StrField" omitNorms="true">
 <analyzer>
   <tokenizer class="solr.RegexTokenizerFactory">
     <str name="pattern">;</str>  <!-- tokens=input.split( ";" ) -->
   </tokenizer>
   <filter class="solr.TrimFilterFactory" />
 </analyzer>
</fieldtype>

<field name="subject" type="text" indexed="true" stored="true"
multiValued="true"/>
<field name="subject_facet" type="facetable" indexed="true"
stored="false" multiValued="false"/>

<copyField source="subject" dest="subject_facet" accumulate=";" />

If ';' is not in the input, this would work.  Is there some character
guaranteed not to be in any input?  Maybe i should call it
"facet_field" rather then "facetable" - i keep reading it as "face
table"

Any thoughts on this design would be great.

thanks
ryan

Reply via email to