On 7/17/2015 8:23 AM, Bill Au wrote:
> One of my database column is a varchar containing a comma-delimited list of
> values.  I wold like to import these values into a multiValued field.  I
> figure that I will need to write a ScriptTransformer to do that.  Is there
> a better way?

DIH provides the RegexTransformer.

https://wiki.apache.org/solr/DataImportHandler#RegexTransformer

With this transformer, you can do "splitBy" on <field> config elements. 
The "mailId" field in the RegexTransformer example on the wiki shows
splitting on commas.

This splitBy functionality is good if you need individual values
returned in search results.  If your goal is to make it possible to
search on those individual values without regard to how they are
returned in search results, you can break it apart at index analysis
time in your schema.  I have a fieldType in my schema that tokenizes on
semicolons, with optional whitespace.  Here's the tokenizer:

  <tokenizer class="solr.PatternTokenizerFactory"
pattern="\p{Space}*;\p{Space}*" />

Thanks,
Shawn

Reply via email to