On 7/17/2015 8:23 AM, Bill Au wrote: > One of my database column is a varchar containing a comma-delimited list of > values. I wold like to import these values into a multiValued field. I > figure that I will need to write a ScriptTransformer to do that. Is there > a better way?
DIH provides the RegexTransformer. https://wiki.apache.org/solr/DataImportHandler#RegexTransformer With this transformer, you can do "splitBy" on <field> config elements. The "mailId" field in the RegexTransformer example on the wiki shows splitting on commas. This splitBy functionality is good if you need individual values returned in search results. If your goal is to make it possible to search on those individual values without regard to how they are returned in search results, you can break it apart at index analysis time in your schema. I have a fieldType in my schema that tokenizes on semicolons, with optional whitespace. Here's the tokenizer: <tokenizer class="solr.PatternTokenizerFactory" pattern="\p{Space}*;\p{Space}*" /> Thanks, Shawn