I'm experimenting with Solr5 (5.1.0 1672403 - timpotter - 2015-04-09
10:37:54). In my custom DIH, I use a RegExTransformer to load several
columns, which may or may not be present. If present, the regexp
matches and the data loads correctly in both Solr4 and 5. If not present
and the regexp fails, the column is empty in Solr 4. But in Solr5 it
contains the original string to be matched.
In other words, in Solr 5.10, if the 'replaceWith' value is empty,
'replaceWith' appears to revert to the original string.
Example:
Column 'data' contains: column1:xxx,column3:yyy
DIH regexp:
<field column="column1" regex="^.*column1:(.*?),.*$"
replaceWith="$1" sourceColName="data" />
<field column="column2" regex="^.*column2:(.*?),.*$"
replaceWith="$1" sourceColName="data" />
<field column="column3" regex="^.*column3:(.*?),.*$"
replaceWith="$1" sourceColName="data" />
solr4:
column1: xxx
column2:
column3: yyy
solr5:
column1:xxx
column2: column1:xxx,column3:yyy
column3: yyy