Here is an example of doing this in DIH. Say you have a field foobar, that is a string type and has | between the strings that you want to put into a multiValued list. This is fairly easy to do with Regex feature of DIH. But say you also want to take the field and grab the lowest value and store it into another field, and the highest value from the array and store that into another field. This would allow you to sort by the low or high value without any problems.
You would have: foobar_bar (bar delimited) Foobar - destination multiValued foobar_low - low value foobar_high - high value You can do that in DIH using script. Here is an almost completed example: <dataConfig> <script> <![CDATA[ function generic_extract(row, field, fieldlow, fieldhigh, outfield) { if (row.get(field)) { var arr = new java.util.ArrayList(); var pieces = row.get(field).split('\|'); if (pieces.length > 0) { var lowvalue; var highvalue; for (var i=0; i < pieces.length; i++) { if (i == 0) { lowvalue = pieces[0]; highvalue = pieces[0]; } // convert to int or float, whatever... (this is not done!) if (lowvalue < pieces[i]) { lowvalue = pieces[i]; } if (highvalue < pieces[i]) { lowvalue = pieces[i]; } arr.add(pieces[i]); } row.put(fieldlow, lowvalue); row.put(fieldhigh, highvalue); row.put(outfield, arr); } } row.remove(field); return row; } function extract(row) { row = generic_extract(row, "foobar_bar", "foobar_low", "foobar_high", "foobar", 1); return row; } ]]> </script> <dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" url="jdbc:sqlserver://server.hostname.com:1433;databaseName=db_name;respons eBuffering=adaptive;" user="xmlsvc" password="xxxxxxx" readOnly="true" /> <document> <entity name="item" transformer="RegexTransformer,script:extract" pk="id" query="select * from table" </entity> </document> </dataConfig> On 3/17/11 9:09 AM, "Bill Bell" <billnb...@gmail.com> wrote: >Do you use Dih handler? A script can do this easily. > >Bill Bell >Sent from mobile > > >On Mar 17, 2011, at 9:02 AM, Bernd Fehling ><bernd.fehl...@uni-bielefeld.de> wrote: > >> >> Good idea. >> Was also just looking into this area. >> >> Assuming my input record looks like this: >> <documents> >> <document id="foobar"> >> <element name="author"><value>author_1 ; author_2 ; >>author_3</value></element> >> </document> >> </documents> >> >> Do you know if I can use something like this: >> ... >> <entity name="records" processor="XPathEntityProcessor" >> transformer="RegexTransformer" >> ... >> <field column="author" >>xpath="/documents/document/element[@name='author']/value" /> >> <field column="author_sort" >>xpath="/documents/document/element[@name='author']/value" /> >> <field column="author" splitBy=" ; " /> >> ... >> >> To just double the input and make author multiValued and author_sort a >>string field? >> >> Regards >> Bernd >> >> >> Am 17.03.2011 15:39, schrieb Gora Mohanty: >>> On Thu, Mar 17, 2011 at 8:04 PM, Bernd Fehling >>> <bernd.fehl...@uni-bielefeld.de> wrote: >>>> >>>> Is there a way to have a kind of "casting" for copyField? >>>> >>>> I have author names in multiValued string field and need a sorting on >>>>it, >>>> but sort on field is only for multiValued=false. >>>> >>>> I'm trying to get multiValued content from one field to a >>>> non-multiValued text or string field for sorting. >>>> And this, if possible, during loading with copyField. >>>> >>>> Or any other solution? >>> [...] >>> >>> Not sure about CopyField, but you could use a transformer to >>> extract values from a multiValued field, and stick them into a >>> single-valued field. >>> >>> Regards, >>> Gora