Thanks Alex.

Inner entity name should be different - It was a typo error in my question.

Regarding using XsltUpdateRequestHandler
<https://wiki.apache.org/solr/XsltUpdateRequestHandler> , It's a good
solution but I can not use it in my application since I need to include few
more transformer and java manipulators.

Could you please suggest how to use XPATH syntax like "/RESOURCE/LINK[@ID=${
testdata.id}]/TAG/TAG_VALUE" in data config xml file?

On Tue, Sep 8, 2015 at 6:34 PM, Umang Agrawal <umang.i...@gmail.com> wrote:

> Hi All
>
> I am facing a problem with XPathEntityProcessor .
>
> Objective:
> When I index Resource XML file using DIH XPathEntityProcessor then there
> should be 2 solr documents
> 01) Link where id is 1000 with 2 tags ABC and DEF
> 02) Link where id is 2000 with 3 tags GHI, JKL and MNO
>
> Solr Version: 4.10.2
>
> Problem:
> I am not able to index <TAG/> data properly.
>
> Expected Output:
> {
> "id": "1000",
> "field_name": "val1",
> "ABC": "ABC_VALUE",
> "DEF": "DEF_VALUE"
> },
> {
> "id": "2000",
> "field_name": "val2",
> "GHI": "GHI_VALUE",
> "JKL": "JKL_VALUE",
> "MNO": "MNO_VALUE"
> }
>
> ========================================================================================================
>
> Resource XML:
>
> <RESOURCE>
> <LINK ID="1000">
> <FIELD>val1</FIELD>
> <TAG>
> <TAG_CODE>ABC</TAG_CODE>
> <TAG_VALUE>ABC_VALUE</TAG_VALUE>
> </TAG>
> <TAG>
> <TAG_CODE>DEF</TAG_CODE>
> <TAG_VALUE>DEF_VALUE</TAG_VALUE>
> </TAG>
> </LINK>
> <LINK ID="2000">
> <FIELD>val2</FIELD>
> <TAG>
> <TAG_CODE>GHI</TAG_CODE>
> <TAG_VALUE>GHI_VALUE</TAG_VALUE>
> </TAG>
> <TAG>
> <TAG_CODE>JKL</TAG_CODE>
> <TAG_VALUE>JKL_VALUE</TAG_VALUE>
> </TAG>
> <TAG>
> <TAG_CODE>MNO</TAG_CODE>
> <TAG_VALUE>MNO_VALUE</TAG_VALUE>
> </TAG>
> </LINK>
> </RESOURCE>
>
>
> ========================================================================================================
>
> DataConfig XML (TRY 1):
> <dataConfig>
> <script><![CDATA[
> function f1(row) {
> var code = row.get("TAG_CODE");
> var val = row.get("TAG_VALUE");
> row.put(code, val);
> row.remove("TAG_CODE");
> row.remove("TAG_VALUE");
> return row;
> }
>     ]]></script>
>     <dataSource type="URLDataSource" />
>     <document>
>         <entity name="testdata" url="http://host:port/uri";
>                 processor="XPathEntityProcessor" forEach="/RESOURCE/LINK">
> <field column="id" xpath="/RESOURCE/LINK/@ID" />
>             <field column="field_name" xpath="/RESOURCE/LINK/FIELD" />
> <entity name="testdata" url="http://host:port/uri";
>                 processor="XPathEntityProcessor"
> forEach="/RESOURCE/LINK/TAG" transformer="script:f1">
> <field column="TAG_CODE" xpath="/RESOURCE/LINK/TAG/TAG_CODE" />
> <field column="TAG_VALUE" xpath="/RESOURCE/LINK/TAG/TAG_VALUE" />
> </entity>
>         </entity>
>     </document>
> </dataConfig>
>
> Output:
> {
> "id": "1000",
> "field_name": "val1",
> "ABC": "ABC_VALUE",
> "DEF": "DEF_VALUE",
> "GHI": "GHI_VALUE",
> "JKL": "JKL_VALUE",
> "MNO": "MNO_VALUE"
> },
> {
> "id": "2000",
> "field_name": "val2",
> "ABC": "ABC_VALUE",
> "DEF": "DEF_VALUE",
> "GHI": "GHI_VALUE",
> "JKL": "JKL_VALUE",
> "MNO": "MNO_VALUE"
> }
>
>
> ========================================================================================================
>
> DataConfig XML (TRY 2):
> <dataConfig>
> <script><![CDATA[
> function f1(row) {
> var code = row.get("TAG_CODE");
> var val = row.get("TAG_VALUE");
> row.put(code, val);
> row.remove("TAG_CODE");
> row.remove("TAG_VALUE");
> return row;
> }
>     ]]></script>
>     <dataSource type="URLDataSource" />
>     <document>
>         <entity name="testdata" url="http://host:port/uri";
>                 processor="XPathEntityProcessor" forEach="/RESOURCE/LINK">
> <field column="id" xpath="/RESOURCE/LINK/@ID" />
>             <field column="field_name" xpath="/RESOURCE/LINK/FIELD" />
> <entity name="testdata" url="http://host:port/uri";
>                 processor="XPathEntityProcessor"
> forEach="/RESOURCE/LINK[@ID=${testdata.id}]/TAG" transformer="script:f1">
> <field column="TAG_CODE" xpath="/RESOURCE/LINK/TAG/TAG_CODE" />
> <field column="TAG_VALUE" xpath="/RESOURCE/LINK/TAG/TAG_VALUE" />
> </entity>
>         </entity>
>     </document>
> </dataConfig>
>
> Output:
> {
> "id": "1000",
> "field_name": "val1"
> },
> {
> "id": "2000",
> "field_name": "val2"
> }
>
>
> ========================================================================================================
>
> DataConfig XML (TRY 3):
> <dataConfig>
> <script><![CDATA[
> function f1(row) {
> var code = row.get("TAG_CODE");
> var val = row.get("TAG_VALUE");
> row.put(code, val);
> row.remove("TAG_CODE");
> row.remove("TAG_VALUE");
> return row;
> }
>     ]]></script>
>     <dataSource type="URLDataSource" />
>     <document>
>         <entity name="testdata" url="http://host:port/uri";
>                 processor="XPathEntityProcessor" forEach="/RESOURCE/LINK">
> <field column="id" xpath="/RESOURCE/LINK/@ID" />
>             <field column="field_name" xpath="/RESOURCE/LINK/FIELD" />
> <entity name="testdata" url="http://host:port/uri";
>                 processor="XPathEntityProcessor"
> forEach="/RESOURCE/LINK[@ID=${testdata.id}]/TAG" transformer="script:f1">
> <field column="TAG_CODE" 
> xpath="/RESOURCE/LINK[@ID=${testdata.id}]/TAG/TAG_CODE"
> />
> <field column="TAG_VALUE" 
> xpath="/RESOURCE/LINK[@ID=${testdata.id}]/TAG/TAG_VALUE"
> />
> </entity>
>         </entity>
>     </document>
> </dataConfig>
>
> Output:
> {
> "id": "1000",
> "field_name": "val1"
> },
> {
> "id": "2000",
> "field_name": "val2"
> }
>
>
> --
> Thanx & Regards
> Umang Agrawal
>
>
> [image: Inline image 1]
>



-- 
Thanx & Regards
Umang Agrawal

Reply via email to