My uniqeKey in scema.xml is id. I've tried adding pk="id" to the store
entity but it makes no difference.

The result is the same if I set rootEntity="false" on the store entity.
However I added debug and verbose output to the dataimporthandler and I
noticed a slight change in how the nested queries are executed. Below is
with rootEntity="true":

<response>
<lst name="responseHeader">...</lst>
<lst name="initArgs">...</lst>
<str name="command">full-import</str>
<str name="mode">debug</str>
<arr name="documents"/>
<lst name="verbose-output">
<lst name="entity:store">
<lst name="document#1">
<str name="query">../../../data/StoresTest.xml</str>
<str name="time-taken">0:0:0.1</str>
<str>----------- row #1-------------</str>
<str name="id">0102</str>
<str name="$forEach">/Stores/Store</str>
<str>---------------------------------------------</str>
<lst name="entity:storearticle">
<str name="query">../../../data/StoreArticlesTest.xml</str>
<str name="query">../../../data/StoreArticlesTest.xml</str>
<str name="time-taken">0:0:0.1</str>
<str name="time-taken">0:0:0.1</str>
<str>----------- row #1-------------</str>
<arr name="store_articles_txt">
<str>18004</str>
</arr>
<str name="$forEach">/StoreArticles</str>
<str>---------------------------------------------</str>
<lst name="transformer:LogTransformer">
<str>---------------------------------------------</str>
<arr name="store_articles_txt">
<str>18004</str>
</arr>
<str name="$forEach">/StoreArticles</str>
<str>---------------------------------------------</str>
</lst>
</lst>
</lst>
<lst name="document#2">
<str>----------- row #1-------------</str>
<str name="id">0104</str>
<str name="$forEach">/Stores/Store</str>
<str>---------------------------------------------</str>
<lst name="entity:storearticle">
<str name="query">../../../data/StoreArticlesTest.xml</str>
<str name="query">../../../data/StoreArticlesTest.xml</str>
<str name="query">../../../data/StoreArticlesTest.xml</str>
<str name="query">../../../data/StoreArticlesTest.xml</str>
<str name="time-taken">0:0:0.0</str>
<str name="time-taken">0:0:0.0</str>
<str name="time-taken">0:0:0.0</str>
<str name="time-taken">0:0:0.0</str>
<str>----------- row #1-------------</str>
<arr name="store_articles_txt">
<str>18004</str>
</arr>
<str name="$forEach">/StoreArticles</str>
<str>---------------------------------------------</str>
<lst name="transformer:LogTransformer">
<str>---------------------------------------------</str>
<arr name="store_articles_txt">
<str>18004</str>
</arr>
<str name="$forEach">/StoreArticles</str>
<str>---------------------------------------------</str>
</lst>
</lst>
</lst>
<lst name="document#3"/>
</lst>
</lst>
<str name="status">idle</str>
<str name="importResponse">Configuration Re-loaded sucessfully</str>
<lst name="statusMessages">...</lst>
<str name="WARNING">...</str>
</response>

And with rootEntity="false":

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">40</int>
</lst>
<lst name="initArgs">
<lst name="defaults">
<str name="config">import-test-articles-config.xml</str>
</lst>
</lst>
<str name="command">full-import</str>
<str name="mode">debug</str>
<arr name="documents"/>
<lst name="verbose-output">
<lst name="entity:store">
<str name="query">../../../data/StoresTest.xml</str>
<str name="query">../../../data/StoresTest.xml</str>
<str name="time-taken">0:0:0.10</str>
<str name="time-taken">0:0:0.10</str>
<str>----------- row #1-------------</str>
<str name="id">0102</str>
<str name="$forEach">/Stores/Store</str>
<str>---------------------------------------------</str>
<lst name="entity:storearticle">
<lst name="document#1">
<str name="query">../../../data/StoreArticlesTest.xml</str>
<str name="time-taken">0:0:0.0</str>
<str>----------- row #1-------------</str>
<arr name="store_articles_txt">
<str>18004</str>
</arr>
<str name="$forEach">/StoreArticles</str>
<str>---------------------------------------------</str>
<lst name="transformer:LogTransformer">
<str>---------------------------------------------</str>
<arr name="store_articles_txt">
<str>18004</str>
</arr>
<str name="$forEach">/StoreArticles</str>
<str>---------------------------------------------</str>
</lst>
</lst>
<lst name="document#2"/>
</lst>
<str>----------- row #2-------------</str>
<str name="id">0104</str>
<str name="$forEach">/Stores/Store</str>
<str>---------------------------------------------</str>
<lst name="entity:storearticle">
<lst name="document#2">
<str name="query">../../../data/StoreArticlesTest.xml</str>
<str name="query">../../../data/StoreArticlesTest.xml</str>
<str name="time-taken">0:0:0.0</str>
<str name="time-taken">0:0:0.0</str>
<str>----------- row #1-------------</str>
<arr name="store_articles_txt">
<str>18004</str>
</arr>
<str name="$forEach">/StoreArticles</str>
<str>---------------------------------------------</str>
<lst name="transformer:LogTransformer">
<str>---------------------------------------------</str>
<arr name="store_articles_txt">
<str>18004</str>
</arr>
<str name="$forEach">/StoreArticles</str>
<str>---------------------------------------------</str>
</lst>
</lst>
<lst name="document#3"/>
</lst>
</lst>
</lst>
<str name="status">idle</str>
<str name="importResponse">Configuration Re-loaded sucessfully</str>
<lst name="statusMessages">...</lst>
<str name="WARNING">...</str>
</response>

I'm not very familiar with the verbose output but it seems like with
rootEntity="true", one query is made to retrieve the stores and then two,
and four queries are made to the nested store-article. With
rootEntity="false", two queries are made to retrieve the stores and then
one, and two queries are made to the nested store-article. It seems odd
that both these cases produces multiple queries for the second store, but
maybe that's expected?

Anyway, althought the queries differs, the result is the same.

/Tobias

2012/7/22 Ahmet Arslan <iori...@yahoo.com>

> > I'm trying to index a set of stores and their articles. I
> > have two
> > XML-files, one that contains the data of the stores and one
> > that contains
> > articles for each store. I'm using DIH with
> > XPathEntityProcessor to process
> > the file containing the store, and using a nested entity I
> > try to get all
> > articles that belongs to the specific store. The problem I
> > encounter is
> > that every store gets the same articles.
> >
> > For testing purposes I've stripped down the xml-files to
> > only include id:s
> > for testing purposes. The store file (StoresTest.xml) looks
> > like this:
> >
> > <?xml version="1.0" encoding="utf-8"?>
> > <Stores><Store><Id>0102</Id></Store><Store><Id>0104</Id></Store></Stores>
> >
> > The Store-Articles relations file (StoreArticlesTest.xml)
> > looks like this:
> > <?xml version="1.0"
> > encoding="utf-8"?><StoreArticles><Store
> > StoreId="0102"><ArticleId>18004</ArticleId></Store><Store
> >
> StoreId="0104"><ArticleId>17004</ArticleId><ArticleId>10004</ArticleId></Store></StoreArticles>
> >
> > And my dih-config file looks like this:
> >
> > <dataConfig>
> >         <dataSource
> > type="FileDataSource" encoding="UTF-8" />
> >         <document>
> >    <entity name="store"
> > processor="XPathEntityProcessor"
> > stream="true"
> > forEach="/Stores/Store"
> > url="../../../data/StoresTest.xml"
> > transformer="TemplateTransformer"
> > >
> > <field column="id"  xpath="/Stores/Store/Id" />
> > <entity name="storearticle"
> > processor="XPathEntityProcessor"
> > stream="true"
> > forEach="/StoreArticles"
> > url="../../../data/StoreArticlesTest.xml"
> > transformer="LogTransformer"
> > logTemplate="Processing ${store.id}" logLevel="info"
> > rootEntity="true">
> >  <field column="store_articles_txt"
> > xpath="/StoreArticles/Store[@StoreId='${
> > store.id}']/ArticleId" />
> > </entity>
> >    </entity>
> > </document>
> > </dataConfig>
> >
> > The result I get in Solr is this:
> >
> > <response>
> > <lst name="responseHeader">...</lst>
> > <result name="response" numFound="2" start="0">
> > <doc>
> > <str name="id">0102</str>
> > <arr name="store_articles_txt">
> > <str>18004</str>
> > </arr>
> > </doc>
> > <doc>
> > <str name="id">0104</str>
> > <arr name="store_articles_txt">
> > <str>18004</str>
> > </arr>
> > </doc>
> > </result>
> > </response>
> >
> > As you see, both stores gets the article for the first
> > store. I would have
> > expected the second store to have two articles: 17004 and
> > 10004.
> >
> > In the log messages printed using LogTransformer I see that
> > each
> > store.idis processed but somehow it only picks up the
> > articles for the
> > first store.
> >
> > Any ideas?
>
> What happens when you set <entity name="store" rootEntity="false" ?
> What is your uniqueKey in schema.xml?
>

Reply via email to