I think JOIN is more performant as - by default - DIH will run an
inner query for each outer one. You can use cached source, but JOIN
will be still more efficient.

The nested entities are more useful when the sources are heterogeneous
(e.g. DB and XML) or when you need to do custom transformers in
between.

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Thu, Apr 25, 2013 at 10:17 AM, Gustav <xbihy...@sharklasers.com> wrote:
> Hello guys, i saw this thread on stackoverflow, but still not satisfied with
> the answers.
>
> I am trying to index data across multiple tables using Solr's Data Import
> Handler. The official wiki on the DIH suggests using embedded entities to
> link multiple tables like so:
>
> <document>
>     <entity name="item" pk="id" query="SELECT * FROM item">
>         <entity name="member" pk="memberid" query="SELECT * FROM member
> WHERE memberid='${item.memberid}'>
>         </entity>
>     </entity>
> </document>
>
> Another way that works is:
>
> <document>
>     <entity name="item" pk="id" query="SELECT * FROM item INNER JOIN member
> ON item.memberid=member.memberid">
>     </entity>
> </document>
>
> Are these two methods functionally different? Is there a performance
> difference?
>
> Another though would be that, if using join tables in MySQL, using the SQL
> query method with multiple joins could cause multiple documents to be
> indexed instead of one.
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/What-is-the-difference-between-a-Join-Query-and-Embedded-Entities-in-Solr-DIH-tp4058923.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to