DIH allows only <document> tag. you may have multiple root <entity>
tags and you may invoke them by name(s). When no name is passed all
root entities are invoked one after another.

On Wed, Sep 9, 2009 at 5:12 AM, Rupert Fiasco<rufia...@gmail.com> wrote:
> Maybe I should be more clear: I have multiple tables in my DB that I
> need to save to my Solr index. In my app code I have logic to persist
> each table, which maps to an application model to Solr. This is fine.
> I am just trying to speed up indexing time by using DIH instead of
> going through my application. From what I understand of DIH I can
> specify one dataSource element and then a series of document/entity
> sets, for each of my models. But like I said before, DIH only appears
> to want to index the first document declared under the dataSource tag.
>
> -Rupert
>
> On Tue, Sep 8, 2009 at 4:05 PM, Rupert Fiasco<rufia...@gmail.com> wrote:
>> I am using the DataImportHandler with a JDBC datasource. From my
>> understanding of DIH, for each of my "content types" e.g. Blog posts,
>> Mesh Categories, etc I would construct a series of document/entity
>> sets, like
>>
>> <dataConfig>
>> <dataSource driver="com.mysql.jdbc.Driver" url="jdbc:mysql://...." />
>>
>>    <!-- BLOG ENTRIES -->
>>    <document name="blog_entries">
>>      <entity name="blog_entries" query="select
>> id,title,keywords,summary,data,title as name_fc,'BlogEntry' as type
>> from blog_entries">
>>        <field column="id" name="pk_i" />
>>        <field column="id" name="id" />
>>        <field column="title" name="text_t" />
>>        <field column="data" name="text_t" />
>>      </entity>
>>    </document>
>>
>>    <!-- MESH CATEGORIES -->
>>    <document name="mesh_category">
>>      <entity name="mesh_categories" query="select
>> id,name,node_key,name as name_fc,'MeshCategory' as type from
>> mesh_categories">
>>        <field column="id" name="pk_i" />
>>        <field column="id" name="id" />
>>        <field column="name" name="text_t" />
>>        <field column="node_key" name="string" />
>>        <field column="name_fc" name="facet_value" />
>>        <field column="type" name="type_t" />
>>      </entity>
>>    </document>
>> </datasource>
>> </dataConfig>
>>
>>
>> Solr parses this just fine and allows me to issue a
>> /dataimport?command=full-import and it runs, but it only runs against
>> the "first" document (blog_entries). It doesnt run against the 2nd
>> document (mesh_categories).
>>
>> If I remove the 2 document elements and wrap both entity sets in just
>> one document tag, then both sets get indexed, which seemingly achieves
>> my goal. This just doesnt make sense from my understanding of how DIH
>> works. My 2 content types are indeed separate so they logically
>> represent two document types, not one.
>>
>> Is this correct? What am I missing here?
>>
>> Thanks
>> -Rupert
>>
>



-- 
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com

Reply via email to