You can only have one document tag and the entities must be nested
within that.

>From the wiki, if you issue a simple "/dataimport?command=full-import"
all top level entities will be processed.


>Maybe I should be more clear: I have multiple tables in my DB that I
>need to save to my Solr index. In my app code I have logic to persist
>each table, which maps to an application model to Solr. This is fine.
>I am just trying to speed up indexing time by using DIH instead of
>going through my application. From what I understand of DIH I can
>specify one dataSource element and then a series of document/entity
>sets, for each of my models. But like I said before, DIH only appears
>to want to index the first document declared under the dataSource tag.
>
>-Rupert
>
>On Tue, Sep 8, 2009 at 4:05 PM, Rupert Fiasco<rufia...@gmail.com> wrote:
>> I am using the DataImportHandler with a JDBC datasource. From my
>> understanding of DIH, for each of my "content types" e.g. Blog posts,
>> Mesh Categories, etc I would construct a series of document/entity
>> sets, like
>>
>> <dataConfig>
>> <dataSource driver="com.mysql.jdbc.Driver" url="jdbc:mysql://...." />
>>
>>    <!-- BLOG ENTRIES -->
>>    <document name="blog_entries">
>>      <entity name="blog_entries" query="select
>> id,title,keywords,summary,data,title as name_fc,'BlogEntry' as type
>> from blog_entries">
>>        <field column="id" name="pk_i" />
>>        <field column="id" name="id" />
>>        <field column="title" name="text_t" />
>>        <field column="data" name="text_t" />
>>      </entity>
>>    </document>
>>
>>    <!-- MESH CATEGORIES -->
>>    <document name="mesh_category">
>>      <entity name="mesh_categories" query="select
>> id,name,node_key,name as name_fc,'MeshCategory' as type from
>> mesh_categories">
>>        <field column="id" name="pk_i" />
>>        <field column="id" name="id" />
>>        <field column="name" name="text_t" />
>>        <field column="node_key" name="string" />
>>        <field column="name_fc" name="facet_value" />
>>        <field column="type" name="type_t" />
>>      </entity>
>>    </document>
>> </datasource>
>> </dataConfig>
>>
>>
>> Solr parses this just fine and allows me to issue a
>> /dataimport?command=full-import and it runs, but it only runs against
>> the "first" document (blog_entries). It doesnt run against the 2nd
>> document (mesh_categories).
>>
>> If I remove the 2 document elements and wrap both entity sets in just
>> one document tag, then both sets get indexed, which seemingly achieves
>> my goal. This just doesnt make sense from my understanding of how DIH
>> works. My 2 content types are indeed separate so they logically
>> represent two document types, not one.
>>
>> Is this correct? What am I missing here?
>>
>> Thanks
>> -Rupert
>>

-- 

===============================================================
Fergus McMenemie               Email:fer...@twig.me.uk
Techmore Ltd                   Phone:(UK) 07721 376021

Unix/Mac/Intranets             Analyst Programmer
===============================================================

Reply via email to