You have to specify which one to run. Each DIH will run only one XML (e.g. 
health-topics-conf.xml)

One thing, and please correct if wrong, I have noticed running DataImport for a 
particular config overwrites the existing data  for a document...that is, there 
is no way to preserve the existing data.
For example if you have a schema of 5 fields and running the 
health-topics-conf.xml  DIH  loads 3 of those fields of a document (id=XYZ)
And then running the encyclopedia-conf.xml DIH will overwrite those 3 fields 
for the same  document id = XYZ.

-----Original Message-----
From: Yangrui Guo [mailto:guoyang...@gmail.com] 
Sent: Tuesday, April 05, 2016 2:16 PM
To: solr-user@lucene.apache.org
Subject: Re: Multiple data-config.xml in one collection?

Hi Daniel,

So if I implement multiple dataimporthandler and do a full import, does Solr 
perform import of all handlers at once or can just specify which handler to 
import? Thank you

Yangrui

On Tuesday, April 5, 2016, Davis, Daniel (NIH/NLM) [C] <daniel.da...@nih.gov>
wrote:

> If Shawn is correct, and you are using DIH, then I have done this by 
> implementing multiple requestHandlers each of them using Data Import 
> Handler, and have each specify a different XML file for the data config.
> Instead of using data-config.xml, I've used a large number of files such as:
>         health-topics-conf.xml
>         encyclopedia-conf.xml
>         ...
> I tend to index a single valued, required field named "source" that I 
> can use in the delete query, and I use the TemplateTranformer to make this 
> easy:
>
> <entity name="topic"
>     ...
>    transformer="TemplateTransformer">
>    <field column="source" template="health-topics" />
>    ...
>
> Hope this helps,
>
> -Dan
>
> -----Original Message-----
> From: Shawn Heisey [mailto:apa...@elyograg.org <javascript:;>]
> Sent: Tuesday, April 05, 2016 10:50 AM
> To: solr-user@lucene.apache.org <javascript:;>
> Subject: Re: Multiple data-config.xml in one collection?
>
> On 4/5/2016 8:12 AM, Yangrui Guo wrote:
> > I'm using Solr Cloud to index a number of databases. The problem is 
> > there is unknown number of databases and each database has its own
> configuration.
> > If I create a single collection for every database the query would 
> > eventually become insanely long. Is it possible to upload different 
> > config to zookeeper for each node in a single collection?
>
> Every shard replica (core) in a collection shares the same 
> configuration, which it gets from zookeeper.  This is one of 
> SolrCloud's guarantees, to prevent problems found with old-style 
> sharding when the configuration is different on each machine.
>
> If you're using the dataimport handler, which you probably are since 
> you mentioned databases, you can parameterize pretty much everything 
> in the DIH config file so it comes from URL parameters on the 
> full-import or delta-import command.
>
> Below is a link to the DIH config that I'm using, redacted slightly.
> I'm not running SolrCloud, but the same thing should work in cloud.  
> It should give you some idea of how to use variables in your config, 
> set by parameters on the URL.
>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__apaste.info_jtq&d=
> CwIBaQ&c=uGuXJ43KPkPWEl2imVFDmZQlhQUET7pVRA2PDIOxgqw&r=bRfqJEeedEKG5nk
> p5748YxbNMFrUYT3YiNl0Ni2vUBQ&m=ps8KnPZhgym3oVyuWub8JT0eZI39W0FLsBW4fx5
> 61NY&s=k7H8l9XT7yyH_KHFtnIi793EtkLZnUvOz3lZA1mV01s&e=
>
> Thanks,
> Shawn
>
>

Reply via email to