You have choices: - Use a separate collection for each data import - Use the same collection for each data import, differentiating them using a field you can query
The choice depends on the objects and how they will be use, and I trust others on this list to have better advise on how to choose. -----Original Message----- From: Yangrui Guo [mailto:guoyang...@gmail.com] Sent: Tuesday, April 05, 2016 11:27 AM To: solr-user@lucene.apache.org Subject: Re: Multiple data-config.xml in one collection? Hi thanks for the answer. Yes I will be using DIH to import data from different database connections. Do I have to create a collection for each connection? On Tuesday, April 5, 2016, Shawn Heisey <apa...@elyograg.org> wrote: > On 4/5/2016 8:12 AM, Yangrui Guo wrote: > > I'm using Solr Cloud to index a number of databases. The problem is > > there is unknown number of databases and each database has its own > configuration. > > If I create a single collection for every database the query would > > eventually become insanely long. Is it possible to upload different > config > > to zookeeper for each node in a single collection? > > Every shard replica (core) in a collection shares the same > configuration, which it gets from zookeeper. This is one of > SolrCloud's guarantees, to prevent problems found with old-style > sharding when the configuration is different on each machine. > > If you're using the dataimport handler, which you probably are since > you mentioned databases, you can parameterize pretty much everything > in the DIH config file so it comes from URL parameters on the > full-import or delta-import command. > > Below is a link to the DIH config that I'm using, redacted slightly. > I'm not running SolrCloud, but the same thing should work in cloud. > It should give you some idea of how to use variables in your config, > set by parameters on the URL. > > http://apaste.info/jtq > > Thanks, > Shawn > >