Re: SolrCloud and Join Queries

Per Steffensen Fri, 04 Jan 2013 02:28:56 -0800

On 1/4/13 9:21 AM, Hassan wrote:

Hi,
I am considering SolrCloud for our applications but I have run intothe limitation of not being able to use Join Queries in distributedsearches.
Our requirements are the following:
- SolrCloud will serve many applications where each application"index" is separate from other application. Each application really iscustomer deployment and we need to isolate customers data from each other-Join queries are required. Queries will only look at one customer ata time.- Since data volume for each customer is small in Solr/Lucenestandards, (1-2 Million document is small, right?

Yes

), we are really interested in the replication aspect of SolrCloudmore than distributed search.
I am considering the following SolrCloud design with questions:
- Start SolrCloud with 1 shard only. This should allow join queries towork correctly since all documents will be available in the same shard(index). is this a correct assumption?
- Each customer will have its own collection in the SolrCloud.

You cant have only one shard and several collections. A collectionsconsists of a number of shards, but a shards "belong" to a collection,so two different collections do not use the same shard. Shard is "below"collection in the concept-hierarchy so to speak.

Do collections provide me with data isolation between customers?

Yes?

Depends on what you mean with "isolation". Since different collectionsenforce different shards, and each shard basically has its own luceneindex (set of lucene indices if you use replication), and distinctlucene indices typically persist in different disk-folders, you will get"isolation" of data in the way that data for different customers will bestored in different disk-folders.

- Adding more nodes as replicas of the single shard to achievereplication and fault tolerance.
Thank you,
Hs

Not sure I understand completely what you want to achieve, but you mightwant to have a collection per customer. One shard per collection = oneshard per customer = (as long as we do not consider replication) onelucene index per customer = one data-disk-folder per customer. Youshould be able to do join queries inside the specific customers shard.


Regards, Per Steffensen

Re: SolrCloud and Join Queries

Reply via email to