I'm trying to setup a multi-tenant Solr cluster (v6.5.1) which must meet the following requirements. The tenants are different customers with similar type of data.
* Ability to query per client but also across all clients * Don't want to hit all shards for all type of requests (per client, across clients) * Don't want to have everything under a single multi-sharded collection to avoid a SPOF and maintenance headaches (e.g. a schema change will force an all-client reindexing. single huge backup/restore) * Ability to semi-support different schemas. Based on the above I ruled out the following setups * Single multi-sharded collection for all clients and all its variations (e.g. multiple clients in a singe shard) * One collection per client My preference lies in a setup like the following * Create a limited # of collections * Split the clients in the collections created above based on some criteria (size, content-type) * Client specific requests will be limited in a single collection * Across clients requests will target a limited # of collections (using &collection=col_1,col_2,col_3) The approach above meets the requirements posted above but the issue that is blocking me is the Distributed IDF not working properly across collections. (Check comment#3, bullet#2 of http://lucene.472066.n3.nabble.com/Distributed-IDF-in-inter-collections-distributed-queries-td4317519.html) -> Do you see anything wrong with my assumptions/approach above? Are there any alternatives besides having separate clusters for the search across clients and the individual clients? -> Is it safe to go with a single collection? If it is, I still need to handle the possible different schemas per client somehow. -> Is there a way to enforce local stats when quering a single collection and use global stats only when querying across collections? (see link above) Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Multi-tenant-setup-tp4340377.html Sent from the Solr - User mailing list archive at Nabble.com.