On Wed, 2015-03-25 at 03:46 +0100, Ian Rose wrote:
> Thus theoretically we could actually just use one single collection for
>all of our customers (adding a 'customer:<whatever>' type fq to all
> queries) but since we never need to query across customers it seemed
> more performant (as well as safer - less chance of accidentally
> leaking data across customers) to use separate collections.

If only a few customers are active at a given time, it is more
performant to use a collestion/customer. If many of them are active, the
more performant option is to lump them together and filter on a field,
due to the redundancy-reduction of larger indexes.

The 1 collection/customer solution has another edge as ranking will be
calculated based on the corpus of the customer and not based on all
customers. If the number of customers is low enough to get the
individual collections solution to work, that would be the preferable
solution.

- Toke Eskildsen, State and University Library, Denmark


Reply via email to