Does it all have to be in a single cloud?

On Mon, Jan 28, 2019, 10:34 PM Shawn Heisey <apa...@elyograg.org wrote:

> On 1/28/2019 8:12 PM, Monica Skidmore wrote:
> > I would have to negotiate with the middle-ware teams - but, we've used a
> core per customer in master-slave mode for about 3 years now, with great
> success.  Our pool of data is very large, so limiting a customer's searches
> to just their core keeps query times fast (or at least reduces the chances
> of one customer impacting another with expensive queries.  There is also a
> little security added - since the customer is required to provide the core
> to search, there is less chance that they'll see another customer's data in
> their responses (like they might if they 'forgot' to add a filter to their
> query.  We were hoping that moving to Cloud would help our management of
> the largest customers - some of which we'd like to sub-shard with the cloud
> tooling.  We expected cloud to support as many cores/collections as our
> 2-versions-old Solr instances - but we didn't count on all the increased
> network traffic or the extra complications of bringing up a large cloud
> cluster.
>
> At this time, SolrCloud will not handle what you're trying to throw at
> it.  Without Cloud, Solr can fairly easily handle thousands of indexes,
> because there is no communication between nodes about cluster state.
> The immensity of that communication (handled via ZooKeeper) is why
> SolrCloud can't scale to thousands of shard replicas.
>
> The solution to this problem will be twofold:  1) Reduce the number of
> work items in the Overseer queue.  2) Make the Overseer do its job a lot
> faster.  There have been small incremental improvements towards these
> goals, but as you've noticed, we're definitely not there yet.
>
> On the subject of a customer forgetting to add a filter ... your systems
> should be handling that for them ... if the customer has direct access
> to Solr, then all bets are off... they'll be able to do just about
> anything they want.  It is possible to configure a proxy to limit what
> somebody can get to, but it would be pretty complicated to come up with
> a proxy configuration that fully locks things down.
>
> Using shards is completely possible without SolrCloud.  But SolrCloud
> certainly does make it a lot easier.
>
> How many records in your largest customer indexes?  How big are those
> indexes on disk?
>
> Thanks,
> Shawn
>

Reply via email to