Re: Better to have lots of smaller cores or one really big core?

Erick Erickson Thu, 02 Jun 2011 18:03:14 -0700

Take another approach <G>? Cores are often used for isolation
purposes. That is, the data in one core may have nothing to do with
another core, the schemas don't have to match etc. They #may# be
both logically and physically separate.

I don't have measurements for this, so I'm guessing a little. But I expect
that using multiple cores will actually use a few more resources than a
single core (e.g. memory). Each core will be keeping a separate
cache, duplicating terms etc. (I may be wrong on this one!).

But if you have a single schema in a logically single core that just grows
too big to server queries acceptably, the usual approach is to go to
shards, which are just a core but Solr manages the query part over
multiple shards via configuration, which is probably easier. So the answer
in this case is to put stuff on a single machine in a single core until it
grows too big, then go to sharding....

So the question is really whether you consider the cores sub-parts of a
single index or distinct units (say one core per customer). In the former,
I'd use one core until it gets too big, then shard. In the latter, multiple
cores are a good solution, largely for administrative/security reasons,
but then you aren't manually constructing a huge URL...

Hope that helps
Erick

On Thu, Jun 2, 2011 at 7:57 PM, JohnRodey <timothydd...@yahoo.com> wrote:
> I am trying to decide what the right approach would be, to have one big core
> and many smaller cores hosted by a solr instance.
>
> I think there may be trade offs either way but wanted to see what others do.
> And by small I mean about 5-10 million documents, large may be 50 million.
>
> It seems like small cores are better because
> - If one server can host say 70 million documents (before memory issues) we
> can get really close with a bunch of small indexes, vs only being able to
> host one 50 million document index.  And when a software update comes out
> that allows us to host 90 million then we could add a few more small
> indexes.
> - It takes less time to build ten 5 million document indexes than one 50
> million document index.
>
> It seems like larger cores are better because
> - Each core returns their result set, so if I want 1000 results and their
> are 100 cores the network is transferring 100000 documents for that search.
> Where if I had only 10 much larger cores only 10000 documents would be sent
> over the network.
> - It would prolong my time until I hit uri length limits being that there
> would be less cores in my system.
>
> Any thoughts???  Other trade-offs???
>
> How do you find what the right size for you is?
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Better-to-have-lots-of-smaller-cores-or-one-really-big-core-tp3017973p3017973.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Better to have lots of smaller cores or one really big core?

Reply via email to