This is beyond my direct area of expertise, but one way to look at
this would be:
1) Create new collections offline. Down to each of the 6000 clients
having its own private collection (embedded SolrJ/server). Or some
sort of mini-hubs, e.g. a server per N clients.
2) Bring those collections into central server
3) Update alias that used to point to previous collection set to point
to the new one:
https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-CreateormodifyanAliasforaCollection
4) Delete old collection set, as nothing points at it anymore

Now, I don't know how that would play with shards/replicas.

Regards,
   Alex.
----
Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 15 August 2015 at 16:03, Troy Edwards <tedwards415...@gmail.com> wrote:
> I am using SolrCloud
>
> My initial requirements are:
>
> 1) There are about 6000 clients
> 2) The number of documents from each client are about 500000 (average
> document size is about 400 bytes)
> 3 I have to wipe off the index/collection every night and create new
>
> Any thoughts/ideas/suggestions on:
>
> 1) How to index such large number of documents i.e. do I use an http client
> to send documents or is data import handler right or should I try uploading
> CSV files?
>
> 2) How many collections should I use?
>
> 3) How many shards / replicas per collection should I use?
>
> 4) Do I need multiple Solr servers?
>
> Thanks

Reply via email to