Re: Index very large number of documents from large number of clients

2015-08-15 Thread Toke Eskildsen
Troy Edwards wrote: > 1) There are about 6000 clients > 2) The number of documents from each client are about 50 (average > document size is about 400 bytes) So roughly 3 billion documents / 1TB index size. So at least 2 shards, due to the 2 billion limit in Lucene. If you want more advice t

Re: Index very large number of documents from large number of clients

2015-08-15 Thread Erick Erickson
Piling on here. At the scale you're talking, I suspect you'll not only have a bunch of servers, you'll really have a bunch of completely separate "Solr Clouds", complete with their own Zookeepers etc. Partly for administrative sake, partly for stability, etc. Not sure that'll be true, mind you, bu

Re: Index very large number of documents from large number of clients

2015-08-15 Thread Shawn Heisey
On 8/15/2015 2:03 PM, Troy Edwards wrote: > I am using SolrCloud > > My initial requirements are: > > 1) There are about 6000 clients > 2) The number of documents from each client are about 50 (average > document size is about 400 bytes) > 3 I have to wipe off the index/collection every night

Re: Index very large number of documents from large number of clients

2015-08-15 Thread Alexandre Rafalovitch
This is beyond my direct area of expertise, but one way to look at this would be: 1) Create new collections offline. Down to each of the 6000 clients having its own private collection (embedded SolrJ/server). Or some sort of mini-hubs, e.g. a server per N clients. 2) Bring those collections into ce

Index very large number of documents from large number of clients

2015-08-15 Thread Troy Edwards
I am using SolrCloud My initial requirements are: 1) There are about 6000 clients 2) The number of documents from each client are about 50 (average document size is about 400 bytes) 3 I have to wipe off the index/collection every night and create new Any thoughts/ideas/suggestions on: 1) Ho