Thanks for quick reply Erik, I want to keep my collections till I run out of hardware, which is at least a couple of years worth data. I'd like to know more on ageing out aliases, did a quick search but didn't find much.
On Fri, Apr 25, 2014 at 9:45 PM, Erick Erickson <erickerick...@gmail.com>wrote: > Hmmm, tell us a little more about your use-case. In particular, how > long do you need to keep the data around? Days? Months? Years? > > Because if you only need to keep the data for a specified period, you > can use the collection aliasing process to age-out collections and > keep the number of cores from growing too large. > > Best, > Erick > > On Fri, Apr 25, 2014 at 6:49 AM, Mukesh Jha <me.mukesh....@gmail.com> > wrote: > > Hi Experts, > > > > I need to divide my indexes based on hour/day with each index having > ~50-80 > > GB data & ~50-80 mill docs, so I'm planning to create daily collection > with > > names e.g. *sample_colledction_yyyy_mm_dd_hh.* > > I'll also create an alias *sample_collection* and update it whenever I > will > > create a new collection so that the entire data set is searchable. > > > > I've a couple of question on the above design > > 1) How far can it scale? As my collections will increase (so will the > > shards & replicas) do we have a breaking point when adding more/searching > > will become an issue? > > 2) As my cluster will grow because of huge number of collections the > > clusterstate.json file present in zookeeper will grow too, won't this be > a > > limiting factor? If so instead of storing all this info in one > > clusterstate.json file shouldn't Solr save cluster specific details in > this > > file & have collection specific config files present on zookeeper? > > 3) How can I easily manage all these collections? Do we have Java > Coreadmin > > API's available. I cannot find much documented on it. > > > > -- > > Txz, > > > > *Mukesh Jha <me.mukesh....@gmail.com>* > -- Thanks & Regards, *Mukesh Jha <me.mukesh....@gmail.com>*