Right. You don't get to 99.9% by assuming that an 8 hour outage is OK.
Design for continuous uptime, with plans for how long it takes to
patch around a single point of failure. For example, if your load
balancer is a single point of failure, make sure that you can redirect
the front end servers to a single Solr server in much less than 8 hours.
Also, think about your SLA. Can the search index be more than 8 hours
stale? How quickly do you need to be able to replace a failed indexing
server? You might be able to run indexing locally on each search
server if they are lightly loaded.
wunder
On Aug 4, 2009, at 7:11 AM, Norberto Meijome wrote:
On Mon, 3 Aug 2009 13:15:44 -0700
"Robert Petersen" <rober...@buy.com> wrote:
Thanks all, I figured there would be more talk about daemontools if
there
were really a need. I appreciate the input and for starters we'll
put two
slaves behind a load balancer and grow it from there.
Robert,
not taking away from daemon tools, but daemon tools won't help you
if your
whole server goes down.
don't put all your eggs in one basket - several
servers, load balancer (hardware load balancers x 2, haproxy, etc)
and sure, use daemon tools to keep your services running within each
server...
B
_________________________
{Beto|Norberto|Numard} Meijome
"Why do you sit there looking like an envelope without any address
on it?"
Mark Twain
I speak for myself, not my employer. Contents may be hot. Slippery
when wet.
Reading disclaimers makes you go blind. Writing them is worse. You
have been
Warned.