(ccing bobm and genew hoping any interesting feedback from them)

> As long as we create and maintain a good abstraction for our data store, I 
> think that we can
switch if needs change.  …always create and maintain a good abstraction
for stuff like this.
That's the most important thing. I don't have any experience with
scaling mongodb, so I *hope* that's okay. In any case, that's something
widely used and easy to install so that seems like an acceptable option.

Nevertheless, I would love to have some services ops inputs about that
since they're the peolpe who know how to operate these and that may have
interesting numbers to crunch
> Looking at database backends for the data we’re talking about storing, these 
> are the ranked concerns
that I have heard:
>
> 1. availability
> 2. performance
> 3. consistency and partition tolerance
>
> Basically, if the service is up, that’s good.  The data that we are
hold here is either very short lived (that related to the push) or
easily repaired (device registrations of push URLs).
>
> It’s a little more difficult when we get to revocation of URIs, or
storing of state for URIs, but I’m going to leave that for a later
discussion.
>
> The main concern that I have here is service availability in light of
data centre failures.  In my experience, this happens far more often
than is acceptable.  For reference, when I was at Skype, Microsoft had
numbers that were pretty close to their promise of 99.9% availability.
>
> But I need to emphasise this - 99.9% is not good enough for a whole
system.  99.99% might be.  And the cloud platform is a component of a
system, the overall availability will be lower.
>
> So geographic distribution of data is critical.  This is managed with
varying degrees of sophistication by the different storage systems.  The
biggest part is how they trade off responsiveness and fault recovery. 
If updating a row requires a cross-geography request, that’s going to be
slow, but it means that you get very good failure characteristics.  At
the same time, it can also mean that you are unable to perform updates
in certain types of failure scenario.
>
> The key feature - one that most databases provide - is the ability to
tune this to application needs.  We’re going to want to tune this
initially so that updates don’t depend on full replication, since we
care more about availability and performance than consistency and
partition tolerance.
>
> Run-down of options
>
> Now, I have only a small amount of background with these specific
items unfortunately, my experience is with stuff that I believe some of
you might think to be poisonous, plus some stuff we just can’t use. 
Nonetheless, here are what I believe to be the high runner options:
>
> MongoDB
>
> This is very widely used, readily deployed to the cloud platform of
your choice and it has a pretty good story when it comes to performance.
>
> Mongo does have a geographic redundancy story.  It’s not covered in
glory, but nor is it entirely embarrassing.
>
> I’m less encouraged by the characteristics of the geographic
redundancy features; replica sets can only be statically configured to
prefer local communication, which means that a failure in the local
cluster will result in cross-geo operations for all requests that block
on replication.  I’m also a little concerned about how placement is
controlled within a cluster.
>
> http://docs.mongodb.org/manual/core/replica-set-architectures/
>
> Cassandra
>
> Cassandra is also widely used, with similar characteristics to Mongo.
>
> This has a larger number of options with respect to geographic
redundancy.  The topology aware replication mode offers some pretty good
opportunities, particularly when deployed with a “snitch” that is aware
of the deployment layout.
>
>
http://www.datastax.com/documentation/cassandra/1.2/cassandra/architecture/architectureDataDistributeReplication_c.html
>
> Redis
>
> Redis is probably the simplest option here.  Its high availability
option is still unstable, so I’m not going to recommend it.
>
> Dynamo
>
> This would tie us to AWS, but it seems like a capable DB.  The problem
is that in the short time I looked, I couldn’t uncover any details on
their geographic redundancy story.  That is not encouraging.
>
> Roll your own geo redundancy
>
> This remains a viable option…if you really need the
performance/availability/other characteristic.  Typically you take an
existing store (pick one, any one) and you add your own brand of
geographic redundancy to suit your needs.  This has the advantage of
being exactly what you need, but the disadvantage of it being a bunch of
extra work.  I’m not going to recommend this either; but it’s an option
that may become worth considering later, unless our database friend
really pick up their collective game.
>
> Others
>
> I could provide info on memcached(b), Azure, Riak, etc…  All of which
have their merits, but none of which are really strong contenders for
the crown.
>
> Summary
>
> I think that we could make either Mongo or Cassandra work for us.  If
we were truly serious about storing hard state, then I think Cassandra
offers more control, but I think that it would be a harder challenge
from an operational perspective to use and deploy.
>
> At this stage, I’m going to suggest that we pick Mongo, even though I
think Cassandra might be functionally superior.

_______________________________________________
dev-media mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-media

Reply via email to