I'm curious why you are storing the backups (sstables and commit logs) to
HDFS instead of something like lustre. Are your backups using Hadoop's
map/reduce somehow? Or is it for convenience?
On Sat, Mar 20, 2010 at 8:40 AM, Chris Goffinet wrote:
> > 5. Backups : If there is a 4 or 5 TB cassandr
On Mar 20, 2010, at 9:10 AM, Jeremy Dunck wrote:
> On Sat, Mar 20, 2010 at 10:40 AM, Chris Goffinet wrote:
>>> 5. Backups : If there is a 4 or 5 TB cassandra cluster what do you
>>> recommend the backup scenario's could be?
>>
>> Worst case scenario (total failure) we opted to do global snapsh
On Sat, Mar 20, 2010 at 10:40 AM, Chris Goffinet wrote:
>> 5. Backups : If there is a 4 or 5 TB cassandra cluster what do you
>> recommend the backup scenario's could be?
>
> Worst case scenario (total failure) we opted to do global snapshots every 24
> hours. This creates hard links to SSTable
> 5. Backups : If there is a 4 or 5 TB cassandra cluster what do you recommend
> the backup scenario's could be?
Worst case scenario (total failure) we opted to do global snapshots every 24
hours. This creates hard links to SSTables on each node. We copy those SSTables
to HDFS on daily basis.
> Also, Does cassandra support counters? Digg's article said they are going to
> contribute their work to open source any idea when that would be?
>
All of the custom work has been pushed upstream from Digg and continues. We
have a few operational tools we will be releasing that will go into co
On Mar 20, 2010, at 2:53 AM, Lenin Gali wrote:
> 1. Eventual consistency: Given a volume of 5K writes / sec and roughly 1500
> writes are Updates per sec while the rest are inserts, what kind of latency
> can be expected in eventual consistency?
Depending on the size of the cluster you're not
Hi,
I have several questions. I hope some of you can share your experiences in
each or all of these following. I will be curious about twitter and digg's
experience as they might be processing
1. Eventual consistency: Given a volume of 5K writes / sec and roughly 1500
writes are Updates per sec wh
Jeff Hodsdon edited the new link in:
http://about.digg.com/blog/looking-future-cassandra
On Fri, Mar 19, 2010 at 2:49 PM, Nathan McCall wrote:
> Gary,
> Did you see this larticle linked from the Cassandra wiki?
> http://about.digg.com/node/564
>
> See http://wiki.apache.org/cassandra/ArticlesAndP
Gary,
Did you see this larticle linked from the Cassandra wiki?
http://about.digg.com/node/564
See http://wiki.apache.org/cassandra/ArticlesAndPresentations for more
examples like the above. In general, you structure your data according
to how it will be queried. This can lead to duplication, but
On 2010-03-19 19:16, Gary wrote:
> I am a newbie to bigtable like model and have a question as follows.
> Take Digg as an example, I want to find a list users who dug a URL and
> also want to find a list of URLs a user dug. How should the data model
> look like for the queries to be efficient? If I
On Mar 19, 2010, at 1:16 PM, Gary wrote:
> I am a newbie to bigtable like model and have a question as follows. Take
> Digg as an example, I want to find a list users who dug a URL and also want
> to find a list of URLs a user dug. How should the data model look like for
> the queries to be ef
I am a newbie to bigtable like model and have a question as follows. Take
Digg as an example, I want to find a list users who dug a URL and also want
to find a list of URLs a user dug. How should the data model look like for
the queries to be efficient? If I use the username and the URL for two row
12 matches
Mail list logo