Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread Todd Lipcon
On Mon, Nov 22, 2010 at 2:39 PM, Edward Capriolo wrote: > @Todd. Good catch about caching HFile blocks. > > My point still applies though. Caching HFIle blocks on a single node > vs individual "dataums" on N nodes may not be more efficient. Thus > terms like "Slower" and "Less Efficient" could be

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread Todd Lipcon
Seems accurate to me. One small correction - the daemon in HBase that serves regions is known as a "region server" rather than a region master. The RS is the equivalent of the tablet server in Bigtable terminology. -Todd On Mon, Nov 22, 2010 at 4:50 PM, David Jeske wrote: > This is my second at

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread Todd Lipcon
On Mon, Nov 22, 2010 at 1:58 PM, David Jeske wrote: > On Mon, Nov 22, 2010 at 11:52 AM, Todd Lipcon wrote: > >> Not quite. The replica synchronization code is pretty messy, but basically >> it will take the longest replica that may have been synced, not a quorum. >> >

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread Todd Lipcon
On Mon, Nov 22, 2010 at 1:26 PM, Edward Capriolo wrote: > For cassandra all writes must be transmitted to all replicas. > CASSANDRA-1314 does not change how writes happen. Write operations > will still effect cache (possibly evicting things if cache is full). > Reads however will prefer a single n

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread Todd Lipcon
On Mon, Nov 22, 2010 at 12:03 PM, Edward Capriolo wrote: > What of reads that are not in the cache? > Cassandra can use memory mapped io for its data and index files. Hbase > has a very expensive read path for things that are not in cache. HDFS > random read performance is historically poor. > Ye

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread Todd Lipcon
On Mon, Nov 22, 2010 at 10:01 AM, David Jeske wrote: > I havn't used either Cassandra or hbase, so please don't take any part of > this message as me attempting to state facts about either system. However, > I'm very familiar with data-storage design details, and I've worked > extensively optimiz

Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-21 Thread Todd Lipcon
On Sun, Nov 21, 2010 at 6:25 PM, Jonathan Ellis wrote: > On Sun, Nov 21, 2010 at 6:16 PM, Todd Lipcon wrote: > > [only jumping in because info was requested - those who know me know that > I > > think Cassandra is a very interesting architecture and a better fit for > many

Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-21 Thread Todd Lipcon
On Sun, Nov 21, 2010 at 2:06 PM, Edward Ribeiro wrote: > > Also I believe saying HBASE is consistent is not true. This can happen: >> Write to region server. -> Region Server acknowledges client-> write >> to WAL -> region server fails = write lost >> >> I wonder how facebook will reconcile that.