Re: EC2 storage options for C*

2016-02-03 Thread Jack Krupansky
or not they’re viable is a decision for each user to make. >>>>>> They’re >>>>>> very, very commonly used for C*, though. At a time when EBS was not >>>>>> sufficiently robust or reliable, a cluster of m1 instances was the de >>>&g

Re: EC2 storage options for C*

2016-02-03 Thread Jeff Jirsa
To: "user@cassandra.apache.org" Subject: Re: EC2 storage options for C* Just curious here ... when did EBS become OK for C*? Didn't they always push towards using ephemeral disks? On Wed, Feb 3, 2016 at 12:17 PM, Ben Bromhead wrote: For what it's worth we've tried d2 in

Re: EC2 storage options for C*

2016-02-03 Thread Sebastian Estevez
y’re viable is a decision for each user to make. >>>>>> They’re >>>>>> very, very commonly used for C*, though. At a time when EBS was not >>>>>> sufficiently robust or reliable, a cluster of m1 instances was the de >>>>>> fac

Re: EC2 storage options for C*

2016-02-03 Thread Sebastian Estevez
m1 instances was the de >>>>> facto >>>>> standard. >>>>> >>>>> The canonical “best practice” in 2015 was i2. We believe we’ve made a >>>>> compelling argument to use m4 or c4 instead of i2. There exists a company >&g

Re: EC2 storage options for C*

2016-02-03 Thread Bryan Cheng
t;>>> >>>> The canonical “best practice” in 2015 was i2. We believe we’ve made a >>>> compelling argument to use m4 or c4 instead of i2. There exists a company >>>> we know currently testing d2 at scale, though I’m not sure they have much >>>>

Re: EC2 storage options for C*

2016-02-03 Thread James Rothering
>>> compelling argument to use m4 or c4 instead of i2. There exists a company >>> we know currently testing d2 at scale, though I’m not sure they have much >>> in terms of concrete results at this time. >>> >>> - Jeff >>> >>> From: J

Re: EC2 storage options for C*

2016-02-03 Thread Will Hayworth
e a >>> compelling argument to use m4 or c4 instead of i2. There exists a company >>> we know currently testing d2 at scale, though I’m not sure they have much >>> in terms of concrete results at this time. >>> >>> - Jeff >>> >>> From:

Re: EC2 storage options for C*

2016-02-03 Thread Ben Bromhead
rom: Jack Krupansky >> Reply-To: "user@cassandra.apache.org" >> Date: Monday, February 1, 2016 at 1:55 PM >> >> To: "user@cassandra.apache.org" >> Subject: Re: EC2 storage options for C* >> >> Thanks. My typo - I referenced "

Re: EC2 storage options for C*

2016-02-01 Thread Jack Krupansky
g" > Date: Monday, February 1, 2016 at 1:55 PM > > To: "user@cassandra.apache.org" > Subject: Re: EC2 storage options for C* > > Thanks. My typo - I referenced "C2 Dense Storage" which is really "D2 > Dense Storage". > > The remaining que

Re: EC2 storage options for C*

2016-02-01 Thread Steve Robenalt
sts a company > we know currently testing d2 at scale, though I’m not sure they have much > in terms of concrete results at this time. > > - Jeff > > From: Jack Krupansky > Reply-To: "user@cassandra.apache.org" > Date: Monday, February 1, 2016 at 1:55 PM > > T

Re: EC2 storage options for C*

2016-02-01 Thread Jeff Jirsa
in terms of concrete results at this time. - Jeff From: Jack Krupansky Reply-To: "user@cassandra.apache.org" Date: Monday, February 1, 2016 at 1:55 PM To: "user@cassandra.apache.org" Subject: Re: EC2 storage options for C* Thanks. My typo - I referenced "C2 Dense

Re: EC2 storage options for C*

2016-02-01 Thread Steve Robenalt
>> > In terms of syncing data for the commit log, if the OS call to sync >>>> an EBS volume returns, is the commit log data absolutely 100% synced at the >>>> hardware level on the EBS end, such that a power failure of the systems on >>>> which the EBS volumes re

Re: EC2 storage options for C*

2016-02-01 Thread Jack Krupansky
level on the EBS end, such that a power failure of the systems on >>> which the EBS volumes reside will still guarantee availability of the >>> fsynced data. As well, is return from fsync an absolute guarantee of >>> sstable durability when Cassandra is about to delete the c

Re: EC2 storage options for C*

2016-02-01 Thread Steve Robenalt
flushing memtables, but for the fsync at the end a solid >> guarantee is needed. >> >> Most of the answers in this block are “probably not 100%, you should be >> writing to more than one host/AZ/DC/vendor to protect your organization >> from failures”. AWS targets

Re: EC2 storage options for C*

2016-02-01 Thread Jack Krupansky
ock are “probably not 100%, you should be > writing to more than one host/AZ/DC/vendor to protect your organization > from failures”. AWS targets something like 0.1% annual failure rate per > volume and 99.999% availability (slide 66). We believe they’re exceeding > those goals (at least

Re: EC2 storage options for C*

2016-02-01 Thread Jeff Jirsa
ide 66). We believe they’re exceeding those goals (at least based with the petabytes of data we have on gp2 volumes). From: Jack Krupansky Reply-To: "user@cassandra.apache.org" Date: Monday, February 1, 2016 at 5:51 AM To: "user@cassandra.apache.org" Subject: Re: EC2

Re: EC2 storage options for C*

2016-02-01 Thread Jack Krupansky
figuration. If you don't, though, EBS GP2 will save a _lot_ of >>>>>> headache. >>>>>> >>>>>> Personally, on small clusters like ours (12 nodes), we've found our >>>>>> choice of instance dictated much more by th

Re: EC2 storage options for C*

2016-02-01 Thread Jack Krupansky
s between read-intensive and >>> write-intensive workloads? >>> >>> -- Jack Krupansky >>> >>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa >>> wrote: >>> >>>> Hi John, >>>> >>>> We run using 4T GP2 volum

Re: EC2 storage options for C*

2016-01-31 Thread Eric Plowe
M, Jeff Jirsa >>> wrote: >>> >>>> Hi John, >>>> >>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M >>>> writes per second on 60 nodes, we didn’t come close to hitting even 50% >>>> utilization (10k is

Re: EC2 storage options for C*

2016-01-31 Thread Jeff Jirsa
>>>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa >>>>>> wrote: >>>>>> Hi John, >>>>>> >>>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes >>>>>> per second on

Re: EC2 storage options for C*

2016-01-31 Thread Eric Plowe
60 nodes, we didn’t come close to hitting even 50% >>> utilization (10k is more than enough for most workloads). PIOPS is not >>> necessary. >>> >>> >>> >>> From: John Wong >>> Reply-To: "user@cassandra.apache.org" >>> Date

Re: EC2 storage options for C*

2016-01-31 Thread Jeff Jirsa
gt;> Hi John, >>>> >>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes >>>> per second on 60 nodes, we didn’t come close to hitting even 50% >>>> utilization (10k is more than enough for most workloads). PIOPS is not >&

Re: EC2 storage options for C*

2016-01-31 Thread Eric Plowe
we didn’t come close to hitting even 50% >> utilization (10k is more than enough for most workloads). PIOPS is not >> necessary. >> >> >> >> From: John Wong >> Reply-To: "user@cassandra.apache.org >> " >> Date: Saturday, January

Re: EC2 storage options for C*

2016-01-31 Thread Jeff Jirsa
> second on 60 nodes, we didn’t come close to hitting even 50% utilization >> (10k is more than enough for most workloads). PIOPS is not necessary. >> >> >> >> From: John Wong >> Reply-To: "user@cassandra.apache.org" >> Date: Saturday, January

Re: EC2 storage options for C*

2016-01-31 Thread Jack Krupansky
ome close to hitting even 50% > utilization (10k is more than enough for most workloads). PIOPS is not > necessary. > > > > From: John Wong > Reply-To: "user@cassandra.apache.org" > Date: Saturday, January 30, 2016 at 3:07 PM > To: "user@cassandra.apache.

Re: EC2 storage options for C*

2016-01-31 Thread Jeff Jirsa
quot; Date: Saturday, January 30, 2016 at 3:07 PM To: "user@cassandra.apache.org" Subject: Re: EC2 storage options for C* For production I'd stick with ephemeral disks (aka instance storage) if you have running a lot of transaction. However, for regular small testing/qa cluster,

Re: EC2 storage options for C*

2016-01-30 Thread John Wong
t; From: Eric Plowe >> Reply-To: "user@cassandra.apache.org" >> Date: Friday, January 29, 2016 at 4:33 PM >> To: "user@cassandra.apache.org" >> Subject: EC2 storage options for C* >> >> My company is planning on rolling out a C* cluster in EC2. We

Re: EC2 storage options for C*

2016-01-30 Thread Bryan Cheng
a.apache.org" > Date: Friday, January 29, 2016 at 4:33 PM > To: "user@cassandra.apache.org" > Subject: EC2 storage options for C* > > My company is planning on rolling out a C* cluster in EC2. We are thinking > about going with ephemeral SSDs. The question is this: Sho

Re: EC2 storage options for C*

2016-01-29 Thread Jeff Jirsa
To: "user@cassandra.apache.org" Subject: EC2 storage options for C* My company is planning on rolling out a C* cluster in EC2. We are thinking about going with ephemeral SSDs. The question is this: Should we put two in RAID 0 or just go with one? We currently run a cluster in our data center wi

Re: EC2 storage options for C*

2016-01-29 Thread Eric Plowe
RAID 0 regardless of instance type* On Friday, January 29, 2016, Eric Plowe wrote: > Bryan, > > Correct, I should have clarified that. I'm evaluating instance types based > on one SSD or two in RAID 0. I thinking its going to be two in RAID 0, > but as I've had no experience running a production

Re: EC2 storage options for C*

2016-01-29 Thread Eric Plowe
Bryan, Correct, I should have clarified that. I'm evaluating instance types based on one SSD or two in RAID 0. I thinking its going to be two in RAID 0, but as I've had no experience running a production C* cluster in EC2, I wanted to reach out to the list. Sorry for the half-baked question :) E

Re: EC2 storage options for C*

2016-01-29 Thread Bryan Cheng
Do you have any idea what kind of disk performance you need? Cassandra with RAID 0 is a fairly common configuration (Al's awesome tuning guide has a blurb on it https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html), so if you feel comfortable with the operational overhead it seems lik

EC2 storage options for C*

2016-01-29 Thread Eric Plowe
My company is planning on rolling out a C* cluster in EC2. We are thinking about going with ephemeral SSDs. The question is this: Should we put two in RAID 0 or just go with one? We currently run a cluster in our data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with the perfor