I didn't launched these commands when in troubles, next time I will. For
now here is what I have (this is working properly for now).
$mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Sun Mar 17 01:46:05 2013
Raid Level : raid0
Array Size : 1761459200 (1679.86 GiB
I've seen the same behaviour (SLOW ephemeral disk) a few times.
You can't do anything with a single slow disk except not using it.
Our solution was always: Replace the m1.xlarge instance asap and everything is
good.
-Rudolf.
On 31.03.2013, at 18:58, Alexis Lê-Quôc wrote:
> Alain,
>
> Can you
Alain,
Can you post your mdadm --detail /dev/md0 output here as well as your
iostat -x -d when that happens. A bad ephemeral drive on EC2 is not unheard
of.
Alexis | @alq | http://datadog.com
P.S. also, disk utilization is not a reliable metric, iostat's await and
svctm are more useful imho.
O
> Ok, if you're going to look into it, please keep me/us posted.
It's not on my radar.
Cheers
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand
@aaronmorton
http://www.thelastpickle.com
On 28/03/2013, at 2:43 PM, Alain RODRIGUEZ wrote:
> Ok, if you're going to look int
Ok, if you're going to look into it, please keep me/us posted.
It happen twice for me, the same day, within a few hours on the same node
and only happened to 1 node out of 12, making this node almost unreachable.
2013/3/28 aaron morton
> I noticed this on an m1.xlarge (cassandra 1.1.10) instan
I noticed this on an m1.xlarge (cassandra 1.1.10) instance today as well, 1 or
2 disks in a raid 0 running at 85 to 100% the others 35 to 50ish.
Have not looked into it.
Cheers
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand
@aaronmorton
http://www.thelastpickle.com
We use C* on m1.xLarge AWS EC2 servers, with 4 disks xvdb, xvdc, xvdd, xvde
parts of a logical Raid0 (md0).
I use to see their use increasing in the same way. This morning there was a
normal minor compaction followed by messages dropped on one node (out of
12).
Looking closely at this node I saw