That looks a lot like what I've seen from machines with bad ram.
2011/7/8 Héctor Izquierdo Seliva :
> Hi everyone,
>
> I'm having thousands of these errors:
>
> WARN [CompactionExecutor:1] 2011-07-08 16:36:45,705
> CompactionManager.java (line 737) Non-fatal error reading row
> (stacktrace follow
it has already run about 20 hours...
On Mon, Jul 11, 2011 at 1:36 AM, aaron morton wrote:
> 1) do I need to treat every node as failure and do a rolling replacement?
> since there might be some inconsistent in the cluster even I have no way to
> find out.
>
> see
> http://wiki.apache.org/cassand
oh the error seems from jmx
sorry but seems I dont have more error messages, the node repair just never
ends... and strace the process find out nothing, it is not doing anything.
is there anyway to get more information about this? do I need to do a major
compaction on every column family? thank
> 1) do I need to treat every node as failure and do a rolling replacement?
> since there might be some inconsistent in the cluster even I have no way to
> find out.
see
http://wiki.apache.org/cassandra/Operations#Dealing_with_the_consequences_of_nodetool_repair_not_running_within_GCGraceSecond
I am running RF=2(I have changed it from 2->3 and back to 2) and 3 nodes and
didn't running node repair more than 10 days, did not aware of this is
critical. I run node repair recently and one of the node always hung...
from log it seems doing nothing related to the repair.
so I got two problems:
All the important stuff is using QUORUM. Normal operation uses around
3-4 GB of heap out of 6. I've also tried running repair on a per CF
basis, and still no luck. I've found it's faster to bootstrap a node
again than repairing it.
Once I have the cluster in a sane state I'll try running a repair
Sounds like your non-repair workload is using too much of the heap.
Alternatively, you could have a very large supercolumn that causes the
OOM when it is read.
2011/7/9 Héctor Izquierdo Seliva :
> Hi Peter.
>
> I have a problem with repair, and it's that it always brings the node
> doing the rep
> Nop, only when something breaks
Unless you've been working at QUORUM life is about to get trickier. Repair is
an essential part of running a cassandra cluster, without it you risk data loss
and dead data coming back to life.
If you have been writing at QUORUM, so have a reasonable expectatio
Hi Peter.
I have a problem with repair, and it's that it always brings the node
doing the repairs down. I've tried setting index_interval to 5000, and
it still dies with OutOfMemory errors, or even worse, it generates
thousands of tiny sstables before dying.
I've tried like 20 repairs during thi
>> - Have you been running repair consistently ?
>
> Nop, only when something breaks
This is unrelated to the problem you were asking about, but if you
never run delete, make sure you are aware of:
http://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair
http://wiki.apache.org/cas
Hi Aaron,
El vie, 08-07-2011 a las 14:47 -0700, aaron morton escribió:
> You may not lose data.
>
> - What version and whats the upgrade history?
all versions from 0.7.1 to 0.8.1. All cfs were in 0.8.1 format though
> - What RF / node count / CL ?
RF=3, node count = 6
> - Have you been runni
You may not lose data.
- What version and whats the upgrade history?
- What RF / node count / CL ?
- Have you been running repair consistently ?
- Is this on a single node or all nodes ?
Cheers
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.c
Hi everyone,
I'm having thousands of these errors:
WARN [CompactionExecutor:1] 2011-07-08 16:36:45,705
CompactionManager.java (line 737) Non-fatal error reading row
(stacktrace follows)
java.io.IOError: java.io.IOException: Impossible row size
6292724931198053
at
org.apache.cassandra.db.
13 matches
Mail list logo