date:20160508

RE: Cassandra 2.0.x OOM during startsup - schema version inconsistency after reboot

2016-05-08 Thread Michael Fong

Hi, all,


Haven't heard any responses so far, and this isue has troubled us for quite 
some time. Here is another update:

We have noticed several times that The schema version may change after 
migration and reboot:

Here is the scenario:

1.   Two node cluster (1 & 2).

2.   There are some schema changes, i.e. create a few new columnfamily. The 
cluster will wait until both nodes have schema version in sync (describe 
cluster) before moving on.

3.   Right before node2 is rebooted, the schema version is consistent; 
however, after ndoe2 reboots and starts servicing, the MigrationManager would 
gossip different schema version.

4.   Afterwards, both nodes starts exchanging schema  message indefinitely 
until one of the node dies.

We currently suspect the change of schema is due to replying the old entry in 
commit log. We wish to continue dig further, but need experts help on this.

I don't know if anyone has seen this before, or if there is anything wrong with 
our migration flow though..

Thanks in advance.

Best regards,


Michael Fong

From: Michael Fong [mailto:michael.f...@ruckuswireless.com]
Sent: Thursday, April 21, 2016 6:41 PM
To: u...@cassandra.apache.org; dev@cassandra.apache.org
Subject: RE: Cassandra 2.0.x OOM during bootstrap

Hi, all,

Here is some more information on before the OOM happened on the rebooted node 
in a 2-node test cluster:


1.   It seems the schema version has changed on the rebooted node after 
reboot, i.e.
Before reboot,
Node 1: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,326 MigrationManager.java 
(line 328) Gossiping my schema version 4cb463f8-5376-3baf-8e88-a5cc6a94f58f
Node 2: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,122 MigrationManager.java 
(line 328) Gossiping my schema version 4cb463f8-5376-3baf-8e88-a5cc6a94f58f

After rebooting node 2,
Node 2: DEBUG [main] 2016-04-19 11:18:18,016 MigrationManager.java (line 328) 
Gossiping my schema version f5270873-ba1f-39c7-ab2e-a86db868b09b



2.   After reboot, both nods repeatedly send MigrationTask to each other - 
we suspect it is related to the schema version (Digest) mismatch after Node 2 
rebooted:
The node2  keeps submitting the migration task over 100+ times to the other 
node.
INFO [GossipStage:1] 2016-04-19 11:18:18,261 Gossiper.java (line 1011) Node 
/192.168.88.33 has restarted, now UP
INFO [GossipStage:1] 2016-04-19 11:18:18,262 TokenMetadata.java (line 414) 
Updating topology for /192.168.88.33
INFO [GossipStage:1] 2016-04-19 11:18:18,263 StorageService.java (line 1544) 
Node /192.168.88.33 state jump to normal
INFO [GossipStage:1] 2016-04-19 11:18:18,264 TokenMetadata.java (line 414) 
Updating topology for /192.168.88.33
DEBUG [GossipStage:1] 2016-04-19 11:18:18,265 MigrationManager.java (line 102) 
Submitting migration task for /192.168.88.33
DEBUG [GossipStage:1] 2016-04-19 11:18:18,265 MigrationManager.java (line 102) 
Submitting migration task for /192.168.88.33
DEBUG [MigrationStage:1] 2016-04-19 11:18:18,268 MigrationTask.java (line 62) 
Can't send schema pull request: node /192.168.88.33 is down.
DEBUG [MigrationStage:1] 2016-04-19 11:18:18,268 MigrationTask.java (line 62) 
Can't send schema pull request: node /192.168.88.33 is down.
DEBUG [RequestResponseStage:1] 2016-04-19 11:18:18,353 Gossiper.java (line 977) 
removing expire time for endpoint : /192.168.88.33
INFO [RequestResponseStage:1] 2016-04-19 11:18:18,353 Gossiper.java (line 978) 
InetAddress /192.168.88.33 is now UP
DEBUG [RequestResponseStage:1] 2016-04-19 11:18:18,353 MigrationManager.java 
(line 102) Submitting migration task for /192.168.88.33
DEBUG [RequestResponseStage:1] 2016-04-19 11:18:18,355 Gossiper.java (line 977) 
removing expire time for endpoint : /192.168.88.33
INFO [RequestResponseStage:1] 2016-04-19 11:18:18,355 Gossiper.java (line 978) 
InetAddress /192.168.88.33 is now UP
DEBUG [RequestResponseStage:1] 2016-04-19 11:18:18,355 MigrationManager.java 
(line 102) Submitting migration task for /192.168.88.33
DEBUG [RequestResponseStage:2] 2016-04-19 11:18:18,355 Gossiper.java (line 977) 
removing expire time for endpoint : /192.168.88.33
INFO [RequestResponseStage:2] 2016-04-19 11:18:18,355 Gossiper.java (line 978) 
InetAddress /192.168.88.33 is now UP
DEBUG [RequestResponseStage:2] 2016-04-19 11:18:18,356 MigrationManager.java 
(line 102) Submitting migration task for /192.168.88.33
.


On the otherhand, Node 1 keeps updating its gossip information, followed by 
receiving and submitting migrationTask afterwards:
DEBUG [RequestResponseStage:3] 2016-04-19 11:18:18,332 Gossiper.java (line 977) 
removing expire time for endpoint : /192.168.88.34
INFO [RequestResponseStage:3] 2016-04-19 11:18:18,333 Gossiper.java (line 978) 
InetAddress /192.168.88.34 is now UP
DEBUG [RequestResponseStage:4] 2016-04-19 11:18:18,335 Gossiper.java (line 977) 
removing expire time for endpoint : /192.168.88.34
INFO [RequestResponseStage:4] 2016-04-19 11:18:18,335 Gossiper.java (line 978) 
InetAddress /192.168.88.3

Re: Cassandra 2.0.x OOM during startsup - schema version inconsistency after reboot

2016-05-08 Thread Michael Kjellman

I'd recommend you create a JIRA! That way you can get some traction on the 
issue. Obviously an OOM is never correct, even if your process is wrong in some 
way!

Best,
kjellman 

Sent from my iPhone

> On May 8, 2016, at 8:48 PM, Michael Fong  
> wrote:
> 
> Hi, all,
> 
> 
> Haven't heard any responses so far, and this isue has troubled us for quite 
> some time. Here is another update:
> 
> We have noticed several times that The schema version may change after 
> migration and reboot:
> 
> Here is the scenario:
> 
> 1.   Two node cluster (1 & 2).
> 
> 2.   There are some schema changes, i.e. create a few new columnfamily. 
> The cluster will wait until both nodes have schema version in sync (describe 
> cluster) before moving on.
> 
> 3.   Right before node2 is rebooted, the schema version is consistent; 
> however, after ndoe2 reboots and starts servicing, the MigrationManager would 
> gossip different schema version.
> 
> 4.   Afterwards, both nodes starts exchanging schema  message 
> indefinitely until one of the node dies.
> 
> We currently suspect the change of schema is due to replying the old entry in 
> commit log. We wish to continue dig further, but need experts help on this.
> 
> I don't know if anyone has seen this before, or if there is anything wrong 
> with our migration flow though..
> 
> Thanks in advance.
> 
> Best regards,
> 
> 
> Michael Fong
> 
> From: Michael Fong [mailto:michael.f...@ruckuswireless.com]
> Sent: Thursday, April 21, 2016 6:41 PM
> To: u...@cassandra.apache.org; dev@cassandra.apache.org
> Subject: RE: Cassandra 2.0.x OOM during bootstrap
> 
> Hi, all,
> 
> Here is some more information on before the OOM happened on the rebooted node 
> in a 2-node test cluster:
> 
> 
> 1.   It seems the schema version has changed on the rebooted node after 
> reboot, i.e.
> Before reboot,
> Node 1: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,326 
> MigrationManager.java (line 328) Gossiping my schema version 
> 4cb463f8-5376-3baf-8e88-a5cc6a94f58f
> Node 2: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,122 
> MigrationManager.java (line 328) Gossiping my schema version 
> 4cb463f8-5376-3baf-8e88-a5cc6a94f58f
> 
> After rebooting node 2,
> Node 2: DEBUG [main] 2016-04-19 11:18:18,016 MigrationManager.java (line 328) 
> Gossiping my schema version f5270873-ba1f-39c7-ab2e-a86db868b09b
> 
> 
> 
> 2.   After reboot, both nods repeatedly send MigrationTask to each other 
> - we suspect it is related to the schema version (Digest) mismatch after Node 
> 2 rebooted:
> The node2  keeps submitting the migration task over 100+ times to the other 
> node.
> INFO [GossipStage:1] 2016-04-19 11:18:18,261 Gossiper.java (line 1011) Node 
> /192.168.88.33 has restarted, now UP
> INFO [GossipStage:1] 2016-04-19 11:18:18,262 TokenMetadata.java (line 414) 
> Updating topology for /192.168.88.33
> INFO [GossipStage:1] 2016-04-19 11:18:18,263 StorageService.java (line 1544) 
> Node /192.168.88.33 state jump to normal
> INFO [GossipStage:1] 2016-04-19 11:18:18,264 TokenMetadata.java (line 414) 
> Updating topology for /192.168.88.33
> DEBUG [GossipStage:1] 2016-04-19 11:18:18,265 MigrationManager.java (line 
> 102) Submitting migration task for /192.168.88.33
> DEBUG [GossipStage:1] 2016-04-19 11:18:18,265 MigrationManager.java (line 
> 102) Submitting migration task for /192.168.88.33
> DEBUG [MigrationStage:1] 2016-04-19 11:18:18,268 MigrationTask.java (line 62) 
> Can't send schema pull request: node /192.168.88.33 is down.
> DEBUG [MigrationStage:1] 2016-04-19 11:18:18,268 MigrationTask.java (line 62) 
> Can't send schema pull request: node /192.168.88.33 is down.
> DEBUG [RequestResponseStage:1] 2016-04-19 11:18:18,353 Gossiper.java (line 
> 977) removing expire time for endpoint : /192.168.88.33
> INFO [RequestResponseStage:1] 2016-04-19 11:18:18,353 Gossiper.java (line 
> 978) InetAddress /192.168.88.33 is now UP
> DEBUG [RequestResponseStage:1] 2016-04-19 11:18:18,353 MigrationManager.java 
> (line 102) Submitting migration task for /192.168.88.33
> DEBUG [RequestResponseStage:1] 2016-04-19 11:18:18,355 Gossiper.java (line 
> 977) removing expire time for endpoint : /192.168.88.33
> INFO [RequestResponseStage:1] 2016-04-19 11:18:18,355 Gossiper.java (line 
> 978) InetAddress /192.168.88.33 is now UP
> DEBUG [RequestResponseStage:1] 2016-04-19 11:18:18,355 MigrationManager.java 
> (line 102) Submitting migration task for /192.168.88.33
> DEBUG [RequestResponseStage:2] 2016-04-19 11:18:18,355 Gossiper.java (line 
> 977) removing expire time for endpoint : /192.168.88.33
> INFO [RequestResponseStage:2] 2016-04-19 11:18:18,355 Gossiper.java (line 
> 978) InetAddress /192.168.88.33 is now UP
> DEBUG [RequestResponseStage:2] 2016-04-19 11:18:18,356 MigrationManager.java 
> (line 102) Submitting migration task for /192.168.88.33
> .
> 
> 
> On the otherhand, Node 1 keeps updating its gossip information, followed by 
> receiving and submitting migrationTask afterwards:
> DEBUG [R

Nodetool Cleanup Problem

2016-05-08 Thread Jan Ali

Hi All, 

I use cassandra 3.4.When running 'nodetool cleanup' command , see this error?
error: Expecting URI in variable: [cassandra.config]. Found[cassandra.yaml]. 
Please prefix the file with [file:///] for local files and [file:///] 
for remote files. If you are executing this from an external tool, it needs to 
set Config.setClientMode(true) to avoid loading configuration.
-- StackTrace --
org.apache.cassandra.exceptions.ConfigurationException: Expecting URI in 
variable: [cassandra.config]. Found[cassandra.yaml]. Please prefix the file 
with [file:///] for local files and [file:///] for remote files. If you 
are executing this from an external tool, it needs to set 
Config.setClientMode(true) to avoid loading configuration.
    at 
org.apache.cassandra.config.YamlConfigurationLoader.getStorageConfigURL(YamlConfigurationLoader.java:78)
    at 
org.apache.cassandra.config.YamlConfigurationLoader.(YamlConfigurationLoader.java:92)
    at 
org.apache.cassandra.config.DatabaseDescriptor.loadConfig(DatabaseDescriptor.java:134)
    at 
org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:121)
    at 
org.apache.cassandra.config.CFMetaData$Builder.(CFMetaData.java:1160)
    at 
org.apache.cassandra.config.CFMetaData$Builder.create(CFMetaData.java:1175)
    at 
org.apache.cassandra.config.CFMetaData$Builder.create(CFMetaData.java:1170)
    at 
org.apache.cassandra.cql3.statements.CreateTableStatement.metadataBuilder(CreateTableStatement.java:118)
    at org.apache.cassandra.config.CFMetaData.compile(CFMetaData.java:413)
    at 
org.apache.cassandra.schema.SchemaKeyspace.compile(SchemaKeyspace.java:238)
    at 
org.apache.cassandra.schema.SchemaKeyspace.(SchemaKeyspace.java:88)
    at org.apache.cassandra.config.Schema.(Schema.java:96)
    at org.apache.cassandra.config.Schema.(Schema.java:50)
    at org.apache.cassandra.tools.nodetool.Cleanup.execute(Cleanup.java:45)
    at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:248)
    at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:162)

Can anyone help me?
Best regards,
Jan Ali

RE: Cassandra 2.0.x OOM during startsup - schema version inconsistency after reboot

Re: Cassandra 2.0.x OOM during startsup - schema version inconsistency after reboot

Nodetool Cleanup Problem

3 matches

Site Navigation

Mail list logo

Footer information