Hi Michael, Thanks for you investigation, I have created https://issues.apache.org/jira/browse/CASSANDRA-4432 to cover related changes and going to start with that ASAP.
Best Regards -- Pavel Yaskevich On Tuesday 10 July 2012 at 19:36, Michael Theroux wrote: > Hello, > > We are in the process of upgrading out cassandra installation from a single > instance to a 6 node cluster with a replication factor of 3. We are using > Cassandra 1.1.2. This is something I've done before in other environments, > but now I've hit an interesting issue. The cluster has been setup and all the > nodes have joined. I was about to update the replication factor to 3 via > cassandra-cli: > > [open@unknown] use open; > Authenticated to keyspace: open > > [default@open] update keyspace open with > placement_strategy='org.apache.cassandra.locator.NetworkTopologyStrategy' and > strategy_options={us-east:3}; > 4698e471-5a1d-30f2-aa11-761d204581ff > Waiting for schema agreement... > ... schemas agree across the cluster > > The above looks normal, but when I look at the schema, the replication factor > is unchanged: > > [default@open] describe open; > Keyspace: open: > Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy > Durable Writes: true > Options: [us-east:1] > Column Families: > ... > > I couldn't figure out why this was, but then I saw this thread: > > http://www.datastax.com/support-forums/topic/cassandra-111-update-keyspace-not-working > > So I tried creating a new keyspace "ks" and looked at the results: > > [default@open] use system; > Authenticated to keyspace: system > [default@system] list schema_keyspace; > schema_keyspace not found in current keyspace. > [default@system] list schema_keyspaces; > Using default limit of 100 > Using default column limit of 100 > ------------------- > RowKey: open > => (column=durable_writes, value=true, timestamp=530617107329814) > => (column=name, value=open, timestamp=530617107329814) > => (column=strategy_class, > value=org.apache.cassandra.locator.NetworkTopologyStrategy, > timestamp=530617107329814) > => (column=strategy_options, value={"us-east":"1"}, timestamp=530617107329814) > ------------------- > RowKey: ks > => (column=durable_writes, value=true, timestamp=42396175198913) > => (column=name, value=ky, timestamp=42396175198913) > => (column=strategy_class, > value=org.apache.cassandra.locator.NetworkTopologyStrategy, > timestamp=42396175198913) > => (column=strategy_options, value={"datacenter1":"1"}, > timestamp=42396175198913) > > > Notice the "timestamp" on the new keyspace is MUCH younger than "open" (by > more than a factor of 10). > > I didn't understand how this could be, as time has always been in sync. > > I decided to look at the code to see if I could spot anything. When > cassandra-cli attempts to create a new keyspace, it uses thrift, and ends up > here (in CassandraServer.java): > > public String system_update_keyspace(KsDef ks_def) > throws InvalidRequestException, SchemaDisagreementException, TException > { > logger.debug("update_keyspace"); > ThriftValidation.validateKeyspaceNotSystem(ks_def.name (http://ks_def.name)); > ... > MigrationManager.announceKeyspaceUpdate(KSMetaData.fromThrift(ks_def)); > return Schema.instance.getVersion().toString(); > ... > } > > > Which then calls: > > public static void announceKeyspaceUpdate(KSMetaData ksm) throws > ConfigurationException > { > ksm.validate(); > > KSMetaData oldKsm = Schema.instance.getKSMetaData(ksm.name (http://ksm.name)); > if (oldKsm == null) > throw new ConfigurationException(String.format("Cannot update non existing > keyspace '%s'.", ksm.name (http://ksm.name))); > > announce(oldKsm.toSchemaUpdate(ksm, System.nanoTime())); > } > > It then uses the results of System.nanoTime in the timestamp. > > I wrote a simple java program to output System.nanoTime on the system in > which I attempted to add the new keyspace, and the output was: > > 46627528340034 > > Which is in the realm of the keyspace I added above. System.nanoTime() is > java instance dependent (nanoTime). You will get different values depending > on what machine you run on and is not necessarily associated with you system > clock. I ran this on several different machines, all verified to be in sync > with NTP, and got massively different results. In fact, when I stopped and > started my instance, my nanoTime became: > > 97234377869 > > I then created another keyspace "kw": > > [default@system] list schema_keyspaces; > Using default limit of 100 > Using default column limit of 100 > ------------------- > RowKey: open > => (column=durable_writes, value=true, timestamp=530617107329814) > => (column=name, value=open, timestamp=530617107329814) > => (column=strategy_class, > value=org.apache.cassandra.locator.NetworkTopologyStrategy, > timestamp=530617107329814) > => (column=strategy_options, value={"us-east":"1"}, timestamp=530617107329814) > ------------------- > RowKey: ks > => (column=durable_writes, value=true, timestamp=42396175198913) > => (column=name, value=ky, timestamp=42396175198913) > => (column=strategy_class, > value=org.apache.cassandra.locator.NetworkTopologyStrategy, > timestamp=42396175198913) > => (column=strategy_options, value={"datacenter1":"1"}, > timestamp=42396175198913) > ------------------- > RowKey: kw > => (column=durable_writes, value=true, timestamp=236211433609) > => (column=name, value=kw, timestamp=236211433609) > => (column=strategy_class, > value=org.apache.cassandra.locator.NetworkTopologyStrategy, > timestamp=236211433609) > => (column=strategy_options, value={"datacenter1":"1"}, > timestamp=236211433609) > > > What I believe is happening is updates are not working because, as the thread > I linked above indicated, Cassandra is seeing my update as older than the > current entries, and is not honoring it. However, this is because it is using > System.nanoTime in thrift, which has no relation to the system clock time. > > I tried to find something in JIRA, but I couldn't really find any issue that > matched (and wasn't fixed for other reasons in earlier releases). Is there > something simpler going on? > > Thanks, > -Mike > >