Update: I have posted a patch that would change nanoTime() to currentTimeInMillis() that actually uses the system timer to CASSANDRA-4432, you can try to apply and test if you would like, Michael.
On Tuesday 10 July 2012 at 19:53, Pavel Yaskevich wrote: > Hi Michael, > > Thanks for you investigation, I have created > https://issues.apache.org/jira/browse/CASSANDRA-4432 to cover related changes > and going to start with that ASAP. > > Best Regards > -- > Pavel Yaskevich > > > On Tuesday 10 July 2012 at 19:36, Michael Theroux wrote: > > > Hello, > > > > We are in the process of upgrading out cassandra installation from a single > > instance to a 6 node cluster with a replication factor of 3. We are using > > Cassandra 1.1.2. This is something I've done before in other environments, > > but now I've hit an interesting issue. The cluster has been setup and all > > the nodes have joined. I was about to update the replication factor to 3 > > via cassandra-cli: > > > > [open@unknown] use open; > > Authenticated to keyspace: open > > > > [default@open] update keyspace open with > > placement_strategy='org.apache.cassandra.locator.NetworkTopologyStrategy' > > and strategy_options={us-east:3}; > > 4698e471-5a1d-30f2-aa11-761d204581ff > > Waiting for schema agreement... > > ... schemas agree across the cluster > > > > The above looks normal, but when I look at the schema, the replication > > factor is unchanged: > > > > [default@open] describe open; > > Keyspace: open: > > Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy > > Durable Writes: true > > Options: [us-east:1] > > Column Families: > > ... > > > > I couldn't figure out why this was, but then I saw this thread: > > > > http://www.datastax.com/support-forums/topic/cassandra-111-update-keyspace-not-working > > > > So I tried creating a new keyspace "ks" and looked at the results: > > > > [default@open] use system; > > Authenticated to keyspace: system > > [default@system] list schema_keyspace; > > schema_keyspace not found in current keyspace. > > [default@system] list schema_keyspaces; > > Using default limit of 100 > > Using default column limit of 100 > > ------------------- > > RowKey: open > > => (column=durable_writes, value=true, timestamp=530617107329814) > > => (column=name, value=open, timestamp=530617107329814) > > => (column=strategy_class, > > value=org.apache.cassandra.locator.NetworkTopologyStrategy, > > timestamp=530617107329814) > > => (column=strategy_options, value={"us-east":"1"}, > > timestamp=530617107329814) > > ------------------- > > RowKey: ks > > => (column=durable_writes, value=true, timestamp=42396175198913) > > => (column=name, value=ky, timestamp=42396175198913) > > => (column=strategy_class, > > value=org.apache.cassandra.locator.NetworkTopologyStrategy, > > timestamp=42396175198913) > > => (column=strategy_options, value={"datacenter1":"1"}, > > timestamp=42396175198913) > > > > > > Notice the "timestamp" on the new keyspace is MUCH younger than "open" (by > > more than a factor of 10). > > > > I didn't understand how this could be, as time has always been in sync. > > > > I decided to look at the code to see if I could spot anything. When > > cassandra-cli attempts to create a new keyspace, it uses thrift, and ends > > up here (in CassandraServer.java): > > > > public String system_update_keyspace(KsDef ks_def) > > throws InvalidRequestException, SchemaDisagreementException, TException > > { > > logger.debug("update_keyspace"); > > ThriftValidation.validateKeyspaceNotSystem(ks_def.name > > (http://ks_def.name)); > > ... > > MigrationManager.announceKeyspaceUpdate(KSMetaData.fromThrift(ks_def)); > > return Schema.instance.getVersion().toString(); > > ... > > } > > > > > > Which then calls: > > > > public static void announceKeyspaceUpdate(KSMetaData ksm) throws > > ConfigurationException > > { > > ksm.validate(); > > > > KSMetaData oldKsm = Schema.instance.getKSMetaData(ksm.name > > (http://ksm.name)); > > if (oldKsm == null) > > throw new ConfigurationException(String.format("Cannot update non existing > > keyspace '%s'.", ksm.name (http://ksm.name))); > > > > announce(oldKsm.toSchemaUpdate(ksm, System.nanoTime())); > > } > > > > It then uses the results of System.nanoTime in the timestamp. > > > > I wrote a simple java program to output System.nanoTime on the system in > > which I attempted to add the new keyspace, and the output was: > > > > 46627528340034 > > > > Which is in the realm of the keyspace I added above. System.nanoTime() is > > java instance dependent (nanoTime). You will get different values depending > > on what machine you run on and is not necessarily associated with you > > system clock. I ran this on several different machines, all verified to be > > in sync with NTP, and got massively different results. In fact, when I > > stopped and started my instance, my nanoTime became: > > > > 97234377869 > > > > I then created another keyspace "kw": > > > > [default@system] list schema_keyspaces; > > Using default limit of 100 > > Using default column limit of 100 > > ------------------- > > RowKey: open > > => (column=durable_writes, value=true, timestamp=530617107329814) > > => (column=name, value=open, timestamp=530617107329814) > > => (column=strategy_class, > > value=org.apache.cassandra.locator.NetworkTopologyStrategy, > > timestamp=530617107329814) > > => (column=strategy_options, value={"us-east":"1"}, > > timestamp=530617107329814) > > ------------------- > > RowKey: ks > > => (column=durable_writes, value=true, timestamp=42396175198913) > > => (column=name, value=ky, timestamp=42396175198913) > > => (column=strategy_class, > > value=org.apache.cassandra.locator.NetworkTopologyStrategy, > > timestamp=42396175198913) > > => (column=strategy_options, value={"datacenter1":"1"}, > > timestamp=42396175198913) > > ------------------- > > RowKey: kw > > => (column=durable_writes, value=true, timestamp=236211433609) > > => (column=name, value=kw, timestamp=236211433609) > > => (column=strategy_class, > > value=org.apache.cassandra.locator.NetworkTopologyStrategy, > > timestamp=236211433609) > > => (column=strategy_options, value={"datacenter1":"1"}, > > timestamp=236211433609) > > > > > > What I believe is happening is updates are not working because, as the > > thread I linked above indicated, Cassandra is seeing my update as older > > than the current entries, and is not honoring it. However, this is because > > it is using System.nanoTime in thrift, which has no relation to the system > > clock time. > > > > I tried to find something in JIRA, but I couldn't really find any issue > > that matched (and wasn't fixed for other reasons in earlier releases). Is > > there something simpler going on? > > > > Thanks, > > -Mike > > > > > > > >