Hi Michael,
Thanks for you investigation, I have created
https://issues.apache.org/jira/browse/CASSANDRA-4432 to cover related changes
and going to start with that ASAP.
Best Regards
--
Pavel Yaskevich
On Tuesday 10 July 2012 at 19:36, Michael Theroux wrote:
> Hello,
>
> We are in the process of upgrading out cassandra installation from a single
> instance to a 6 node cluster with a replication factor of 3. We are using
> Cassandra 1.1.2. This is something I've done before in other environments,
> but now I've hit an interesting issue. The cluster has been setup and all the
> nodes have joined. I was about to update the replication factor to 3 via
> cassandra-cli:
>
> [open@unknown] use open;
> Authenticated to keyspace: open
>
> [default@open] update keyspace open with
> placement_strategy='org.apache.cassandra.locator.NetworkTopologyStrategy' and
> strategy_options={us-east:3};
> 4698e471-5a1d-30f2-aa11-761d204581ff
> Waiting for schema agreement...
> ... schemas agree across the cluster
>
> The above looks normal, but when I look at the schema, the replication factor
> is unchanged:
>
> [default@open] describe open;
> Keyspace: open:
> Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
> Durable Writes: true
> Options: [us-east:1]
> Column Families:
> ...
>
> I couldn't figure out why this was, but then I saw this thread:
>
> http://www.datastax.com/support-forums/topic/cassandra-111-update-keyspace-not-working
>
> So I tried creating a new keyspace "ks" and looked at the results:
>
> [default@open] use system;
> Authenticated to keyspace: system
> [default@system] list schema_keyspace;
> schema_keyspace not found in current keyspace.
> [default@system] list schema_keyspaces;
> Using default limit of 100
> Using default column limit of 100
> -------------------
> RowKey: open
> => (column=durable_writes, value=true, timestamp=530617107329814)
> => (column=name, value=open, timestamp=530617107329814)
> => (column=strategy_class,
> value=org.apache.cassandra.locator.NetworkTopologyStrategy,
> timestamp=530617107329814)
> => (column=strategy_options, value={"us-east":"1"}, timestamp=530617107329814)
> -------------------
> RowKey: ks
> => (column=durable_writes, value=true, timestamp=42396175198913)
> => (column=name, value=ky, timestamp=42396175198913)
> => (column=strategy_class,
> value=org.apache.cassandra.locator.NetworkTopologyStrategy,
> timestamp=42396175198913)
> => (column=strategy_options, value={"datacenter1":"1"},
> timestamp=42396175198913)
>
>
> Notice the "timestamp" on the new keyspace is MUCH younger than "open" (by
> more than a factor of 10).
>
> I didn't understand how this could be, as time has always been in sync.
>
> I decided to look at the code to see if I could spot anything. When
> cassandra-cli attempts to create a new keyspace, it uses thrift, and ends up
> here (in CassandraServer.java):
>
> public String system_update_keyspace(KsDef ks_def)
> throws InvalidRequestException, SchemaDisagreementException, TException
> {
> logger.debug("update_keyspace");
> ThriftValidation.validateKeyspaceNotSystem(ks_def.name (http://ks_def.name));
> ...
> MigrationManager.announceKeyspaceUpdate(KSMetaData.fromThrift(ks_def));
> return Schema.instance.getVersion().toString();
> ...
> }
>
>
> Which then calls:
>
> public static void announceKeyspaceUpdate(KSMetaData ksm) throws
> ConfigurationException
> {
> ksm.validate();
>
> KSMetaData oldKsm = Schema.instance.getKSMetaData(ksm.name (http://ksm.name));
> if (oldKsm == null)
> throw new ConfigurationException(String.format("Cannot update non existing
> keyspace '%s'.", ksm.name (http://ksm.name)));
>
> announce(oldKsm.toSchemaUpdate(ksm, System.nanoTime()));
> }
>
> It then uses the results of System.nanoTime in the timestamp.
>
> I wrote a simple java program to output System.nanoTime on the system in
> which I attempted to add the new keyspace, and the output was:
>
> 46627528340034
>
> Which is in the realm of the keyspace I added above. System.nanoTime() is
> java instance dependent (nanoTime). You will get different values depending
> on what machine you run on and is not necessarily associated with you system
> clock. I ran this on several different machines, all verified to be in sync
> with NTP, and got massively different results. In fact, when I stopped and
> started my instance, my nanoTime became:
>
> 97234377869
>
> I then created another keyspace "kw":
>
> [default@system] list schema_keyspaces;
> Using default limit of 100
> Using default column limit of 100
> -------------------
> RowKey: open
> => (column=durable_writes, value=true, timestamp=530617107329814)
> => (column=name, value=open, timestamp=530617107329814)
> => (column=strategy_class,
> value=org.apache.cassandra.locator.NetworkTopologyStrategy,
> timestamp=530617107329814)
> => (column=strategy_options, value={"us-east":"1"}, timestamp=530617107329814)
> -------------------
> RowKey: ks
> => (column=durable_writes, value=true, timestamp=42396175198913)
> => (column=name, value=ky, timestamp=42396175198913)
> => (column=strategy_class,
> value=org.apache.cassandra.locator.NetworkTopologyStrategy,
> timestamp=42396175198913)
> => (column=strategy_options, value={"datacenter1":"1"},
> timestamp=42396175198913)
> -------------------
> RowKey: kw
> => (column=durable_writes, value=true, timestamp=236211433609)
> => (column=name, value=kw, timestamp=236211433609)
> => (column=strategy_class,
> value=org.apache.cassandra.locator.NetworkTopologyStrategy,
> timestamp=236211433609)
> => (column=strategy_options, value={"datacenter1":"1"},
> timestamp=236211433609)
>
>
> What I believe is happening is updates are not working because, as the thread
> I linked above indicated, Cassandra is seeing my update as older than the
> current entries, and is not honoring it. However, this is because it is using
> System.nanoTime in thrift, which has no relation to the system clock time.
>
> I tried to find something in JIRA, but I couldn't really find any issue that
> matched (and wasn't fixed for other reasons in earlier releases). Is there
> something simpler going on?
>
> Thanks,
> -Mike
>
>