This is an automated email from the ASF dual-hosted git repository.
patrick pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git
The following commit(s) were added to refs/heads/trunk by this push:
new 9aa3a0ba6c Revert "Merge branch 'codex/operator-docs-remediation' into
trunk"
9aa3a0ba6c is described below
commit 9aa3a0ba6cbc8d7ba29209974efc5d3696d75812
Author: Patrick McFadin <[email protected]>
AuthorDate: Wed Apr 1 15:12:24 2026 -0700
Revert "Merge branch 'codex/operator-docs-remediation' into trunk"
This reverts commit 93894a630c717e71fde9e4b0362d4c2d9e05df4d, reversing
changes made to a100911d3a78a79afc53002e9eb6a499d1b5e3c4.
---
doc/modules/cassandra/nav.adoc | 1 +
.../getting-started/cassandra-quickstart.adoc | 80 ++++----
.../pages/getting-started/mtlsauthenticators.adoc | 130 +++++-------
.../cassandra/pages/installing/installing.adoc | 4 +-
.../managing/configuration/cass_env_sh_file.adoc | 153 ++++++++++----
.../configuration/cass_jvm_options_file.adoc | 25 +--
.../managing/configuration/cass_rackdc_file.adoc | 68 ++++---
.../managing/configuration/cass_topo_file.adoc | 25 ++-
.../managing/configuration/configuration.adoc | 141 ++++++++++---
.../pages/managing/configuration/index.adoc | 28 +--
.../pages/managing/operating/audit_logging.adoc | 226 +++++++++++++++++++++
.../pages/managing/operating/auditlogging.adoc | 2 +-
.../pages/managing/operating/auto_repair.adoc | 13 +-
.../pages/managing/operating/backups.adoc | 4 -
.../pages/managing/operating/bulk_loading.adoc | 15 --
.../managing/operating/compaction/overview.adoc | 32 ---
.../managing/operating/compaction/tombstones.adoc | 6 +-
.../pages/managing/operating/compaction/ucs.adoc | 27 +--
.../cassandra/pages/managing/operating/hints.adoc | 6 +-
.../pages/managing/operating/logging.adoc | 3 +-
.../managing/operating/onboarding-to-accord.adoc | 153 +++++++-------
.../cassandra/pages/managing/operating/repair.adoc | 22 +-
.../managing/operating/role_name_generation.adoc | 19 +-
.../pages/managing/operating/security.adoc | 105 ++++------
.../cassandra/pages/managing/operating/snitch.adoc | 15 +-
25 files changed, 758 insertions(+), 545 deletions(-)
diff --git a/doc/modules/cassandra/nav.adoc b/doc/modules/cassandra/nav.adoc
index ef80101c9f..09fcdd8843 100644
--- a/doc/modules/cassandra/nav.adoc
+++ b/doc/modules/cassandra/nav.adoc
@@ -98,6 +98,7 @@
**** xref:cassandra:managing/operating/hints.adoc[Hints]
**** xref:cassandra:managing/operating/logging.adoc[Logging]
***** xref:cassandra:managing/operating/auditlogging.adoc[Audit logging]
+***** xref:cassandra:managing/operating/audit_logging.adoc[Audit logging 2]
***** xref:cassandra:managing/operating/fqllogging.adoc[Full query logging]
**** xref:cassandra:managing/operating/metrics.adoc[Monitoring metrics]
**** xref:cassandra:managing/operating/repair.adoc[Repair]
diff --git
a/doc/modules/cassandra/pages/getting-started/cassandra-quickstart.adoc
b/doc/modules/cassandra/pages/getting-started/cassandra-quickstart.adoc
index 30d153a7f7..c0eb9070b5 100644
--- a/doc/modules/cassandra/pages/getting-started/cassandra-quickstart.adoc
+++ b/doc/modules/cassandra/pages/getting-started/cassandra-quickstart.adoc
@@ -1,34 +1,32 @@
-= Cassandra Quickstart
+= Cassandra Quickstart
== STEP 1: GET CASSANDRA USING DOCKER
-You'll need Docker Desktop for Mac, Docker Desktop for Windows, or an
-equivalent Docker installation on Linux.
+You'll need to have Docker Desktop for Mac, Docker Desktop for Windows, or
similar software installed on your computer.
-Apache Cassandra is also available as a tarball or package; see the
-xref:installing/installing.adoc[installation guide].
+Apache Cassandra is also available as a tarball or package
xref:_/download.adoc[download].
-[source,console]
+[source, console]
----
include::cassandra:example$BASH/docker_pull.sh[]
----
== STEP 2: START CASSANDRA
-A Docker network lets you reach the container ports without publishing
-them on the host.
+A Docker network allows us to access the container's ports without exposing
them on the host.
-[source,console]
+[source, console]
----
include::cassandra:example$BASH/docker-network-run.sh[]
----
== STEP 3: CREATE FILES
-Create a file named `data.cql` and paste the following CQL script in it.
-The script creates a keyspace, a table, and sample rows:
+The Cassandra Query Language (CQL) is very similar to SQL but suited for the
JOINless structure of Cassandra.
-[source,cql]
+Create a file named data.cql and paste the following CQL script in it. This
script will create a keyspace, the layer at which Cassandra replicates its
data, a table to hold the data, and insert some data into that table:
+
+[source, cql]
----
include::cassandra:example$CQL/create-keyspace-store.cql[]
@@ -39,76 +37,66 @@
include::cassandra:example$CQL/insert-shopping-cart-data.cql[]
== STEP 4: LOAD DATA WITH CQLSH
-Use `cqlsh` to load the script into the running container.
+The CQL shell, or `cqlsh`, is one tool to use in interacting with the
database.
+We'll use it to load some data into the database using the script you just
saved.
-[source,console]
+[source, console]
----
-include::cassandra:example$BASH/docker-run-cqlsh-load-data.sh[]
+include::cassandra:example$BASH/docker-run-cqlsh-load-data.sh[]
----
[NOTE]
====
-The Cassandra server can take a few seconds to finish starting. If the
-load step fails immediately, wait for the node to finish init and retry.
+The cassandra server itself (the first docker run command you ran) takes a few
seconds to start up.
+The above command will throw an error if the server hasn't finished its init
sequence yet, so give it a few seconds to spin up.
====
== STEP 5: INTERACTIVE CQLSH
-You can also use `cqlsh` interactively:
+Much like an SQL shell, you can also of course use `cqlsh` to run CQL commands
interactively.
-[source,console]
+[source, console]
----
include::cassandra:example$BASH/docker-run-cqlsh-quickstart.sh[]
----
-This should get you a prompt like this:
+This should get you a prompt like so:
-[source,console]
+[source, console]
----
include::cassandra:example$RESULTS/docker-run-cqlsh-quickstart.result[]
----
== STEP 6: READ SOME DATA
-[source,cql]
+[source, cql]
----
-include::cassandra:example$CQL/select-data-from-shopping-cart.cql[]
+include::cassandra:example$CQL/select-data-from-shopping-cart.cql[]
----
== STEP 7: WRITE SOME MORE DATA
-[source,cql]
+[source, cql]
----
include::cassandra:example$CQL/insert-more-data-shopping-cart.cql[]
----
-== STEP 8: CHECK STATUS
-
-Before you clean up, confirm the node is healthy. `nodetool status`
-should show the node as `UN`:
+== STEP 8: CLEAN UP
-[source,console]
+[source, console]
----
-$ nodetool status
-Datacenter: dc1
-=======================
-Status=Up/Down
-|/ State=Normal/Leaving/Joining/Moving
--- Address Load Tokens Owns (effective) Host ID
Rack
-UN 127.0.0.1 123.45 KiB 1 100.0%
01234567-89ab-cdef-0123-456789abcdef rack1
+include::cassandra:example$BASH/docker-kill-and-remove.sh[]
----
-== STEP 9: CLEAN UP
+== CONGRATULATIONS!
+
+Hey, that wasn't so hard, was it?
+
+To learn more, we suggest the following next steps:
+
+* Read through the xref:master@_:ROOT:cassandra-basics.adoc[Cassandra Basics]
to learn main concepts and how Cassandra works at a high level.
+* Browse through the xref:master@_:ROOT:case-studies.adoc[Case Studies] to
learn how other users in our worldwide community are getting value out of
Cassandra.
-[source,console]
-----
-include::cassandra:example$BASH/docker-kill-and-remove.sh[]
-----
-This removes the container and the `cassandra` Docker network created
-for the quickstart.
-== CONGRATULATIONS!
-To learn more, read the xref:master@_:ROOT:cassandra-basics.adoc[Cassandra
-Basics] and xref:master@_:ROOT:case-studies.adoc[Case Studies].
diff --git
a/doc/modules/cassandra/pages/getting-started/mtlsauthenticators.adoc
b/doc/modules/cassandra/pages/getting-started/mtlsauthenticators.adoc
index 4ab76926f9..719f069739 100644
--- a/doc/modules/cassandra/pages/getting-started/mtlsauthenticators.adoc
+++ b/doc/modules/cassandra/pages/getting-started/mtlsauthenticators.adoc
@@ -1,27 +1,12 @@
= Getting started with mTLS authenticators
-When certificate-based authentication such as TLS is used for client and
-internode connections, `MutualTlsAuthenticator` and
-`MutualTlsInternodeAuthenticator` can authenticate clients by using the
-client certificate from the SSL handshake.
+When a certificate based authentication protocol like TLS is used for client
and
+Internode connections, `MutualTlsAuthenticator` &
`MutualTlsInternodeAuthenticator`
+can be used for the authentication by leveraging the client certificates from
the
+SSL handshake.
-After the SSL handshake, the identity from the client certificate is
-extracted and only authorized users are granted access.
-
-== Certificate prerequisites
-
-Before enabling either authenticator, generate the certificate material
-you intend to use:
-
-* one CA certificate that issues the node and client certificates
-* one server certificate and key for each node
-* one client certificate and key for each user or service account
-* a truststore that contains the issuing CA
-* a keystore for each node
-
-If you use the default SPIFFE validator, the SPIFFE ID must be present
-in the certificate SAN. If you use a custom CN-based validator, the
-subject CN must match the identity you want to map to a role.
+After SSL handshake, identity from the client certificates is extracted and
only
+authorized users will be granted access.
== What is an Identity
@@ -33,7 +18,7 @@ certificate conventions used in the deployment environment.
There is a default implementation of `MutualTlsCertificateValidator` with
https://spiffe.io/docs/latest/spiffe-about/spiffe-concepts/[SPIFFE] as the
identity
-of the certificates. This requires SPIFFE to be present in the SAN of the
certificate.
+of the certificates.This requires spiffe to be present in the SAN of the
certificate.
Instead of using `SPIFFE` based validator, a custom `CN` based validator that
implements `MutualTlsCertificateValidator`
could be configured by the operator if required.
@@ -43,10 +28,10 @@ could be configured by the operator if required.
Note that the following steps uses SPIFFE identity as an example, If you are
using
a custom validator, use appropriate identity in place of
`spiffe://testdomain.com/testIdentifier/testValue`.
-*STEP 1: Add authorized users to `system_auth.identity_to_roles` table*
+*STEP 1: Add authorized users to system_auth.identity_to_roles table*
Note that only users with permissions to create/modify roles can add/remove
identities.
-Client certificates with the identities in this table will be trusted by
Cassandra.
+Client certificates with the identities in this table will be trusted by C*.
[source, plaintext]
----
ADD IDENTITY 'spiffe://testdomain.com/testIdentifier/testValue' TO ROLE
'read_only_user'
@@ -54,7 +39,7 @@ ADD IDENTITY
'spiffe://testdomain.com/testIdentifier/testValue' TO ROLE 'read_on
*STEP 2: Configure Cassandra.yaml with right properties*
-Configure `client_encryption_options` for mTLS connections.
+`client_encryption_options` configuration for mTLS connections
[source, plaintext]
----
client_encryption_options:
@@ -66,9 +51,8 @@ client_encryption_options:
truststore_password: cassandra
require_client_auth: true // to enable mTLS
----
-Configure the mTLS authenticator and validator for client connections.
-If you are implementing a custom validator, use that instead of the
-SPIFFE validator.
+Configure mTLS authenticator and the validator for client connections . If you
are
+implementing a custom validator, use that instead of Spiffe validator
[source, plaintext]
----
authenticator:
@@ -79,37 +63,26 @@ authenticator:
*STEP 3: Bounce the cluster*
-After the bounce, Cassandra accepts mTLS connections from clients. If
-the identity is present in the `identity_to_roles` table, access is
-granted.
-
-== Verifying mTLS
-
-Verify the setup with the same client toolchain you will use in
-production. Connect with the client certificate and key, run a simple
-query, and confirm that the mapped role can access the cluster. Repeat
-the connection attempt with a certificate whose identity is not present
-in `identity_to_roles`; the connection should be rejected.
+After the bounce, C* will accept mTLS connections from the clients and if their
+identity is present in the `identity_to_roles` table, access will be granted.
== Configuring mTLS with password fallback authenticator for client connections
-Operators that want to migrate cannot immediately require mTLS
-authentication because that would break existing non-mTLS clients. To
-make a smooth transition, run Cassandra in optional mTLS mode and
-configure the authenticator to be
-`MutualTlsWithPasswordFallbackAuthenticator`, which accepts both
-certificate-based and password-based connections.
+Operators that wish to migrate cannot immediately change the configuration to
require
+mTLS authentication as it will break existing non-mTLS based clients of the
cluster.
+In order to make a smooth transition from non-mTLS based authentication to
mTLS authentication,
+the operator can run Cassandra in optional mTLS mode and configure
authenticator to be
+`MutualTlsWithPasswordFallbackAuthenticator` which can accept both certificate
based
+and password based connections.
-Below are the steps to configure Cassandra in optional mTLS mode with
-the fallback authenticator. Note that the following steps use SPIFFE
-identity as an example. If you are using a custom validator, use the
-appropriate identity in place of
-`spiffe://testdomain.com/testIdentifier/testValue`.
+Below are the steps to configure C* in optional mTLS mode with fallback
authenticator.
+Note that the following steps uses SPIFFE identity as an example, If you are
using
+a custom validator, use appropriate identity in place of
`spiffe://testdomain.com/testIdentifier/testValue`.
-*STEP 1: Add authorized users to `system_auth.identity_to_roles` table*
+*STEP 1: Add authorized users to system_auth.identity_to_roles table*
Note that only users with permissions to create/modify roles can add/remove
identities.
-Client certificates with the identities in this table will be trusted by
Cassandra.
+Client certificates with the identities in this table will be trusted by C*.
[source, plaintext]
----
ADD IDENTITY 'spiffe://testdomain.com/testIdentifier/testValue' TO ROLE
'read_only_user'
@@ -117,8 +90,8 @@ ADD IDENTITY
'spiffe://testdomain.com/testIdentifier/testValue' TO ROLE 'read_on
*STEP 2: Configure Cassandra.yaml with right properties*
-Configure `client_encryption_options` for mTLS connections. Note that
-`require_client_auth` is optional here.
+`client_encryption_options` configuration for mTLS connections, Note that
require_client_auth configuration
+is optional.
[source, plaintext]
----
client_encryption_options:
@@ -130,9 +103,8 @@ client_encryption_options:
truststore_password: cassandra
require_client_auth: optional // to enable mTLS in optional mode
----
-Configure the fallback authenticator and validator for client
-connections. If you are implementing a custom validator, use that
-instead of the SPIFFE validator.
+Configure fallback authenticator and the validator for client connections . If
you are
+implementing a custom validator, use that instead of Spiffe validator
[source, plaintext]
----
authenticator:
@@ -143,13 +115,13 @@ authenticator:
*STEP 3: Bounce the cluster*
-After the bounce, Cassandra accepts both mTLS and password-based
-connections. Use this configuration only during the transition phase.
-Set `require_client_auth` to `true` after all clients use mTLS.
+After the bounce, C* will accept both mTLS connections and password based
connections from
+the clients. This configuration should be used during transition phase and the
require_client_auth
+configuration should be set to true when all the clients start making mTLS
connections to the cluster.
== Configuring mTLS authenticator for Internode connections
-The internode authenticator trusts certificates whose identities are present in
+Internode authenticator trusts certificates whose identities are present in
`internode_authenticator.parameters.trusted_peer_identities` if configured.
Otherwise, it trusts connections which have the same identity as the node.
@@ -160,13 +132,13 @@ connections from other nodes who have the same identity
will be trusted if
`trusted_peer_identities` is not configured.
For example, if a node has `testIdentity` embedded in the certificate in
-the outbound keystore, it trusts connections from other nodes when
-their certificates have `testIdentity` embedded in them.
+outbound keystore, It trusts connections from other nodes when their
certificates
+have `testIdentity` embedded in them.
There is an optional configuration `node_identity` that can be used to verify
identity
extracted from the keystore to avoid any configuration errors.
-*STEP 1: Configure `server_encryption_options` in `cassandra.yaml`*
+*STEP 1: Configure server_encryption_options in cassandra.yaml*
[source, plaintext]
----
@@ -184,9 +156,8 @@ server_encryption_options:
*STEP 2: Configure Internode Authenticator and Validator*
-Configure the mTLS internode authenticator and validator. If you are
-implementing a custom validator, use that instead of the SPIFFE
-validator.
+Configure mTLS Internode authenticator and validator. If you are
+implementing a custom validator, use that instead of Spiffe validator
[source, plaintext]
----
internode_authenticator:
@@ -197,22 +168,19 @@ internode_authenticator:
----
*STEP 3: Bounce the cluster*
-Once all nodes in the cluster are restarted, all internode
-communications are authenticated by mTLS.
+Once all nodes in the cluster are restarted, all internode communications will
be authenticated by mTLS.
== Migration from existing password based authentication
* For client connections, since the migration will not happen overnight,
- operators can run Cassandra in optional mTLS mode and use
- `MutualTlsWithPasswordFallbackAuthenticator`, which accepts both mTLS
- and password-based connections. Configure the settings in
- `cassandra.yaml`. Once all clients migrate to mTLS, turn off optional
- mode and set the authenticator to `MutualTlsAuthenticator`. From that
- point only mTLS client connections are accepted.
+the operators can run cassandra in optional mTLS mode and use
+`MutualTlsWithPasswordFallbackAuthenticator` which will accept both mTLS &
password
+based connections, based on the type of connection client is making. These
settings
+can be configured in `cassandra.yaml`. Once all the clients migrate to using
mTLS,
+turn off optional mode and set the authenticator to be
`MutualTlsAuthenticator`. From
+that point only mTLS client connections will be accepted.
* For Internode connections, while doing rolling upgrades from non-mTLS based
configuration
- to mTLS based configuration, set `server_encryption_options.optional: true`
- for the new nodes so they can connect to old nodes that are still
- using non-mTLS based configuration during upgrade. After this, change
- the internode authenticator to `MutualTlsInternodeAuthenticator` and
- turn off the optional mode by setting
- `server_encryption_options.optional: false`.
+to mTLS based configuration, set `server_encryption_options.optional:true` for
the new nodes to
+be able to connect to old nodes which are still using non-mTLS based
configuration during upgrade.
+After this, change the internode authenticator to be
`MutualTlsInternodeAuthenticator` and turn off the optional
+mode by setting `server_encryption_options.optional:false`.
\ No newline at end of file
diff --git a/doc/modules/cassandra/pages/installing/installing.adoc
b/doc/modules/cassandra/pages/installing/installing.adoc
index 3cdfc90e13..e5043914e0 100644
--- a/doc/modules/cassandra/pages/installing/installing.adoc
+++ b/doc/modules/cassandra/pages/installing/installing.adoc
@@ -144,7 +144,7 @@ Result::
--
[source,console]
----
-include::cassandra:example$RESULTS/curl_verify_sha.result[]
+include::example$RESULTS/curl_verify_sha.result[]
----
--
====
@@ -155,7 +155,7 @@ include::cassandra:example$RESULTS/curl_verify_sha.result[]
include::cassandra:example$BASH/tarball.sh[]
----
+
-The files are extracted to an `apache-cassandra-<version>/` directory.
+The files will be extracted to the `apache-cassandra-4.0.0/` directory.
This is the tarball installation location.
. Located in the tarball installation location are the directories for the
scripts, binaries, utilities, configuration, data and log files:
+
diff --git
a/doc/modules/cassandra/pages/managing/configuration/cass_env_sh_file.adoc
b/doc/modules/cassandra/pages/managing/configuration/cass_env_sh_file.adoc
index 52a492e6ea..309b15b17d 100644
--- a/doc/modules/cassandra/pages/managing/configuration/cass_env_sh_file.adoc
+++ b/doc/modules/cassandra/pages/managing/configuration/cass_env_sh_file.adoc
@@ -1,72 +1,157 @@
= cassandra-env.sh file
-Use `cassandra-env.sh` for JVM settings that are calculated at startup,
-such as heap sizing derived from the host. If a setting is static, use
-the `jvm-*` files instead.
-
-The practical risk in this file is not syntax. It is starting a node
-with the wrong startup behavior. A bad token, ring delay, or ring-state
-choice can change when the node joins the cluster and how it serves
-traffic.
-
-The most common settings in this file are:
+The `cassandra-env.sh` bash script file can be used to pass additional
+options to the Java virtual machine (JVM), such as maximum and minimum
+heap size, rather than setting them in the environment. If the JVM
+settings are static and do not need to be computed from the node
+characteristics, the `cassandra-jvm-options` files should be used
+instead. For example, commonly computed values are the heap sizes, using
+the system values.
+
+For example, add
+`JVM_OPTS="$JVM_OPTS -Dcassandra.load_ring_state=false"` to the
+`cassandra_env.sh` file and run the command-line `cassandra` to start.
+The option is set from the `cassandra-env.sh` file, and is equivalent to
+starting Cassandra with the command-line option
+`cassandra -Dcassandra.load_ring_state=false`.
+
+The `-D` option specifies the start-up parameters in both the command
+line and `cassandra-env.sh` file. The following options are available:
== `cassandra.auto_bootstrap=false`
-Disable bootstrap on the first start of a new node. Use it only when you
-know the node should not join the ring immediately.
+Facilitates setting auto_bootstrap to false on initial set-up of the
+cluster. The next time you start the cluster, you do not need to change
+the `cassandra.yaml` file on each node to revert to true, the default
+value.
== `cassandra.available_processors=<number_of_processors>`
-Limit the processors Cassandra sees on a host that runs multiple
-instances.
+In a multi-instance deployment, multiple Cassandra instances will
+independently assume that all CPU processors are available to it. This
+setting allows you to specify a smaller set of processors.
== `cassandra.boot_without_jna=true`
-Start Cassandra without JNA only when you accept the loss of JNA-backed
-integrations.
+If JNA fails to initialize, Cassandra fails to boot. Use this command to
+boot Cassandra without JNA.
== `cassandra.config=<directory>`
-Set the location of `cassandra.yaml`.
+The directory location of the `cassandra.yaml file`. The default
+location depends on the type of installation.
+
+== `cassandra.ignore_dynamic_snitch_severity=true|false`
+
+Setting this property to true causes the dynamic snitch to ignore the
+severity indicator from gossip when scoring nodes. Explore failure
+detection and recovery and dynamic snitching for more information.
+
+*Default:* false
== `cassandra.initial_token=<token>`
-Use this only when virtual nodes are disabled. With vnodes enabled,
-Cassandra chooses tokens automatically.
+Use when virtual nodes (vnodes) are not used. Sets the initial
+partitioner token for a node the first time the node is started. Note:
+Vnodes are highly recommended as they automatically select tokens.
+
+*Default:* disabled
== `cassandra.join_ring=true|false`
-Set to false when the node should start but stay out of the ring until a
-later `nodetool join`.
+Set to false to start Cassandra on a node but not have the node join the
+cluster. You can use `nodetool join` and a JMX call to join the ring
+afterwards.
+
+*Default:* true
== `cassandra.load_ring_state=true|false`
-Set to false to clear the node's saved gossip state on restart.
+Set to false to clear all gossip state for the node on restart.
+
+*Default:* true
+
+== `cassandra.partitioner=<partitioner>`
+
+Set the partitioner.
+
+*Default:* org.apache.cassandra.dht.Murmur3Partitioner
+
+== `cassandra.prepared_statements_cache_size_in_bytes=<cache_size>`
+
+Set the cache size for prepared statements.
== `cassandra.replace_address=<listen_address of dead node>|<broadcast_address
of dead node>`
-Use this during node replacement. The replacement node must start with
-an empty data directory.
+To replace a node that has died, restart a new node in its place
+specifying the `listen_address` or `broadcast_address` that the new node
+is assuming. The new node must not have any data in its data directory,
+the same state as before bootstrapping. Note: The `broadcast_address`
+defaults to the `listen_address` except when using the
+`Ec2MultiRegionSnitch`.
+
+== `cassandra.replayList=<table>`
+
+Allow restoring specific tables from an archived commit log.
== `cassandra.ring_delay_ms=<number_of_ms>`
-Controls how long a joining node waits before it announces itself as
-part of the ring.
+Defines the amount of time a node waits to hear from other nodes before
+formally joining the ring.
*Default:* 1000ms
-*Inference:* if you shorten this too much, the node can join before it
-has heard enough cluster state to make a stable decision about the ring.
+== `cassandra.native_transport_port=<port>`
+
+Set the port on which the CQL native transport listens for clients.
+
+*Default:* 9042
+
+== `cassandra.rpc_port=<port>`
+
+Set the port for the Thrift RPC service, which is used for client
+connections.
+
+*Default:* 9160
+
+== `cassandra.storage_port=<port>`
+
+Set the port for inter-node communication.
+
+*Default:* 7000
+
+== `cassandra.ssl_storage_port=<port>`
+
+Set the SSL port for encrypted communication.
+
+*Default:* 7001
+
+== `cassandra.start_native_transport=true|false`
+
+Enable or disable the native transport server. See
+`start_native_transport` in `cassandra.yaml`.
+
+*Default:* true
+
+== `cassandra.start_rpc=true|false`
+
+Enable or disable the Thrift RPC server.
+
+*Default:* true
+
+== `cassandra.triggers_dir=<directory>`
+
+Set the default location for the trigger JARs.
+
+*Default:* conf/triggers
== `cassandra.write_survey=true`
-Use only for write-performance experiments.
+For testing new compaction and compression strategies. It allows you to
+experiment with different strategies and benchmark write performance
+differences without affecting the production workload.
== `consistent.rangemovement=true|false`
-Keep this enabled for normal bootstraps. Disabling it skips the
-consistency check that protects token movement during bootstrap.
-
-If you need a one-line rule, keep host-specific startup calculations in
-`cassandra-env.sh` and static JVM settings in the `jvm-*` files.
+Set to true makes Cassandra perform bootstrap safely without violating
+consistency. False disables this.
diff --git
a/doc/modules/cassandra/pages/managing/configuration/cass_jvm_options_file.adoc
b/doc/modules/cassandra/pages/managing/configuration/cass_jvm_options_file.adoc
index 560d02469f..132669f643 100644
---
a/doc/modules/cassandra/pages/managing/configuration/cass_jvm_options_file.adoc
+++
b/doc/modules/cassandra/pages/managing/configuration/cass_jvm_options_file.adoc
@@ -1,19 +1,20 @@
= jvm-* files
Several files for JVM configuration are included in Cassandra. The
-`jvm-server.options` file and the JDK-specific `jvmN-server.options`
-files are the main place for static JVM settings on cluster nodes.
-Likewise, the `jvm-clients.options` and corresponding
-`jvmN-clients.options` files configure JVM settings for tools like
-`nodetool` and `sstableloader`.
+`jvm-server.options` file, and corresponding JDK specific files
+`jvmN-server.options` are the main file for settings that affect
+the operation of the Cassandra JVM on cluster nodes. The file includes
+startup parameters, general JVM settings such as garbage collection, and
+heap settings. Likewise, the `jvm-clients.options` and corresponding
+`jvmN-clients.options` files can be used to configure JVM settings for
+clients like `nodetool` and the `sstable` tools.
-Use these files for settings that do not need to be computed from the
-host at startup. If a value depends on node state or host-specific
-resources, keep it in `cassandra-env.sh` instead.
+See each file for examples of settings.
[NOTE]
====
-The `jvm-*` files replaced `cassandra-env.sh` for static JVM settings in
-Cassandra 3.0 and later. `cassandra-env.sh` still matters when the JVM
-options must be derived from the machine.
-====
+The `jvm-\*` files replace the `cassandra-env.sh` file used in Cassandra
+versions prior to Cassandra 3.0. The `cassandra-env.sh` bash script file
+is still useful if JVM settings must be dynamically calculated based on
+system settings. The `jvm-*` files only store static JVM settings.
+====
\ No newline at end of file
diff --git
a/doc/modules/cassandra/pages/managing/configuration/cass_rackdc_file.adoc
b/doc/modules/cassandra/pages/managing/configuration/cass_rackdc_file.adoc
index 2f173641db..bb79cb5271 100644
--- a/doc/modules/cassandra/pages/managing/configuration/cass_rackdc_file.adoc
+++ b/doc/modules/cassandra/pages/managing/configuration/cass_rackdc_file.adoc
@@ -1,30 +1,50 @@
= cassandra-rackdc.properties file
-Use `cassandra-rackdc.properties` when the snitch needs to know the
-local rack and datacenter. These values are part of replica placement.
-If the datacenter name here does not match the keyspace replication map,
-Cassandra can place replicas in the wrong failure domain without an
-obvious error.
-
-Choose the snitch that matches your environment:
-
-[cols="1,2,3",options="header"]
-|===
-|Environment |Recommended snitch |Why
-|On-premises or other self-managed infrastructure |GossipingPropertyFileSnitch
|Put the rack and datacenter in one local file and gossip them to the rest of
the cluster.
-|Single-AZ or single-region AWS |Ec2Snitch |Read the rack and datacenter from
AWS metadata instead of managing them by hand.
-|Multi-region AWS |Ec2MultiRegionSnitch |Use when nodes need public-address
awareness across regions.
-|Custom or mixed topology metadata |PropertyFileSnitch |Use
`cassandra-topologies.properties` when every node needs an explicit, shared
topology map.
-|===
-
-The file is case-sensitive. Use the same datacenter spelling that appears
-in your keyspace replication settings.
+Several `snitch` options use the `cassandra-rackdc.properties`
+configuration file to determine which `datacenters` and racks cluster
+nodes belong to. Information about the network topology allows requests
+to be routed efficiently and to distribute replicas evenly. The
+following snitches can be configured here:
+
+* GossipingPropertyFileSnitch
+* AWS EC2 single-region snitch
+* AWS EC2 multi-region snitch
+
+The GossipingPropertyFileSnitch is recommended for production. This
+snitch uses the datacenter and rack information configured in a local
+node's `cassandra-rackdc.properties` file and propagates the information
+to other nodes using `gossip`. It is the default snitch and the settings
+in this properties file are enabled.
+
+The AWS EC2 snitches are configured for clusters in AWS. This snitch
+uses the `cassandra-rackdc.properties` options to designate one of two
+AWS EC2 datacenter and rack naming conventions:
+
+* legacy: Datacenter name is the part of the availability zone name
+preceding the last "-" when the zone ends in -1 and includes the number
+if not -1. Rack name is the portion of the availability zone name
+following the last "-".
++
+____
+Examples: us-west-1a => dc: us-west, rack: 1a; us-west-2b => dc:
+us-west-2, rack: 2b;
+____
+* standard: Datacenter name is the standard AWS region name, including
+the number. Rack name is the region plus the availability zone letter.
++
+____
+Examples: us-west-1a => dc: us-west-1, rack: us-west-1a; us-west-2b =>
+dc: us-west-2, rack: us-west-2b;
+____
+
+Either snitch can set to use the local or internal IP address when
+multiple datacenters are not communicating.
== GossipingPropertyFileSnitch
=== `dc`
-Datacenter name. The value is case-sensitive.
+Name of the datacenter. The value is case-sensitive.
*Default value:* DC1
@@ -45,12 +65,14 @@ Datacenter and rack naming convention. Options are `legacy`
or
[NOTE]
====
-Use the `legacy` value if you are upgrading a pre-4.0 cluster.
+YOU MUST USE THE `legacy` VALUE IF YOU ARE UPGRADING A PRE-4.0 CLUSTER.
====
+== Either snitch
+
=== `prefer_local`
-Use the local or internal IP address when communication is not across
-different datacenters. *This option is commented out by default.*
+Option to use the local or internal IP address when communication is not
+across different datacenters. *This option is commented out by default.*
*Default value:* true
diff --git
a/doc/modules/cassandra/pages/managing/configuration/cass_topo_file.adoc
b/doc/modules/cassandra/pages/managing/configuration/cass_topo_file.adoc
index ee6d2c09e2..ed874e7672 100644
--- a/doc/modules/cassandra/pages/managing/configuration/cass_topo_file.adoc
+++ b/doc/modules/cassandra/pages/managing/configuration/cass_topo_file.adoc
@@ -1,16 +1,19 @@
= cassandra-topologies.properties file
-Use `cassandra-topologies.properties` with `PropertyFileSnitch` when you
-want every node's datacenter and rack to come from an explicit file. The
-file must be identical on every node. If any node disagrees, Cassandra
-can make inconsistent replica-placement decisions.
-
-The datacenter and rack names are case-sensitive and should match the
-names used in your keyspace replication settings.
-
-If you use a snitch other than `PropertyFileSnitch`, use
-xref:cassandra:managing/configuration/cass_rackdc_file.adoc[cassandra-rackdc.properties]
-instead.
+The `PropertyFileSnitch` `snitch` option uses the
+`cassandra-topologies.properties` configuration file to determine which
+`datacenters` and racks cluster nodes belong to. If other snitches are
+used, the
xref:cassandra:managing/configuration/cass_rackdc_file.adoc[cassandra-rackdc.properties]
must be used. The snitch determines
+network topology (proximity by rack and datacenter) so that requests are
+routed efficiently and allows the database to distribute replicas
+evenly.
+
+Include every node in the cluster in the properties file, defining your
+datacenter names as in the keyspace definition. The datacenter and rack
+names are case-sensitive.
+
+The `cassandra-topologies.properties` file must be copied identically to
+every node in the cluster.
== Example
diff --git
a/doc/modules/cassandra/pages/managing/configuration/configuration.adoc
b/doc/modules/cassandra/pages/managing/configuration/configuration.adoc
index 2453ca640d..9b4df248b2 100644
--- a/doc/modules/cassandra/pages/managing/configuration/configuration.adoc
+++ b/doc/modules/cassandra/pages/managing/configuration/configuration.adoc
@@ -1,20 +1,21 @@
-= Parameter renaming and units
-:navtitle: Parameter renaming
+= Liberating cassandra.yaml Parameters' Names from Their Units
-This page covers the `cassandra.yaml` renaming work from
-CASSANDRA-15234. It is a reference for translating older examples and
-for checking upgrade compatibility when you touch a renamed setting.
+== Objective
-== What changed
+Three big things happened as part of
https://issues.apache.org/jira/browse/CASSANDRA-15234[CASSANDRA-15234]:
-* Many `cassandra.yaml` parameters were renamed to follow the
- `noun_verb` pattern.
-* Duration, data storage, and data rate values now accept explicit
- units.
-* Cassandra still accepts the old names for backward compatibility.
+1) Renaming of parameters in `cassandra.yaml` to follow the form `noun_verb`.
+
+2) Liberating `cassandra.yaml` parameters from their units (DataStorage,
DataRate and Duration) and introducing temporary smallest accepted unit per
parameter (only for DataStorage and Duration ones)
+
+3) Backward compatibility framework to support the old names and lack of units
support until at least the next major release.
-Supported units:
+== Renamed Parameters
+
+The community has decided to allow operators to specify units for Cassandra
parameters of types duration, data storage, and data rate.
+All parameters which had a particular unit (most of the time added as a suffix
to their name) can be now set by using the format [value][unit]. The unit
suffix has been removed from their names.
+Supported units:
[cols=",",options="header",]
|===
|Parameter Type |Units Supported
@@ -23,6 +24,36 @@ Supported units:
|Data Rate | B/s, MiB/s, KiB/s
|===
+
+*Example*:
+
+Old name and value format:
+....
+permissions_update_interval_ms: 0
+....
+New name and possible value formats:
+....
+permissions_update_interval: 0ms
+permissions_update_interval: 0s
+permissions_update_interval: 0d
+permissions_update_interval: 0us
+permissions_update_interval: 0µs
+....
+
+The work in
https://issues.apache.org/jira/browse/CASSANDRA-15234[CASSANDRA-15234] was
already quite big, so we decided
+to introduce the notion of the smallest allowed unit per parameter for
duration and data storage parameters. What does this mean?
+Cassandra's internals still use the old units for parameters. If, for example,
seconds are used internally, but you want
+to add a value in nanoseconds in `cassandra.yaml`, you will get a
configuration exception that contains the following information:
+....
+Accepted units: seconds, minutes, hours, days.
+....
+
+Why was this needed?
+Because we can run into precision issues. The full solution to the problem is
to convert internally all parameters’ values
+to be manipulated with the smallest supported by Cassandra unit. A series of
tickets to assess and maybe migrate to the smallest unit
+our parameters (incrementally, post
https://issues.apache.org/jira/browse/CASSANDRA-15234[CASSANDRA-15234]) will be
opened in the future.
+
+
[cols=",,",options="header",]
|===
|Old Name |New Name |The Smallest Supported Unit
@@ -113,31 +144,77 @@ Supported units:
|cache_load_timeout_seconds |cache_load_timeout |s
|===
-== Example
+Another TO DO is to add JMX methods supporting the new format. However, we may
abandon this if virtual tables support
+configuration changes in the near future.
-If you change this:
+*Notes for Cassandra Developers*:
-....
-permissions_update_interval_ms: 0
-....
+- Most of our parameters are already moved to the new framework as part of
https://issues.apache.org/jira/browse/CASSANDRA-15234[CASSANDRA-15234].
+`@Replaces` is the annotation to be used when you make changes to any
configuration parameters in `Config` class and `cassandra.yaml`, and you want
to add backward
+compatibility with previous Cassandra versions. `Converters` class enumerates
the different methods used for backward compatibility.
+`IDENTITY` is the one used for name change only. For more information about
the other Converters, please, check the JavaDoc in the class.
+For backward compatibility virtual table `Settings` contains both the old and
the new
+parameters with the old and the new value format. Only exception at the moment
are the following three parameters: `key_cache_save_period`,
+`row_cache_save_period` and `counter_cache_save_period` which appear only once
with the new value format.
+The old names and value format still can be used at least until the next major
release. Deprecation warning is emitted on startup.
+If the parameter is of type duration, data rate or data storage, its value
should be accompanied by a unit when new name is used.
-use this instead:
+- Please follow the new format `noun_verb` when adding new configuration
parameters.
-....
-permissions_update_interval: 0ms
-....
+- Please consider adding any new parameters with the lowest supported by
Cassandra unit when possible. Our new types also
+support long and integer upper bound, depending on your needs. All options for
configuration parameters' types are nested
+classes in our three main abstract classes - `DurationSpec`,
`DataStorageSpec`, `DataRateSpec`.
+
+- If for some reason you consider the smallest unit for a new parameter
shouldn’t be the one that is supported as such in
+Cassandra, you can use the rest of the nested classes in `DurationSpec`,
`DataStorageSpec`. The smallest allowed unit is
+the one we use internally for the property, so we don't have to do conversions
to bigger units which will lead to precision
+problems. This is a problem only with `DurationSpec` and `DataStorageSpec`.
`DataRateSpec` is handled internally in double.
+
+- New parameters should be added as non-negative numbers. For parameters where
you would have set -1 to disable in the past, you might
+want to consider a separate flag parameter or null value. In case you use the
null value, please, ensure that any default value
+introduced in the DatabaseDescriptor to handle it is also duplicated in any
related setters.
+
+- Parameters of type data storage, duration and data rate cannot be set to
Long.MAX_VALUE (former parameters of long type)
+and Integer.MAX_VALUE (former parameters of int type). That numbers are used
during conversion between units to prevent
+an overflow from happening.
+
+- Any time you add @Replaces with a name change, we need to add an entry in
this
https://github.com/riptano/ccm/blob/808b6ca13526785b0fddfe1ead2383c060c4b8b6/ccmlib/common.py#L62[Python
dictionary in CCM] to support the same backward compatibility as SnakeYAML.
+
+Please follow the instructions in requirements.txt in the DTest repo how to
retag CCM after committing any changes.
+You might want to test also with tagging in your repo to ensure that there
will be no surprise after retagging the official CCM.
+Please be sure to run a full CI after any changes as CCM affects a few of our
testing suites.
-If you choose a unit smaller than Cassandra stores internally, the
-configuration fails with an accepted-units message rather than silently
-truncating the value. That is intentional and avoids precision loss.
+- Some configuration parameters are not announced in cassandra.yaml, but they
are presented in the Config class for advanced users.
+Those also should be using the new framework and naming conventions.
-== Compatibility
+- As we have backward compatibility, we didn’t have to rework all python
DTests to set config in the new format, and we exercise
+the backward compatibility while testing. Please consider adding any new tests
using the new names and value format though.
-Cassandra keeps the old names and old value format for backward
-compatibility. When both the old and new names are present, the last one
-wins. That behavior is useful during upgrades, but it also means that
-duplicate or conflicting keys can hide mistakes.
+- In-JVM upgrade tests do not support per-version configuration at the moment,
so we have to keep the old names and value format.
+Currently, if we try to use the new config for a newer version, that will be
silently ignored and default config will be used.
+
+- SnakeYAML supports overloading of parameters. This means that if you add a
configuration parameter more than once in your `cassandra.yaml` -
+the latest occasion will be the one to load in Config during Cassandra
startup. In order to make upgrades as less disruptive as possible,
+we continue supporting that behavior also with adding old and new names of a
parameter into `cassandra.yaml`.
+
+- Please ensure that any JMX setters/getters update the Config class
properties and not some local copies. Settings Virtual Table
+reports the configuration loaded at any time from the Config class.
+
+*Example*:
+
+If you add the following to `cassandra.yaml`:
+....
+hinted_handoff_enabled: true
+enabled_hinted_handolff: false
+....
+
+you will get loaded in `Config`:
+....
+hinted_handoff_enabled: false
+....
-If you touch one of these settings, use the new name and an explicit
-unit. Keep any compatibility exceptions local to the upgrade procedure
-you are following.
+https://issues.apache.org/jira/browse/CASSANDRA-17379[CASSANDRA-17379] was
opened to improve the user experience and deprecate the overloading.
+By default, we refuse starting Cassandra with a config containing both old and
new config keys for the same parameter. Start
+Cassandra with `-Dcassandra.allow_new_old_config_keys=true` to override. For
historical reasons duplicate config keys
+in `cassandra.yaml` are allowed by default, start Cassandra with
`-Dcassandra.allow_duplicate_config_keys=false` to disallow this.
+Please note that `key_cache_save_period`, `row_cache_save_period`,
`counter_cache_save_period` will be affected only by
`-Dcassandra.allow_duplicate_config_keys`.
\ No newline at end of file
diff --git a/doc/modules/cassandra/pages/managing/configuration/index.adoc
b/doc/modules/cassandra/pages/managing/configuration/index.adoc
index 04b01a3ef5..6c40a8ce86 100644
--- a/doc/modules/cassandra/pages/managing/configuration/index.adoc
+++ b/doc/modules/cassandra/pages/managing/configuration/index.adoc
@@ -1,22 +1,12 @@
= Configuring Cassandra
:navtitle: Configuring
-This section explains where Cassandra configuration lives, which file to
-edit for a given change, and what to verify after the change.
-
-Start with the file that matches the setting you need:
-
-* xref:cassandra:managing/configuration/cass_yaml_file.adoc[cassandra.yaml]
for cluster behavior and table defaults
-* xref:cassandra:managing/configuration/cass_jvm_options_file.adoc[jvm-*
files] for static JVM settings
-*
xref:cassandra:managing/configuration/cass_env_sh_file.adoc[cassandra-env.sh]
for startup values that depend on the host
-*
xref:cassandra:managing/configuration/cass_rackdc_file.adoc[cassandra-rackdc.properties]
for rack and datacenter metadata
-*
xref:cassandra:managing/configuration/cass_topo_file.adoc[cassandra-topologies.properties]
for explicit per-node topology mapping
-*
xref:cassandra:managing/configuration/cass_cl_archive_file.adoc[commitlog-archiving.properties]
for commitlog archiving
-*
xref:cassandra:managing/configuration/cass_logback_xml_file.adoc[logback.xml]
for logging configuration
-
-If you are changing `cassandra.yaml` parameter names or units, see
-xref:cassandra:managing/configuration/configuration.adoc[Parameter renaming
and units].
-
-After a change, restart or reload only if the target page says it is
-safe to do so. Some settings are read once at startup, while others take
-effect only after a full node restart.
+This section describes how to configure Apache Cassandra.
+
+* xref:cassandra:managing/configuration/cass_yaml_file.adoc[cassandra.yaml]
+*
xref:cassandra:managing/configuration/cass_rackdc_file.adoc[cassandra-rackdc.properties]
+* xref:cassandra:managing/configuration/cass_env_sh_file.adoc[cassandra-env.sh]
+*
xref:cassandra:managing/configuration/cass_topo_file.adoc[cassandra-topologies.properties]
+*
xref:cassandra:managing/configuration/cass_cl_archive_file.adoc[commitlog-archiving.properties]
+* xref:cassandra:managing/configuration/cass_logback_xml_file.adoc[logback.xml]
+* xref:cassandra:managing/configuration/cass_jvm_options_file.adoc[jvm-* files]
diff --git a/doc/modules/cassandra/pages/managing/operating/audit_logging.adoc
b/doc/modules/cassandra/pages/managing/operating/audit_logging.adoc
new file mode 100644
index 0000000000..89387d043d
--- /dev/null
+++ b/doc/modules/cassandra/pages/managing/operating/audit_logging.adoc
@@ -0,0 +1,226 @@
+= Audit Logging
+
+Audit logging in Cassandra logs every incoming CQL command request,
+as well as authentication (successful/unsuccessful login) to a Cassandra node.
+Currently, there are two implementations provided.
+The custom logger can be implemented and injected with the class name as a
parameter in
+the `cassandra.yaml` file.
+
+* `BinAuditLogger`: an efficient way to log events to file in a binary
+format (community-recommended logger for performance)
+* `FileAuditLogger`: logs events to `audit/audit.log` file using slf4j
+logger
+
+== What does audit logging captures
+
+Audit logging captures following events:
+
+* Successful as well as unsuccessful login attempts
+* All database commands executed via native CQL protocol attempted or
+successfully executed
+
+== Limitations
+
+Executing prepared statements will log the query as provided by the
+client in the prepare call, along with the execution timestamp and all
+other attributes (see below).
+Actual values bound for prepared statement execution will not show up in the
audit log.
+
+== What does audit logging logs
+
+Each audit log implementation has access to the following attributes,
+and for the default text based logger these fields are concatenated with
+pipes to yield the final message.
+
+* `user`: User name(if available)
+* `host`: Host IP, where the command is being executed
+* `source ip address`: Source IP address from where the request initiated
+* `source port`: Source port number from where the request initiated
+* `timestamp`: unix time stamp
+* `type`: Type of the request (SELECT, INSERT, etc.,)
+* `category` - Category of the request (DDL, DML, etc.,)
+* `keyspace` - Keyspace(If applicable) on which request is targeted to
+be executed
+* `scope` - Table/Aggregate name/ function name/ trigger name etc., as
+applicable
+* `operation` - CQL command being executed
+
+== How to configure
+
+Auditlog can be configured using the `cassandra.yaml` file.
+To use audit logging on one node, either edit that file or enable and
configure using `nodetool`.
+
+=== cassandra.yaml configurations for AuditLog
+
+The following options are supported:
+
+* `enabled`: This option enables/ disables audit log
+* `logger`: Class name of the logger/ custom logger.
+* `audit_logs_dir`: Auditlogs directory location, if not set, default to
+[.title-ref]#cassandra.logdir.audit# or [.title-ref]#cassandra.logdir# +
+/audit/
+* `included_keyspaces`: Comma separated list of keyspaces to be included
+in audit log, default - includes all keyspaces
+* `excluded_keyspaces`: Comma separated list of keyspaces to be excluded
+from audit log, default - excludes no keyspace except
+[.title-ref]#system#, [.title-ref]#system_schema# and
+[.title-ref]#system_virtual_schema#
+* `included_categories`: Comma separated list of Audit Log Categories to
+be included in audit log, default - includes all categories
+* `excluded_categories`: Comma separated list of Audit Log Categories to
+be excluded from audit log, default - excludes no category
+* `included_users`: Comma separated list of users to be included in
+audit log, default - includes all users
+* `excluded_users`: Comma separated list of users to be excluded from
+audit log, default - excludes no user
+
+List of available categories are: QUERY, DML, DDL, DCL, OTHER, AUTH,
+ERROR, PREPARE
+
+=== NodeTool command to enable AuditLog
+
+The `nodetool enableauditlog` command enables AuditLog with the
`cassandra.yaml` file defaults.
+Those defaults can be overridden using options with this nodetool command.
+
+[source,none]
+----
+nodetool enableauditlog
+----
+
+==== Options
+
+`--excluded-categories`::
+ Comma separated list of Audit Log Categories to be excluded for audit
+ log. If not set the value from cassandra.yaml will be used
+`--excluded-keyspaces`::
+ Comma separated list of keyspaces to be excluded for audit log. If not
+ set the value from cassandra.yaml will be used. Please remeber that
+ [.title-ref]#system#, [.title-ref]#system_schema# and
+ [.title-ref]#system_virtual_schema# are excluded by default, if you
+ are overwriting this option via nodetool, remember to add these
+ keyspaces back if you dont want them in audit logs
+`--excluded-users`::
+ Comma separated list of users to be excluded for audit log. If not set
+ the value from cassandra.yaml will be used
+`--included-categories`::
+ Comma separated list of Audit Log Categories to be included for audit
+ log. If not set the value from cassandra.yaml will be used
+`--included-keyspaces`::
+ Comma separated list of keyspaces to be included for audit log. If not
+ set the value from cassandra.yaml will be used
+`--included-users`::
+ Comma separated list of users to be included for audit log. If not set
+ the value from cassandra.yaml will be used
+`--logger`::
+ Logger name to be used for AuditLogging. Default BinAuditLogger. If
+ not set the value from cassandra.yaml will be used
+
+=== NodeTool command to disable AuditLog
+
+The `nodetool disableauditlog` command disables AuditLog.
+
+[source,none]
+----
+nodetool disableauditlog
+----
+
+=== NodeTool command to reload AuditLog filters
+
+The `nodetool enableauditlog` command can be used to reload auditlog filters
with either defaults or previous `loggername` and
+updated filters:
+
+[source,none]
+----
+nodetool enableauditlog --loggername <Default/ existing loggerName>
--included-keyspaces <New Filter values>
+----
+
+== View the contents of AuditLog Files
+
+The `auditlogviewer` is used to view the contents of the audit binlog file in
human readable text format.
+
+[source,none]
+----
+auditlogviewer <path1> [<path2>...<pathN>] [options]
+----
+
+=== Options
+
+`-f,--follow`::
+ Upon reacahing the end of the log continue indefinitely;;
+ waiting for more records
+`-r,--roll_cycle`::
+ How often to roll the log file was rolled. May be;;
+ necessary for Chronicle to correctly parse file names. (MINUTELY,
+ HOURLY, DAILY). Default HOURLY.
+`-h,--help`::
+ display this help message
+
+For example, to dump the contents of audit log files to the console:
+
+[source,none]
+----
+auditlogviewer /logs/cassandra/audit
+----
+
+results in
+
+[source,none]
+----
+LogMessage:
user:anonymous|host:localhost/X.X.X.X|source:/X.X.X.X|port:60878|timestamp:1521158923615|type:USE_KS|category:DDL|ks:dev1|operation:USE
"dev1"
+----
+
+== Configuring BinAuditLogger
+
+To use `BinAuditLogger` as a logger in AuditLogging, set the logger to
`BinAuditLogger` in the `cassandra.yaml` file
+ under the `audit_logging_options` section.
+`BinAuditLogger` can be futher configued using its advanced options in
`cassandra.yaml`.
+
+=== Advanced Options for BinAuditLogger
+
+`block`::
+ Indicates if the AuditLog should block if the it falls behind or
+ should drop audit log records. Default is set to `true` so that
+ AuditLog records wont be lost
+`max_queue_weight`::
+ Maximum weight of in memory queue for records waiting to be written to
+ the audit log file before blocking or dropping the log records.
+ Default is set to `256 * 1024 * 1024`
+`max_log_size`::
+ Maximum size of the rolled files to retain on disk before deleting the
+ oldest file. Default is set to `16L * 1024L * 1024L * 1024L`
+`roll_cycle`::
+ How often to roll Audit log segments so they can potentially be
+ reclaimed. Some available options are:
+ `FIVE_MINUTELY, FAST_HOURLY, FAST_DAILY,
+ LargeRollCycles.LARGE_DAILY, LargeRollCycles.XLARGE_DAILY,
LargeRollCycles.HUGE_DAILY.`
+ For more options, refer: net.openhft.chronicle.queue.RollCycles.
+ Default is set to `"FAST_HOURLY"`
+
+== Configuring FileAuditLogger
+
+To use `FileAuditLogger` as a logger in AuditLogging, set the class name in
the `cassandra.yaml` file and configure
+the audit log events to flow through separate log file instead of system.log.
+
+[source,xml]
+----
+<!-- Audit Logging (FileAuditLogger) rolling file appender to audit.log -->
+<appender name="AUDIT" class="ch.qos.logback.core.rolling.RollingFileAppender">
+ <file>${cassandra.logdir}/audit/audit.log</file>
+ <rollingPolicy
class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
+ <!-- rollover daily -->
+
<fileNamePattern>${cassandra.logdir}/audit/audit.log.%d{yyyy-MM-dd}.%i.zip</fileNamePattern>
+ <!-- each file should be at most 50MB, keep 30 days worth of history, but
at most 5GB -->
+ <maxFileSize>50MB</maxFileSize>
+ <maxHistory>30</maxHistory>
+ <totalSizeCap>5GB</totalSizeCap>
+ </rollingPolicy>
+ <encoder>
+ <pattern>%-5level [%thread] %date{"yyyy-MM-dd'T'HH:mm:ss,SSS", UTC} %F:%L
- %msg%n</pattern>
+ </encoder>
+</appender>
+
+<!-- Audit Logging additivity to redirect audt logging events to
audit/audit.log -->
+<logger name="org.apache.cassandra.audit" additivity="false" level="INFO">
+ <appender-ref ref="AUDIT"/>
+</logger>
+----
diff --git a/doc/modules/cassandra/pages/managing/operating/auditlogging.adoc
b/doc/modules/cassandra/pages/managing/operating/auditlogging.adoc
index 94488f8f62..2a542c256e 100644
--- a/doc/modules/cassandra/pages/managing/operating/auditlogging.adoc
+++ b/doc/modules/cassandra/pages/managing/operating/auditlogging.adoc
@@ -260,7 +260,7 @@ In order to change ``roll_cycle`` on a node, you have to:
The ``block`` option specifies whether audit logging should block writing or
drop log records if the audit logging falls behind. Supported boolean values
are ``true`` (default) or ``false``.
-For example: ``block: false`` to drop records (e.g. if audit is used for
troubleshooting)
+For example: ``block: false`` to drop records (e.g. if audit is used for
troobleshooting)
For regulatory compliance purposes, it's a good practice to explicitly set
``block: true`` to prevent any regression in case of future default value
change.
diff --git a/doc/modules/cassandra/pages/managing/operating/auto_repair.adoc
b/doc/modules/cassandra/pages/managing/operating/auto_repair.adoc
index 98afa3075c..e989c49d2a 100644
--- a/doc/modules/cassandra/pages/managing/operating/auto_repair.adoc
+++ b/doc/modules/cassandra/pages/managing/operating/auto_repair.adoc
@@ -7,9 +7,6 @@ Auto Repair is a fully automated scheduler that provides repair
orchestration wi
significantly reduces operational overhead by eliminating the need for
operators to deploy external tools to submit and
manage repairs.
-Read xref:managing/operating/repair.adoc[Repair] first if you want the
underlying repair behavior before you tune the
-scheduler.
-
At a high level, a dedicated thread pool is assigned to the repair scheduler.
The repair scheduler in Cassandra
maintains a new replicated table, `system_distributed.auto_repair_history`,
which stores the repair history for all
nodes, including details such as the last repair time. The scheduler selects
the node(s) to begin repairs and
@@ -65,8 +62,8 @@ a smaller set of data, so a shorter `min_repair_interval`
such as `1h` is recomm
==== Enabling Incremental Repair on existing clusters with a large amount of
data
[#enabling-ir]
One should be careful when enabling incremental repair on a cluster for the
first time. While
-xref:#repair-token-range-splitter[RepairTokenRangeSplitter] includes a default
configuration to attempt to
-gracefully migrate to incremental repair over time, failure to take proper
precaution could overwhelm the cluster with
+xref:#repair-token-range-splitter[RepairTokenRangeSplitter] includes a default
configuration to attempt to gracefully
+migrate to incremental repair over time, failure to take proper precaution
could overwhelm the cluster with
xref:managing/operating/compaction/overview.adoc#types-of-compaction[anticompactions].
No matter how one goes about enabling and running incremental repair, it is
recommended to run a cycle of full repairs
@@ -105,10 +102,6 @@ configuration that attempts to conservatively migrate
100GiB of compressed data
requirements, data set and capability of a cluster's hardware, one may
consider tuning these values to be more
aggressive or conservative.
-For example, on a large cluster with wide tables or STCS data that will need a
lot of anticompaction, start with the
-default splitter and full repairs first, then keep the incremental schedule
conservative until you have observed the
-impact on compaction, repair duration, and disk usage. That ordering is an
operator choice, not a Cassandra guarantee.
-
=== Previewing Repaired Data
The `preview_repaired` repair type executes repairs over the repaired data set
to detect possible data inconsistencies.
@@ -244,7 +237,7 @@ disturbances or failures. However, making the repairs too
short can lead to over
the main bottleneck.
- *Minimize the impact on hosts*: Repairs should not heavily affect the host
systems. For incremental repairs, this
- might involve anticompaction work. In full repairs, streaming large amounts
of data—especially with wide partitions
+might involve anti-compaction work. In full repairs, streaming large amounts
of data—especially with wide partitions
can lead to issues with disk usage and higher compaction costs.
- *Reduce overstreaming*: The Merkle tree, which represents data within each
partition and range, has a maximum size.
diff --git a/doc/modules/cassandra/pages/managing/operating/backups.adoc
b/doc/modules/cassandra/pages/managing/operating/backups.adoc
index c1cf0f4603..1bc949d051 100644
--- a/doc/modules/cassandra/pages/managing/operating/backups.adoc
+++ b/doc/modules/cassandra/pages/managing/operating/backups.adoc
@@ -21,10 +21,6 @@ Apache Cassandra supports two kinds of backup strategies.
A _snapshot_ is a copy of a table’s SSTable files at a given time,
created via hard links.
The DDL to create the table is stored as well.
-Because snapshots are hard links to the existing SSTable files, they are
-lightweight, but they are not a separate physical copy on different
-storage. If the underlying SSTable files are removed or the disk is
-lost, the snapshot goes with them.
Snapshots may be created by a user or created automatically.
The setting `snapshot_before_compaction` in the `cassandra.yaml` file
determines if
snapshots are created before each compaction.
diff --git a/doc/modules/cassandra/pages/managing/operating/bulk_loading.adoc
b/doc/modules/cassandra/pages/managing/operating/bulk_loading.adoc
index 3307988037..939d0fc58c 100644
--- a/doc/modules/cassandra/pages/managing/operating/bulk_loading.adoc
+++ b/doc/modules/cassandra/pages/managing/operating/bulk_loading.adoc
@@ -34,14 +34,6 @@ The `sstableloader` is the main tool for bulk uploading data.
conforming to the replication strategy and replication factor.
The table to upload data to does need not to be empty.
-Before you run `sstableloader`, verify these prerequisites:
-
-* the target keyspace and table already exist
-* the SSTables were produced by a compatible Cassandra version
-* the directory path matches the target `keyspace/table` name
-* you know which live nodes to use with `--nodes`
-* there is enough disk and network capacity for the stream and the follow-up
compaction work
-
The only requirements to run `sstableloader` are:
* One or more comma separated initial hosts to connect to and get ring
@@ -199,13 +191,6 @@ id | name | publisher
(2 rows)
----
-Also verify cluster-side effects after the load:
-
-* `nodetool status` still shows the target nodes as `UN`
-* `nodetool netstats` or server logs show the streaming finished cleanly
-* the expected row counts or sample queries succeed from client traffic
-* disk usage and pending compactions are within the range you expect for the
imported SSTables
-
==== Bulk Loading from a Snapshot
Restoring a snapshot of a table to the same table can be easily accomplished:
diff --git
a/doc/modules/cassandra/pages/managing/operating/compaction/overview.adoc
b/doc/modules/cassandra/pages/managing/operating/compaction/overview.adoc
index 9223b4ac2a..3fdbb6d5a7 100644
--- a/doc/modules/cassandra/pages/managing/operating/compaction/overview.adoc
+++ b/doc/modules/cassandra/pages/managing/operating/compaction/overview.adoc
@@ -16,26 +16,6 @@ As SSTables accumulate, the distribution of data can require
accessing more and
To keep the database healthy, Cassandra periodically merges SSTables and
discards old data.
This process is called
https://cassandra.apache.org/_/glossary.html#compaction[compaction].
-For operators, the practical effect is simple: fewer SSTables means fewer
files to consult on reads, less disk spent on
-obsolete versions, and fewer tombstones waiting to be purged by later
compactions.
-
-[source,text]
-----
-memtables flush
- |
- v
- SSTables on disk
- |
- v
-compaction selects overlapping SSTables
- |
- v
-merge newest rows and drop shadowed data when safe
- |
- v
- fewer, cleaner SSTables
-----
-
== Why must compaction be run?
Since SSTables are consulted during read operations, it is important to keep
the number of SSTables small.
@@ -100,18 +80,6 @@ With LCS the resulting SSTable will end up in L0.
Different compaction strategies are available to optimize for different
workloads.
Picking the right compaction strategy for your workload will ensure the best
performance for both querying and for compaction itself.
-The table below is a starting point, not a hard rule. Use it to pick the first
strategy for a table, then tune only after
-observing read amplification, write amplification, and tombstone pressure in
your cluster.
-
-[cols="1,2,2",options="header"]
-|===
-| Strategy | Good starting point | Watch for
-| UCS | New tables and mixed workloads | Tune `scaling_parameters`,
`target_sstable_size`, and `base_shard_count` after observing real read/write
pressure.
-| STCS | Simple fallback when another strategy does not fit | Higher read
amplification on read-heavy workloads and more overlap between SSTables.
-| LCS | Read-heavy workloads with lots of updates and deletes | Extra write
amplification and higher compaction cost.
-| TWCS | TTL-heavy, mostly immutable time-series data | Data with mixed
lifetimes can make expired SSTables harder to drop.
-|===
-
xref:cassandra:managing/operating/compaction/ucs.adoc[`Unified Compaction
Strategy (UCS)`]::
UCS is a good choice for most workloads and is recommended for new workloads.
This compaction strategy is designed to handle a wide variety of workloads.
diff --git
a/doc/modules/cassandra/pages/managing/operating/compaction/tombstones.adoc
b/doc/modules/cassandra/pages/managing/operating/compaction/tombstones.adoc
index 0b46dd215e..780ec31856 100644
--- a/doc/modules/cassandra/pages/managing/operating/compaction/tombstones.adoc
+++ b/doc/modules/cassandra/pages/managing/operating/compaction/tombstones.adoc
@@ -19,12 +19,10 @@ After `gc_grace_seconds` has elapsed, the data is eligible
for permanent removal
====
== Why tombstones?
-
+
The tombstone represents the deletion of an object, either a row or column
value.
This approach is used instead of removing values because of the distributed
nature of {cassandra}.
Once an object is marked as a tombstone, queries will ignore all values that
are time-stamped previous to the tombstone insertion.
-The operator consequence is that deletes only stay safe if replicas see the
tombstone before `gc_grace_seconds` expires.
-That is why repair frequency and `gc_grace_seconds` must be planned together.
== Preventing Data Resurrection
@@ -190,4 +188,4 @@ Below is the output from sstabledump, showing how a delete
operation generates a
"deletion_info" : { "marked_deleted" : "2025-12-02T21:58:26.185187Z",
"local_delete_time" : "2025-12-02T21:58:26Z" }
}
}
-----
+----
\ No newline at end of file
diff --git a/doc/modules/cassandra/pages/managing/operating/compaction/ucs.adoc
b/doc/modules/cassandra/pages/managing/operating/compaction/ucs.adoc
index e54d30e7eb..a41c979ef9 100644
--- a/doc/modules/cassandra/pages/managing/operating/compaction/ucs.adoc
+++ b/doc/modules/cassandra/pages/managing/operating/compaction/ucs.adoc
@@ -5,13 +5,8 @@
The `UnifiedCompactionStrategy (UCS)` is recommended for most workloads,
whether read-heavy, write-heavy, mixed read-write, or time-series.
There is no need to use legacy compaction strategies, because UCS can be
configured to behave like any of them.
-For operators, the useful mental model is: UCS starts with flushed SSTables,
groups them into sharded levels, and then
-keeps compacting them toward a target size and density. That lets the same
table behave more like STCS, LCS, or TWCS
-without requiring a full strategy swap.
-
UCS is a compaction strategy that combines the best of the other strategies
plus new features.
-UCS has been designed to maximize the speed of compactions, which is crucial
for high-density nodes, using a unique
-sharding mechanism that compacts partitioned data in parallel.
+UCS has been designed to maximize the speed of compactions, which is crucial
for high-density nodes, using an unique sharding mechanism that compacts
partitioned data in parallel.
And whereas STCS, LCS, or TWCS will require full compaction of the data if the
compaction strategy is changed, UCS can change parameters in flight to switch
from one strategy to another.
In fact, a combination of different compaction strategies can be used at the
same time, with different parameters for each level of the hierarchy.
Finally, UCS is stateless, so it does not rely on any metadata to make
compaction decisions.
@@ -23,20 +18,6 @@ Thus, a compaction is triggered when more than a given
number of SSTables are pr
* *size* can be replaced by *density*, allowing SSTables to be split at
arbitrary points when the output of a compaction is written, while still
producing a leveled hierarchy.
Density is defined as the size of an SSTable divided by the width of the token
range it covers.
-[source,text]
-----
-memtable flushes
- |
- v
-sharded SSTables in L0
- |
- v
-density-based leveling and splitting
- |
- v
-non-overlapping SSTables at higher levels
-----
-
== Migration from Other Strategies
The Unified Compaction Strategy (UCS) can be configured to behave like other
compaction strategies, making migration straightforward. It also provides
advanced options for optimizing specific workload patterns.
@@ -442,6 +423,7 @@ There are two extensions currently implemented, SSTable
growth and a minimum SST
First, let's examine the case when the size of the data set is expected to
grow very large.
To avoid pre-specifying a sufficiently large target size to avoid problems
with per-SSTable overhead, an `SSTtable growth` parameter has been implemented.
+// LLP: I don't know what this means: determines what part of the density
growth should be assigned to increased SSTable size
This parameter determines what part of the density growth should be assigned
to increased SSTable size, reducing the growth of the number of shards, and
hence, non-overlapping SSTables.
The second extension is a mode of operation with a fixed number of shards that
splits conditionally on reaching a minimum size.
@@ -541,7 +523,8 @@ In addition to `TRANSITIVE`, "overlap inclusion methods" of
`NONE` and `SINGLE`
In normal operation, we compact all SSTables in the compaction bucket.
If compaction is very late, we may apply a limit on the number of overlapping
sources we compact.
-In that case, we use the collection of oldest SSTables that keeps every
included overlap set under the configured overlap limit, making sure that if an
SSTable is included in this compaction, all older ones are also included to
maintain time order.
+// LLP: What does limit-many mean in the next sentence?
+In that case, we use the collection of oldest SSTables that would select at
most limit-many in any included overlap set, making sure that if an SSTable is
included in this compaction, all older ones are also included to maintain time
order.
=== Selecting the compaction to run
@@ -719,4 +702,4 @@ In `cassandra.yaml`, there is also one parameter that
affects compaction:
concurrent_compactors::
Number of simultaneous compactions to allow, NOT including validation
"compactions" for anti-entropy repair.
-Higher values increase compaction performance but may increase read and write
latencies.
+Higher values increase compaction performance but may increase read and write
latencies.
\ No newline at end of file
diff --git a/doc/modules/cassandra/pages/managing/operating/hints.adoc
b/doc/modules/cassandra/pages/managing/operating/hints.adoc
index 0a4950a224..06f0fc97bb 100644
--- a/doc/modules/cassandra/pages/managing/operating/hints.adoc
+++ b/doc/modules/cassandra/pages/managing/operating/hints.adoc
@@ -26,10 +26,6 @@ by all replicas. Hints, like read-repair, are best effort
and not an
alternative to performing full repair, but they do help reduce the
duration of inconsistency between replicas in practice.
-Use hints for short outages and routine maintenance. If a node stays
-down long enough for hints to expire, repair is the mechanism that
-brings the replica back into agreement with the rest of the cluster.
-
== Hinted Handoff
Hinted handoff is the process by which Cassandra applies hints to
@@ -43,7 +39,7 @@ replicas acknowledge the mutation the coordinator responds
successfully
to the client. If a replica node is unavailable, however, the
coordinator stores a hint locally to the filesystem for later
application. New hints will be retained for up to
-`max_hint_window_in_ms` of downtime (defaults to `3 h`). If the
+`max_hint_windowin_ms` of downtime (defaults to `3 h`). If the
unavailable replica does return to the cluster before the window
expires, the coordinator applies any pending hinted mutations against
the replica to ensure that eventual consistency is maintained.
diff --git a/doc/modules/cassandra/pages/managing/operating/logging.adoc
b/doc/modules/cassandra/pages/managing/operating/logging.adoc
index a9965d2ce2..d740aef82d 100644
--- a/doc/modules/cassandra/pages/managing/operating/logging.adoc
+++ b/doc/modules/cassandra/pages/managing/operating/logging.adoc
@@ -1,4 +1,5 @@
= Logging
* xref:cassandra:managing/operating/auditlogging.adoc[Audit logging]
-* xref:cassandra:managing/operating/fqllogging.adoc[Full query logging]
+* xref:cassandra:managing/operating/audit_logging.adoc[Audit logging 2]
+* xref:cassandra:managing/operating/fqllogging.adoc[Full query logging]
\ No newline at end of file
diff --git
a/doc/modules/cassandra/pages/managing/operating/onboarding-to-accord.adoc
b/doc/modules/cassandra/pages/managing/operating/onboarding-to-accord.adoc
index 3eedc73a65..e9fe5e386f 100644
--- a/doc/modules/cassandra/pages/managing/operating/onboarding-to-accord.adoc
+++ b/doc/modules/cassandra/pages/managing/operating/onboarding-to-accord.adoc
@@ -2,14 +2,11 @@
== Intro
-Accord is Cassandra's transaction protocol for migrated tables and
-token ranges. It supports all existing CQL and can be enabled on a per
-table and per token range within that table basis.
-
-Enabling Accord on existing tables requires a migration process that
-can be done on this same per table and per range basis that safely
-transitions data from being managed by Cassandra {plus} Paxos to
-Cassandra {plus} Accord without downtime.
+Accord supports all existing CQL and can be enabled on a per table and
+per token range within that table basis. Enabling Accord on existing tables
requires a
+migration process that can be done on this same per table and per range
+basis that safely transitions data from being managed by Cassandra
+{plus} Paxos to Cassandra {plus} Accord without downtime.
A migration is required because Accord can't safely read data written by
non-SERIAL writes. Accord requires deterministic reads in order to have
@@ -28,64 +25,64 @@ This guide does not cover the new transaction syntax.
You need to set `accord.enabled` to true for Accord to be initialized at
startup.
-`accord.default_transactional_mode` allows you to set a default
-transactional mode for newly created tables which will be used in
-create table statements when no `transactional_mode` is specified. This
+`accord.default++_++transactional++_++mode` allows you to set a default
+transactional mode for newly created tables which will be used in create
+table statements when no `transactional++_++mode` is specified. This
prevents accidentally creating non-Accord tables that will need
migration to Accord.
-`accord.range_migration` configures the behavior of altering the
-`transactional_mode` of a table. When set to `auto` the entire ring
-will be marked as migrating when the `transactional_mode` of a table is
-altered. When set to `explicit` no ranges will be marked as migrating
-when the `transactional_mode` of a table is altered.
+`accord.range++_++migration` configures the behavior of altering the
+`transactional++_++mode` of a table. When set to `auto` the entire ring
+will be marked as migrating when the `transactional++_++mode` of a table
+is altered. When set to `explicit` no ranges will be marked as migrating
+when the `transactional++_++mode` of a table is altered.
=== Table parameters
-`transactional_mode` can be set when a table is created
-`CREATE TABLE foo WITH transactional_mode = 'full'` or it can be set by
-altering an existing table `ALTER TABLE foo WITH transactional_mode =
-'full'`.
-`transactional_mode` designates the target or intended transaction
+`transactional++_++mode` can be set when a table is created
+`CREATE TABLE foo WITH transactional++_++mode = ‘full'` or it can be set
+by altering an existing table
+`ALTER TABLE foo WITH transactional++_++mode = ‘full'`.
+`transactional++_++mode` designates the target or intended transaction
system for the table and for a newly created table this will be the
transaction system that is used, but for existing tables that are being
altered the table will still need to be migrated to the target system.
-`transactional_mode` can be set to `full`, `mixed_reads`, and `off`.
-`off` means that Paxos will be used and transaction statements will be
-rejected. `full` means that all reads and writes will execute on
-Accord. `mixed_reads` means that all writes will execute on Accord
+`transactional++_++mode` can be set to `full`, `mixed++_++reads`, and
+`off`. `off` means that Paxos will be used and transaction statements
+will be rejected. `full` means that all reads and writes will execute on
+Accord. `mixed++_++reads` means that all writes will execute on Accord
along with `SERIAL` reads/writes, but non-SERIAL reads/writes will
execute on the existing eventually consistent path. Applying the
mutations for blocking read repair will always be done through Accord in
-`full` and `mixed_reads`.
+`full` in and `mixed++_++reads`.
-`transactional_migration_from` indicates whether a migration is
+`transactional++_++migration++_++from` indicates whether a migration is
currently in progress although it does not indicate which ranges are
actively being migrated. This is set automatically when you create a
-table or alter `transactional_mode` and should not be set manually.
-It's possible to manually set `transactional_migration_from` to
+table or alter `transactional++_++mode` and should not be set manually.
+It's possible to manually set `transactional++_++migration++_++from` to
force the completion of migration without actually running the necessary
migration steps.
-`transactional_migration_from` can be set to `none`, `off`, `full`, and
-`mixed_reads`. `off`, `full`, and `mixed_reads` correspond to the
-`transactional_mode` being migrated away from and
+`transactional++_++migration++_++from` can be set to `none`, `off`,
+`full`, and `mixed++_++reads`. `off`, `full`, and `mixed++_++reads`
+correspond to the `transactional++_++mode` being migrated away from and
`none` indicates that no migration is in progress either because the
migration has completed or because the table was created with its
-current `transactional_mode`.
+current `transactional++_++mode`.
-=== `mixed_reads` vs `full`
+=== mixed++_++reads vs full
-When Accord is running with `transactional_mode` `full` it will be
+When Accord is running with `transactional++_++mode` `full` it will be
able to perform asynchronous commit saving a WAN roundtrip.
-`mixed_reads` allows non-SERIAL reads to continue to execute using the
-original eventually consistent read path. `mixed_reads`, unlike `full`,
-always requires Accord to synchronously commit at the requested
-consistency level in order to make acknowledged Accord writes visible
-to non-SERIAL reads.
+`mixed++_++reads` allows non-SERIAL reads to continue to execute using
+the original eventually consistent read path. `mixed++_++reads`, unlikes
+`full`, always requires Accord to always synchronously commit at the
+requested consistency level in order to make acknowledged Accord writes
+visible to non-SERIAL reads.
-There is no `transactional_mode` that allows non-SERIAL writes
+There is no `transactional++_++mode` that allows non-SERIAL writes
because they break Accord's transaction recovery resulting in
transactions appearing to have different outcomes at different nodes.
@@ -110,9 +107,9 @@ table:
ALTER TABLE foo WITH transactional_mode = 'full'
....
-After the table is altered it is required to run `nodetool
-consensus_admin begin-migration` on ranges in the table unless
-`accord.range_migration=auto`.
+After the table is altered it is required to run
+`nodetool consensus++_++admin begin-migration` on ranges in the table
+unless `accord.range++_++migration=auto`.
When a range is initially marked migrating to Accord all non-SERIAL
writes will execute on Accord while `SERIAL` writes will continue to
@@ -120,25 +117,24 @@ execute on Paxos. non-SERIAL writes include regular
writes, logged and
unlogged batches, hints, and read repair. Accord will perform
synchronous commit the specified consistency level requiring 2x WAN RTT.
-Tables that are migrating or are partially migrated to Accord (or back
-to Paxos) can be listed using `nodetool consensus_admin list` or the
-system table `system_accord_debug.migration_state`.
+Tables that are migrating or are partially migrated to Accord (or back to
Paxos) can be listed using
+`nodetool consensus_admin list` or the sytem table
`system_accord_debug.migration_state`.
Migration to Accord consists of two phases with the first phase starting
when a range is marked migrating, and the second phase starting after a
-full or incremental data repair, and then the migration completing
-after a second repair which must be a full data repair {plus} Paxos
-repair. While marking the range as migrating can be done automatically
-with `accord.range_migration=auto`, there is not automation for
-triggering the repairs. If you regularly run compatible repairs then
-the migration will eventually complete, but if you don't run them or
-want the migration to complete sooner then you will need to either
-trigger them manually or invoke `nodetool consensus_admin finish-migration`
+full or incremental data repair, and then the migration completing after
+a second repair which must be a full data repair {plus} Paxos repair.
+While marking the range as migrating can be done automatically with
+`accord.range++_++migration=auto`, there is not automation for
+triggering the repairs. If you regularly run compatible repairs then the
+migration will eventually complete, but if you don't run them or want
+the migration to complete sooner then you will need to either trigger
+them manually or invoke `nodetool consensus++_++admin finish-migration`
to trigger them.
Any repair that is compatible will drive migration forward whether it
-only covers part of the migrating range or whether it is started via
-`nodetool consensus_admin finish-migration` or some other external
+only covers part of the migrating range or whether is started via
+`nodetool consensus++_++admin finish-migration` or some other external
process that initiates repair. Force repair with down nodes will not be
eligible to drive any type or phase of migration forward. Force repair
with all nodes up will still work.
@@ -149,10 +145,8 @@ In the first phase of migration Accord is unable to safely
read
non-SERIAL writes so Paxos continues to be used for `SERIAL` operations
and Accord executes all writes and synchronously commits at the
requested consistency level in order to allow Paxos to safely read
-Accord writes. Accord's read and write metrics are all counted towards
-the existing `Read` and `Write` scope along with the eventually
-consistent operations, but you should also start to see writes also
-being counted in the `AccordWrite` scope.
+Accord writes. Accord's read and write metrics are all counted towards the
existing `Read` and `Write` scope
+along with the eventually consistent operations, but you should also start to
see writes also being counted in the `AccordWrite` scope.
A data repair either incremental or full replicates all non-SERIAL
writes at `ALL` making it safe for Accord to read non-SERIAL writes that
@@ -163,13 +157,12 @@ safely read them.
=== Second phase
In the second phase all reads and writes execute through Accord
-(assuming `transactional_mode="full"`). Before an operation can execute
-on Accord it is necessary to run a Paxos key repair in order to ensure
-that any uncommitted Paxos transactions are committed and this check
-will take at least one extra WAN RTT. Additionally Accord has to read
-at `QUORUM` (where it would normally only read from a single replica in
-`transactional_mode="full"` and migration completed) because Paxos
-writes are only visible at `QUORUM`.
+(assuming `transactional++_++mode="full"`). Before an operation can execute on
+Accord it is necessary to run a Paxos key repair in order to ensure that
+any uncommitted Paxos transactions are committed and this check will
+take at least one extra WAN RTT. Additionally Accord has to read at `QUORUM`
+(where it would normally only read from a single replica in
`transactional++_++mode="full"` and migration completed) because
+Paxos writes are only visible at `QUORUM`.
All reads and CAS operations in the range should start showing up in the
Accord metrics and not the existing metrics.
@@ -190,8 +183,8 @@ the repair before running the Paxos repair.
== Migration from Accord
Migration from Accord to Paxos occurs in a single phase and begins by
-altering the table's `transactional_mode` to `off` and then optionally
-marking ranges as migrating as discussed above.
+altering the table's `transactional++_++mode` to `off` and then
+optionally marking ranges as migrating as discussed above.
Once a range is marked migrating all operations in the migrating range
will stop executing on Accord. Before each operation occurs they will
@@ -214,10 +207,10 @@ manage consensus migration. The existing methods for
starting repairs
can also be used to start the repairs that are needed to complete
migration.
-=== `nodetool consensus_admin list`
+=== nodetool consensus++_++admin list
Invoking `nodetool` with
-`consensus_admin list [<keyspace> <tables>...]`
+`consensus++_++admin list ++[<++keyspace++>++ ++<++tables++>++...++]++`
will connect to the specified node and retrieve that nodes view of what
tables are currently being migrated from transactional cluster metadata.
Tables that are not being migrated are not listed.
@@ -226,10 +219,10 @@ The results can be printed out in several different
formats using the
`format` parameter which supports `json`, `minified-json`, `yaml`, and
`minified-yaml`.
-=== `nodetool consensus_admin begin-migration`
+=== nodetool consensus++_++admin begin-migration
Invoking `nodetool` with
-`consensus_admin begin-migration [<keyspace> <tables>...]`
+`consensus++_++admin begin-migration ++[<++keyspace++>++
++<++tables++>++...++]++`
can be used to mark ranges on a table as migrating. This can only be
done after the migration has been started by altering the tables.
Marking ranges as migrating is a lightweight operation and does not
@@ -241,7 +234,7 @@ specified keyspace and tables. If the entire range is
marked migrating it is
only necessary to invoke `begin-migration` on one node.
This is only needed if
-`accord.default_transactional_mode=explicit` is set in
+`accord.default++_++transactional++_++mode=explicit` is set in
`cassandra.yaml` otherwise all the ranges will already have been marked
migrating when the alter occurred.
@@ -249,10 +242,10 @@ Ranges that are migrating will require at least an extra
WAN roundtrip
for each request that touches a migrating range because both transaction
systems may need to be used to execute the request.
-=== `nodetool consensus_admin finish-migration`
+=== nodetool consensus++_++admin finish-migration
Invoking `nodetool` with
-`consensus_admin finish-migration [<keyspace> <tables>...]`
+`consensus++_++admin finish-migration ++[<++keyspace++>++ ++<++tables++>++...`
will run the repairs needed to complete the migration for the specified
ranges. If no range is specified it will default to the primary range of
the node that `nodetool` is connecting to so you can call it once on
@@ -271,9 +264,9 @@ used for `SERIAL` writes and thus needs to perform
synchronous commit at
the requested consistency level.
Once migration is complete the read and write consistency levels will be
-ignored with transactional mode `full`. With transactional mode
-`mixed_reads` Accord will continue to do synchronous commit and honor
-the requested commit/write consistency level.
+ignored with transactional mode `full` . With transactional mode
+`mixed++_++reads` Accord will continue to do synchronous commit and
+honor the requested commit/write consistency level.
Accord will always reject any requests to execute at unsupported
consistency levels to ensure that migration to/from Accord is always
diff --git a/doc/modules/cassandra/pages/managing/operating/repair.adoc
b/doc/modules/cassandra/pages/managing/operating/repair.adoc
index 4bee0d0294..d7eaba1711 100644
--- a/doc/modules/cassandra/pages/managing/operating/repair.adoc
+++ b/doc/modules/cassandra/pages/managing/operating/repair.adoc
@@ -1,13 +1,12 @@
= Repair
-Cassandra is designed to remain available if one of its nodes is down or
-unreachable. However, when a node is down or unreachable, it needs to
+Cassandra is designed to remain available if one of it's nodes is down
+or unreachable. However, when a node is down or unreachable, it needs to
eventually discover the writes it missed. Hints attempt to inform a node
-of missed writes, but are a best effort and are not guaranteed to deliver
-every missed write. If a node stays unavailable long enough for hints to
-expire, or if tombstones are not repaired before `gc_grace_seconds`
-expires, deleted data can reappear and replicas can remain permanently
-divergent.
+of missed writes, but are a best effort, and aren't guaranteed to inform
+a node of 100% of the writes it missed. These inconsistencies can
+eventually result in data loss as nodes are replaced or tombstones
+expire.
These inconsistencies are fixed with the repair process. Repair
synchronizes the data between nodes by comparing their respective
@@ -15,10 +14,6 @@ datasets for their common token ranges, and streaming the
differences
for any out of sync sections between the nodes. It compares the data
with merkle trees, which are a hierarchy of hashes.
-If you only need the short version: read this page first, then read
-xref:managing/operating/auto_repair.adoc[Auto Repair] if you want
-Cassandra to schedule the same underlying work automatically.
-
== Incremental and Full Repairs
There are 2 types of repairs: full repairs, and incremental repairs.
@@ -34,11 +29,6 @@ for syncing up missed writes, but it doesn't protect against
things like
disk corruption, data loss by operator error, or bugs in Cassandra. For
this reason, full repairs should still be run occasionally.
-In practice, repair is the mechanism that keeps hints from being the
-only line of defense. If a replica is down too long, repair is what
-brings the cluster back to a common view of deletes and missed writes
-before those differences turn into zombies.
-
== Automated Repair Scheduling
Since repair can result in a lot of disk and network io, it has
diff --git
a/doc/modules/cassandra/pages/managing/operating/role_name_generation.adoc
b/doc/modules/cassandra/pages/managing/operating/role_name_generation.adoc
index 1b5ea6185f..1029695bf5 100644
--- a/doc/modules/cassandra/pages/managing/operating/role_name_generation.adoc
+++ b/doc/modules/cassandra/pages/managing/operating/role_name_generation.adoc
@@ -27,8 +27,7 @@ role_name_policy:
`UUIDRoleNameGenerator` is the only built-in implementation of a generator
which generates role names. As
its name suggests, generated role names will be UUIDs (without hyphens) and
their first character will be always a letter.
-Once enabled together with `CassandraRoleManager` as `role_manager`,
-an operator or application can use these statements:
+Once enabled (together with `CassandraRoleManager` as `role_manager`), a user
might execute these queries:
The most simple variation is:
@@ -113,20 +112,6 @@ The configuration of `role_name_policy` is configurable in
runtime via JMX. You
calling `GuardrailsMBean#getRoleNamePolicy()` which returns JSON
representation of such a policy as a string. Similarly,
you can use `GuardrailsMBean#setRoleNamePolicy(String)` to configure it in
runtime.
-== Operational workflow
-
-A common provisioning flow is:
-
-. Enable the generator and create the account with `CREATE GENERATED
-ROLE WITH GENERATED PASSWORD AND LOGIN = true;`.
-. Capture the returned generated role name and password in your secret
-manager.
-. Grant the new role the permissions it needs for the application.
-. Point the application at the returned credentials and verify that it
-can authenticate.
-. When rotating access, create a new generated role, update the stored
-secret, then revoke and drop the old role.
-
If you do not want to be able to configure this in runtime, please set
`role_name_policy_reconfiguration_enabled`
in `cassandra.yaml` to `false`.
@@ -134,4 +119,4 @@ The fact that you see a "table" returned after generated
role name / password is
overriding `createRoleWithResult` and `alterRoleWithResult` methods and
populating the response message with
generated values. If you happen to have a completely custom implementation of
`IRoleManager`, e.g. integrating with
an external service, this is a way how to hook in your own "credentials" to be
returned to a user.
-E.g. an `IRoleManager` integrating with some vault-like solution might return
there custom credentials-like values for end-user to act on and similar.
+E.g. an `IRoleManager` integrating with some vault-like solution might return
there custom credentials-like values for end-user to act on and similar.
\ No newline at end of file
diff --git a/doc/modules/cassandra/pages/managing/operating/security.adoc
b/doc/modules/cassandra/pages/managing/operating/security.adoc
index 8aa4542f15..0052eab713 100644
--- a/doc/modules/cassandra/pages/managing/operating/security.adoc
+++ b/doc/modules/cassandra/pages/managing/operating/security.adoc
@@ -1,20 +1,5 @@
= Security
-== Security checklist
-
-Before turning on authentication or encryption, do these first:
-
-* Enable client-to-node and node-to-node TLS.
-* Increase `system_auth` replication before enabling password auth.
- Inference: if `system_auth` stays at RF=1, a single node outage can
- block logins.
-* Preconfigure client applications with the credentials and certificates
- they will use after the rollout.
-* Decide whether JMX will stay local-only, use standard JMX auth, or use
- Cassandra integrated auth.
-* Create a non-default superuser and disable the default `cassandra`
- login after the cluster is stable.
-
There are three main components to the security features provided by
Cassandra:
@@ -35,7 +20,7 @@ access internode communication and JMX ports can still:
* Attach to the cluster directly to capture write traffic
Correct configuration of all three security components should negate
-these vectors. Therefore, understanding Cassandra's security features
+theses vectors. Therefore, understanding Cassandra's security features
is crucial to configuring your cluster to meet your security needs.
== TLS/SSL Encryption
@@ -46,12 +31,8 @@ ensures that data in flight is not compromised and is
transferred
securely. The options for client-to-node and node-to-node encryption are
managed separately and may be configured independently.
-Plan encryption rollouts carefully. If one side is switched to TLS
-before matching clients or peers are ready, existing connections will
-drop until the rollout catches up.
-
In both cases, the JVM defaults for supported protocols and cipher
-suites are used when encryption is enabled. These can be overridden using
+suites are used when encryption is enabled. These can be overidden using
the settings in `cassandra.yaml`, but this is not recommended unless
there are policies in place which dictate certain settings or a need to
disable vulnerable ciphers or protocols in cases where the JVM cannot be
@@ -66,7 +47,7 @@ Cassandra provides flexibility of using Java based key
material or
completely customizing the SSL context. You can choose any keystore
format supported by Java (JKS, PKCS12 etc) as well as other standards
like PEM. You can even customize the SSL context creation to use Cloud
-Native technologies like Kubernetes Secrets for storing the key
+Native technologies like Kuberenetes Secrets for storing the key
material or to integrate with your in-house Key Management System.
For information on generating the keystore and truststore files
@@ -76,7 +57,7 @@
http://download.oracle.com/javase/6/docs/technotes/guides/security/jsse/JSSERefG
documentation on creating keystores].
For customizing the SSL context creation you can implement
-https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/security/ISslContextFactory.java[ISslContextFactory]
+https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/security/ISslContextFactory.java[ISslContextCreationFactory]
interface or extend one of its public subclasses appropriately. You
can then use the `ssl_context_factory` setting for
`server_encryption_options` or `client_encryption_options` sections
@@ -233,15 +214,6 @@ using the `role_manager` setting in `cassandra.yaml`. The
default
setting uses `CassandraRoleManager`, an implementation which stores role
information in the tables of the `system_auth` keyspace.
-For this page:
-
-* `role` is the Cassandra auth object stored in `system_auth`.
-* `user` is a role with `LOGIN = true`.
-* `principal` is the authenticated client identity presented to
- Cassandra.
-* `identity` is the value extracted from an mTLS certificate and mapped
- to a role.
-
See also the xref:cassandra:developing/cql/security.adoc#database-roles[`CQL
documentation on roles`].
== Authentication
@@ -271,27 +243,29 @@ once authentication is enabled, so setting up the client
side config in
advance is safe. In contrast, as soon as a server has authentication
enabled, any connection attempt without proper credentials will be
rejected which may cause availability problems for client applications.
+Once clients are setup and ready for authentication to be enabled,
+follow this procedure to enable it on the cluster.
-Follow this order to enable password authentication safely:
+Pick a single node in the cluster on which to perform the initial
+configuration. Ideally, no clients should connect to this node during
+the setup process, so you may want to remove it from client config,
+block it at the network level or possibly add a new temporary node to
+the cluster for this purpose. On that node, perform the following steps:
[arabic]
-. On a single node, open a `cqlsh` session and change the replication
-factor of the `system_auth` keyspace before any node restarts with
-`PasswordAuthenticator` enabled. By default, this keyspace uses
-`SimpleReplicationStrategy` and a `replication_factor` of 1. For any
-non-trivial deployment, increase this so login remains possible when a
-node is unavailable. Best practice is to configure a replication factor
-of 3 to 5 per DC.
+. Open a `cqlsh` session and change the replication factor of the
+`system_auth` keyspace. By default, this keyspace uses
+`SimpleReplicationStrategy` and a `replication_factor` of 1. It is
+recommended to change this for any non-trivial deployment to ensure that
+should nodes become unavailable, login is still possible. Best practice
+is to configure a replication factor of 3 to 5 per-DC.
[source,cql]
----
ALTER KEYSPACE system_auth WITH replication = {'class':
'NetworkTopologyStrategy', 'DC1': 3, 'DC2': 3};
----
-. Wait for the new replication settings to be in place across the
-cluster. Inference: if you keep `system_auth` at RF=1, a single node
-outage can prevent logins.
-
+[arabic, start=2]
. Edit `cassandra.yaml` to change the `authenticator` option like so:
[source,yaml]
@@ -299,8 +273,8 @@ outage can prevent logins.
authenticator: PasswordAuthenticator
----
+[arabic, start=3]
. Restart the node.
-
. Open a new `cqlsh` session using the credentials of the default
superuser:
@@ -309,6 +283,7 @@ superuser:
$ cqlsh -u cassandra -p cassandra
----
+[arabic, start=5]
. During login, the credentials for the default superuser are read with
a consistency level of `QUORUM`, whereas those for all other users
(including superusers) are read at `LOCAL_ONE`. In the interests of
@@ -324,7 +299,8 @@ further configuration.
CREATE ROLE dba WITH SUPERUSER = true AND LOGIN = true AND PASSWORD = 'super';
----
-. Start a new `cqlsh` session, this time logging in as the new_superuser
+[arabic, start=6]
+. Start a new cqlsh session, this time logging in as the new_superuser
and disable the default superuser.
[source,cql]
@@ -332,12 +308,14 @@ and disable the default superuser.
ALTER ROLE cassandra WITH SUPERUSER = false AND LOGIN = false;
----
+[arabic, start=7]
. Finally, set up the roles and credentials for your application users
with xref:cassandra:developing/cql/security.adoc#create-role[`CREATE ROLE`]
statements.
-At the end of these steps, that node is configured to use password
-authentication. To roll that out across the cluster, repeat the
-authenticator change and restart on each remaining node.
+At the end of these steps, the one node is configured to use password
+authentication. To roll that out across the cluster, repeat steps 2 and
+3 on each node in the cluster. Once all nodes have been restarted,
+authentication will be fully enabled throughout the cluster.
Note that using `PasswordAuthenticator` also requires the use of
xref:cassandra:developing/cql/security.adoc#operation-roles[`CassandraRoleManager`].
@@ -345,7 +323,7 @@
xref:cassandra:developing/cql/security.adoc#operation-roles[`CassandraRoleManage
See also: `setting-credentials-for-internal-authentication`,
xref:cassandra:developing/cql/security.adoc#create-role[`CREATE ROLE`],
xref:cassandra:developing/cql/security.adoc#alter-role[`ALTER ROLE`],
-xref:cassandra:developing/cql/security.adoc#alter-keyspace[`ALTER KEYSPACE`]
and
+xref:xref:cassandra:developing/cql/security.adoc#alter-keyspace[`ALTER
KEYSPACE`] and
xref:cassandra:developing/cql/security.adoc#grant-permission[`GRANT
PERMISSION`].
== Authorization
@@ -473,14 +451,6 @@ For both authentication and authorization, two providers
are available;
the first based on standard JMX security and the second which integrates
more closely with Cassandra's own auth subsystem.
-=== JMX access decision tree
-
-* Keep JMX local-only if you only need bootstrap-time tooling.
-* Use standard JMX auth if you need remote access before Cassandra auth
- is ready.
-* Use Cassandra integrated auth if you want JMX access to follow CQL
- roles after the node has joined the ring.
-
The default settings for Cassandra make JMX accessible only from
localhost. To enable remote JMX connections, edit `cassandra-env.sh`
to change the `LOCAL_JMX` setting to
@@ -551,19 +521,22 @@ File-Based Password Authentication In JMX]
=== Cassandra Integrated Auth
-An alternative to the out-of-the-box JMX auth is to use Cassandra's own
-authentication and authorization providers for JMX clients. This is
-potentially more flexible and secure but it comes with one major
-caveat. It is not available until after a node has joined the ring,
-because the auth subsystem is not fully configured until that point.
-That means bootstrap-time JMX access should stay local-only or use
-standard JMX auth until initial setup is complete.
+An alternative to the out-of-the-box JMX auth is to useeCassandra's own
+authentication and/or authorization providers for JMX clients. This is
+potentially more flexible and secure but it come with one major caveat.
+Namely that it is not available until [.title-ref]#after# a node has
+joined the ring, because the auth subsystem is not fully configured
+until that point However, it is often critical for monitoring purposes
+to have JMX access particularly during bootstrap. So it is recommended,
+where possible, to use local only JMX auth during bootstrap and then, if
+remote connectivity is required, to switch to integrated auth once the
+node has joined the ring and initial setup is complete.
With this option, the same database roles used for CQL authentication
can be used to control access to JMX, so updates can be managed
centrally using just `cqlsh`. Furthermore, fine grained control over
exactly which operations are permitted on particular MBeans can be
-achieved via
xref:cassandra:developing/cql/security.adoc#grant-permission[`GRANT
PERMISSION`].
+acheived via
xref:cassandra:developing/cql/security.adoc#grant-permission[`GRANT
PERMISSION`].
To enable integrated authentication, edit `cassandra-env.sh` to
uncomment these lines:
diff --git a/doc/modules/cassandra/pages/managing/operating/snitch.adoc
b/doc/modules/cassandra/pages/managing/operating/snitch.adoc
index 7364c259b4..cd59f98f9d 100644
--- a/doc/modules/cassandra/pages/managing/operating/snitch.adoc
+++ b/doc/modules/cassandra/pages/managing/operating/snitch.adoc
@@ -1,23 +1,15 @@
= Snitch
-In Cassandra, the snitch has two functions:
+In cassandra, the snitch has two functions:
* it teaches Cassandra enough about your network topology to route
requests efficiently.
-* it allows Cassandra to place replicas around your cluster to avoid
+* it allows Cassandra to spread replicas around your cluster to avoid
correlated failures. It does this by grouping machines into
"datacenters" and "racks." Cassandra will do its best not to have more
than one replica on the same "rack" (which may not actually be a
physical location).
-The snitch does not move data; it only informs routing and replica
-placement.
-
-Use the same datacenter and rack names here as you use in
-`cassandra-rackdc.properties` or `cassandra-topologies.properties` and
-as your keyspace replication expects. If those names do not line up,
-replica placement can be wrong.
-
== Dynamic snitching
The dynamic snitch monitors read latencies to avoid reading from hosts
@@ -49,8 +41,7 @@ implementations:
GossipingPropertyFileSnitch::
This should be your go-to snitch for production use. The rack and
datacenter for the local node are defined in
-
xref:cassandra:managing/configuration/cass_rackdc_file.adoc[cassandra-rackdc.properties]
- and propagated to other nodes via gossip.
+ cassandra-rackdc.properties and propagated to other nodes via gossip.
If `cassandra-topology.properties` exists, it is used as a fallback,
allowing migration from the PropertyFileSnitch.
SimpleSnitch::
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]