[
https://issues.apache.org/jira/browse/CASSANDRA-21228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cyl updated CASSANDRA-21228:
----------------------------
Component/s: Feature/Authorization
Feature/Rate Limiting
Labels: dos performance security (was: )
> ALTER ROLE Password Hash DoS Vulnerability
> ------------------------------------------
>
> Key: CASSANDRA-21228
> URL: https://issues.apache.org/jira/browse/CASSANDRA-21228
> Project: Apache Cassandra
> Issue Type: Bug
> Components: Feature/Authorization, Feature/Rate Limiting
> Reporter: Cyl
> Priority: Normal
> Labels: dos, performance, security
>
> # ALTER ROLE Password Hash DoS Vulnerability
> ## 1. Vulnerability Description
> **Name**: Authenticated DoS via `ALTER ROLE` Password Hashing
> **Overview**:
> In current Cassandra builds, `ALTER ROLE ... WITH PASSWORD` executes
> `BCrypt.hashpw` synchronously on the standard request executor
> (`Dispatcher.requestExecutor`). When an authenticated user issues many
> password changes, the expensive bcrypt work monopolizes that pool, starving
> all other CQL requests and producing an authenticated denial of service. This
> is the same root problem addressed by CASSANDRA-17812 for `AUTH_RESPONSE`,
> but the trigger has moved to `ALTER ROLE`.
> **Affected Configurations**:
> - Clusters running `PasswordAuthenticator`.
> - Any authenticated account that may alter its own password (default behavior
> for non-superusers).
> - Attackers that can reach the native CQL port.
> **Impact**:
> - Legitimate query latency inflates dramatically (observed increase from ~2
> ms to >1 s).
> - Attack threads hit numerous `OperationTimedOut` errors, demonstrating
> thread-pool exhaustion.
> - Service recovers immediately once the attack stops, indicating a classic
> CPU-starvation DoS.
> ## 2. Proof-of-Concept Steps
> The file `poc_dos.py` automates the scenario:
> 1. Start a single-node Cassandra instance with `PasswordAuthenticator` and
> `CassandraAuthorizer`.
> 2. With the superuser, create a victim role named `target_role`.
> 3. Launch 200 concurrent threads that run `ALTER ROLE target_role WITH
> PASSWORD '<random>'` in a tight loop.
> 4. Start a monitor thread executing `SELECT now()` once per second to record
> latency.
> Run the following command:
> ```bash
> python3 poc_dos.py
> ```
> **Observed Output**:
> ```
> Starting attack with 200 threads...
> [Victim] Query latency: 0.3743s
> [Victim] Query latency: 0.9145s
> Worker failed: ('Unable to connect ... OperationTimedOut ...')
> [Victim] Query latency: 1.0181s
> ...
> ```
> Immediately after the attack begins, the monitor reports 300 ms–1 s latency
> along with repeated `OperationTimedOut` errors. Once the attack stops,
> latency returns to ~2 ms, proving the DoS is reproducible.
> ## 3. Problematic Code Reference
> The vulnerable path sits in `CassandraRoleManager.optionsToAssignments(...)`
> and ultimately in `hashpw(...)`, both under
> `src/java/org/apache/cassandra/auth/`:
> ```java
> private String optionsToAssignments(Map<Option, Object> options)
> {
> return options.entrySet()
> .stream()
> .map(entry ->
> {
> switch (entry.getKey())
> {
> case PASSWORD:
> // bcrypt runs on Dispatcher.requestExecutor
> return String.format("salted_hash = '%s'",
> escape(hashpw((String) entry.getValue())));
> // other options elided
> }
> })
> .filter(Objects::nonNull)
> .collect(Collectors.joining(","));
> }
> private static String hashpw(String password)
> {
> return BCrypt.hashpw(password, PasswordSaltSupplier.get());
> }
> ```
> Because every `ALTER ROLE ... WITH PASSWORD` is processed on the shared
> `Dispatcher.requestExecutor`, each invocation above performs bcrypt hashing
> on threads that also handle standard queries, leading to starvation.
> ## 4. Related Issue and Root Cause
> - **Related Fix**: [CASSANDRA-17812] “Rate-limit new client connection auth
> setup to avoid overwhelming bcrypt”.
> - Mitigation: route `AUTH_RESPONSE` (and similar) to `authExecutor`.
> - Gap: `ALTER ROLE` / `CREATE ROLE` continue to run on `requestExecutor`.
> - **Shared Root Cause**: heavyweight bcrypt hashing without rate limiting or
> pool isolation leads to CPU starvation.
> ## 5. Recommended Fixes
> 1. **Execution Isolation**: Dispatch password hashing work (`ALTER ROLE ...
> PASSWORD`, `CREATE ROLE ... PASSWORD`, etc.) to a constrained executor
> similar to `authExecutor`.
> 2. **Rate Limiting**: Enforce per-role, per-connection, or global throttles
> (e.g., token bucket) on password modifications.
> 3. **Asynchronous Hashing**: Optionally compute bcrypt off-thread and update
> the system tables once ready, returning an “operation queued” response
> (requires protocol changes, higher complexity).
> 4. **Operational Mitigations** (until a code fix ships):
> - Monitor CPU saturation closely; adjusting
> `auth_bcrypt_gensalt_log2_rounds` does not solve the issue but may highlight
> abuse sooner.
> - Tighten credential/role cache TTLs (`roles_validity_in_ms`,
> `credentials_validity_in_ms`) though this cannot block an active attacker.
> ## 6. Conclusion
> This vulnerability belongs to the same family as CASSANDRA-17812—bcrypt
> computations starving the main request pool. Because any authenticated
> account can trigger it with repeated `ALTER ROLE` statements, the risk is
> high. We recommend extending the rate limiting / dedicated executor strategy
> to all password-hashing pathways as soon as possible.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]