[ 
https://issues.apache.org/jira/browse/CASSANDRA-21228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-21228:
------------------------------------------
    Resolution: Duplicate
        Status: Resolved  (was: Triage Needed)

> ALTER ROLE Password Hash DoS Vulnerability
> ------------------------------------------
>
>                 Key: CASSANDRA-21228
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21228
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Feature/Authorization, Feature/Rate Limiting
>            Reporter: Cyl
>            Priority: Normal
>              Labels: dos, performance, security
>
> h2. Vulnerability Description
> *Name*: Authenticated DoS via {{ALTER ROLE}} Password Hashing
> *Overview*:
> In current Cassandra builds, {{ALTER ROLE ... WITH PASSWORD}} executes 
> {{BCrypt.hashpw}} synchronously on the standard request executor 
> ({{Dispatcher.requestExecutor}}). When an authenticated user issues many 
> password changes, the expensive bcrypt work monopolizes that pool, starving 
> all other CQL requests and producing an authenticated denial of service. This 
> is the same root problem addressed by CASSANDRA-17812 for {{AUTH_RESPONSE}}, 
> but the trigger has moved to {{ALTER ROLE}}.
> *Affected Configurations*:
> * Clusters running {{PasswordAuthenticator}}.
> * Any authenticated account that may alter its own password (default behavior 
> for non-superusers).
> * Attackers that can reach the native CQL port.
> *Impact*:
> * Legitimate query latency inflates dramatically (observed increase from ~2 
> ms to >1 s).
> * Attack threads hit numerous {{OperationTimedOut}} errors, demonstrating 
> thread-pool exhaustion.
> * Service recovers immediately once the attack stops, indicating a classic 
> CPU-starvation DoS.
> h2. Proof-of-Concept Steps
> The file {{poc_dos.py}} automates the scenario:
> # Start a single-node Cassandra instance with {{PasswordAuthenticator}} and 
> {{CassandraAuthorizer}}.
> # With the superuser, create a victim role named {{target_role}}.
> # Launch 200 concurrent threads that run {{ALTER ROLE target_role WITH 
> PASSWORD '<random>'}} in a tight loop.
> # Start a monitor thread executing {{SELECT now()}} once per second to record 
> latency.
> Run the following command:
> {code:bash}
> python3 poc_dos.py
> {code}
> *Observed Output*:
> {code}
> Starting attack with 200 threads...
> [Victim] Query latency: 0.3743s
> [Victim] Query latency: 0.9145s
> Worker failed: ('Unable to connect ... OperationTimedOut ...')
> [Victim] Query latency: 1.0181s
> ...
> {code}
> Immediately after the attack begins, the monitor reports 300 ms–1 s latency 
> along with repeated {{OperationTimedOut}} errors. Once the attack stops, 
> latency returns to ~2 ms, proving the DoS is reproducible.
> h2. Problematic Code Reference
> The vulnerable path sits in 
> {{CassandraRoleManager.optionsToAssignments(...)}} and ultimately in 
> {{hashpw(...)}}, both under {{src/java/org/apache/cassandra/auth/}}:
> {code:java}
> private String optionsToAssignments(Map<Option, Object> options)
> {
>   return options.entrySet()
>           .stream()
>           .map(entry ->
>           {
>             switch (entry.getKey())
>             {
>               case PASSWORD:
>                 // bcrypt runs on Dispatcher.requestExecutor
>                 return String.format("salted_hash = '%s'", 
> escape(hashpw((String) entry.getValue())));
>               // other options elided
>             }
>           })
>           .filter(Objects::nonNull)
>           .collect(Collectors.joining(","));
> }
> private static String hashpw(String password)
> {
>   return BCrypt.hashpw(password, PasswordSaltSupplier.get());
> }
> {code}
> Because every {{ALTER ROLE ... WITH PASSWORD}} is processed on the shared 
> {{Dispatcher.requestExecutor}}, each invocation above performs bcrypt hashing 
> on threads that also handle standard queries, leading to starvation.
> h2. Related Issue and Root Cause
> * *Related Fix*: [CASSANDRA-17812] “Rate-limit new client connection auth 
> setup to avoid overwhelming bcrypt”.
>   * Mitigation: route {{AUTH_RESPONSE}} (and similar) to {{authExecutor}}.
>   * Gap: {{ALTER ROLE}} / {{CREATE ROLE}} continue to run on 
> {{requestExecutor}}.
> * *Shared Root Cause*: heavyweight bcrypt hashing without rate limiting or 
> pool isolation leads to CPU starvation.
> h2. Recommended Fixes
> # *Execution Isolation*: Dispatch password hashing work ({{ALTER ROLE ... 
> PASSWORD}}, {{CREATE ROLE ... PASSWORD}}, etc.) to a constrained executor 
> similar to {{authExecutor}}.
> # *Rate Limiting*: Enforce per-role, per-connection, or global throttles 
> (e.g., token bucket) on password modifications.
> # *Asynchronous Hashing*: Optionally compute bcrypt off-thread and update the 
> system tables once ready, returning an “operation queued” response (requires 
> protocol changes, higher complexity).
> # *Operational Mitigations* (until a code fix ships):
>    * Monitor CPU saturation closely; adjusting 
> {{auth_bcrypt_gensalt_log2_rounds}} does not solve the issue but may 
> highlight abuse sooner.
>    * Tighten credential/role cache TTLs ({{roles_validity_in_ms}}, 
> {{credentials_validity_in_ms}}) though this cannot block an active attacker.
> h2. Conclusion
> This vulnerability belongs to the same family as CASSANDRA-17812—bcrypt 
> computations starving the main request pool. Because any authenticated 
> account can trigger it with repeated {{ALTER ROLE}} statements, the risk is 
> high. We recommend extending the rate limiting / dedicated executor strategy 
> to all password-hashing pathways as soon as possible.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to