[ 
https://issues.apache.org/jira/browse/CASSANDRA-21228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cyl updated CASSANDRA-21228:
----------------------------
    Description: 
h2. Vulnerability Description

*Name*: Authenticated DoS via {{ALTER ROLE}} Password Hashing

*Overview*:
In current Cassandra builds, {{ALTER ROLE ... WITH PASSWORD}} executes 
{{BCrypt.hashpw}} synchronously on the standard request executor 
({{Dispatcher.requestExecutor}}). When an authenticated user issues many 
password changes, the expensive bcrypt work monopolizes that pool, starving all 
other CQL requests and producing an authenticated denial of service. This is 
the same root problem addressed by CASSANDRA-17812 for {{AUTH_RESPONSE}}, but 
the trigger has moved to {{ALTER ROLE}}.

*Affected Configurations*:
* Clusters running {{PasswordAuthenticator}}.
* Any authenticated account that may alter its own password (default behavior 
for non-superusers).
* Attackers that can reach the native CQL port.

*Impact*:
* Legitimate query latency inflates dramatically (observed increase from ~2 ms 
to >1 s).
* Attack threads hit numerous {{OperationTimedOut}} errors, demonstrating 
thread-pool exhaustion.
* Service recovers immediately once the attack stops, indicating a classic 
CPU-starvation DoS.

h2. Proof-of-Concept Steps

The file {{poc_dos.py}} automates the scenario:

# Start a single-node Cassandra instance with {{PasswordAuthenticator}} and 
{{CassandraAuthorizer}}.
# With the superuser, create a victim role named {{target_role}}.
# Launch 200 concurrent threads that run {{ALTER ROLE target_role WITH PASSWORD 
'<random>'}} in a tight loop.
# Start a monitor thread executing {{SELECT now()}} once per second to record 
latency.

Run the following command:

{code:bash}
python3 poc_dos.py
{code}

*Observed Output*:

{code}
Starting attack with 200 threads...
[Victim] Query latency: 0.3743s
[Victim] Query latency: 0.9145s
Worker failed: ('Unable to connect ... OperationTimedOut ...')
[Victim] Query latency: 1.0181s
...
{code}

Immediately after the attack begins, the monitor reports 300 ms–1 s latency 
along with repeated {{OperationTimedOut}} errors. Once the attack stops, 
latency returns to ~2 ms, proving the DoS is reproducible.

h2. Problematic Code Reference

The vulnerable path sits in {{CassandraRoleManager.optionsToAssignments(...)}} 
and ultimately in {{hashpw(...)}}, both under 
{{src/java/org/apache/cassandra/auth/}}:

{code:java}
private String optionsToAssignments(Map<Option, Object> options)
{
  return options.entrySet()
          .stream()
          .map(entry ->
          {
            switch (entry.getKey())
            {
              case PASSWORD:
                // bcrypt runs on Dispatcher.requestExecutor
                return String.format("salted_hash = '%s'", 
escape(hashpw((String) entry.getValue())));
              // other options elided
            }
          })
          .filter(Objects::nonNull)
          .collect(Collectors.joining(","));
}

private static String hashpw(String password)
{
  return BCrypt.hashpw(password, PasswordSaltSupplier.get());
}
{code}

Because every {{ALTER ROLE ... WITH PASSWORD}} is processed on the shared 
{{Dispatcher.requestExecutor}}, each invocation above performs bcrypt hashing 
on threads that also handle standard queries, leading to starvation.

h2. Related Issue and Root Cause

* *Related Fix*: [CASSANDRA-17812] “Rate-limit new client connection auth setup 
to avoid overwhelming bcrypt”.
  * Mitigation: route {{AUTH_RESPONSE}} (and similar) to {{authExecutor}}.
  * Gap: {{ALTER ROLE}} / {{CREATE ROLE}} continue to run on 
{{requestExecutor}}.
* *Shared Root Cause*: heavyweight bcrypt hashing without rate limiting or pool 
isolation leads to CPU starvation.

h2. Recommended Fixes

# *Execution Isolation*: Dispatch password hashing work ({{ALTER ROLE ... 
PASSWORD}}, {{CREATE ROLE ... PASSWORD}}, etc.) to a constrained executor 
similar to {{authExecutor}}.
# *Rate Limiting*: Enforce per-role, per-connection, or global throttles (e.g., 
token bucket) on password modifications.
# *Asynchronous Hashing*: Optionally compute bcrypt off-thread and update the 
system tables once ready, returning an “operation queued” response (requires 
protocol changes, higher complexity).
# *Operational Mitigations* (until a code fix ships):
   * Monitor CPU saturation closely; adjusting 
{{auth_bcrypt_gensalt_log2_rounds}} does not solve the issue but may highlight 
abuse sooner.
   * Tighten credential/role cache TTLs ({{roles_validity_in_ms}}, 
{{credentials_validity_in_ms}}) though this cannot block an active attacker.

h2. Conclusion

This vulnerability belongs to the same family as CASSANDRA-17812—bcrypt 
computations starving the main request pool. Because any authenticated account 
can trigger it with repeated {{ALTER ROLE}} statements, the risk is high. We 
recommend extending the rate limiting / dedicated executor strategy to all 
password-hashing pathways as soon as possible.

  was:
h2. 2. Vulnerability Description

*Name*: Authenticated DoS via {{ALTER ROLE}} Password Hashing

*Overview*:
In current Cassandra builds, {{ALTER ROLE ... WITH PASSWORD}} executes 
{{BCrypt.hashpw}} synchronously on the standard request executor 
({{Dispatcher.requestExecutor}}). When an authenticated user issues many 
password changes, the expensive bcrypt work monopolizes that pool, starving all 
other CQL requests and producing an authenticated denial of service. This is 
the same root problem addressed by CASSANDRA-17812 for {{AUTH_RESPONSE}}, but 
the trigger has moved to {{ALTER ROLE}}.

*Affected Configurations*:
* Clusters running {{PasswordAuthenticator}}.
* Any authenticated account that may alter its own password (default behavior 
for non-superusers).
* Attackers that can reach the native CQL port.

*Impact*:
* Legitimate query latency inflates dramatically (observed increase from ~2 ms 
to >1 s).
* Attack threads hit numerous {{OperationTimedOut}} errors, demonstrating 
thread-pool exhaustion.
* Service recovers immediately once the attack stops, indicating a classic 
CPU-starvation DoS.

h2. 2. Proof-of-Concept Steps

The file {{poc_dos.py}} automates the scenario:

# Start a single-node Cassandra instance with {{PasswordAuthenticator}} and 
{{CassandraAuthorizer}}.
# With the superuser, create a victim role named {{target_role}}.
# Launch 200 concurrent threads that run {{ALTER ROLE target_role WITH PASSWORD 
'<random>'}} in a tight loop.
# Start a monitor thread executing {{SELECT now()}} once per second to record 
latency.

Run the following command:

{code:bash}
python3 poc_dos.py
{code}

*Observed Output*:

{code}
Starting attack with 200 threads...
[Victim] Query latency: 0.3743s
[Victim] Query latency: 0.9145s
Worker failed: ('Unable to connect ... OperationTimedOut ...')
[Victim] Query latency: 1.0181s
...
{code}

Immediately after the attack begins, the monitor reports 300 ms–1 s latency 
along with repeated {{OperationTimedOut}} errors. Once the attack stops, 
latency returns to ~2 ms, proving the DoS is reproducible.

h2. 3. Problematic Code Reference

The vulnerable path sits in {{CassandraRoleManager.optionsToAssignments(...)}} 
and ultimately in {{hashpw(...)}}, both under 
{{src/java/org/apache/cassandra/auth/}}:

{code:java}
private String optionsToAssignments(Map<Option, Object> options)
{
  return options.entrySet()
          .stream()
          .map(entry ->
          {
            switch (entry.getKey())
            {
              case PASSWORD:
                // bcrypt runs on Dispatcher.requestExecutor
                return String.format("salted_hash = '%s'", 
escape(hashpw((String) entry.getValue())));
              // other options elided
            }
          })
          .filter(Objects::nonNull)
          .collect(Collectors.joining(","));
}

private static String hashpw(String password)
{
  return BCrypt.hashpw(password, PasswordSaltSupplier.get());
}
{code}

Because every {{ALTER ROLE ... WITH PASSWORD}} is processed on the shared 
{{Dispatcher.requestExecutor}}, each invocation above performs bcrypt hashing 
on threads that also handle standard queries, leading to starvation.

h2. 4. Related Issue and Root Cause

* *Related Fix*: [CASSANDRA-17812] “Rate-limit new client connection auth setup 
to avoid overwhelming bcrypt”.
  * Mitigation: route {{AUTH_RESPONSE}} (and similar) to {{authExecutor}}.
  * Gap: {{ALTER ROLE}} / {{CREATE ROLE}} continue to run on 
{{requestExecutor}}.
* *Shared Root Cause*: heavyweight bcrypt hashing without rate limiting or pool 
isolation leads to CPU starvation.

h2. 5. Recommended Fixes

# *Execution Isolation*: Dispatch password hashing work ({{ALTER ROLE ... 
PASSWORD}}, {{CREATE ROLE ... PASSWORD}}, etc.) to a constrained executor 
similar to {{authExecutor}}.
# *Rate Limiting*: Enforce per-role, per-connection, or global throttles (e.g., 
token bucket) on password modifications.
# *Asynchronous Hashing*: Optionally compute bcrypt off-thread and update the 
system tables once ready, returning an “operation queued” response (requires 
protocol changes, higher complexity).
# *Operational Mitigations* (until a code fix ships):
   * Monitor CPU saturation closely; adjusting 
{{auth_bcrypt_gensalt_log2_rounds}} does not solve the issue but may highlight 
abuse sooner.
   * Tighten credential/role cache TTLs ({{roles_validity_in_ms}}, 
{{credentials_validity_in_ms}}) though this cannot block an active attacker.

h2. 6. Conclusion

This vulnerability belongs to the same family as CASSANDRA-17812—bcrypt 
computations starving the main request pool. Because any authenticated account 
can trigger it with repeated {{ALTER ROLE}} statements, the risk is high. We 
recommend extending the rate limiting / dedicated executor strategy to all 
password-hashing pathways as soon as possible.


> ALTER ROLE Password Hash DoS Vulnerability
> ------------------------------------------
>
>                 Key: CASSANDRA-21228
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21228
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Feature/Authorization, Feature/Rate Limiting
>            Reporter: Cyl
>            Priority: Normal
>              Labels: dos, performance, security
>
> h2. Vulnerability Description
> *Name*: Authenticated DoS via {{ALTER ROLE}} Password Hashing
> *Overview*:
> In current Cassandra builds, {{ALTER ROLE ... WITH PASSWORD}} executes 
> {{BCrypt.hashpw}} synchronously on the standard request executor 
> ({{Dispatcher.requestExecutor}}). When an authenticated user issues many 
> password changes, the expensive bcrypt work monopolizes that pool, starving 
> all other CQL requests and producing an authenticated denial of service. This 
> is the same root problem addressed by CASSANDRA-17812 for {{AUTH_RESPONSE}}, 
> but the trigger has moved to {{ALTER ROLE}}.
> *Affected Configurations*:
> * Clusters running {{PasswordAuthenticator}}.
> * Any authenticated account that may alter its own password (default behavior 
> for non-superusers).
> * Attackers that can reach the native CQL port.
> *Impact*:
> * Legitimate query latency inflates dramatically (observed increase from ~2 
> ms to >1 s).
> * Attack threads hit numerous {{OperationTimedOut}} errors, demonstrating 
> thread-pool exhaustion.
> * Service recovers immediately once the attack stops, indicating a classic 
> CPU-starvation DoS.
> h2. Proof-of-Concept Steps
> The file {{poc_dos.py}} automates the scenario:
> # Start a single-node Cassandra instance with {{PasswordAuthenticator}} and 
> {{CassandraAuthorizer}}.
> # With the superuser, create a victim role named {{target_role}}.
> # Launch 200 concurrent threads that run {{ALTER ROLE target_role WITH 
> PASSWORD '<random>'}} in a tight loop.
> # Start a monitor thread executing {{SELECT now()}} once per second to record 
> latency.
> Run the following command:
> {code:bash}
> python3 poc_dos.py
> {code}
> *Observed Output*:
> {code}
> Starting attack with 200 threads...
> [Victim] Query latency: 0.3743s
> [Victim] Query latency: 0.9145s
> Worker failed: ('Unable to connect ... OperationTimedOut ...')
> [Victim] Query latency: 1.0181s
> ...
> {code}
> Immediately after the attack begins, the monitor reports 300 ms–1 s latency 
> along with repeated {{OperationTimedOut}} errors. Once the attack stops, 
> latency returns to ~2 ms, proving the DoS is reproducible.
> h2. Problematic Code Reference
> The vulnerable path sits in 
> {{CassandraRoleManager.optionsToAssignments(...)}} and ultimately in 
> {{hashpw(...)}}, both under {{src/java/org/apache/cassandra/auth/}}:
> {code:java}
> private String optionsToAssignments(Map<Option, Object> options)
> {
>   return options.entrySet()
>           .stream()
>           .map(entry ->
>           {
>             switch (entry.getKey())
>             {
>               case PASSWORD:
>                 // bcrypt runs on Dispatcher.requestExecutor
>                 return String.format("salted_hash = '%s'", 
> escape(hashpw((String) entry.getValue())));
>               // other options elided
>             }
>           })
>           .filter(Objects::nonNull)
>           .collect(Collectors.joining(","));
> }
> private static String hashpw(String password)
> {
>   return BCrypt.hashpw(password, PasswordSaltSupplier.get());
> }
> {code}
> Because every {{ALTER ROLE ... WITH PASSWORD}} is processed on the shared 
> {{Dispatcher.requestExecutor}}, each invocation above performs bcrypt hashing 
> on threads that also handle standard queries, leading to starvation.
> h2. Related Issue and Root Cause
> * *Related Fix*: [CASSANDRA-17812] “Rate-limit new client connection auth 
> setup to avoid overwhelming bcrypt”.
>   * Mitigation: route {{AUTH_RESPONSE}} (and similar) to {{authExecutor}}.
>   * Gap: {{ALTER ROLE}} / {{CREATE ROLE}} continue to run on 
> {{requestExecutor}}.
> * *Shared Root Cause*: heavyweight bcrypt hashing without rate limiting or 
> pool isolation leads to CPU starvation.
> h2. Recommended Fixes
> # *Execution Isolation*: Dispatch password hashing work ({{ALTER ROLE ... 
> PASSWORD}}, {{CREATE ROLE ... PASSWORD}}, etc.) to a constrained executor 
> similar to {{authExecutor}}.
> # *Rate Limiting*: Enforce per-role, per-connection, or global throttles 
> (e.g., token bucket) on password modifications.
> # *Asynchronous Hashing*: Optionally compute bcrypt off-thread and update the 
> system tables once ready, returning an “operation queued” response (requires 
> protocol changes, higher complexity).
> # *Operational Mitigations* (until a code fix ships):
>    * Monitor CPU saturation closely; adjusting 
> {{auth_bcrypt_gensalt_log2_rounds}} does not solve the issue but may 
> highlight abuse sooner.
>    * Tighten credential/role cache TTLs ({{roles_validity_in_ms}}, 
> {{credentials_validity_in_ms}}) though this cannot block an active attacker.
> h2. Conclusion
> This vulnerability belongs to the same family as CASSANDRA-17812—bcrypt 
> computations starving the main request pool. Because any authenticated 
> account can trigger it with repeated {{ALTER ROLE}} statements, the risk is 
> high. We recommend extending the rate limiting / dedicated executor strategy 
> to all password-hashing pathways as soon as possible.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to