[
https://issues.apache.org/jira/browse/CASSANDRA-21228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cyl updated CASSANDRA-21228:
----------------------------
Description:
h2. Vulnerability Description
*Name*: Authenticated DoS via {{ALTER ROLE}} Password Hashing
*Overview*:
In current Cassandra builds, {{ALTER ROLE ... WITH PASSWORD}} executes
{{BCrypt.hashpw}} synchronously on the standard request executor
({{Dispatcher.requestExecutor}}). When an authenticated user issues many
password changes, the expensive bcrypt work monopolizes that pool, starving all
other CQL requests and producing an authenticated denial of service. This is
the same root problem addressed by CASSANDRA-17812 for {{AUTH_RESPONSE}}, but
the trigger has moved to {{ALTER ROLE}}.
*Affected Configurations*:
* Clusters running {{PasswordAuthenticator}}.
* Any authenticated account that may alter its own password (default behavior
for non-superusers).
* Attackers that can reach the native CQL port.
*Impact*:
* Legitimate query latency inflates dramatically (observed increase from ~2 ms
to >1 s).
* Attack threads hit numerous {{OperationTimedOut}} errors, demonstrating
thread-pool exhaustion.
* Service recovers immediately once the attack stops, indicating a classic
CPU-starvation DoS.
h2. Proof-of-Concept Steps
The file {{poc_dos.py}} automates the scenario:
# Start a single-node Cassandra instance with {{PasswordAuthenticator}} and
{{CassandraAuthorizer}}.
# With the superuser, create a victim role named {{target_role}}.
# Launch 200 concurrent threads that run {{ALTER ROLE target_role WITH PASSWORD
'<random>'}} in a tight loop.
# Start a monitor thread executing {{SELECT now()}} once per second to record
latency.
Run the following command:
{code:bash}
python3 poc_dos.py
{code}
*Observed Output*:
{code}
Starting attack with 200 threads...
[Victim] Query latency: 0.3743s
[Victim] Query latency: 0.9145s
Worker failed: ('Unable to connect ... OperationTimedOut ...')
[Victim] Query latency: 1.0181s
...
{code}
Immediately after the attack begins, the monitor reports 300 ms–1 s latency
along with repeated {{OperationTimedOut}} errors. Once the attack stops,
latency returns to ~2 ms, proving the DoS is reproducible.
h2. Problematic Code Reference
The vulnerable path sits in {{CassandraRoleManager.optionsToAssignments(...)}}
and ultimately in {{hashpw(...)}}, both under
{{src/java/org/apache/cassandra/auth/}}:
{code:java}
private String optionsToAssignments(Map<Option, Object> options)
{
return options.entrySet()
.stream()
.map(entry ->
{
switch (entry.getKey())
{
case PASSWORD:
// bcrypt runs on Dispatcher.requestExecutor
return String.format("salted_hash = '%s'",
escape(hashpw((String) entry.getValue())));
// other options elided
}
})
.filter(Objects::nonNull)
.collect(Collectors.joining(","));
}
private static String hashpw(String password)
{
return BCrypt.hashpw(password, PasswordSaltSupplier.get());
}
{code}
Because every {{ALTER ROLE ... WITH PASSWORD}} is processed on the shared
{{Dispatcher.requestExecutor}}, each invocation above performs bcrypt hashing
on threads that also handle standard queries, leading to starvation.
h2. Related Issue and Root Cause
* *Related Fix*: [CASSANDRA-17812] “Rate-limit new client connection auth setup
to avoid overwhelming bcrypt”.
* Mitigation: route {{AUTH_RESPONSE}} (and similar) to {{authExecutor}}.
* Gap: {{ALTER ROLE}} / {{CREATE ROLE}} continue to run on
{{requestExecutor}}.
* *Shared Root Cause*: heavyweight bcrypt hashing without rate limiting or pool
isolation leads to CPU starvation.
h2. Recommended Fixes
# *Execution Isolation*: Dispatch password hashing work ({{ALTER ROLE ...
PASSWORD}}, {{CREATE ROLE ... PASSWORD}}, etc.) to a constrained executor
similar to {{authExecutor}}.
# *Rate Limiting*: Enforce per-role, per-connection, or global throttles (e.g.,
token bucket) on password modifications.
# *Asynchronous Hashing*: Optionally compute bcrypt off-thread and update the
system tables once ready, returning an “operation queued” response (requires
protocol changes, higher complexity).
# *Operational Mitigations* (until a code fix ships):
* Monitor CPU saturation closely; adjusting
{{auth_bcrypt_gensalt_log2_rounds}} does not solve the issue but may highlight
abuse sooner.
* Tighten credential/role cache TTLs ({{roles_validity_in_ms}},
{{credentials_validity_in_ms}}) though this cannot block an active attacker.
h2. Conclusion
This vulnerability belongs to the same family as CASSANDRA-17812—bcrypt
computations starving the main request pool. Because any authenticated account
can trigger it with repeated {{ALTER ROLE}} statements, the risk is high. We
recommend extending the rate limiting / dedicated executor strategy to all
password-hashing pathways as soon as possible.
was:
h2. 2. Vulnerability Description
*Name*: Authenticated DoS via {{ALTER ROLE}} Password Hashing
*Overview*:
In current Cassandra builds, {{ALTER ROLE ... WITH PASSWORD}} executes
{{BCrypt.hashpw}} synchronously on the standard request executor
({{Dispatcher.requestExecutor}}). When an authenticated user issues many
password changes, the expensive bcrypt work monopolizes that pool, starving all
other CQL requests and producing an authenticated denial of service. This is
the same root problem addressed by CASSANDRA-17812 for {{AUTH_RESPONSE}}, but
the trigger has moved to {{ALTER ROLE}}.
*Affected Configurations*:
* Clusters running {{PasswordAuthenticator}}.
* Any authenticated account that may alter its own password (default behavior
for non-superusers).
* Attackers that can reach the native CQL port.
*Impact*:
* Legitimate query latency inflates dramatically (observed increase from ~2 ms
to >1 s).
* Attack threads hit numerous {{OperationTimedOut}} errors, demonstrating
thread-pool exhaustion.
* Service recovers immediately once the attack stops, indicating a classic
CPU-starvation DoS.
h2. 2. Proof-of-Concept Steps
The file {{poc_dos.py}} automates the scenario:
# Start a single-node Cassandra instance with {{PasswordAuthenticator}} and
{{CassandraAuthorizer}}.
# With the superuser, create a victim role named {{target_role}}.
# Launch 200 concurrent threads that run {{ALTER ROLE target_role WITH PASSWORD
'<random>'}} in a tight loop.
# Start a monitor thread executing {{SELECT now()}} once per second to record
latency.
Run the following command:
{code:bash}
python3 poc_dos.py
{code}
*Observed Output*:
{code}
Starting attack with 200 threads...
[Victim] Query latency: 0.3743s
[Victim] Query latency: 0.9145s
Worker failed: ('Unable to connect ... OperationTimedOut ...')
[Victim] Query latency: 1.0181s
...
{code}
Immediately after the attack begins, the monitor reports 300 ms–1 s latency
along with repeated {{OperationTimedOut}} errors. Once the attack stops,
latency returns to ~2 ms, proving the DoS is reproducible.
h2. 3. Problematic Code Reference
The vulnerable path sits in {{CassandraRoleManager.optionsToAssignments(...)}}
and ultimately in {{hashpw(...)}}, both under
{{src/java/org/apache/cassandra/auth/}}:
{code:java}
private String optionsToAssignments(Map<Option, Object> options)
{
return options.entrySet()
.stream()
.map(entry ->
{
switch (entry.getKey())
{
case PASSWORD:
// bcrypt runs on Dispatcher.requestExecutor
return String.format("salted_hash = '%s'",
escape(hashpw((String) entry.getValue())));
// other options elided
}
})
.filter(Objects::nonNull)
.collect(Collectors.joining(","));
}
private static String hashpw(String password)
{
return BCrypt.hashpw(password, PasswordSaltSupplier.get());
}
{code}
Because every {{ALTER ROLE ... WITH PASSWORD}} is processed on the shared
{{Dispatcher.requestExecutor}}, each invocation above performs bcrypt hashing
on threads that also handle standard queries, leading to starvation.
h2. 4. Related Issue and Root Cause
* *Related Fix*: [CASSANDRA-17812] “Rate-limit new client connection auth setup
to avoid overwhelming bcrypt”.
* Mitigation: route {{AUTH_RESPONSE}} (and similar) to {{authExecutor}}.
* Gap: {{ALTER ROLE}} / {{CREATE ROLE}} continue to run on
{{requestExecutor}}.
* *Shared Root Cause*: heavyweight bcrypt hashing without rate limiting or pool
isolation leads to CPU starvation.
h2. 5. Recommended Fixes
# *Execution Isolation*: Dispatch password hashing work ({{ALTER ROLE ...
PASSWORD}}, {{CREATE ROLE ... PASSWORD}}, etc.) to a constrained executor
similar to {{authExecutor}}.
# *Rate Limiting*: Enforce per-role, per-connection, or global throttles (e.g.,
token bucket) on password modifications.
# *Asynchronous Hashing*: Optionally compute bcrypt off-thread and update the
system tables once ready, returning an “operation queued” response (requires
protocol changes, higher complexity).
# *Operational Mitigations* (until a code fix ships):
* Monitor CPU saturation closely; adjusting
{{auth_bcrypt_gensalt_log2_rounds}} does not solve the issue but may highlight
abuse sooner.
* Tighten credential/role cache TTLs ({{roles_validity_in_ms}},
{{credentials_validity_in_ms}}) though this cannot block an active attacker.
h2. 6. Conclusion
This vulnerability belongs to the same family as CASSANDRA-17812—bcrypt
computations starving the main request pool. Because any authenticated account
can trigger it with repeated {{ALTER ROLE}} statements, the risk is high. We
recommend extending the rate limiting / dedicated executor strategy to all
password-hashing pathways as soon as possible.
> ALTER ROLE Password Hash DoS Vulnerability
> ------------------------------------------
>
> Key: CASSANDRA-21228
> URL: https://issues.apache.org/jira/browse/CASSANDRA-21228
> Project: Apache Cassandra
> Issue Type: Bug
> Components: Feature/Authorization, Feature/Rate Limiting
> Reporter: Cyl
> Priority: Normal
> Labels: dos, performance, security
>
> h2. Vulnerability Description
> *Name*: Authenticated DoS via {{ALTER ROLE}} Password Hashing
> *Overview*:
> In current Cassandra builds, {{ALTER ROLE ... WITH PASSWORD}} executes
> {{BCrypt.hashpw}} synchronously on the standard request executor
> ({{Dispatcher.requestExecutor}}). When an authenticated user issues many
> password changes, the expensive bcrypt work monopolizes that pool, starving
> all other CQL requests and producing an authenticated denial of service. This
> is the same root problem addressed by CASSANDRA-17812 for {{AUTH_RESPONSE}},
> but the trigger has moved to {{ALTER ROLE}}.
> *Affected Configurations*:
> * Clusters running {{PasswordAuthenticator}}.
> * Any authenticated account that may alter its own password (default behavior
> for non-superusers).
> * Attackers that can reach the native CQL port.
> *Impact*:
> * Legitimate query latency inflates dramatically (observed increase from ~2
> ms to >1 s).
> * Attack threads hit numerous {{OperationTimedOut}} errors, demonstrating
> thread-pool exhaustion.
> * Service recovers immediately once the attack stops, indicating a classic
> CPU-starvation DoS.
> h2. Proof-of-Concept Steps
> The file {{poc_dos.py}} automates the scenario:
> # Start a single-node Cassandra instance with {{PasswordAuthenticator}} and
> {{CassandraAuthorizer}}.
> # With the superuser, create a victim role named {{target_role}}.
> # Launch 200 concurrent threads that run {{ALTER ROLE target_role WITH
> PASSWORD '<random>'}} in a tight loop.
> # Start a monitor thread executing {{SELECT now()}} once per second to record
> latency.
> Run the following command:
> {code:bash}
> python3 poc_dos.py
> {code}
> *Observed Output*:
> {code}
> Starting attack with 200 threads...
> [Victim] Query latency: 0.3743s
> [Victim] Query latency: 0.9145s
> Worker failed: ('Unable to connect ... OperationTimedOut ...')
> [Victim] Query latency: 1.0181s
> ...
> {code}
> Immediately after the attack begins, the monitor reports 300 ms–1 s latency
> along with repeated {{OperationTimedOut}} errors. Once the attack stops,
> latency returns to ~2 ms, proving the DoS is reproducible.
> h2. Problematic Code Reference
> The vulnerable path sits in
> {{CassandraRoleManager.optionsToAssignments(...)}} and ultimately in
> {{hashpw(...)}}, both under {{src/java/org/apache/cassandra/auth/}}:
> {code:java}
> private String optionsToAssignments(Map<Option, Object> options)
> {
> return options.entrySet()
> .stream()
> .map(entry ->
> {
> switch (entry.getKey())
> {
> case PASSWORD:
> // bcrypt runs on Dispatcher.requestExecutor
> return String.format("salted_hash = '%s'",
> escape(hashpw((String) entry.getValue())));
> // other options elided
> }
> })
> .filter(Objects::nonNull)
> .collect(Collectors.joining(","));
> }
> private static String hashpw(String password)
> {
> return BCrypt.hashpw(password, PasswordSaltSupplier.get());
> }
> {code}
> Because every {{ALTER ROLE ... WITH PASSWORD}} is processed on the shared
> {{Dispatcher.requestExecutor}}, each invocation above performs bcrypt hashing
> on threads that also handle standard queries, leading to starvation.
> h2. Related Issue and Root Cause
> * *Related Fix*: [CASSANDRA-17812] “Rate-limit new client connection auth
> setup to avoid overwhelming bcrypt”.
> * Mitigation: route {{AUTH_RESPONSE}} (and similar) to {{authExecutor}}.
> * Gap: {{ALTER ROLE}} / {{CREATE ROLE}} continue to run on
> {{requestExecutor}}.
> * *Shared Root Cause*: heavyweight bcrypt hashing without rate limiting or
> pool isolation leads to CPU starvation.
> h2. Recommended Fixes
> # *Execution Isolation*: Dispatch password hashing work ({{ALTER ROLE ...
> PASSWORD}}, {{CREATE ROLE ... PASSWORD}}, etc.) to a constrained executor
> similar to {{authExecutor}}.
> # *Rate Limiting*: Enforce per-role, per-connection, or global throttles
> (e.g., token bucket) on password modifications.
> # *Asynchronous Hashing*: Optionally compute bcrypt off-thread and update the
> system tables once ready, returning an “operation queued” response (requires
> protocol changes, higher complexity).
> # *Operational Mitigations* (until a code fix ships):
> * Monitor CPU saturation closely; adjusting
> {{auth_bcrypt_gensalt_log2_rounds}} does not solve the issue but may
> highlight abuse sooner.
> * Tighten credential/role cache TTLs ({{roles_validity_in_ms}},
> {{credentials_validity_in_ms}}) though this cannot block an active attacker.
> h2. Conclusion
> This vulnerability belongs to the same family as CASSANDRA-17812—bcrypt
> computations starving the main request pool. Because any authenticated
> account can trigger it with repeated {{ALTER ROLE}} statements, the risk is
> high. We recommend extending the rate limiting / dedicated executor strategy
> to all password-hashing pathways as soon as possible.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]