[jira] [Commented] (CASSANDRA-21165) Query read timeout potentially due to altered query on server side

Brandon Williams (Jira) Thu, 02 Apr 2026 13:52:56 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-21165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18070699#comment-18070699
 ]


Brandon Williams commented on CASSANDRA-21165:
----------------------------------------------

I looked into WRITETIME and while the blob value is not needed conceptually for 
writetime, it is still fetched/carried in the row input path.  
WritetimeOrTTLSelector adds the input 
[here|https://github.com/apache/cassandra/blob/cassandra-5.0/src/java/org/apache/cassandra/cql3/selection/WritetimeOrTTLSelector.java#L132]
 and SimpleSelector pulls the value 
[here|https://github.com/apache/cassandra/blob/cassandra-5.0/src/java/org/apache/cassandra/cql3/selection/SimpleSelector.java#L155].
  So, WRITETIME may not be a great test since it will still read (but not 
necessarily deserialize) the value.

To get a more definitive answer I added dtests in [this 
branch|https://github.com/driftx/cassandra-dtest/tree/CASSANDRA-21165] to test 
selecting the writetime or non-blob column from a non-replica coordinator and 
then measuring traffic via system_views.internode_outbound, and the tests both 
fail against the versions I tested (4.0, 4.1, and 5.0) so indeed the blob 
column is being sent over the wire when not requested.  I think this is to be 
expected under the ALL_REGULARS_AND_QUERIED_STATICS_COLUMNS strategy that 
normal CQL reads are using, however: 
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/filter/ColumnFilter.java#L118
 which would explain why it occurs in all versions.

> Query read timeout potentially due to altered query on server side
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-21165
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21165
>             Project: Apache Cassandra
>          Issue Type: Bug
>            Reporter: Harsh Desai
>            Assignee: Brad Schoening
>            Priority: Urgent
>         Attachments: CQL_TRACING_Output.txt
>
>
> During load testing of Cassandra 5.0.6 cluster, we came across an unusual 
> issue wherein a lightweight CQL query times out.
> Upon further analysis, it was found that the query being executed on the 
> server side does not seem to be the same as the one sent by driver.
>  
> {+}Client side code{+}:
> this.statement = session.prepare(SimpleStatement.newInstance("SELECT column1 
> from \"kspace\".\"tsTable\" WHERE key = ? AND key2 = ? ORDER BY column1 DESC 
> LIMIT 1").setIdempotent(true));
>  
> {+}Cassandra server audit logs{+}:
> FileAuditLogger.java:51 - 
> ...|type:REQUEST_FAILURE|category:ERROR|ks:kspace|scope:tsTable|operation:SELECT
>  column1 from "kspace"."tsTable" WHERE key = ? AND key2 = ? ORDER BY column1 
> DESC LIMIT 1; Operation timed out - received only 1 responses.
>  
> {+}Cassandra server logs{+}:
> NoSpamLogger.java:104 - ...ReadTimeoutException "Operation timed out - 
> received only 1 responses." while executing SELECT {color:#ff0000}*{color} 
> FROM "kspace"."tsTable" WHERE key = c001c5c2-f0a7-1046-115d-edb4b67ab0d9 AND 
> key2 = '2026-02' ORDER BY column1 DESC, {color:#ff0000}*column2 ASC, column3 
> DESC, column4 DESC*{color} LIMIT 1 {color:#ff0000}*ALLOW FILTERING*{color}
>  
> {+}Replica node logs{+}:
> .. [WARN ] [ReadStage-68] cluster_id=1 ip_address=1.1.1.1  
> NoSpamLogger.java:107 - /2.2.2.2:7000->/3.3.3.3:7000-LARGE_MESSAGES-2acb4e9d 
> overloaded; dropping 1.779MiB message (queue: 131.653MiB local, 127.653MiB 
> endpoint, 127.653MiB global)
> {+}Table Schema{+}:
>  
> ||Column||Type||Key type||
> |key|TIMEUUID|Partition Key|
> |key2|TEXT|Partition Key|
> |column1|BIGINT|Clustering Column ASC|
> |column2|TIMEUUID|Clustering Column DESC|
> |column3|BOOLEAN|Clustering Column ASC|
> |column4|TEXT|Clustering Column ASC|
> |value|BLOB| |
>  
> Attached is the CQL query TRACING output (executed separately) which shows 
> that a message being transmitted from the replica node is the large one.
> Evidently, the query sent by the driver is quite light-weight while the one 
> executed on the server is not, as it tries to fetch all the columns including 
> the blob which is not asked for. This might be supported by the fact that the 
> message happens to be a large one and hence dropped. Besides, the query runs 
> with “ALLOW FILTERING” unexpectedly which is detrimental to the query 
> performance.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-21165) Query read timeout potentially due to altered query on server side

Reply via email to