I have a keyspace that is using the NetworkTopologyStrategy. In one of the data centers, there are 3 nodes, and the replication factor is 3. I would expect that when I make a request to any of the 3 nodes, that node would answer instead of forwarding the request. But that doesn't always happen, and I can't seem to find
any reason that would trigger it. For example, look at the following traces:

session_id                           | client     | command | coordinator | coordinator_port | duration | parameters | request                     | started_at
--------------------------------------+------------+---------+-------------+------------------+----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------+----------
 fea76550-3775-11f0-88c7-67d23f0ba311 |  10.0.4.68 |   QUERY | 10.0.4.63 |             7000 |     5024 | {'bound_var_0_slugs': '[''rules'']', 'consistency_level': 'LOCAL_ONE', 'query': 'SELECT title, content, "tableOfContents" FROM "CmsPage" WHERE slugs = ?', 'serial_consistency_level': 'SERIAL'} | Execute CQL3 prepared query | 2025-05-23 01:33:57.157000+0000  008c2ae0-3776-11f0-88c7-67d23f0ba311 |  10.0.4.68 |   QUERY | 10.0.4.63 |             7000 |     1681 | {'bound_var_0_slugs': '[''rules'']', 'consistency_level': 'LOCAL_ONE', 'query': 'SELECT title, content, "tableOfContents" FROM "CmsPage" WHERE slugs = ?', 'serial_consistency_level': 'SERIAL'} | Execute CQL3 prepared query | 2025-05-23 01:34:00.334000+0000

Trace Events:
 session_id                           | event_id                             | activity | source     | source_elapsed | source_port | thread
--------------------------------------+--------------------------------------+--------------------------------------------------------------------------+------------+----------------+-------------+-----------------------------
 fea76550-3775-11f0-88c7-67d23f0ba311 | fea78c60-3775-11f0-88c7-67d23f0ba311 |                                       reading data from /10.0.3.184:7000 |  10.0.4.63 |            674 |        7000 | Native-Transport-Requests-1  fea76550-3775-11f0-88c7-67d23f0ba311 | fea78c6a-3775-11f0-88c7-67d23f0ba311 |      Sending READ_REQ message to /10.0.3.184:7000 message size 182 bytes |  10.0.4.63 |            994 |        7000 |     Messaging-EventLoop-3-1  fea76550-3775-11f0-88c7-67d23f0ba311 | fea828a0-3775-11f0-88c7-67d23f0ba311 | READ_RSP message received from /10.0.3.184:7000 |  10.0.4.63 |           4558 |        7000 |     Messaging-EventLoop-3-3  fea76550-3775-11f0-88c7-67d23f0ba311 | fea828a0-3775-11f0-accb-b7ca6218b7fc | READ_REQ message received from /10.0.4.63:7000 | 10.0.3.184 |             70 |        7000 |     Messaging-EventLoop-3-3  fea76550-3775-11f0-88c7-67d23f0ba311 | fea828aa-3775-11f0-88c7-67d23f0ba311 |                                Processing response from /10.0.3.184:7000 |  10.0.4.63 |           4683 |        7000 |      RequestResponseStage-2  fea76550-3775-11f0-88c7-67d23f0ba311 | fea84fb0-3775-11f0-accb-b7ca6218b7fc |                              Executing single-partition query on CmsPage | 10.0.3.184 |            718 |        7000 |                 ReadStage-2  fea76550-3775-11f0-88c7-67d23f0ba311 | fea84fba-3775-11f0-accb-b7ca6218b7fc |                                             Acquiring sstable references | 10.0.3.184 |            836 |        7000 |                 ReadStage-2  fea76550-3775-11f0-88c7-67d23f0ba311 | fea84fc4-3775-11f0-accb-b7ca6218b7fc |                                                Merging memtable contents | 10.0.3.184 |            871 |        7000 |                 ReadStage-2  fea76550-3775-11f0-88c7-67d23f0ba311 | fea84fce-3775-11f0-accb-b7ca6218b7fc |        Bloom filter allows skipping sstable 3gqh_0etw_07i282pmrgsayqybc9 | 10.0.3.184 |            932 |        7000 | ReadStage-2  fea76550-3775-11f0-88c7-67d23f0ba311 | fea84fd8-3775-11f0-accb-b7ca6218b7fc | Partition index found for sstable 3gqh_0etv_3lghs2pmrgsayqybc9, size = 0 | 10.0.3.184 |           1068 |        7000 |                 ReadStage-2  fea76550-3775-11f0-88c7-67d23f0ba311 | fea876c0-3775-11f0-accb-b7ca6218b7fc |                                   Read 1 live rows and 0 tombstone cells | 10.0.3.184 |           1437 |        7000 |                 ReadStage-2  fea76550-3775-11f0-88c7-67d23f0ba311 | fea876ca-3775-11f0-accb-b7ca6218b7fc |                                    Enqueuing response to /10.0.4.63:7000 | 10.0.3.184 |           1493 |        7000 |                 ReadStage-2  fea76550-3775-11f0-88c7-67d23f0ba311 | fea876d4-3775-11f0-accb-b7ca6218b7fc |      Sending READ_RSP message to /10.0.4.63:7000 message size 2322 bytes | 10.0.3.184 |           1843 |        7000 |     Messaging-EventLoop-3-4


 session_id                           | event_id                             | activity | source    | source_elapsed | source_port | thread
--------------------------------------+--------------------------------------+--------------------------------------------------------------------------+-----------+----------------+-------------+-------------
 008c2ae0-3776-11f0-88c7-67d23f0ba311 | 008c51f0-3776-11f0-88c7-67d23f0ba311 |                              Executing single-partition query on CmsPage | 10.0.4.63 |            843 |        7000 | ReadStage-2  008c2ae0-3776-11f0-88c7-67d23f0ba311 | 008c51fa-3776-11f0-88c7-67d23f0ba311 |                                             Acquiring sstable references | 10.0.4.63 |            948 |        7000 | ReadStage-2  008c2ae0-3776-11f0-88c7-67d23f0ba311 | 008c5204-3776-11f0-88c7-67d23f0ba311 |                                                Merging memtable contents | 10.0.4.63 |            989 |        7000 | ReadStage-2  008c2ae0-3776-11f0-88c7-67d23f0ba311 | 008c520e-3776-11f0-88c7-67d23f0ba311 | Partition index found for sstable 3go5_02qz_3ttfe2hrx6r82xf1x4, size = 0 | 10.0.4.63 |           1129 |        7000 | ReadStage-2  008c2ae0-3776-11f0-88c7-67d23f0ba311 | 008c7900-3776-11f0-88c7-67d23f0ba311 |                                   Read 1 live rows and 0 tombstone cells | 10.0.4.63 |           1417 |        7000 | ReadStage-2

The client is the same, with the same coordinator, the same prepared query, and
the same query parameters.

Initially, I was worried that it might be a miscoordination of prepared queries
to the host where they were prepared. I'm working on improving the Haskell
cassandra driver, and adding a token aware routing policy.

The library currently makes the following assumption:

    The spec scopes the 'QueryId' to the node the query has
    been prepared with. The spec does not state anything
    about the format of the 'QueryId'. However the official
    Java driver assumes that any given 'QueryString' yields
    the same 'QueryId' on every node. This client make the
    same assumption.

But if that broken assumption were causing the behavior that I'm seeing, I would
expect it to be consistent based on the client IP, and I'm not seeing that.
Perhaps if the query was prepared on the coordinator transparently while the
query was forwarded, it might explain what I'm seeing.

I've tried do a full repair on all of the nodes in the cluster for this
keyspace, and it didn't seem to change anything.

I'm happy to try and get more information. If there are details that I've left
out that would help diagnose this issue, I'd be happy to get them.

In a related question: I'm not sure if I should be spreading out read requests
across the replicas, or if they should instead go to the primary replica,
assuming that it is up.



Hopefully someone can explain what I'm missing.

Thank You,
Kyle

Reply via email to