Josh McKenzie created CASSANALYTICS-133:
-------------------------------------------
Summary: Persisting data to C* in the StatePersister is async and
has unhandled exceptions
Key: CASSANALYTICS-133
URL: https://issues.apache.org/jira/browse/CASSANALYTICS-133
Project: Apache Cassandra Analytics
Issue Type: Bug
Components: CDC
Reporter: Josh McKenzie
The code path to persist data to C* has some interesting properties.
# In our {{SidecarStatePersister#flushActive}} call, we wait on the futures
w/out setting our own timeout on them. In effect, this means our persisting
data to the C* db is subject to the standard default write timeout in C*.
Blocking operations asking the {{StatePersister}} to persist when C* isn't
healthy might be... rather surprising.
# We handle {{ExecutionException}} and {{InterruptedException}} in
{{SidecarStatePersister.flushActiveSafe}} but do not actually handle any of the
{{CassandraException}} shaped things we could get back from Cassandra. I could
see arguments for either way on doing this, but the intent isn't documented and
the name of the method calls into question what exactly "safe" means in this
context. :)
We should probably do something about both of these. Setting our own more
aggressive timeouts if we're going to rely on the {{flush()}} path in the
StatePersister in any kind of performance sensitive or functionally blocking
path (shutdown, etc), and either renaming some of the methods to denote what
they are and aren't safe in the face of or change / augment the logic to make
deliberate choices when we fail to persist CDC state back to the DB.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]