Re: [DISCUSS] CASSANDRA-18940 SAI post-filtering reads don't update local table latency metrics

Jeremiah Jordan Fri, 01 Dec 2023 11:09:37 -0800

 Again I am coming at this from the operator/end user perspective.
Creating a metrics dashboard, and then I am looking at those metrics to
understand what my queries are doing.  We have coordinator query level
metrics, and then we have lower level table metrics on the replicas.  I
want to be able to draw a line from this set of coordinator query metrics,
to that set of table metrics, and be able to understand how they are
affecting each other for a given query.


The best would be for SAI / Indexes to have their very own sets of all the
metrics to understand how many rows are read by a given SAI query, and how
that turns into the over all time for the query, and how long those
individual reads were taking, etc.

But at the very least I want all of that separate from the metrics for my
regular point reads.

And yes putting the individual point read metrics into the range metrics
would be strange.  But rolling up the time to get all the rows and rolling
that into the Range metrics could possibly make sense.  Still strange.  So
again SAI specific metrics seem the best to me, rather than shoe horning
them into the existing metrics.

-Jeremiah

On Dec 1, 2023 at 1:04:47 PM, Caleb Rackliffe <[email protected]>
wrote:

> Right. SAI queries are distributed range queries that produce local
> single-partition reads. They should absolutely not be recorded in the local
> range read latency metric. I'm fine ultimately with a new metric or the
> existing local single-partition read metric.
>
> On Fri, Dec 1, 2023 at 1:02 PM J. D. Jordan <[email protected]>
> wrote:
>
>> At the coordinator level SAI queries fall under Range metrics. I would
>> either put them under the same at the lower level or in a new SAI metric.
>>
>> It would be confusing to have the top level coordinator query metrics in
>> Range and the lower level in Read.
>>
>> On Dec 1, 2023, at 12:50 PM, Caleb Rackliffe <[email protected]>
>> wrote:
>>
>> 
>> So the plan would be to have local "Read" and "Range" remain unchanged in
>> TableMetrics, but have a third "SAIRead" (?) just for SAI post-filtering
>> read SinglePartitionReadCommands? I won't complain too much if that's what
>> we settle on, but it just depends on how much this is a metric for
>> ReadCommand subclasses operating at the node-local level versus something
>> we think we should link conceptually to a user query. SAI queries will
>> produce a SinglePartitionReadCommand per matching primary key, so that
>> definitely won't work for the latter.
>>
>> @Mike On a related note, we now have "PartitionReads" and "RowsFiltered"
>> in TableQueryMetrics. Should the former just be removed, given a.) it
>> actually is rows now not partitions and b.) "RowsFiltered" seems like it'll
>> be almost  the same thing now? (I guess if we ever try batching rows reads
>> per partition, it would come in handy again...)
>>
>> On Fri, Dec 1, 2023 at 12:30 PM J. D. Jordan <[email protected]>
>> wrote:
>>
>>> I prefer option 2. It is much easier to understand and roll up two
>>> metrics than to do subtractive dashboards.
>>>
>>> SAI reads are already “range reads” for the client level metrics, not
>>> regular reads. So grouping them into the regular read metrics at the lower
>>> level seems confusing to me in that sense as well.
>>>
>>> As an operator I want to know how my SAI reads and normal reads are
>>> performing latency wise separately.
>>>
>>> -Jeremiah
>>>
>>> On Dec 1, 2023, at 11:15 AM, Caleb Rackliffe <[email protected]>
>>> wrote:
>>>
>>> 
>>> Option 1 would be my preference. Seems both useful to have a single
>>> metric for read load against the table and a way to break out SAI reads
>>> specifically.
>>>
>>> On Fri, Dec 1, 2023 at 11:00 AM Mike Adamson <[email protected]>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> We are looking at adding SAI post-filtering reads to the local table
>>>> metrics and would like some feedback on the best approach.
>>>>
>>>> We don't think that SAI reads are that special so they can be included
>>>> in the table latencies, but how do we handle the global counts and the SAI
>>>> counts? Do we need to maintain a separate count of SAI reads? We feel the
>>>> answer to this is yes so how do we do the counting? There are two options
>>>> (others welcome):
>>>>
>>>> 1. All reads go into the current global count and we have a separate
>>>> count for SAI specific reads. So non-SAI reads = global count - SAI count
>>>> 2. We leave the exclude the SAI reads from the current global count so
>>>> total reads = global count + SAI count
>>>>
>>>> Our preference is for option 1 above. Does anyone have any strong views
>>>> / opinions on this?
>>>>
>>>>
>>>>
>>>> --
>>>> [image: DataStax Logo Square] <https://www.datastax.com/> *Mike
>>>> Adamson*
>>>> Engineering
>>>>
>>>> +1 650 389 6000 <16503896000> | datastax.com
>>>> <https://www.datastax.com/>
>>>> Find DataStax Online: [image: LinkedIn Logo]
>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linkedin.com_company_datastax&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=uHzE4WhPViSF0rsjSxKhfwGDU1Bo7USObSc_aIcgelo&s=akx0E6l2bnTjOvA-YxtonbW0M4b6bNg4nRwmcHNDo4Q&e=>
>>>>    [image: Facebook Logo]
>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.facebook.com_datastax&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=uHzE4WhPViSF0rsjSxKhfwGDU1Bo7USObSc_aIcgelo&s=ncMlB41-6hHuqx-EhnM83-KVtjMegQ9c2l2zDzHAxiU&e=>
>>>>    [image: Twitter Logo] <https://twitter.com/DataStax>   [image: RSS
>>>> Feed] <https://www.datastax.com/blog/rss.xml>   [image: Github Logo]
>>>> <https://github.com/datastax>
>>>>
>>>>

Re: [DISCUSS] CASSANDRA-18940 SAI post-filtering reads don't update local table latency metrics

Reply via email to