Marton Greber has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23246 )

Change subject: KUDU-3662 [5/n] Add metrics support to replication
......................................................................


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/23246/4/java/kudu-replication/src/main/java/org/apache/kudu/replication/wrappedsource/MetricWrappedKuduEnumerator.java
File 
java/kudu-replication/src/main/java/org/apache/kudu/replication/wrappedsource/MetricWrappedKuduEnumerator.java:

http://gerrit.cloudera.org:8080/#/c/23246/4/java/kudu-replication/src/main/java/org/apache/kudu/replication/wrappedsource/MetricWrappedKuduEnumerator.java@127
PS4, Line 127:   private long getLastEndTimestamp() {
             :     long hybridTime = 
ReflectionSecurityUtils.getLongFieldValue(lastEndTimestampField, delegate);
             :     long[] parsed = 
HybridTimeUtil.HTTimestampToPhysicalAndLogical(hybridTime);
             :     long epochMicros = parsed[0];
             :     return epochMicros / 1_000;
             :   }
> > The physical timestamp alone provides all the necessary information to un
The replication in the source connector uses hybrid time for the initial t_0 
and hybrid timestamps for diff scanning t_{n} to t_{n+1}.

Since metrics are not yet implemented in the source connector, we wrap some 
classes in this patch to achieve metrics exposure. In a sense lastEndTimestamp 
( t_{n} )is a state of the enumerator. From an observability point of view what 
we want to get out of this metric is a nice monotonic increase to imply any lag 
that can occur. This is why I thought the physical is fine.

When we generate a set of splits (the splits basically wrap KuduScanTokens) we 
save the ending timestamp. While the readers are executing the splits, they are 
tracked: pending splits (currently reading), unassigned splits (waiting for 
reader)
The discover interval tells the enumerator the periodicity how new splits 
should be discovered, but only if all the current splits are finished: pending 
and unassigned are empty.

"this approach assumes it's not necessary to track changes that happened 
withing the same physical timestamp": not sure if I understand, is the question 
towards the general replication mechanism or just about what is exposed as 
metrics?

If it is about exposing the logical part: i wasn't planning to expose that as a 
metric, but if you think it would be helpful, I can add it for sure. Just let 
me know.

If the question is about how the diff scan boundaries cover for example fast 
updates which can end up on the same physical timestamp: we use system clock to 
get the physical part for a given timestamp, and we always use 0 for the 
logical part when we set t_{n}, t_{n+1}, then it is just handled by diffscan. 
We use the same apis and mechanism on this front that backup and restore uses.


http://gerrit.cloudera.org:8080/#/c/23246/4/java/kudu-replication/src/test/java/org/apache/kudu/replication/TestReplicationMetrics.java
File 
java/kudu-replication/src/test/java/org/apache/kudu/replication/TestReplicationMetrics.java:

http://gerrit.cloudera.org:8080/#/c/23246/4/java/kudu-replication/src/test/java/org/apache/kudu/replication/TestReplicationMetrics.java@129
PS4, Line 129:
> OK, that sounds reasonable.  To avoid introducing new flaky test, consider
Did run TSAN through dist-test a 1000 times all successful. [1]
One thing I noticed while experimenting is that it's advantageous to add @Test 
timeouts just in case. Added those in the latest patch set.

[1] http://dist-test.cloudera.org/job?job_id=root.1757148805.14824



--
To view, visit http://gerrit.cloudera.org:8080/23246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibfbd34c707e7539ee88863399ae3061683f8bb3b
Gerrit-Change-Number: 23246
Gerrit-PatchSet: 6
Gerrit-Owner: Marton Greber <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Gabriella Lotz <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber <[email protected]>
Gerrit-Reviewer: Zoltan Chovan <[email protected]>
Gerrit-Comment-Date: Sat, 06 Sep 2025 09:54:04 +0000
Gerrit-HasComments: Yes

Reply via email to