Marton Greber has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23246 )

Change subject: KUDU-3662 [5/n] Add metrics support to replication
......................................................................


Patch Set 4:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/23246/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/23246/3//COMMIT_MSG@9
PS3, Line 9: three key metrics
> Question - are these metrics always enabled i.e. do these metrics get popul
Yes if they are registered as we do here, these metrics will be always 
available.


http://gerrit.cloudera.org:8080/#/c/23246/3//COMMIT_MSG@15
PS3, Line 15: upstream
            : Flink Kudu connector
> Just out of curiosity if the Flink connector changes its internal structure
Good question: so if a commit lands in the Flink connector, nothing happens. 
There needs to be a new release that we can pick up. As long as we don't 
explicitly bump our Flink Kudu connector version we are good.


http://gerrit.cloudera.org:8080/#/c/23246/3/java/kudu-replication/src/main/java/org/apache/kudu/replication/wrappedsource/MetricWrappedKuduEnumerator.java
File 
java/kudu-replication/src/main/java/org/apache/kudu/replication/wrappedsource/MetricWrappedKuduEnumerator.java:

http://gerrit.cloudera.org:8080/#/c/23246/3/java/kudu-replication/src/main/java/org/apache/kudu/replication/wrappedsource/MetricWrappedKuduEnumerator.java@75
PS3, Line 75: pending
> nit: Can this be renamed to pendingSplits for consistency?
ah sorry my bad the naming here is correct according to the wrapped class:
https://github.com/apache/flink-connector-kudu/blob/eb13082fc75a0061a1605c92f536157b4ac8a5b0/flink-connector-kudu/src/main/java/org/apache/flink/connector/kudu/source/enumerator/KuduSourceEnumerator.java#L84-L86
(we have to use the matching field names)

But the getters and metric names could probably use a better name:
pendingSplits -> pendingCount
unassignedSplits -> unassignedCount.

Thanks for pointing this out.


http://gerrit.cloudera.org:8080/#/c/23246/3/java/kudu-replication/src/main/java/org/apache/kudu/replication/wrappedsource/MetricWrappedKuduEnumerator.java@76
PS3, Line 76: unassigned
> ditto - unassignedSplits?
ditto.


http://gerrit.cloudera.org:8080/#/c/23246/3/java/kudu-replication/src/main/java/org/apache/kudu/replication/wrappedsource/MetricWrappedKuduEnumerator.java@129
PS3, Line 129: long[] parsed = 
HybridTimeUtil.HTTimestampToPhysicalAndLogical(hybridTime);
             :     long epochMicros = parsed[0];
             :     return epochMicros / 1_000;
> HybridTimeUtil.HTTimestampToPhysicalAndLogical() doesn't throw any exceptio
In terms of sanity checking, that is done in the enumerator:
https://github.com/apache/flink-connector-kudu/blob/eb13082fc75a0061a1605c92f536157b4ac8a5b0/flink-connector-kudu/src/main/java/org/apache/flink/connector/kudu/source/utils/KuduSplitGenerator.java#L65-L79

Meaning that the enumerator fails first if we encounter an invalid timestamp.


http://gerrit.cloudera.org:8080/#/c/23246/3/java/kudu-replication/src/test/java/org/apache/kudu/replication/TestReplicationMetrics.java
File 
java/kudu-replication/src/test/java/org/apache/kudu/replication/TestReplicationMetrics.java:

http://gerrit.cloudera.org:8080/#/c/23246/3/java/kudu-replication/src/test/java/org/apache/kudu/replication/TestReplicationMetrics.java@51
PS3, Line 51: timeoutMi
> nit: Here and wherever applicable, make it "timeoutMillis" to be unambiguou
Done


http://gerrit.cloudera.org:8080/#/c/23246/3/java/kudu-replication/src/test/java/org/apache/kudu/replication/TestReplicationMetrics.java@98
PS3, Line 98: waitForTimestampAdvancement
> Is there a likelihood to hit timeout for timestamp advancement on slow syst
The discoveryIntervalSeconds in createDefaultJobConfig in the test base class 
is configured for 2 seconds. That means that the source will try to generate 
new splits every 2 seconds. The rationale behind the 15 seconds timeout was 
that even if it gets delayed with a couple cycles on really slow systems we 
should be fine.


http://gerrit.cloudera.org:8080/#/c/23246/3/java/kudu-replication/src/test/java/org/apache/kudu/replication/TestReplicationMetrics.java@125
PS3, Line 125: insertRowsIntoAllTypesTable
> question - I am not quite clear on "replication for table metadata", does r
pure DDL changes are not replicated. (we do not handle schema drift yet) For 
the initial version in DDL changes one has to stop the replication and complete 
the DDL on both source and sink and then resume it. This will be documented.



--
To view, visit http://gerrit.cloudera.org:8080/23246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibfbd34c707e7539ee88863399ae3061683f8bb3b
Gerrit-Change-Number: 23246
Gerrit-PatchSet: 4
Gerrit-Owner: Marton Greber <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Gabriella Lotz <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber <[email protected]>
Gerrit-Reviewer: Zoltan Chovan <[email protected]>
Gerrit-Comment-Date: Thu, 28 Aug 2025 12:56:45 +0000
Gerrit-HasComments: Yes

Reply via email to