github-actions[bot] commented on code in PR #63490:
URL: https://github.com/apache/doris/pull/63490#discussion_r3280689193
##########
fe/fe-core/src/main/java/org/apache/doris/job/extensions/insert/streaming/DataSourceConfigValidator.java:
##########
@@ -53,7 +53,8 @@ public class DataSourceConfigValidator {
DataSourceConfigKeys.SSL_MODE,
DataSourceConfigKeys.SSL_ROOTCERT,
DataSourceConfigKeys.SLOT_NAME,
- DataSourceConfigKeys.PUBLICATION_NAME
+ DataSourceConfigKeys.PUBLICATION_NAME,
+ DataSourceConfigKeys.SERVER_ID
);
Review Comment:
Adding `server_id` to the allowed key set without validating its
syntax/range lets invalid persisted jobs through FE validation. The reader-side
check in `ConfigUtil.resolveServerIdRange()` is only reached when MySQL source
config is built; `/api/initReader` only calls `Env.getReader()` ->
`MySqlSourceReader.initialize()`, which initializes the serializer/thread pool
and does not parse `server_id`, and initial/snapshot jobs skip reader
initialization at create time entirely. For example, `CREATE JOB ... FROM MYSQL
(... "offset" = "latest", "server_id" = "abc")` passes this validator and
`/api/initReader`, then fails later when a task calls
`getSourceSplits()`/`prepareBinlogSplit()`. That also means the new regression
cases expecting synchronous CREATE failure for malformed `server_id` will not
catch the error at the point they assert. Please validate `server_id` here for
MySQL, including single/range format, `>= 1`, start <= end, and width vs
`snapshot_parallelism` (and reject or
ignore it explicitly for non-MySQL sources).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]