This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/spark-kubernetes-operator.git
The following commit(s) were added to refs/heads/main by this push:
new 45cd118 [SPARK-54073] Improve `ConfOptionDocGenerator` to generate a
sorted doc by config key
45cd118 is described below
commit 45cd1181b575a1644e7597484168f113b0f0839c
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Tue Oct 28 23:01:10 2025 -0700
[SPARK-54073] Improve `ConfOptionDocGenerator` to generate a sorted doc by
config key
### What changes were proposed in this pull request?
This PR aims to improve `ConfOptionDocGenerator` to generate a sorted
config doc by key.
### Why are the changes needed?
To maintain `docs/config_properties.md` in the alphabetic order always.
### Does this PR introduce _any_ user-facing change?
No `Spark Operator` behavior change.
### How was this patch tested?
Pass the CIs and check the generated doc manually.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #407 from dongjoon-hyun/SPARK-54073.
Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
---
.../k8s/operator/utils/ConfOptionDocGenerator.java | 1 +
.../apache/spark/k8s/operator/utils/DocTable.java | 5 ++
docs/config_properties.md | 58 +++++++++++-----------
3 files changed, 35 insertions(+), 29 deletions(-)
diff --git
a/build-tools/docs-utils/src/main/java/org/apache/spark/k8s/operator/utils/ConfOptionDocGenerator.java
b/build-tools/docs-utils/src/main/java/org/apache/spark/k8s/operator/utils/ConfOptionDocGenerator.java
index b2c3c63..d8f5fff 100644
---
a/build-tools/docs-utils/src/main/java/org/apache/spark/k8s/operator/utils/ConfOptionDocGenerator.java
+++
b/build-tools/docs-utils/src/main/java/org/apache/spark/k8s/operator/utils/ConfOptionDocGenerator.java
@@ -68,6 +68,7 @@ public class ConfOptionDocGenerator {
conf.getDescription()));
}
}
+ table.sort();
table.flush(printWriter);
printWriter.close();
}
diff --git
a/build-tools/docs-utils/src/main/java/org/apache/spark/k8s/operator/utils/DocTable.java
b/build-tools/docs-utils/src/main/java/org/apache/spark/k8s/operator/utils/DocTable.java
index 569f012..bb511d9 100644
---
a/build-tools/docs-utils/src/main/java/org/apache/spark/k8s/operator/utils/DocTable.java
+++
b/build-tools/docs-utils/src/main/java/org/apache/spark/k8s/operator/utils/DocTable.java
@@ -22,6 +22,7 @@ package org.apache.spark.k8s.operator.utils;
import java.io.PrintWriter;
import java.util.ArrayList;
import java.util.Collections;
+import java.util.Comparator;
import java.util.List;
import lombok.Builder;
@@ -45,6 +46,10 @@ public class DocTable {
rows.add(row);
}
+ public void sort() {
+ rows.sort(Comparator.comparing(list -> list.get(0)));
+ }
+
public void flush(PrintWriter writer) {
writer.println(joinRow(headers));
writer.println(joinRow(Collections.nCopies(columns, HEADER_SEPARATOR)));
diff --git a/docs/config_properties.md b/docs/config_properties.md
index e0552bd..534d946 100644
--- a/docs/config_properties.md
+++ b/docs/config_properties.md
@@ -2,42 +2,42 @@
# Spark Operator Config Properties
| Key | Type | Default Value | Allow Hot Reloading | Description |
| --- | --- | --- | --- | --- |
- | spark.logConf | Boolean | false | true | When enabled, operator will print
configurations |
- | spark.kubernetes.operator.name | String | spark-kubernetes-operator | false
| Name of the operator. |
- | spark.kubernetes.operator.namespace | String | default | false | Namespace
that operator is deployed within. |
- | spark.kubernetes.operator.watchedNamespaces | String | default | true |
Comma-separated list of namespaces that the operator would be watching for
Spark resources. If set to '*', operator would watch all namespaces. |
- | spark.kubernetes.operator.terminateOnInformerFailureEnabled | Boolean |
false | false | Enable to indicate informer errors should stop operator
startup. If disabled, operator startup will ignore recoverable errors, caused
for example by RBAC issues and will retry periodically. |
- | spark.kubernetes.operator.reconciler.terminationTimeoutSeconds | Integer |
30 | false | Grace period for operator shutdown before reconciliation threads
are killed. |
- | spark.kubernetes.operator.reconciler.parallelism | Integer | 50 | false |
Thread pool size for Spark Operator reconcilers. Unbounded pool would be used
if set to non-positive number. |
- | spark.kubernetes.operator.reconciler.foregroundRequestTimeoutSeconds | Long
| 60 | true | Timeout (in seconds) for requests made to API server. This
applies only to foreground requests. |
- | spark.kubernetes.operator.reconciler.intervalSeconds | Long | 120 | true |
Interval (in seconds, non-negative) to reconcile Spark applications. Note that
reconciliation is always expected to be triggered when app spec / status is
updated. This interval controls the reconcile behavior of operator
reconciliation even when there's no update on SparkApplication, e.g. to
determine whether a hanging app needs to be proactively terminated. Thus this
is recommended to set to above 2 minutes t [...]
- | spark.kubernetes.operator.reconciler.trimStateTransitionHistoryEnabled |
Boolean | true | true | When enabled, operator would trim state transition
history when a new attempt starts, keeping previous attempt summary only. |
- | spark.kubernetes.operator.reconciler.appStatusListenerClassNames | String |
| false | Comma-separated names of SparkAppStatusListener class
implementations |
- | spark.kubernetes.operator.reconciler.clusterStatusListenerClassNames |
String | | false | Comma-separated names of SparkClusterStatusListener class
implementations |
- | spark.kubernetes.operator.dynamicConfig.enabled | Boolean | false | false |
When enabled, operator would use config map as source of truth for config
property override. The config map need to be created in
spark.kubernetes.operator.namespace, and labeled with operator name. |
- | spark.kubernetes.operator.dynamicConfig.selector | String |
app.kubernetes.io/name=spark-kubernetes-operator,app.kubernetes.io/component=operator-dynamic-config-overrides
| false | The selector str applied to dynamic config map. |
- | spark.kubernetes.operator.dynamicConfig.reconcilerParallelism | Integer | 1
| false | Parallelism for dynamic config reconciler. Unbounded pool would be
used if set to non-positive number. |
- | spark.kubernetes.operator.reconciler.rateLimiter.refreshPeriodSeconds |
Integer | 15 | false | Operator rate limiter refresh period(in seconds) for
each resource. |
- | spark.kubernetes.operator.reconciler.rateLimiter.maxLoopForPeriod | Integer
| 5 | false | Max number of reconcile loops triggered within the rate limiter
refresh period for each resource. Setting the limit <= 0 disables the limiter.
|
- | spark.kubernetes.operator.reconciler.retry.initialIntervalSeconds | Integer
| 5 | false | Initial interval(in seconds) of retries on unhandled controller
errors. |
- | spark.kubernetes.operator.reconciler.retry.intervalMultiplier | Double |
1.5 | false | Interval multiplier of retries on unhandled controller errors.
Setting this to 1 for linear retry. |
- | spark.kubernetes.operator.reconciler.retry.maxIntervalSeconds | Integer |
-1 | false | Max interval(in seconds) of retries on unhandled controller
errors. Set to non-positive for unlimited. |
- | spark.kubernetes.operator.api.retryMaxAttempts | Integer | 15 | false | Max
attempts of retries on unhandled controller errors. Setting this to
non-positive value means no retry. |
| spark.kubernetes.operator.api.retryAttemptAfterSeconds | Long | 1 | false |
Default time (in seconds) to wait till next request. This would be used if
server does not set Retry-After in response. Setting this to non-positive
number means immediate retry. |
- | spark.kubernetes.operator.api.statusPatchMaxAttempts | Long | 3 | false |
Maximal number of retry attempts of requests to k8s server for resource status
update. This would be performed on top of k8s client
spark.kubernetes.operator.retry.maxAttempts to overcome potential conflicting
update on the same SparkApplication. This should be positive number. |
+ | spark.kubernetes.operator.api.retryMaxAttempts | Integer | 15 | false | Max
attempts of retries on unhandled controller errors. Setting this to
non-positive value means no retry. |
| spark.kubernetes.operator.api.secondaryResourceCreateMaxAttempts | Long | 3
| false | Maximal number of retry attempts of requesting secondary resource for
Spark application. This would be performed on top of k8s client
spark.kubernetes.operator.retry.maxAttempts to overcome potential conflicting
reconcile on the same SparkApplication. This should be positive number |
- | spark.kubernetes.operator.metrics.josdkMetricsEnabled | Boolean | true |
false | When enabled, the josdk metrics will be added in metrics source and
configured for operator. |
- | spark.kubernetes.operator.metrics.clientMetricsEnabled | Boolean | true |
false | Enable KubernetesClient metrics for measuring the HTTP traffic to the
Kubernetes API Server. Since the metrics is collected via interceptors, can be
disabled when opt in customized interceptors. |
- | spark.kubernetes.operator.metrics.clientMetricsGroupByResponseCodeEnabled |
Boolean | true | false | When enabled, additional metrics group by http
response code group(1xx, 2xx, 3xx, 4xx, 5xx) received from API server will be
added. Users can disable it when their monitoring system can combine lower
level kubernetes.client.http.response.<3-digit-response-code> metrics. |
- | spark.kubernetes.operator.metrics.port | Integer | 19090 | false | The port
used for checking metrics |
- | spark.kubernetes.operator.metrics.prometheusTextBasedFormatEnabled |
Boolean | true | false | Whether or not to enable text-based format for
Prometheus 2.0, as recommended by
https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format
|
- | spark.kubernetes.operator.metrics.sanitizePrometheusMetricsNameEnabled |
Boolean | true | false | Whether or not to enable automatic name sanitizing for
all metrics based on best-practice guide from Prometheus
https://prometheus.io/docs/practices/naming/ |
+ | spark.kubernetes.operator.api.statusPatchMaxAttempts | Long | 3 | false |
Maximal number of retry attempts of requests to k8s server for resource status
update. This would be performed on top of k8s client
spark.kubernetes.operator.retry.maxAttempts to overcome potential conflicting
update on the same SparkApplication. This should be positive number. |
+ | spark.kubernetes.operator.dynamicConfig.enabled | Boolean | false | false |
When enabled, operator would use config map as source of truth for config
property override. The config map need to be created in
spark.kubernetes.operator.namespace, and labeled with operator name. |
+ | spark.kubernetes.operator.dynamicConfig.reconcilerParallelism | Integer | 1
| false | Parallelism for dynamic config reconciler. Unbounded pool would be
used if set to non-positive number. |
+ | spark.kubernetes.operator.dynamicConfig.selector | String |
app.kubernetes.io/name=spark-kubernetes-operator,app.kubernetes.io/component=operator-dynamic-config-overrides
| false | The selector str applied to dynamic config map. |
| spark.kubernetes.operator.health.probePort | Integer | 19091 | false | The
port used for health/readiness check probe status. |
| spark.kubernetes.operator.health.sentinelExecutorPoolSize | Integer | 3 |
false | Size of executor service in Sentinel Managers to check the health of
sentinel resources. |
| spark.kubernetes.operator.health.sentinelResourceReconciliationDelaySeconds
| Integer | 60 | true | Allowed max time(seconds) between spec update and
reconciliation for sentinel resources. |
| spark.kubernetes.operator.leaderElection.enabled | Boolean | false | false
| Enable leader election for the operator to allow running standby instances.
When this is disabled, only one operator instance is expected to be up and
running at any time (replica = 1) to avoid race condition. |
- | spark.kubernetes.operator.leaderElection.leaseName | String |
spark-operator-lease | false | Leader election lease name, must be unique for
leases in the same namespace. |
| spark.kubernetes.operator.leaderElection.leaseDurationSeconds | Integer |
180 | false | Leader election lease duration in seconds, non-negative. |
+ | spark.kubernetes.operator.leaderElection.leaseName | String |
spark-operator-lease | false | Leader election lease name, must be unique for
leases in the same namespace. |
| spark.kubernetes.operator.leaderElection.renewDeadlineSeconds | Integer |
120 | false | Leader election renew deadline in seconds, non-negative. This
needs to be smaller than the lease duration to allow current leader renew the
lease before lease expires. |
| spark.kubernetes.operator.leaderElection.retryPeriodSeconds | Integer | 5 |
false | Leader election retry period in seconds, non-negative. |
+ | spark.kubernetes.operator.metrics.clientMetricsEnabled | Boolean | true |
false | Enable KubernetesClient metrics for measuring the HTTP traffic to the
Kubernetes API Server. Since the metrics is collected via interceptors, can be
disabled when opt in customized interceptors. |
+ | spark.kubernetes.operator.metrics.clientMetricsGroupByResponseCodeEnabled |
Boolean | true | false | When enabled, additional metrics group by http
response code group(1xx, 2xx, 3xx, 4xx, 5xx) received from API server will be
added. Users can disable it when their monitoring system can combine lower
level kubernetes.client.http.response.<3-digit-response-code> metrics. |
+ | spark.kubernetes.operator.metrics.josdkMetricsEnabled | Boolean | true |
false | When enabled, the josdk metrics will be added in metrics source and
configured for operator. |
+ | spark.kubernetes.operator.metrics.port | Integer | 19090 | false | The port
used for checking metrics |
+ | spark.kubernetes.operator.metrics.prometheusTextBasedFormatEnabled |
Boolean | true | false | Whether or not to enable text-based format for
Prometheus 2.0, as recommended by
https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format
|
+ | spark.kubernetes.operator.metrics.sanitizePrometheusMetricsNameEnabled |
Boolean | true | false | Whether or not to enable automatic name sanitizing for
all metrics based on best-practice guide from Prometheus
https://prometheus.io/docs/practices/naming/ |
+ | spark.kubernetes.operator.name | String | spark-kubernetes-operator | false
| Name of the operator. |
+ | spark.kubernetes.operator.namespace | String | default | false | Namespace
that operator is deployed within. |
+ | spark.kubernetes.operator.reconciler.appStatusListenerClassNames | String |
| false | Comma-separated names of SparkAppStatusListener class
implementations |
+ | spark.kubernetes.operator.reconciler.clusterStatusListenerClassNames |
String | | false | Comma-separated names of SparkClusterStatusListener class
implementations |
+ | spark.kubernetes.operator.reconciler.foregroundRequestTimeoutSeconds | Long
| 60 | true | Timeout (in seconds) for requests made to API server. This
applies only to foreground requests. |
+ | spark.kubernetes.operator.reconciler.intervalSeconds | Long | 120 | true |
Interval (in seconds, non-negative) to reconcile Spark applications. Note that
reconciliation is always expected to be triggered when app spec / status is
updated. This interval controls the reconcile behavior of operator
reconciliation even when there's no update on SparkApplication, e.g. to
determine whether a hanging app needs to be proactively terminated. Thus this
is recommended to set to above 2 minutes t [...]
+ | spark.kubernetes.operator.reconciler.parallelism | Integer | 50 | false |
Thread pool size for Spark Operator reconcilers. Unbounded pool would be used
if set to non-positive number. |
+ | spark.kubernetes.operator.reconciler.rateLimiter.maxLoopForPeriod | Integer
| 5 | false | Max number of reconcile loops triggered within the rate limiter
refresh period for each resource. Setting the limit <= 0 disables the limiter.
|
+ | spark.kubernetes.operator.reconciler.rateLimiter.refreshPeriodSeconds |
Integer | 15 | false | Operator rate limiter refresh period(in seconds) for
each resource. |
+ | spark.kubernetes.operator.reconciler.retry.initialIntervalSeconds | Integer
| 5 | false | Initial interval(in seconds) of retries on unhandled controller
errors. |
+ | spark.kubernetes.operator.reconciler.retry.intervalMultiplier | Double |
1.5 | false | Interval multiplier of retries on unhandled controller errors.
Setting this to 1 for linear retry. |
+ | spark.kubernetes.operator.reconciler.retry.maxIntervalSeconds | Integer |
-1 | false | Max interval(in seconds) of retries on unhandled controller
errors. Set to non-positive for unlimited. |
+ | spark.kubernetes.operator.reconciler.terminationTimeoutSeconds | Integer |
30 | false | Grace period for operator shutdown before reconciliation threads
are killed. |
+ | spark.kubernetes.operator.reconciler.trimStateTransitionHistoryEnabled |
Boolean | true | true | When enabled, operator would trim state transition
history when a new attempt starts, keeping previous attempt summary only. |
+ | spark.kubernetes.operator.terminateOnInformerFailureEnabled | Boolean |
false | false | Enable to indicate informer errors should stop operator
startup. If disabled, operator startup will ignore recoverable errors, caused
for example by RBAC issues and will retry periodically. |
+ | spark.kubernetes.operator.watchedNamespaces | String | default | true |
Comma-separated list of namespaces that the operator would be watching for
Spark resources. If set to '*', operator would watch all namespaces. |
+ | spark.logConf | Boolean | false | true | When enabled, operator will print
configurations |
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]