Re: [PR] [MINOR] docs: add architecture-first optimizer guide and improve discoverability [gravitino]

via GitHub Mon, 09 Mar 2026 04:40:29 -0700


jerryshao commented on code in PR #10203:
URL: https://github.com/apache/gravitino/pull/10203#discussion_r2904889105



##########
docs/table-maintenance-service/optimizer-cli-reference.md:
##########
@@ -0,0 +1,211 @@
+---
+title: "Optimizer CLI Reference"
+slug: /table-maintenance-service/cli-reference

Review Comment:
   Slug `"/table-maintenance-service/cli-reference"` doesn't match filename 
`optimizer-cli-reference.md`. Consider 
`/table-maintenance-service/optimizer-cli-reference`.



##########
docs/index.md:
##########
@@ -92,6 +92,8 @@ Gravitino currently supports the following catalogs:
 
 If you want to operate table and partition statistics, you can refer to the 
[document](./manage-statistics-in-gravitino.md).
 
+If you want an operations guide for automated maintenance workflows 
(statistics, metrics, monitoring, and strategy jobs), see [Table Maintenance 
Service (Optimizer)](./table-maintenance-service/optimizer.md). You can start 
with Gravitino built-in policies and built-in job templates first, and use 
custom extension interfaces when built-ins do not meet your requirements.

Review Comment:
   The new sentence is quite long compared to other entries on this page. 
Consider trimming to match the surrounding one-liner style, e.g.:
   
   > If you want to automate table maintenance workflows, see [Table 
Maintenance Service (Optimizer)](./table-maintenance-service/optimizer.md).



##########
docs/table-maintenance-service/optimizer.md:
##########
@@ -0,0 +1,115 @@
+---
+title: "Table Maintenance Service (Optimizer)"
+slug: /table-maintenance-service

Review Comment:
   Slug `"/table-maintenance-service"` doesn't match the filename 
`optimizer.md`. Other Gravitino docs use slugs that mirror their filenames. 
Consider `/table-maintenance-service/optimizer` to stay consistent, or rename 
the file to `table-maintenance-service.md`.



##########
docs/table-maintenance-service/optimizer.md:
##########
@@ -0,0 +1,115 @@
+---
+title: "Table Maintenance Service (Optimizer)"
+slug: /table-maintenance-service
+keyword: table maintenance, optimizer, statistics, metrics, monitor
+license: This software is licensed under the Apache License version 2.
+---
+
+## What is this service
+
+The Table Maintenance Service (Optimizer) automates table maintenance by 
connecting:
+
+- Statistics and metrics collection
+- Rule evaluation and strategy recommendation
+- Job template based execution
+
+The CLI commands and configuration keys use the `optimizer` name.
+
+## Architecture overview
+
+The optimizer workflow is based on six parts:
+
+1. Metadata objects: catalog/schema/table in a metalake.
+2. Statistics and metrics: table/partition signals used for decision making.
+3. Policies: strategy intent, for example `system_iceberg_compaction`.
+4. Job templates: executable contracts, for example built-in Spark templates.
+5. Job executor: local or custom backend that runs submitted jobs.
+6. Status and logs: REST job state plus local staging logs.
+
+Typical data flow:
+
+1. Collect statistics and metrics for target tables.
+2. Evaluate rules and produce candidate actions.
+3. Submit jobs using a concrete template and `jobConf`.
+4. Track status and verify results on table metadata and logs.
+
+## Execution modes
+
+| Mode | Main entry | Best for | Output |
+| --- | --- | --- | --- |
+| Built-in maintenance workflow | Gravitino REST + built-in templates | 
Server-side operational runs | Submitted Spark jobs and updated metadata |
+| Optimizer CLI local calculator | `gravitino-optimizer.sh` | Local 
file-driven testing and batch scripts | Statistics/metrics updates and optional 
submissions |
+
+Use built-in maintenance workflow when you want policy-driven server execution.
+Use CLI local calculator when you want to feed JSONL input directly.
+
+## Start here
+
+- Configuration first: read [Optimizer 
Configuration](./optimizer-configuration.md).
+- Need custom integrations: read [Optimizer Extension 
Guide](./optimizer-extension-guide.md).
+- First-time enablement: run [Optimizer Quick Start and 
Verification](./optimizer-quick-start.md).
+- CLI-only usage: read [Optimizer CLI Reference](./optimizer-cli-reference.md).
+- Runtime failures or mismatched results: check [Optimizer 
Troubleshooting](./optimizer-troubleshooting.md).
+
+## Lifecycle
+
+### 1. Collect
+
+Generate or ingest table and partition statistics/metrics.
+
+### 2. Evaluate
+
+Apply policies and rules to decide whether maintenance should run.
+
+### 3. Submit
+
+Pick a job template and submit job with concrete `jobConf`.
+
+### 4. Observe
+
+Check REST job status and validate resulting statistics, metrics, or rewritten 
data files.
+
+## Configuration model
+
+| Layer | Scope | Typical keys |
+| --- | --- | --- |
+| Gravitino server config | Runtime for job manager and executor | 
`gravitino.job.executor`, `gravitino.job.statusPullIntervalInMs`, 
`gravitino.jobExecutor.local.sparkHome` |
+| Job submission `jobConf` | Per job run | `catalog_name`, `table_identifier`, 
`spark_*`, template-specific args |
+| Optimizer CLI config | CLI commands | `gravitino.optimizer.*` in 
`conf/gravitino-optimizer.conf` |
+
+## Terminology mapping
+
+| Term | Example value | Used in |
+| --- | --- | --- |
+| Policy name | `iceberg_compaction_default` | Policy identity and CLI 
`--strategy-name` |
+| Policy type | `system_iceberg_compaction` | REST policy creation field 
`policyType` |
+| Strategy type | `iceberg-data-compaction` | Policy content field 
`strategy.type` and strategy handler config key |
+
+For strategy submission, `--strategy-name` must use policy name, not policy 
type or strategy type.
+
+## Before you start
+
+- Prepare a running Gravitino server.
+- Ensure target metalake exists (examples use `test`).
+- Configure `SPARK_HOME` or `gravitino.jobExecutor.local.sparkHome` for Spark 
templates.
+- For CLI mode, prepare `conf/gravitino-optimizer.conf` from template.
+- Use fully qualified identifiers where possible, for example 
`catalog.schema.table`.
+- If your Iceberg REST backend is in-memory, metadata is reset after restart.
+
+## Success criteria
+
+- Update-stats job finishes and statistics include `custom-data-file-mse` and 
`custom-delete-file-number`.
+- `submit-strategy-jobs` prints `SUBMIT` with a rewrite job ID.
+- Rewrite job log shows `Rewritten data files: <N>` where `N > 0` for 
non-empty tables.
+
+## Related docs
+
+- [Optimizer Configuration](./optimizer-configuration.md)
+- [Optimizer Extension Guide](./optimizer-extension-guide.md)
+- [Optimizer Quick Start and Verification](./optimizer-quick-start.md)
+- [Optimizer CLI Reference](./optimizer-cli-reference.md)
+- [Optimizer Troubleshooting](./optimizer-troubleshooting.md)
+- [Manage policies in Gravitino](./manage-policies-in-gravitino.md)

Review Comment:
   Broken relative link. This file lives at 
`docs/table-maintenance-service/optimizer.md`, so 
`./manage-policies-in-gravitino.md` resolves to 
`docs/table-maintenance-service/manage-policies-in-gravitino.md` which doesn't 
exist.
   
   The same issue affects the three links below 
(`iceberg-compaction-policy.md`, `manage-jobs-in-gravitino.md`, 
`manage-statistics-in-gravitino.md`). All four should use `../` prefix:
   ```
   - [Manage policies in Gravitino](../manage-policies-in-gravitino.md)
   - [Iceberg compaction policy](../iceberg-compaction-policy.md)
   - [Manage jobs in Gravitino](../manage-jobs-in-gravitino.md)
   - [Manage statistics in Gravitino](../manage-statistics-in-gravitino.md)
   ```
   Run `./gradlew :docs:build` to catch these.



##########
maintenance/optimizer/build.gradle.kts:
##########
@@ -117,6 +117,7 @@ tasks {
 
   register("copyConfigs", Copy::class) {
     from("src/main/resources")
+    include("**/*.conf", "**/*.template")

Review Comment:
   Adding `**/*.template` to the include filter implies a 
`gravitino-optimizer.conf.template` file exists under `src/main/resources/`. 
However, no `.template` file is added in this PR.
   
   Either:
   1. Add `src/main/resources/gravitino-optimizer.conf.template` (recommended — 
it's referenced in the docs), or
   2. Remove the `**/*.template` glob until the template file is ready.



##########
docs/table-maintenance-service/optimizer-quick-start.md:
##########
@@ -0,0 +1,237 @@
+---
+title: "Optimizer Quick Start and Verification"
+slug: /table-maintenance-service/quick-start
+keyword: table maintenance, optimizer, quick start, compaction, update stats
+license: This software is licensed under the Apache License version 2.
+---
+
+## Before running quick start
+
+- Prepare a running Gravitino server.
+- Ensure target metalake exists (examples use `test`).
+- Configure `SPARK_HOME` or `gravitino.jobExecutor.local.sparkHome` for Spark 
templates.
+- If your Iceberg REST backend is in-memory, metadata is reset after restart.
+
+For full config details, see [Optimizer 
Configuration](./optimizer-configuration.md).
+
+## Success criteria
+
+- Update-stats job finishes and statistics include `custom-data-file-mse` and 
`custom-delete-file-number`.
+- `submit-strategy-jobs` prints `SUBMIT` with a rewrite job ID.
+- Rewrite job log shows `Rewritten data files: <N>` where `N > 0` for 
non-empty tables.
+
+## Quick start A: built-in table maintenance workflow
+
+This workflow uses:
+
+- Built-in policy type: `system_iceberg_compaction`
+- Built-in update stats job template: `builtin-iceberg-update-stats`
+- Built-in rewrite data files job template: 
`builtin-iceberg-rewrite-data-files`
+
+### 1. Preflight checks
+
+```bash
+# Check metalake
+curl -sS "http://localhost:8090/api/metalakes/test"; | jq
+
+# Check built-in templates
+curl -sS 
"http://localhost:8090/api/metalakes/test/jobs/templates?details=true"; | jq 
'.jobTemplates[].name'
+```
+
+Expected names include:
+
+- `builtin-iceberg-update-stats`
+- `builtin-iceberg-rewrite-data-files`
+
+If missing, verify `gravitino-jobs` JAR in `auxlib`, then restart Gravitino.
+
+### 2. Prepare demo metadata objects
+
+Create a REST Iceberg catalog, schema, and table:
+
+```bash
+# Create catalog (ignore "already exists" errors)
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "rest_catalog",
+    "type": "RELATIONAL",
+    "comment": "Iceberg REST catalog",
+    "provider": "lakehouse-iceberg",
+    "properties": {
+      "catalog-backend": "rest",
+      "uri": "http://localhost:9001/iceberg";
+    }
+  }' \
+  http://localhost:8090/api/metalakes/test/catalogs
+
+# Create schema
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "db",
+    "comment": "optimizer demo schema",
+    "properties": {}
+  }' \
+  http://localhost:8090/api/metalakes/test/catalogs/rest_catalog/schemas
+
+# Create table
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "t1",
+    "comment": "optimizer demo table",
+    "columns": [
+      {"name": "id", "type": "integer", "nullable": true},
+      {"name": "name", "type": "string", "nullable": true}
+    ],
+    "properties": {}
+  }' \
+  
http://localhost:8090/api/metalakes/test/catalogs/rest_catalog/schemas/db/tables
+```
+
+### 3. Seed demo data (recommended)
+
+Use Spark SQL to create enough small files so compaction has visible effect:
+
+```bash
+${SPARK_HOME}/bin/spark-sql \
+  --conf spark.hadoop.fs.defaultFS=file:/// \
+  --conf spark.sql.catalog.rest_demo=org.apache.iceberg.spark.SparkCatalog \
+  --conf spark.sql.catalog.rest_demo.type=rest \
+  --conf spark.sql.catalog.rest_demo.uri=http://localhost:9001/iceberg \
+  -e "CREATE NAMESPACE IF NOT EXISTS rest_demo.db; \
+      SET spark.sql.files.maxRecordsPerFile=1000; \
+      INSERT INTO rest_demo.db.t1 \
+      SELECT id, concat('name_', CAST(id AS STRING)) FROM range(0, 100000);"
+```
+
+### 4. Create and attach built-in compaction policy
+
+```bash
+# Create policy
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "iceberg_compaction_default",
+    "comment": "Built-in iceberg compaction policy",
+    "policyType": "system_iceberg_compaction",
+    "enabled": true,
+    "content": {}
+  }' \
+  http://localhost:8090/api/metalakes/test/policies
+
+# Attach policy to table
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "policiesToAdd": ["iceberg_compaction_default"]
+  }' \
+  
http://localhost:8090/api/metalakes/test/objects/table/rest_catalog.db.t1/policies
+```
+
+Verify association:
+
+```bash
+curl -sS 
"http://localhost:8090/api/metalakes/test/objects/table/rest_catalog.db.t1/policies?details=true";
 | jq
+```
+
+### 5. Submit built-in update stats job
+
+```bash
+update_stats_job_id=$(curl -sS -X POST -H "Accept: 
application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "jobTemplateName": "builtin-iceberg-update-stats",
+    "jobConf": {
+      "catalog_name": "rest_catalog",
+      "table_identifier": "db.t1",
+      "update_mode": "all",
+      "updater_options": 
"{\"gravitino_uri\":\"http://localhost:8090\",\"metalake\":\"test\",\"statistics_updater\":\"gravitino-statistics-updater\",\"metrics_updater\":\"gravitino-metrics-updater\"}";,
+      "spark_conf": 
"{\"spark.master\":\"local[2]\",\"spark.hadoop.fs.defaultFS\":\"file:///\"}",
+      "spark_master": "local[2]",
+      "spark_executor_instances": "1",
+      "spark_executor_cores": "1",
+      "spark_executor_memory": "1g",
+      "spark_driver_memory": "1g",
+      "catalog_type": "rest",
+      "catalog_uri": "http://localhost:9001/iceberg";,
+      "warehouse_location": ""
+    }
+  }' \
+  http://localhost:8090/api/metalakes/test/jobs/runs | jq -r '.job.jobId')
+
+echo "update-stats job id: ${update_stats_job_id}"
+```
+
+### 6. Trigger rewrite submission with `submit-strategy-jobs`
+
+```bash
+# Required optimizer CLI config for strategy submission.
+# Note: --strategy-name is policy name, not strategy.type.
+cat > /tmp/gravitino-optimizer-submit.conf <<'EOF_CONF'
+gravitino.optimizer.gravitinoUri = http://localhost:8090
+gravitino.optimizer.gravitinoMetalake = test
+gravitino.optimizer.gravitinoDefaultCatalog = rest_catalog
+gravitino.optimizer.recommender.statisticsProvider = 
gravitino-statistics-provider
+gravitino.optimizer.recommender.strategyProvider = gravitino-strategy-provider
+gravitino.optimizer.recommender.tableMetaProvider = 
gravitino-table-metadata-provider
+gravitino.optimizer.recommender.jobSubmitter = gravitino-job-submitter
+gravitino.optimizer.strategyHandler.iceberg-data-compaction.className = 
org.apache.gravitino.maintenance.optimizer.recommender.handler.compaction.CompactionStrategyHandler
+gravitino.optimizer.jobSubmitterConfig.catalog_name = rest_catalog
+gravitino.optimizer.jobSubmitterConfig.spark_master = local[2]
+gravitino.optimizer.jobSubmitterConfig.spark_executor_instances = 1
+gravitino.optimizer.jobSubmitterConfig.spark_executor_cores = 1
+gravitino.optimizer.jobSubmitterConfig.spark_executor_memory = 1g
+gravitino.optimizer.jobSubmitterConfig.spark_driver_memory = 1g
+gravitino.optimizer.jobSubmitterConfig.catalog_type = rest
+gravitino.optimizer.jobSubmitterConfig.catalog_uri = 
http://localhost:9001/iceberg
+gravitino.optimizer.jobSubmitterConfig.warehouse_location =
+gravitino.optimizer.jobSubmitterConfig.spark_conf = 
{"spark.master":"local[2]","spark.hadoop.fs.defaultFS":"file:///"}
+EOF_CONF
+
+# Optional: preview recommendations without submitting jobs.
+./bin/gravitino-optimizer.sh \
+  --type submit-strategy-jobs \
+  --identifiers rest_catalog.db.t1 \
+  --strategy-name iceberg_compaction_default \
+  --dry-run \
+  --limit 10 \
+  --conf-path /tmp/gravitino-optimizer-submit.conf
+
+# Submit rewrite job through strategy evaluation.
+submit_output=$(./bin/gravitino-optimizer.sh \
+  --type submit-strategy-jobs \
+  --identifiers rest_catalog.db.t1 \
+  --strategy-name iceberg_compaction_default \
+  --limit 10 \
+  --conf-path /tmp/gravitino-optimizer-submit.conf)
+echo "${submit_output}"
+
+strategy_job_id=$(echo "${submit_output}" | sed -n 
's/.*jobId=\([^[:space:]]*\).*/\1/p')

Review Comment:
   This `sed` extraction is fragile — if the submission fails or output format 
changes, `strategy_job_id` will silently be empty, and the subsequent `curl` 
status check will hit a malformed URL with no visible error.
   
   Add a guard after this line:
   ```bash
   [[ -z "${strategy_job_id}" ]] && echo 'ERROR: failed to extract strategy job 
ID' && exit 1
   ```



##########
docs/table-maintenance-service/optimizer-cli-reference.md:
##########
@@ -0,0 +1,211 @@
+---
+title: "Optimizer CLI Reference"
+slug: /table-maintenance-service/cli-reference
+keyword: table maintenance, optimizer, cli, commands, metrics, statistics
+license: This software is licensed under the Apache License version 2.
+---
+
+Use `--help` to list all commands, or `--help --type <command>` for 
command-specific help.
+
+By default, optimizer CLI loads `conf/gravitino-optimizer.conf` from the 
current working
+directory. Use `--conf-path` only when you need a custom config file.
+
+## Command quick reference
+
+| Command (`--type`) | Required options | Optional options | Purpose |
+| --- | --- | --- | --- |
+| `submit-strategy-jobs` | `--identifiers`, `--strategy-name` (policy name) | 
`--dry-run`, `--limit` | Recommend and optionally submit jobs |

Review Comment:
   The `(policy name)` annotation in this Required options cell is inconsistent 
— other rows in this table have no such annotation. The "Option field meanings" 
table below already covers this. Remove the annotation here to keep the command 
table clean.



##########
docs/table-maintenance-service/optimizer-configuration.md:
##########
@@ -0,0 +1,102 @@
+---
+title: "Optimizer Configuration"
+slug: /table-maintenance-service/configuration
+keyword: table maintenance, optimizer, configuration, job template, spark
+license: This software is licensed under the Apache License version 2.
+---
+
+## Configuration layers
+
+Use these layers together:
+
+| Layer | Scope | Typical keys |
+| --- | --- | --- |
+| Gravitino server config | Runtime for job manager and executor | 
`gravitino.job.executor`, `gravitino.job.statusPullIntervalInMs`, 
`gravitino.jobExecutor.local.sparkHome` |
+| Job submission `jobConf` | Per job run | `catalog_name`, `table_identifier`, 
`spark_*`, template-specific args |
+| Optimizer CLI config | CLI commands | `gravitino.optimizer.*` in 
`conf/gravitino-optimizer.conf` |
+
+## Server-side configuration
+
+Set server-level runtime behavior in `gravitino.conf`.
+
+```properties
+gravitino.job.executor=local
+gravitino.job.statusPullIntervalInMs=300000
+gravitino.jobExecutor.local.sparkHome=/path/to/spark
+```
+
+For local demo environments, you can reduce 
`gravitino.job.statusPullIntervalInMs` to get faster status updates.
+
+## Built-in update stats `jobConf`
+
+Use `builtin-iceberg-update-stats` with at least these keys:
+
+```json
+{
+  "catalog_name": "rest_catalog",
+  "table_identifier": "db.t1",
+  "update_mode": "all",
+  "updater_options": 
"{\"gravitino_uri\":\"http://localhost:8090\",\"metalake\":\"test\",\"statistics_updater\":\"gravitino-statistics-updater\",\"metrics_updater\":\"gravitino-metrics-updater\"}";,
+  "spark_conf": 
"{\"spark.master\":\"local[2]\",\"spark.hadoop.fs.defaultFS\":\"file:///\"}",
+  "spark_master": "local[2]",
+  "spark_executor_instances": "1",
+  "spark_executor_cores": "1",
+  "spark_executor_memory": "1g",
+  "spark_driver_memory": "1g",
+  "catalog_type": "rest",
+  "catalog_uri": "http://localhost:9001/iceberg";,
+  "warehouse_location": ""
+}
+```
+
+## Strategy submission configuration
+
+`submit-strategy-jobs` needs optimizer CLI config. This is a minimal working 
example:
+
+```properties
+gravitino.optimizer.gravitinoUri = http://localhost:8090
+gravitino.optimizer.gravitinoMetalake = test
+gravitino.optimizer.gravitinoDefaultCatalog = rest_catalog
+gravitino.optimizer.recommender.statisticsProvider = 
gravitino-statistics-provider
+gravitino.optimizer.recommender.strategyProvider = gravitino-strategy-provider
+gravitino.optimizer.recommender.tableMetaProvider = 
gravitino-table-metadata-provider
+gravitino.optimizer.recommender.jobSubmitter = gravitino-job-submitter
+gravitino.optimizer.strategyHandler.iceberg-data-compaction.className = 
org.apache.gravitino.maintenance.optimizer.recommender.handler.compaction.CompactionStrategyHandler
+gravitino.optimizer.jobSubmitterConfig.catalog_name = rest_catalog
+gravitino.optimizer.jobSubmitterConfig.spark_master = local[2]
+gravitino.optimizer.jobSubmitterConfig.spark_executor_instances = 1
+gravitino.optimizer.jobSubmitterConfig.spark_executor_cores = 1
+gravitino.optimizer.jobSubmitterConfig.spark_executor_memory = 1g
+gravitino.optimizer.jobSubmitterConfig.spark_driver_memory = 1g
+gravitino.optimizer.jobSubmitterConfig.catalog_type = rest
+gravitino.optimizer.jobSubmitterConfig.catalog_uri = 
http://localhost:9001/iceberg
+gravitino.optimizer.jobSubmitterConfig.warehouse_location =

Review Comment:
   `warehouse_location =` with no value will work in Java `.properties` files 
but may confuse users. Add a brief inline comment explaining the intent:
   ```properties
   # Leave empty for local filesystem; set to your warehouse URI for cloud/HDFS 
storage
   gravitino.optimizer.jobSubmitterConfig.warehouse_location =
   ```



##########
docs/table-maintenance-service/optimizer.md:
##########
@@ -0,0 +1,115 @@
+---
+title: "Table Maintenance Service (Optimizer)"
+slug: /table-maintenance-service
+keyword: table maintenance, optimizer, statistics, metrics, monitor
+license: This software is licensed under the Apache License version 2.
+---
+
+## What is this service
+
+The Table Maintenance Service (Optimizer) automates table maintenance by 
connecting:
+
+- Statistics and metrics collection
+- Rule evaluation and strategy recommendation
+- Job template based execution
+
+The CLI commands and configuration keys use the `optimizer` name.
+
+## Architecture overview
+
+The optimizer workflow is based on six parts:
+
+1. Metadata objects: catalog/schema/table in a metalake.
+2. Statistics and metrics: table/partition signals used for decision making.
+3. Policies: strategy intent, for example `system_iceberg_compaction`.
+4. Job templates: executable contracts, for example built-in Spark templates.
+5. Job executor: local or custom backend that runs submitted jobs.
+6. Status and logs: REST job state plus local staging logs.
+
+Typical data flow:
+
+1. Collect statistics and metrics for target tables.
+2. Evaluate rules and produce candidate actions.
+3. Submit jobs using a concrete template and `jobConf`.
+4. Track status and verify results on table metadata and logs.
+
+## Execution modes
+
+| Mode | Main entry | Best for | Output |
+| --- | --- | --- | --- |
+| Built-in maintenance workflow | Gravitino REST + built-in templates | 
Server-side operational runs | Submitted Spark jobs and updated metadata |
+| Optimizer CLI local calculator | `gravitino-optimizer.sh` | Local 
file-driven testing and batch scripts | Statistics/metrics updates and optional 
submissions |
+
+Use built-in maintenance workflow when you want policy-driven server execution.
+Use CLI local calculator when you want to feed JSONL input directly.
+
+## Start here
+
+- Configuration first: read [Optimizer 
Configuration](./optimizer-configuration.md).
+- Need custom integrations: read [Optimizer Extension 
Guide](./optimizer-extension-guide.md).
+- First-time enablement: run [Optimizer Quick Start and 
Verification](./optimizer-quick-start.md).
+- CLI-only usage: read [Optimizer CLI Reference](./optimizer-cli-reference.md).
+- Runtime failures or mismatched results: check [Optimizer 
Troubleshooting](./optimizer-troubleshooting.md).
+
+## Lifecycle
+
+### 1. Collect
+
+Generate or ingest table and partition statistics/metrics.
+
+### 2. Evaluate
+
+Apply policies and rules to decide whether maintenance should run.
+
+### 3. Submit
+
+Pick a job template and submit job with concrete `jobConf`.
+
+### 4. Observe
+
+Check REST job status and validate resulting statistics, metrics, or rewritten 
data files.
+
+## Configuration model
+
+| Layer | Scope | Typical keys |
+| --- | --- | --- |
+| Gravitino server config | Runtime for job manager and executor | 
`gravitino.job.executor`, `gravitino.job.statusPullIntervalInMs`, 
`gravitino.jobExecutor.local.sparkHome` |
+| Job submission `jobConf` | Per job run | `catalog_name`, `table_identifier`, 
`spark_*`, template-specific args |
+| Optimizer CLI config | CLI commands | `gravitino.optimizer.*` in 
`conf/gravitino-optimizer.conf` |
+
+## Terminology mapping
+
+| Term | Example value | Used in |
+| --- | --- | --- |
+| Policy name | `iceberg_compaction_default` | Policy identity and CLI 
`--strategy-name` |
+| Policy type | `system_iceberg_compaction` | REST policy creation field 
`policyType` |
+| Strategy type | `iceberg-data-compaction` | Policy content field 
`strategy.type` and strategy handler config key |
+
+For strategy submission, `--strategy-name` must use policy name, not policy 
type or strategy type.
+
+## Before you start

Review Comment:
   This "Before you start" section and the "Success criteria" section below 
duplicate the same blocks in `optimizer-quick-start.md`. If either copy is 
updated independently, they'll diverge. Consider removing these two sections 
from this overview page and linking to the quick-start instead.



##########
docs/table-maintenance-service/optimizer-quick-start.md:
##########
@@ -0,0 +1,237 @@
+---
+title: "Optimizer Quick Start and Verification"
+slug: /table-maintenance-service/quick-start
+keyword: table maintenance, optimizer, quick start, compaction, update stats
+license: This software is licensed under the Apache License version 2.
+---
+
+## Before running quick start
+
+- Prepare a running Gravitino server.
+- Ensure target metalake exists (examples use `test`).
+- Configure `SPARK_HOME` or `gravitino.jobExecutor.local.sparkHome` for Spark 
templates.
+- If your Iceberg REST backend is in-memory, metadata is reset after restart.
+
+For full config details, see [Optimizer 
Configuration](./optimizer-configuration.md).
+
+## Success criteria
+
+- Update-stats job finishes and statistics include `custom-data-file-mse` and 
`custom-delete-file-number`.
+- `submit-strategy-jobs` prints `SUBMIT` with a rewrite job ID.
+- Rewrite job log shows `Rewritten data files: <N>` where `N > 0` for 
non-empty tables.
+
+## Quick start A: built-in table maintenance workflow
+
+This workflow uses:
+
+- Built-in policy type: `system_iceberg_compaction`
+- Built-in update stats job template: `builtin-iceberg-update-stats`
+- Built-in rewrite data files job template: 
`builtin-iceberg-rewrite-data-files`
+
+### 1. Preflight checks
+
+```bash
+# Check metalake
+curl -sS "http://localhost:8090/api/metalakes/test"; | jq
+
+# Check built-in templates
+curl -sS 
"http://localhost:8090/api/metalakes/test/jobs/templates?details=true"; | jq 
'.jobTemplates[].name'

Review Comment:
   Is `?details=true` a real, supported query parameter for the job templates 
API? If it's not, the `jq` filter `.jobTemplates[].name` may still work, but 
the unsupported parameter will be silently ignored and could confuse users. 
Please verify against the OpenAPI spec.



##########
docs/table-maintenance-service/optimizer-configuration.md:
##########
@@ -0,0 +1,102 @@
+---
+title: "Optimizer Configuration"
+slug: /table-maintenance-service/configuration

Review Comment:
   Slug `"/table-maintenance-service/configuration"` doesn't match filename 
`optimizer-configuration.md`. Consider 
`/table-maintenance-service/optimizer-configuration`.



##########
docs/table-maintenance-service/optimizer-quick-start.md:
##########
@@ -0,0 +1,237 @@
+---
+title: "Optimizer Quick Start and Verification"
+slug: /table-maintenance-service/quick-start
+keyword: table maintenance, optimizer, quick start, compaction, update stats
+license: This software is licensed under the Apache License version 2.
+---
+
+## Before running quick start
+
+- Prepare a running Gravitino server.
+- Ensure target metalake exists (examples use `test`).
+- Configure `SPARK_HOME` or `gravitino.jobExecutor.local.sparkHome` for Spark 
templates.
+- If your Iceberg REST backend is in-memory, metadata is reset after restart.
+
+For full config details, see [Optimizer 
Configuration](./optimizer-configuration.md).
+
+## Success criteria
+
+- Update-stats job finishes and statistics include `custom-data-file-mse` and 
`custom-delete-file-number`.
+- `submit-strategy-jobs` prints `SUBMIT` with a rewrite job ID.
+- Rewrite job log shows `Rewritten data files: <N>` where `N > 0` for 
non-empty tables.
+
+## Quick start A: built-in table maintenance workflow
+
+This workflow uses:
+
+- Built-in policy type: `system_iceberg_compaction`
+- Built-in update stats job template: `builtin-iceberg-update-stats`
+- Built-in rewrite data files job template: 
`builtin-iceberg-rewrite-data-files`
+
+### 1. Preflight checks
+
+```bash
+# Check metalake
+curl -sS "http://localhost:8090/api/metalakes/test"; | jq
+
+# Check built-in templates
+curl -sS 
"http://localhost:8090/api/metalakes/test/jobs/templates?details=true"; | jq 
'.jobTemplates[].name'
+```
+
+Expected names include:
+
+- `builtin-iceberg-update-stats`
+- `builtin-iceberg-rewrite-data-files`
+
+If missing, verify `gravitino-jobs` JAR in `auxlib`, then restart Gravitino.
+
+### 2. Prepare demo metadata objects
+
+Create a REST Iceberg catalog, schema, and table:
+
+```bash
+# Create catalog (ignore "already exists" errors)
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "rest_catalog",
+    "type": "RELATIONAL",
+    "comment": "Iceberg REST catalog",
+    "provider": "lakehouse-iceberg",
+    "properties": {
+      "catalog-backend": "rest",
+      "uri": "http://localhost:9001/iceberg";
+    }
+  }' \
+  http://localhost:8090/api/metalakes/test/catalogs
+
+# Create schema
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "db",
+    "comment": "optimizer demo schema",
+    "properties": {}
+  }' \
+  http://localhost:8090/api/metalakes/test/catalogs/rest_catalog/schemas
+
+# Create table
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "t1",
+    "comment": "optimizer demo table",
+    "columns": [
+      {"name": "id", "type": "integer", "nullable": true},
+      {"name": "name", "type": "string", "nullable": true}
+    ],
+    "properties": {}
+  }' \
+  
http://localhost:8090/api/metalakes/test/catalogs/rest_catalog/schemas/db/tables
+```
+
+### 3. Seed demo data (recommended)
+
+Use Spark SQL to create enough small files so compaction has visible effect:
+
+```bash
+${SPARK_HOME}/bin/spark-sql \
+  --conf spark.hadoop.fs.defaultFS=file:/// \
+  --conf spark.sql.catalog.rest_demo=org.apache.iceberg.spark.SparkCatalog \
+  --conf spark.sql.catalog.rest_demo.type=rest \
+  --conf spark.sql.catalog.rest_demo.uri=http://localhost:9001/iceberg \
+  -e "CREATE NAMESPACE IF NOT EXISTS rest_demo.db; \
+      SET spark.sql.files.maxRecordsPerFile=1000; \
+      INSERT INTO rest_demo.db.t1 \
+      SELECT id, concat('name_', CAST(id AS STRING)) FROM range(0, 100000);"
+```
+
+### 4. Create and attach built-in compaction policy
+
+```bash
+# Create policy
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "iceberg_compaction_default",
+    "comment": "Built-in iceberg compaction policy",
+    "policyType": "system_iceberg_compaction",
+    "enabled": true,
+    "content": {}
+  }' \
+  http://localhost:8090/api/metalakes/test/policies
+
+# Attach policy to table
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "policiesToAdd": ["iceberg_compaction_default"]
+  }' \
+  
http://localhost:8090/api/metalakes/test/objects/table/rest_catalog.db.t1/policies
+```
+
+Verify association:
+
+```bash
+curl -sS 
"http://localhost:8090/api/metalakes/test/objects/table/rest_catalog.db.t1/policies?details=true";
 | jq
+```
+
+### 5. Submit built-in update stats job
+
+```bash
+update_stats_job_id=$(curl -sS -X POST -H "Accept: 
application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "jobTemplateName": "builtin-iceberg-update-stats",
+    "jobConf": {
+      "catalog_name": "rest_catalog",
+      "table_identifier": "db.t1",
+      "update_mode": "all",
+      "updater_options": 
"{\"gravitino_uri\":\"http://localhost:8090\",\"metalake\":\"test\",\"statistics_updater\":\"gravitino-statistics-updater\",\"metrics_updater\":\"gravitino-metrics-updater\"}";,
+      "spark_conf": 
"{\"spark.master\":\"local[2]\",\"spark.hadoop.fs.defaultFS\":\"file:///\"}",
+      "spark_master": "local[2]",
+      "spark_executor_instances": "1",
+      "spark_executor_cores": "1",
+      "spark_executor_memory": "1g",
+      "spark_driver_memory": "1g",
+      "catalog_type": "rest",
+      "catalog_uri": "http://localhost:9001/iceberg";,
+      "warehouse_location": ""
+    }
+  }' \
+  http://localhost:8090/api/metalakes/test/jobs/runs | jq -r '.job.jobId')
+
+echo "update-stats job id: ${update_stats_job_id}"
+```
+
+### 6. Trigger rewrite submission with `submit-strategy-jobs`
+
+```bash
+# Required optimizer CLI config for strategy submission.
+# Note: --strategy-name is policy name, not strategy.type.
+cat > /tmp/gravitino-optimizer-submit.conf <<'EOF_CONF'
+gravitino.optimizer.gravitinoUri = http://localhost:8090
+gravitino.optimizer.gravitinoMetalake = test
+gravitino.optimizer.gravitinoDefaultCatalog = rest_catalog
+gravitino.optimizer.recommender.statisticsProvider = 
gravitino-statistics-provider
+gravitino.optimizer.recommender.strategyProvider = gravitino-strategy-provider
+gravitino.optimizer.recommender.tableMetaProvider = 
gravitino-table-metadata-provider
+gravitino.optimizer.recommender.jobSubmitter = gravitino-job-submitter
+gravitino.optimizer.strategyHandler.iceberg-data-compaction.className = 
org.apache.gravitino.maintenance.optimizer.recommender.handler.compaction.CompactionStrategyHandler
+gravitino.optimizer.jobSubmitterConfig.catalog_name = rest_catalog
+gravitino.optimizer.jobSubmitterConfig.spark_master = local[2]
+gravitino.optimizer.jobSubmitterConfig.spark_executor_instances = 1
+gravitino.optimizer.jobSubmitterConfig.spark_executor_cores = 1
+gravitino.optimizer.jobSubmitterConfig.spark_executor_memory = 1g
+gravitino.optimizer.jobSubmitterConfig.spark_driver_memory = 1g
+gravitino.optimizer.jobSubmitterConfig.catalog_type = rest
+gravitino.optimizer.jobSubmitterConfig.catalog_uri = 
http://localhost:9001/iceberg
+gravitino.optimizer.jobSubmitterConfig.warehouse_location =
+gravitino.optimizer.jobSubmitterConfig.spark_conf = 
{"spark.master":"local[2]","spark.hadoop.fs.defaultFS":"file:///"}
+EOF_CONF
+
+# Optional: preview recommendations without submitting jobs.
+./bin/gravitino-optimizer.sh \
+  --type submit-strategy-jobs \
+  --identifiers rest_catalog.db.t1 \
+  --strategy-name iceberg_compaction_default \
+  --dry-run \
+  --limit 10 \
+  --conf-path /tmp/gravitino-optimizer-submit.conf
+
+# Submit rewrite job through strategy evaluation.
+submit_output=$(./bin/gravitino-optimizer.sh \
+  --type submit-strategy-jobs \
+  --identifiers rest_catalog.db.t1 \
+  --strategy-name iceberg_compaction_default \
+  --limit 10 \
+  --conf-path /tmp/gravitino-optimizer-submit.conf)
+echo "${submit_output}"
+
+strategy_job_id=$(echo "${submit_output}" | sed -n 
's/.*jobId=\([^[:space:]]*\).*/\1/p')
+echo "strategy rewrite job id: ${strategy_job_id}"
+```
+
+### 7. Track status and verify results
+
+```bash
+# Check job status by id
+curl -sS 
"http://localhost:8090/api/metalakes/test/jobs/runs/${update_stats_job_id}"; | jq
+curl -sS 
"http://localhost:8090/api/metalakes/test/jobs/runs/${strategy_job_id}"; | jq
+
+# Verify table statistics after update-stats
+curl -sS 
"http://localhost:8090/api/metalakes/test/objects/table/rest_catalog.db.t1/statistics";
 | jq
+
+# Verify rewrite actually rewrote files (N should be > 0 for non-empty table)
+grep -E "Rewritten data files|Added data files|completed successfully" \
+  
"/tmp/gravitino/jobs/staging/test/builtin-iceberg-rewrite-data-files/${strategy_job_id}/error.log"

Review Comment:
   The staging log path `/tmp/gravitino/jobs/staging/...` is hardcoded. This 
path is controlled by a Gravitino config key — please document which key sets 
it and what the default is, so users know how to find logs if they've 
customised the staging directory.



##########
docs/table-maintenance-service/optimizer-extension-guide.md:
##########
@@ -0,0 +1,128 @@
+---
+title: "Optimizer Extension Guide"
+slug: /table-maintenance-service/extension-guide
+keyword: table maintenance, optimizer, extension, provider, ServiceLoader
+license: This software is licensed under the Apache License version 2.
+---
+
+Use this guide when built-in optimizer components do not match your 
environment and you need custom implementations.
+
+## Extension model
+
+Optimizer supports three loading patterns:
+
+1. `Provider` SPI (`name()` + `initialize()`): loaded by `ServiceLoader` and 
selected by config value.
+2. Class-name mapping for strategy handlers and job adapters.
+3. Typed SPI for `StatisticsCalculator` and `MetricsEvaluator`.
+
+## Extension points and config keys
+
+| Area | Interface / type | Config key | Loading mode |
+| --- | --- | --- | --- |
+| Recommender statistics | `StatisticsProvider` | 
`gravitino.optimizer.recommender.statisticsProvider` | `Provider` SPI by 
`name()` |
+| Recommender strategy source | `StrategyProvider` | 
`gravitino.optimizer.recommender.strategyProvider` | `Provider` SPI by `name()` 
|
+| Recommender table metadata | `TableMetadataProvider` | 
`gravitino.optimizer.recommender.tableMetaProvider` | `Provider` SPI by 
`name()` |
+| Recommender job submission | `JobSubmitter` | 
`gravitino.optimizer.recommender.jobSubmitter` | `Provider` SPI by `name()` |
+| Strategy evaluation logic | `StrategyHandler` | 
`gravitino.optimizer.strategyHandler.<strategyType>.className` | Reflection by 
class name |
+| Job template adaptation | `GravitinoJobAdapter` | 
`gravitino.optimizer.jobAdapter.<jobTemplate>.className` | Reflection by class 
name |
+| Update statistics sink | `StatisticsUpdater` | 
`gravitino.optimizer.updater.statisticsUpdater` | `Provider` SPI by `name()` |
+| Update metrics sink | `MetricsUpdater` | 
`gravitino.optimizer.updater.metricsUpdater` | `Provider` SPI by `name()` |
+| Monitor metrics source | `MetricsProvider` | 
`gravitino.optimizer.monitor.metricsProvider` | `Provider` SPI by `name()` |
+| Monitor table-job relation | `TableJobRelationProvider` | 
`gravitino.optimizer.monitor.tableJobRelationProvider` | `Provider` SPI by 
`name()` |
+| Monitor evaluator | `MetricsEvaluator` | 
`gravitino.optimizer.monitor.metricsEvaluator` | Typed SPI 
(`ServiceLoader<MetricsEvaluator>`) |
+| Monitor callbacks | `MonitorCallback` | 
`gravitino.optimizer.monitor.callbacks` | `Provider` SPI by `name()` 
(comma-separated) |
+| CLI calculator | `StatisticsCalculator` | CLI `--calculator-name` | Typed 
SPI (`ServiceLoader<StatisticsCalculator>`) |
+
+## Implement a custom provider
+
+Most extension points use `Provider`:
+
+```java
+public class MyStatisticsProvider implements StatisticsProvider {
+  @Override
+  public String name() {
+    return "my-statistics-provider";
+  }
+
+  @Override
+  public void initialize(OptimizerEnv optimizerEnv) {
+    // Initialize clients/resources from optimizer config.
+  }
+
+  @Override
+  public void close() throws Exception {}
+}
+```
+
+Requirements:
+
+- Keep a stable `name()` value; config resolves by this name 
(case-insensitive).
+- Provide a public no-arg constructor.
+- Implement `initialize(OptimizerEnv)` and `close()` lifecycle correctly.
+
+## Register with ServiceLoader
+
+### For `Provider` implementations
+
+Create file:
+
+`META-INF/services/org.apache.gravitino.maintenance.optimizer.api.common.Provider`

Review Comment:
   Please verify all three `META-INF/services` fully-qualified interface names 
against the actual source:
   - `org.apache.gravitino.maintenance.optimizer.api.common.Provider`
   - 
`org.apache.gravitino.maintenance.optimizer.api.updater.StatisticsCalculator`
   - `org.apache.gravitino.maintenance.optimizer.api.monitor.MetricsEvaluator`
   
   A wrong package path here causes silent `ServiceLoader` failures for users 
trying to extend the optimizer.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [MINOR] docs: add architecture-first optimizer guide and improve discoverability [gravitino]

Reply via email to