Re: [PR] [MINOR] docs: add architecture-first optimizer guide and improve discoverability [gravitino]

via GitHub Mon, 09 Mar 2026 05:37:42 -0700


FANNG1 commented on code in PR #10203:
URL: https://github.com/apache/gravitino/pull/10203#discussion_r2905165957



##########
docs/index.md:
##########
@@ -92,6 +92,8 @@ Gravitino currently supports the following catalogs:
 
 If you want to operate table and partition statistics, you can refer to the 
[document](./manage-statistics-in-gravitino.md).
 
+If you want an operations guide for automated maintenance workflows 
(statistics, metrics, monitoring, and strategy jobs), see [Table Maintenance 
Service (Optimizer)](./table-maintenance-service/optimizer.md). You can start 
with Gravitino built-in policies and built-in job templates first, and use 
custom extension interfaces when built-ins do not meet your requirements.

Review Comment:
   Updated. I shortened the index entry to a concise two-line style and kept 
the extension guidance brief.



##########
docs/table-maintenance-service/optimizer.md:
##########
@@ -0,0 +1,115 @@
+---
+title: "Table Maintenance Service (Optimizer)"
+slug: /table-maintenance-service

Review Comment:
   Fixed. The overview page slug is now  to match .



##########
docs/table-maintenance-service/optimizer.md:
##########
@@ -0,0 +1,115 @@
+---
+title: "Table Maintenance Service (Optimizer)"
+slug: /table-maintenance-service
+keyword: table maintenance, optimizer, statistics, metrics, monitor
+license: This software is licensed under the Apache License version 2.
+---
+
+## What is this service
+
+The Table Maintenance Service (Optimizer) automates table maintenance by 
connecting:
+
+- Statistics and metrics collection
+- Rule evaluation and strategy recommendation
+- Job template based execution
+
+The CLI commands and configuration keys use the `optimizer` name.
+
+## Architecture overview
+
+The optimizer workflow is based on six parts:
+
+1. Metadata objects: catalog/schema/table in a metalake.
+2. Statistics and metrics: table/partition signals used for decision making.
+3. Policies: strategy intent, for example `system_iceberg_compaction`.
+4. Job templates: executable contracts, for example built-in Spark templates.
+5. Job executor: local or custom backend that runs submitted jobs.
+6. Status and logs: REST job state plus local staging logs.
+
+Typical data flow:
+
+1. Collect statistics and metrics for target tables.
+2. Evaluate rules and produce candidate actions.
+3. Submit jobs using a concrete template and `jobConf`.
+4. Track status and verify results on table metadata and logs.
+
+## Execution modes
+
+| Mode | Main entry | Best for | Output |
+| --- | --- | --- | --- |
+| Built-in maintenance workflow | Gravitino REST + built-in templates | 
Server-side operational runs | Submitted Spark jobs and updated metadata |
+| Optimizer CLI local calculator | `gravitino-optimizer.sh` | Local 
file-driven testing and batch scripts | Statistics/metrics updates and optional 
submissions |
+
+Use built-in maintenance workflow when you want policy-driven server execution.
+Use CLI local calculator when you want to feed JSONL input directly.
+
+## Start here
+
+- Configuration first: read [Optimizer 
Configuration](./optimizer-configuration.md).
+- Need custom integrations: read [Optimizer Extension 
Guide](./optimizer-extension-guide.md).
+- First-time enablement: run [Optimizer Quick Start and 
Verification](./optimizer-quick-start.md).
+- CLI-only usage: read [Optimizer CLI Reference](./optimizer-cli-reference.md).
+- Runtime failures or mismatched results: check [Optimizer 
Troubleshooting](./optimizer-troubleshooting.md).
+
+## Lifecycle
+
+### 1. Collect
+
+Generate or ingest table and partition statistics/metrics.
+
+### 2. Evaluate
+
+Apply policies and rules to decide whether maintenance should run.
+
+### 3. Submit
+
+Pick a job template and submit job with concrete `jobConf`.
+
+### 4. Observe
+
+Check REST job status and validate resulting statistics, metrics, or rewritten 
data files.
+
+## Configuration model
+
+| Layer | Scope | Typical keys |
+| --- | --- | --- |
+| Gravitino server config | Runtime for job manager and executor | 
`gravitino.job.executor`, `gravitino.job.statusPullIntervalInMs`, 
`gravitino.jobExecutor.local.sparkHome` |
+| Job submission `jobConf` | Per job run | `catalog_name`, `table_identifier`, 
`spark_*`, template-specific args |
+| Optimizer CLI config | CLI commands | `gravitino.optimizer.*` in 
`conf/gravitino-optimizer.conf` |
+
+## Terminology mapping
+
+| Term | Example value | Used in |
+| --- | --- | --- |
+| Policy name | `iceberg_compaction_default` | Policy identity and CLI 
`--strategy-name` |
+| Policy type | `system_iceberg_compaction` | REST policy creation field 
`policyType` |
+| Strategy type | `iceberg-data-compaction` | Policy content field 
`strategy.type` and strategy handler config key |
+
+For strategy submission, `--strategy-name` must use policy name, not policy 
type or strategy type.
+
+## Before you start

Review Comment:
   Good point. I removed the duplicated blocks from the overview page and now 
point readers to the quick-start page for prerequisites and success checks.



##########
docs/table-maintenance-service/optimizer.md:
##########
@@ -0,0 +1,115 @@
+---
+title: "Table Maintenance Service (Optimizer)"
+slug: /table-maintenance-service
+keyword: table maintenance, optimizer, statistics, metrics, monitor
+license: This software is licensed under the Apache License version 2.
+---
+
+## What is this service
+
+The Table Maintenance Service (Optimizer) automates table maintenance by 
connecting:
+
+- Statistics and metrics collection
+- Rule evaluation and strategy recommendation
+- Job template based execution
+
+The CLI commands and configuration keys use the `optimizer` name.
+
+## Architecture overview
+
+The optimizer workflow is based on six parts:
+
+1. Metadata objects: catalog/schema/table in a metalake.
+2. Statistics and metrics: table/partition signals used for decision making.
+3. Policies: strategy intent, for example `system_iceberg_compaction`.
+4. Job templates: executable contracts, for example built-in Spark templates.
+5. Job executor: local or custom backend that runs submitted jobs.
+6. Status and logs: REST job state plus local staging logs.
+
+Typical data flow:
+
+1. Collect statistics and metrics for target tables.
+2. Evaluate rules and produce candidate actions.
+3. Submit jobs using a concrete template and `jobConf`.
+4. Track status and verify results on table metadata and logs.
+
+## Execution modes
+
+| Mode | Main entry | Best for | Output |
+| --- | --- | --- | --- |
+| Built-in maintenance workflow | Gravitino REST + built-in templates | 
Server-side operational runs | Submitted Spark jobs and updated metadata |
+| Optimizer CLI local calculator | `gravitino-optimizer.sh` | Local 
file-driven testing and batch scripts | Statistics/metrics updates and optional 
submissions |
+
+Use built-in maintenance workflow when you want policy-driven server execution.
+Use CLI local calculator when you want to feed JSONL input directly.
+
+## Start here
+
+- Configuration first: read [Optimizer 
Configuration](./optimizer-configuration.md).
+- Need custom integrations: read [Optimizer Extension 
Guide](./optimizer-extension-guide.md).
+- First-time enablement: run [Optimizer Quick Start and 
Verification](./optimizer-quick-start.md).
+- CLI-only usage: read [Optimizer CLI Reference](./optimizer-cli-reference.md).
+- Runtime failures or mismatched results: check [Optimizer 
Troubleshooting](./optimizer-troubleshooting.md).
+
+## Lifecycle
+
+### 1. Collect
+
+Generate or ingest table and partition statistics/metrics.
+
+### 2. Evaluate
+
+Apply policies and rules to decide whether maintenance should run.
+
+### 3. Submit
+
+Pick a job template and submit job with concrete `jobConf`.
+
+### 4. Observe
+
+Check REST job status and validate resulting statistics, metrics, or rewritten 
data files.
+
+## Configuration model
+
+| Layer | Scope | Typical keys |
+| --- | --- | --- |
+| Gravitino server config | Runtime for job manager and executor | 
`gravitino.job.executor`, `gravitino.job.statusPullIntervalInMs`, 
`gravitino.jobExecutor.local.sparkHome` |
+| Job submission `jobConf` | Per job run | `catalog_name`, `table_identifier`, 
`spark_*`, template-specific args |
+| Optimizer CLI config | CLI commands | `gravitino.optimizer.*` in 
`conf/gravitino-optimizer.conf` |
+
+## Terminology mapping
+
+| Term | Example value | Used in |
+| --- | --- | --- |
+| Policy name | `iceberg_compaction_default` | Policy identity and CLI 
`--strategy-name` |
+| Policy type | `system_iceberg_compaction` | REST policy creation field 
`policyType` |
+| Strategy type | `iceberg-data-compaction` | Policy content field 
`strategy.type` and strategy handler config key |
+
+For strategy submission, `--strategy-name` must use policy name, not policy 
type or strategy type.
+
+## Before you start
+
+- Prepare a running Gravitino server.
+- Ensure target metalake exists (examples use `test`).
+- Configure `SPARK_HOME` or `gravitino.jobExecutor.local.sparkHome` for Spark 
templates.
+- For CLI mode, prepare `conf/gravitino-optimizer.conf` from template.
+- Use fully qualified identifiers where possible, for example 
`catalog.schema.table`.
+- If your Iceberg REST backend is in-memory, metadata is reset after restart.
+
+## Success criteria
+
+- Update-stats job finishes and statistics include `custom-data-file-mse` and 
`custom-delete-file-number`.
+- `submit-strategy-jobs` prints `SUBMIT` with a rewrite job ID.
+- Rewrite job log shows `Rewritten data files: <N>` where `N > 0` for 
non-empty tables.
+
+## Related docs
+
+- [Optimizer Configuration](./optimizer-configuration.md)
+- [Optimizer Extension Guide](./optimizer-extension-guide.md)
+- [Optimizer Quick Start and Verification](./optimizer-quick-start.md)
+- [Optimizer CLI Reference](./optimizer-cli-reference.md)
+- [Optimizer Troubleshooting](./optimizer-troubleshooting.md)
+- [Manage policies in Gravitino](./manage-policies-in-gravitino.md)

Review Comment:
   Fixed earlier: all four cross-directory links in  now use the  prefix.



##########
docs/table-maintenance-service/optimizer-cli-reference.md:
##########
@@ -0,0 +1,211 @@
+---
+title: "Optimizer CLI Reference"
+slug: /table-maintenance-service/cli-reference

Review Comment:
   Fixed. CLI reference slug is now .



##########
docs/table-maintenance-service/optimizer-cli-reference.md:
##########
@@ -0,0 +1,211 @@
+---
+title: "Optimizer CLI Reference"
+slug: /table-maintenance-service/cli-reference
+keyword: table maintenance, optimizer, cli, commands, metrics, statistics
+license: This software is licensed under the Apache License version 2.
+---
+
+Use `--help` to list all commands, or `--help --type <command>` for 
command-specific help.
+
+By default, optimizer CLI loads `conf/gravitino-optimizer.conf` from the 
current working
+directory. Use `--conf-path` only when you need a custom config file.
+
+## Command quick reference
+
+| Command (`--type`) | Required options | Optional options | Purpose |
+| --- | --- | --- | --- |
+| `submit-strategy-jobs` | `--identifiers`, `--strategy-name` (policy name) | 
`--dry-run`, `--limit` | Recommend and optionally submit jobs |

Review Comment:
   Fixed. I removed the  annotation from the command quick reference row and 
kept the explanation in the option meanings table.



##########
docs/table-maintenance-service/optimizer-configuration.md:
##########
@@ -0,0 +1,102 @@
+---
+title: "Optimizer Configuration"
+slug: /table-maintenance-service/configuration

Review Comment:
   Fixed. Configuration page slug is now .



##########
docs/table-maintenance-service/optimizer-configuration.md:
##########
@@ -0,0 +1,102 @@
+---
+title: "Optimizer Configuration"
+slug: /table-maintenance-service/configuration
+keyword: table maintenance, optimizer, configuration, job template, spark
+license: This software is licensed under the Apache License version 2.
+---
+
+## Configuration layers
+
+Use these layers together:
+
+| Layer | Scope | Typical keys |
+| --- | --- | --- |
+| Gravitino server config | Runtime for job manager and executor | 
`gravitino.job.executor`, `gravitino.job.statusPullIntervalInMs`, 
`gravitino.jobExecutor.local.sparkHome` |
+| Job submission `jobConf` | Per job run | `catalog_name`, `table_identifier`, 
`spark_*`, template-specific args |
+| Optimizer CLI config | CLI commands | `gravitino.optimizer.*` in 
`conf/gravitino-optimizer.conf` |
+
+## Server-side configuration
+
+Set server-level runtime behavior in `gravitino.conf`.
+
+```properties
+gravitino.job.executor=local
+gravitino.job.statusPullIntervalInMs=300000
+gravitino.jobExecutor.local.sparkHome=/path/to/spark
+```
+
+For local demo environments, you can reduce 
`gravitino.job.statusPullIntervalInMs` to get faster status updates.
+
+## Built-in update stats `jobConf`
+
+Use `builtin-iceberg-update-stats` with at least these keys:
+
+```json
+{
+  "catalog_name": "rest_catalog",
+  "table_identifier": "db.t1",
+  "update_mode": "all",
+  "updater_options": 
"{\"gravitino_uri\":\"http://localhost:8090\",\"metalake\":\"test\",\"statistics_updater\":\"gravitino-statistics-updater\",\"metrics_updater\":\"gravitino-metrics-updater\"}";,
+  "spark_conf": 
"{\"spark.master\":\"local[2]\",\"spark.hadoop.fs.defaultFS\":\"file:///\"}",
+  "spark_master": "local[2]",
+  "spark_executor_instances": "1",
+  "spark_executor_cores": "1",
+  "spark_executor_memory": "1g",
+  "spark_driver_memory": "1g",
+  "catalog_type": "rest",
+  "catalog_uri": "http://localhost:9001/iceberg";,
+  "warehouse_location": ""
+}
+```
+
+## Strategy submission configuration
+
+`submit-strategy-jobs` needs optimizer CLI config. This is a minimal working 
example:
+
+```properties
+gravitino.optimizer.gravitinoUri = http://localhost:8090
+gravitino.optimizer.gravitinoMetalake = test
+gravitino.optimizer.gravitinoDefaultCatalog = rest_catalog
+gravitino.optimizer.recommender.statisticsProvider = 
gravitino-statistics-provider
+gravitino.optimizer.recommender.strategyProvider = gravitino-strategy-provider
+gravitino.optimizer.recommender.tableMetaProvider = 
gravitino-table-metadata-provider
+gravitino.optimizer.recommender.jobSubmitter = gravitino-job-submitter
+gravitino.optimizer.strategyHandler.iceberg-data-compaction.className = 
org.apache.gravitino.maintenance.optimizer.recommender.handler.compaction.CompactionStrategyHandler
+gravitino.optimizer.jobSubmitterConfig.catalog_name = rest_catalog
+gravitino.optimizer.jobSubmitterConfig.spark_master = local[2]
+gravitino.optimizer.jobSubmitterConfig.spark_executor_instances = 1
+gravitino.optimizer.jobSubmitterConfig.spark_executor_cores = 1
+gravitino.optimizer.jobSubmitterConfig.spark_executor_memory = 1g
+gravitino.optimizer.jobSubmitterConfig.spark_driver_memory = 1g
+gravitino.optimizer.jobSubmitterConfig.catalog_type = rest
+gravitino.optimizer.jobSubmitterConfig.catalog_uri = 
http://localhost:9001/iceberg
+gravitino.optimizer.jobSubmitterConfig.warehouse_location =

Review Comment:
   Updated. I added explicit notes that  can be empty for local filesystem 
testing and should be set for HDFS/cloud warehouses.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [MINOR] docs: add architecture-first optimizer guide and improve discoverability [gravitino]

Reply via email to