moomindani opened a new pull request, #16574: URL: https://github.com/apache/iceberg/pull/16574
Closes #16573. Adds an optional filtering layer above any `MetricsReporter` implementation that drops `ScanReport` and `CommitReport` instances whose `tableName()` does not pass the configured include / exclude regex. The filter applies uniformly to `LoggingMetricsReporter`, `RESTMetricsReporter`, and custom user-supplied reporters. The proposal surfaced in the dev@ DISCUSS thread for #16250 (per-table cardinality of the OTel reporter) and is intentionally scoped as cross-reporter, not OTel-specific. ## Design `CatalogUtil.loadMetricsReporter` wraps the resolved reporter in a `FilteringMetricsReporter` when either of the new properties is set. When neither is set, the resolved reporter is returned unchanged — no wrapper instantiated, no runtime overhead on the default path. `MetricsReport` subtypes that do not expose a table name (anything other than `ScanReport` / `CommitReport`) are forwarded without filtering. ## Configuration Two new catalog properties: ``` metrics-reporter.table-name.include=prod_db\..* metrics-reporter.table-name.exclude=.*\.tmp_.* ``` Values are Java regex patterns matched against the table name. When both are set, `exclude` wins over `include` (an explicit deny overrides an include). Empty values are treated as not set to avoid accidentally silencing all metrics on misconfiguration. Invalid regex values fail fast at catalog initialization with a clear error pointing at the offending property. Behavior: - `include` only: forward reports whose table name matches; drop others. - `exclude` only: drop reports whose table name matches; forward others. - Both set: drop if `exclude` matches; otherwise forward only if `include` matches. - Neither set: forward everything (current behavior). This mirrors the existing `route-regex` pattern used in `iceberg-kafka-connect` (`IcebergSinkConfig`), where a user-supplied regex from configuration is compiled via `Pattern.compile()` and matched against incoming data. Same trust model: catalog property = admin-controlled. ## Disclosure Per the project's [AI-assisted contribution guidelines](https://iceberg.apache.org/contribute/#guidelines-for-ai-assisted-contributions), I used Claude Code to help draft this work. I reviewed every change by hand and ran the full test/lint loop locally before opening this PR. The design and motivation discussion is in #16573. cc @ebyhr @jbonofre — happy to address any feedback. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
