MichaelRBlack opened a new pull request, #64691: URL: https://github.com/apache/airflow/pull/64691
## Summary Task-level OTel metrics (e.g. `ti.finish`) are silently dropped in forked task subprocesses because the OTel Python SDK's `Once()` guard on `set_meter_provider()` survives `fork()`. **Root cause:** `stats.py` correctly detects PID mismatches after fork and calls `otel_logger.get_otel_logger()` to re-initialize. This creates a fresh `MeterProvider` and calls `metrics.set_meter_provider()`, but the SDK's `_METER_PROVIDER_SET_ONCE._done = True` flag inherited from the parent blocks the call. The child ends up with the parent's stale provider whose `PeriodicExportingMetricReader` export thread is dead after fork. **Fix:** Reset the SDK's provider state in `get_otel_logger()` before calling `set_meter_provider()`. Since `stats.py` only calls the factory after detecting a PID mismatch, this reset only runs in forked children that need a fresh provider. Closes #64690 ## Test plan - [x] Added unit test that simulates `Once._done = True` (forked child state) and verifies `get_otel_logger()` successfully sets a new `MeterProvider` - [ ] Manual: Deploy and confirm `ti.finish` metrics appear in Grafana - [ ] Manual: Confirm "Overriding of current MeterProvider is not allowed" warning no longer appears in task logs 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
