markhoerth opened a new pull request, #10840: URL: https://github.com/apache/gravitino/pull/10840
### What changes were proposed in this pull request? Adds three health check endpoints following MicroProfile Health semantics: - `GET /api/health/live` — liveness, returns 200 when the HTTP thread is responsive - `GET /api/health/ready` — readiness, returns 200 when the entity store is reachable, 503 otherwise - `GET /api/health` — aggregate, returns 200 when both checks pass Also exempts `/api/health*` from `AuthenticationFilter` so Kubernetes probes and load balancers can reach the endpoints without credentials under any configured authenticator. ### Why are the changes needed? Modern Java services (Apache Polaris, Spring Boot, Quarkus, Micronaut) ship these endpoints by default. Gravitino runs on raw Jetty and does not, which blocks standard Kubernetes probe configuration, load balancer health checks, and enterprise GTM integration. This is a parity gap that surfaces on day one of enterprise deployments. Fix: #10839 ### Does this PR introduce _any_ user-facing change? Yes — adds three new public REST endpoints under `/api/health`. No existing endpoint behavior is changed. No property keys added or removed. ### How was this patch tested? - New unit tests in `TestHealthOperations` covering liveness, readiness (happy path, entity store uninitialized, entity store throws), and aggregate status — 6 test cases. - New unit tests in `TestAuthenticationFilter` verifying health paths bypass authentication and non-health paths (including paths that merely contain "health", e.g. `/api/metalakes/health_metalake`) continue to require authentication — 2 test cases. - Manual verification against a running Gravitino instance: all three endpoints return 200 with expected JSON bodies; `/api/version` continues to work unchanged. ### Notes for reviewers - **Auth filter exemption is hardcoded** to `/api/health*`, matching how Spring Boot and Quarkus hardcode their well-known health paths. Happy to make this config-driven in a follow-up if preferred. - **Bounded timeout on entity store probe.** The readiness check runs `EntityStore.exists()` with a 2-second ceiling via `CompletableFuture` to prevent a hanging JDBC connection from tying up Jetty worker threads. - **Response body format.** `HealthResponse` extends `BaseResponse` and keeps `code: 0` even in 503 responses — the HTTP status is the probe signal, the body is diagnostic. This is intentional and differs from `ErrorResponse` usage. - **`/api/version` is conventionally also unauthenticated** across most Java services (used for client discovery before auth negotiation). Not changed in this PR but worth considering as a follow-up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
