markhoerth opened a new pull request, #10840:
URL: https://github.com/apache/gravitino/pull/10840

   ### What changes were proposed in this pull request?
   
   Adds three health check endpoints following MicroProfile Health semantics:
   
   - `GET /api/health/live` — liveness, returns 200 when the HTTP thread is 
responsive
   - `GET /api/health/ready` — readiness, returns 200 when the entity store is 
reachable, 503 otherwise
   - `GET /api/health` — aggregate, returns 200 when both checks pass
   
   Also exempts `/api/health*` from `AuthenticationFilter` so Kubernetes probes 
and load balancers can reach the endpoints without credentials under any 
configured authenticator.
   
   ### Why are the changes needed?
   
   Modern Java services (Apache Polaris, Spring Boot, Quarkus, Micronaut) ship 
these endpoints by default. Gravitino runs on raw Jetty and does not, which 
blocks standard Kubernetes probe configuration, load balancer health checks, 
and enterprise GTM integration. This is a parity gap that surfaces on day one 
of enterprise deployments.
   
   Fix: #10839
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes — adds three new public REST endpoints under `/api/health`. No existing 
endpoint behavior is changed. No property keys added or removed.
   
   ### How was this patch tested?
   
   - New unit tests in `TestHealthOperations` covering liveness, readiness 
(happy path, entity store uninitialized, entity store throws), and aggregate 
status — 6 test cases.
   - New unit tests in `TestAuthenticationFilter` verifying health paths bypass 
authentication and non-health paths (including paths that merely contain 
"health", e.g. `/api/metalakes/health_metalake`) continue to require 
authentication — 2 test cases.
   - Manual verification against a running Gravitino instance: all three 
endpoints return 200 with expected JSON bodies; `/api/version` continues to 
work unchanged.
   
   ### Notes for reviewers
   
   - **Auth filter exemption is hardcoded** to `/api/health*`, matching how 
Spring Boot and Quarkus hardcode their well-known health paths. Happy to make 
this config-driven in a follow-up if preferred.
   - **Bounded timeout on entity store probe.** The readiness check runs 
`EntityStore.exists()` with a 2-second ceiling via `CompletableFuture` to 
prevent a hanging JDBC connection from tying up Jetty worker threads.
   - **Response body format.** `HealthResponse` extends `BaseResponse` and 
keeps `code: 0` even in 503 responses — the HTTP status is the probe signal, 
the body is diagnostic. This is intentional and differs from `ErrorResponse` 
usage.
   - **`/api/version` is conventionally also unauthenticated** across most Java 
services (used for client discovery before auth negotiation). Not changed in 
this PR but worth considering as a follow-up.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to