The GitHub Actions job "Required Checks" on texera.git/gh-readonly-queue/main/pr-5629-3dab771a2fe3ea5bf97c4c69cfbd761f9cd01e54 has failed. Run started by GitHub user aicam (triggered by aicam).
Head commit for run: 2cdb1fe20c2fef594e0379e31b349fb4f2899475 / ali risheh <[email protected]> fix(access-control-service): include port in computing unit pod URI and use Envoy Gateway for distributed CUs (#5629) ### What changes were proposed in this PR? Make the in-cluster address of a computing unit come from a single source of truth — the URI recorded when its pod is created — and ensure that URI is complete (includes the port). This lets the gateway route a user to a computing unit located **anywhere it can reach** (in the local cluster, another cluster, or an external host), instead of being limited to a reconstructed in-cluster address. See #5630. Two related changes: **1. Include the port in the generated pod URI** (`computing-unit-managing-service`) `KubernetesClient.generatePodURI` builds the address stored as the computing unit's `uri` (via `setUri` in `ComputingUnitManagingResource`) and returned to clients as `nodeAddresses`. The pod's container listens on `KubernetesConfig.computeUnitPortNumber` (declared with `withContainerPort(...)` in the same file), but the generated URI omitted the port, so the persisted address was not directly connectable. The port is now appended: ```scala s"...svc.cluster.local:${KubernetesConfig.computeUnitPortNumber}" ``` **2. Route using the recorded URI** (`access-control-service`) `AccessControlResource` rebuilt the computing unit's address from `KubernetesConfig` on every authorization request, duplicating the construction logic in `generatePodURI` and pinning every CU to the local cluster. It now reads the URI recorded for the unit and returns it as the `Host` for the gateway to route to. If no URI has been recorded, the unit is not routable and the request is **refused with `403`** (no in-cluster fallback, per review). ### Routing flow The access-control service is the gateway's external authorizer; the `Host` it returns is the upstream Envoy forwards the (upgraded) connection to. Because that host comes from the unit's recorded URI, the same gateway can reach computing units in different locations: ```mermaid flowchart LR FE["Frontend<br/>(/wsapi?cuid=N)"] --> GW["Envoy Gateway"] GW -. "ext-auth: authorize + get Host" .-> ACS["access-control-service"] ACS -- "read recorded uri for CU N" --> DB[("workflow_computing_unit")] ACS -- "Host = recorded uri<br/>(or 403 if none)" --> GW GW == "dynamic forward proxy<br/>to returned Host" ==> R{Where the CU lives} R --> CU1["In-cluster CU pod<br/>computing-unit-N...svc.cluster.local:port"] R --> CU2["CU in another cluster"] R --> CU3["External / remote CU host:port"] ``` ### Any related issues, documentation, discussions? - Closes #5630. - Builds on the Envoy Gateway / ext-auth routing introduced in #4191 (unified Envoy Gateway) and #3598 (access-control-service as the ext-auth service for computing-unit traffic). ### How was this PR tested? On live deployment. <img width="1835" height="960" alt="Screenshot from 2026-06-13 13-31-00" src="https://github.com/user-attachments/assets/d56a48f9-b99d-4d36-827a-0a4ce54995fd" /> ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Code (Claude Opus 4.8) --------- Co-authored-by: Claude Opus 4.8 (1M context) <[email protected]> Report URL: https://github.com/apache/texera/actions/runs/27579201394 With regards, GitHub Actions via GitBox
