The GitHub Actions job "Benchmarks PR Comment" on texera.git/main has succeeded. Run started by GitHub user aglinxinyuan (triggered by aglinxinyuan).
Head commit for run: 01e070fd83dd7fd9a5574964880fe67d7f8f1257 / Matthew B. <[email protected]> perf(auth): collapse N+1 in getComputingUnitAccess into one join (#5648) ### What changes were proposed in this PR? - Replace the two-DAO lookup in `ComputingUnitAccess.getComputingUnitAccess` (fetch the unit by `cuid`, then fetch all of the user's access rows by `uid` and filter them in memory) with a single `LEFT JOIN` of `WORKFLOW_COMPUTING_UNIT` to `COMPUTING_UNIT_USER_ACCESS` on `(cuid, uid)`, returning the owner uid and the caller's privilege in one round-trip. - Resolve the privilege from that single row: the owner gets `WRITE`, a present access row yields its privilege, and a null join (no grant) yields `NONE`. - A missing computing unit now resolves to `NONE` instead of throwing; the sole caller (`AccessControlResource`) already maps both `NONE` and a thrown exception to `403 FORBIDDEN`, so the externally observable behavior is unchanged. ### Performance comparison (previous vs current logic) Measured with a throwaway benchmark against the same embedded Postgres the unit tests use (`MockTexeraDB`), exercising the non-owner path (the only path that previously issued a second query). For each grant count the benchmark seeds a user holding that many grants, then times 2000 calls per implementation after a 200-call warmup. Microseconds per call, lower is better: | Grants held by the user | Previous (2 queries) | Current (1 join) | Speedup | | ---: | ---: | ---: | ---: | | 1 | 4382.5 us | 2395.1 us | 1.83x | | 10 | 3892.7 us | 1786.2 us | 2.18x | | 100 | 3260.0 us | 1690.4 us | 1.93x | | 1000 | 3650.8 us | 1630.0 us | 2.24x | So the rewrite is roughly 2x faster across the board. The speedup stays flat rather than growing with the grant count, which is itself informative: on this loopback setup the dominant cost is the extra database round-trip (two statements vs one), not the in-memory transfer-and-filter of the user's grant rows. Against a real database over a network, where per-statement latency is higher, halving the round-trips should matter at least as much. This addresses the fair concern that "more SQL queries do not necessarily mean slower": here the previous logic issued fewer total joins but paid for an extra round-trip, and measuring confirms the single join wins. These numbers come from a local micro-benchmark, not committed to the suite, and are absolute timings on embedded Postgres; treat the ratio as the signal, not the raw microseconds. ### Any related issues, documentation, discussions? Closes: #5645 ### How was this PR tested? - Added `ComputingUnitAccessSpec` covering the owner (`WRITE`), shared-grant (granted privilege), no-grant (`NONE`), and missing-unit (`NONE`) cases. - `sbt Auth/compile` and `sbt Auth/test`, expect a clean compile of the rewritten jOOQ query and a passing spec. - Manual, via the access-control route backed by `AccessControlResource`: request a computing unit as its owner, expect `WRITE`; as a user holding a shared grant, expect that grant's privilege; as a user with no grant, or for a non-existent `cuid`, expect `403 FORBIDDEN`. - Behavior-preservation check: confirm the only caller maps `NONE` to `403`, so the missing-unit path (previously a thrown NPE, now `NONE`) still yields `403`. ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.8 (Claude Code), in compliance with the ASF Generative Tooling Guidance Report URL: https://github.com/apache/texera/actions/runs/27489767501 With regards, GitHub Actions via GitBox
