andrewmusselman opened a new issue, #944:
URL: https://github.com/apache/tooling-trusted-releases/issues/944

   Discussion point that came up during key revokation in #922 
   
   From @dave2wave:
   ```
   I think we need to be clear on ldap state in development/debug/test modes 
and users. ASVS L2 finds our inconsistencies. We should probably discuss the 
rules we really need. This way you can properly handle edge cases.
   ```
   
   # LDAP State in Dev/Debug/Test Modes — Discussion Points
   
   ## Context
   
   ASVS L2 compliance review has surfaced inconsistencies in how ATR handles 
LDAP-dependent authentication and authorization across its operating modes: 
**Production**, **Debug**, and **Debug+ALLOW_TESTS**. The core tension is that 
LDAP is the source of truth for user identity, account status, committee 
membership, and admin roles — but in development scenarios it may be 
unavailable, partially available, or entirely bypassed.
   
   This document maps the specific inconsistencies found in the codebase and 
proposes rules to discuss.
   
   ---
   
   ## 1. The Three Operating Modes (and what actually changes)
   
   | Behavior | Production | Debug (no ALLOW_TESTS) | Debug + ALLOW_TESTS |
   |---|---|---|---|
   | LDAP credentials required? | Yes (startup fails without them in 
`principal.py`) | No (graceful degradation) | No |
   | `ldap.is_active()` when no credentials | N/A | **Returns `True` for 
everyone** | Returns `True` for everyone |
   | `"test"` user gets admin? | No | No | **Yes** |
   | `"test"` user bypasses LDAP entirely? | No | No | **Yes** (hardcoded in 
`principal.py`, `ldap.py`, `user.py`) |
   | Rate limiting enabled? | Yes | Yes | **No** |
   | Session cache file used for auth? | No | **Yes** (fallback in 
`AuthoriserLDAP`) | Yes |
   | `_augment_test_membership()` adds "test" to everyone? | No | No | **Yes** |
   | Test login endpoint (`/test/login`) available? | No (404) | No (404) | 
**Yes** (creates session with no LDAP) |
   
   ### Discussion Point 1a: `is_active()` silently returns `True` when LDAP is 
unconfigured
   
   In `atr/ldap.py` line 248:
   
   ```python
   async def is_active(asf_uid: str) -> bool:
       if get_bind_credentials() is None:
           return True  # ← Any UID is treated as active
   ```
   
   This means if a developer accidentally runs without LDAP credentials, 
**every user is considered active including banned accounts**. This is the 
primary path for checking banned/deleted accounts in `blueprints/common.py`, 
`jwtoken.py`, and `ssh.py`.
   
   **Question**: Should this be `return False` (fail-closed) when LDAP is 
unavailable in non-Debug modes? Or should it raise an error rather than 
silently succeeding?
   
   ### Discussion Point 1b: `fetch_admin_users()` and `fetch_tooling_users()` 
degrade differently
   
   When LDAP is unavailable:
   - `fetch_admin_users()` returns `frozenset()` — no admins (fail-closed)
   - `fetch_tooling_users(extra)` returns `extra` — the additional users still 
get tooling access
   
   This is inconsistent. One fails closed, the other partially open.
   
   ---
   
   ## 2. The "test" User — A Synthetic Identity with Real Privileges
   
   When `ALLOW_TESTS=True`, a synthetic `"test"` user is created across at 
least **seven different code paths**, each with its own hardcoded bypass:
   
   | Location | What it does |
   |---|---|
   | `ldap.is_active()` | Returns `True` without checking LDAP |
   | `user.is_admin()` / `is_admin_async()` | Returns `True` — test is a full 
admin |
   | `principal.AuthoriserLDAP.cache_refresh()` | Hardcodes 
`committees={"test"}, projects={"test"}` |
   | `principal._augment_test_membership()` | Adds "test" committee/project to 
**every authenticated user** |
   | `server._initialise_test_environment()` | Creates DB records: test 
committee, project, with `"test"` as sole member |
   | `get/test.py::test_login` | Creates a session cookie with `uid="test"` and 
no LDAP verification |
   | `user.projects()` | Makes the test project visible to all users |
   
   ### Discussion Point 2a: The test user is an admin everywhere, not just for 
the test project
   
   `user.is_admin("test")` returns `True` globally. This means in ALLOW_TESTS 
mode, the test user can access admin functions like "browse as", LDAP lookup, 
release deletion, token revocation, and project management. If ALLOW_TESTS 
accidentally leaks into a production-like environment, this is a privilege 
escalation path.
   
   **Question**: Should the test user be admin of the test project only, or is 
full admin necessary for testing? Could we scope it?
   
   ### Discussion Point 2b: `_augment_test_membership()` modifies real users' 
memberships
   
   In `principal.py` line 412:
   
   ```python
   def _augment_test_membership(committees, projects):
       if config.get().ALLOW_TESTS:
           committees = committees.union({"test"})
           projects = projects.union({"test"})
       return committees, projects
   ```
   
   This is called for **every user**, not just the test user. Every real 
developer who authenticates via LDAP in Debug+ALLOW_TESTS mode is silently 
granted membership in the "test" committee and project. This is intentional (so 
devs can interact with the test project) but is never documented or surfaced to 
the user.
   
   **Question**: Is this the right approach, or should developers be explicitly 
added to the test committee in the database instead?
   
   ### Discussion Point 2c: The `/test/login` endpoint bypasses all 
authentication
   
   ```python
   session_data = atr.models.session.CookieData(
       uid="test", fullname="Test User",
       pmcs=["test"], projects=["test"],
       isMember=False, isChair=False, ...
   )
   util.write_quart_session_cookie(session_data)
   ```
   
   This creates a valid session with no password, no LDAP check, and no OAuth 
flow. The guard is only `config.get().ALLOW_TESTS`. There is also a config 
check that `ALLOW_TESTS` can only be enabled in Debug mode, but Debug mode 
itself is only restricted via `util.is_dev_environment()`.
   
   **Question**: What are the exact conditions for `is_dev_environment()` and 
are they sufficient? If it only checks for the absence of `PRODUCTION=true`, a 
misconfigured deployment could end up in Debug mode.
   
   ---
   
   ## 3. Session Cache File — A Parallel Auth Path
   
   In `principal.py::AuthoriserLDAP.cache_refresh()`, when running in Debug 
mode (even without ALLOW_TESTS), there's a **session cache file** fallback:
   
   ```python
   if config.get_mode() == config.Mode.Debug:
       session_cache = await util.session_cache_read()
       if asf_uid in session_cache:
           # Use cached data instead of querying LDAP
   ```
   
   This reads from `state/cache/user_session_cache.json` and bypasses LDAP 
entirely for committee/project membership data. If this file contains stale 
data (e.g., a user was removed from a committee), the application won't know.
   
   ### Discussion Point 3: The session cache has no TTL or integrity check
   
   The cache file is a plain JSON file with no expiration timestamp, no 
signature, and no validation. If it exists and contains an entry for a UID, 
that entry is trusted completely. Compared to the in-memory `principal.Cache` 
which has a 600-second TTL, this file-based cache has infinite TTL.
   
   **Question**: Should this file-based cache be removed, given a TTL, or at 
minimum documented as a development-only convenience that will never be enabled 
in production?
   
   ---
   
   ## 4. SSH Authentication — Different LDAP Failure Semantics
   
   The SSH server in `ssh.py` checks LDAP directly during `begin_auth()`:
   
   ```python
   account = await ldap.account_lookup(username)
   if account is None or ldap.is_banned(account):
       log.failed_authentication("account_banned_or_deleted")
       return True  # ← Still requires auth, but won't match keys
   ```
   
   But `ldap.account_lookup()` calls `get_bind_credentials()`, which returns 
`None` when LDAP isn't configured. When credentials are `None`, 
`account_lookup` returns `None`. This means the SSH path treats "LDAP 
unavailable" the same as "account doesn't exist" — which means **SSH auth fails 
closed** when LDAP is down.
   
   Meanwhile, the web path (`blueprints/common.py`) calls `ldap.is_active()`, 
which returns `True` when LDAP is unavailable — **web auth fails open**.
   
   ### Discussion Point 4: SSH fails closed, Web fails open — same system, 
opposite behavior
   
   | Auth Path | LDAP unavailable → | LDAP says user is banned → |
   |---|---|---|
   | Web (`is_active`) | **Allows access** | Denies access |
   | SSH (`account_lookup`) | **Denies access** | Denies access |
   | JWT (`is_active`) | **Allows access** | Denies access |
   | PAT validation (`is_active`) | **Allows access** | Denies access |
   
   **Question**: Should all paths have the same failure mode? The safe default 
for a security-sensitive application is fail-closed.
   
   ---
   
   ## 5. JWT/PAT Authentication and LDAP Coupling
   
   In `jwtoken.py`, the `verify()` function checks `ldap.is_active()` after 
validating the JWT signature:
   
   ```python
   if not await ldap.is_active(asf_uid):
       raise base.ASFQuartException("Account is disabled", errorcode=401)
   ```
   
   But since `is_active()` returns `True` when LDAP is unconfigured, a valid 
JWT for a banned user would be accepted if LDAP credentials are missing.
   
   Additionally, PAT hash validation is skipped when the `atr_th` claim is 
absent — this is documented as intentional ("not all JWTs come from PATs"), but 
it means there's a class of JWTs that have no revocation check at all.
   
   ### Discussion Point 5: Should JWT verification require LDAP to be available?
   
   If LDAP is down in production, the current behavior is to accept all JWTs 
regardless of account status. This could be acceptable (short-lived tokens, 
30-minute TTL), but it contradicts the security model where LDAP is 
authoritative for account status.
   
   ---
   
   ## 6. Proposed Rules for Discussion
   
   Based on the above findings, here are candidate rules for the team to decide 
on:
   
   1. **Fail-closed by default**: `ldap.is_active()` should return `False` (or 
raise) when LDAP is unavailable, unless Debug mode is active AND ALLOW_TESTS is 
enabled. The current `return True` is dangerous.
   
   2. **Consistent LDAP failure behavior**: All auth paths (web, SSH, JWT, PAT) 
should behave the same way when LDAP is down. Pick one: fail-open or 
fail-closed.
   
   3. **Scope the test user**: The `"test"` user should have minimal privileges 
needed for end-to-end testing — member of test committee/project only, not a 
global admin. Create a separate `TEST_ADMIN` flag if admin testing is needed.
   
   4. **Remove or TTL the session cache file**: The `user_session_cache.json` 
fallback is an undocumented auth bypass in Debug mode. Either remove it, add a 
TTL, or clearly mark it in code comments and developer docs.
   
   5. **Audit `_augment_test_membership()`**: Decide whether silently adding 
"test" committee/project membership to every real user is the right approach, 
or if this should be database-driven.
   
   6. **Document the mode matrix**: Create a clear, authoritative table (like 
the one in §1) that lives in the codebase and is tested. When behavior changes 
based on `ALLOW_TESTS` or `Mode.Debug`, there should be a single place to 
understand the full picture.
   
   7. **Guard Debug mode more strictly**: Verify that `is_dev_environment()` is 
sufficient to prevent Debug mode in production-like deployments. Consider 
requiring an explicit opt-in (e.g., a file on disk) rather than just the 
absence of a `PRODUCTION` env var.
   
   8. **Rate limiting in test mode**: Rate limiting is entirely disabled when 
`ALLOW_TESTS` is on. If integration tests need higher limits, consider raising 
the limits rather than removing them entirely, so tests still exercise the 
rate-limiting code paths.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to