asf-tooling opened a new issue, #986:
URL: https://github.com/apache/tooling-trusted-releases/issues/986
**ASVS Level(s):** [L1]
**Description:**
### Summary
URLs from third-party API responses (NPM, ArtifactHub, PyPI) are rendered as
clickable HTML links without protocol validation. The `distribution_web_url()`
function extracts URLs directly from API responses and stores them in the
database. These URLs are later rendered via `html_tr_a()` as `<a href>`
elements without validating the protocol scheme. An attacker could publish a
package with a `javascript:` or `data:` URL in the homepage field, which would
be stored and later execute in users' browsers when they view the distribution
page, resulting in stored XSS. Jinja2 auto-escaping prevents breaking out of
HTML attributes but does NOT prevent `javascript:` protocol execution in href
attributes.
### Details
**Affected Files and Lines:**
- `atr/shared/distribution.py:161-202` - URL extraction without validation
- `atr/shared/distribution.py:248` - URL rendering
- `atr/get/distribution.py:105` - Distribution display
URLs are extracted from third-party APIs and rendered without protocol
validation, allowing dangerous protocols.
### Recommended Remediation
Create a centralized URL protocol validation function and apply it to all
third-party URLs:
```python
_SAFE_URL_SCHEMES = frozenset({'http', 'https'})
def validate_url_protocol(url: str) -> str | None:
"""Validate URL has safe protocol scheme."""
try:
parsed = urllib.parse.urlparse(url)
if parsed.scheme.lower() not in _SAFE_URL_SCHEMES:
return None
return url
except Exception:
return None
# Apply in distribution_web_url() for all cases
web_url = validate_url_protocol(raw_url)
if not web_url:
return None
# Defense-in-depth at render layer
def html_tr_a(url: str, text: str) -> htm.Element:
"""Render link with protocol validation."""
safe_url = validate_url_protocol(url)
if not safe_url:
return htm.td[text] # Render as text if unsafe
return htm.td[htm.a(href=safe_url)[text]]
```
Apply in `distribution_web_url()` for all cases (NPM, ArtifactHub, PyPI).
Add defense-in-depth at render layer in `html_tr_a()` to validate URLs again
before rendering.
### Acceptance Criteria
- [ ] URL validation function created
- [ ] Validation applied at storage
- [ ] Validation applied at rendering
- [ ] Dangerous protocols rejected
- [ ] Integration test verifies rejection
- [ ] Unit test verifying the fix
### References
- Source reports: L1:1.2.2.md
- Related findings: FINDING-070
- ASVS sections: 1.2.2
### Priority
Medium
---
---
### Consolidated: FINDING-070 - Missing URL Protocol Validation for SBOM
Supplier URLs
**ASVS Level(s):** [L1]
**Description:**
### Summary
The `supplier_op_from_url()` function in SBOM conformance processing accepts
URLs from deps.dev API responses without protocol validation. When processing
SBOM documents, the system queries the deps.dev API for Maven package homepage
URLs and extracts the URL from the 'HOMEPAGE' link label. The fallback case
accepts ANY URL as both the supplier name and URL without validating the
protocol scheme. A `javascript:` or `data:` URL from the deps.dev API would be
stored in the SBOM supplier URL field. If this data is later rendered in a web
context with the URL as a clickable link, it could enable stored XSS.
### Details
**Affected Files and Lines:**
- `atr/sbom/conformance.py:104-115` - supplier_op_from_url without validation
- `atr/sbom/conformance.py:124-132` - URL extraction from API
The function accepts any URL without protocol validation, allowing dangerous
protocols to be stored.
### Recommended Remediation
Add protocol validation to `supplier_op_from_url()`:
```python
def supplier_op_from_url(url: str) -> tuple[str, str] | None:
"""Extract supplier from URL with protocol validation."""
try:
parsed = urllib.parse.urlparse(url)
# Validate protocol
if parsed.scheme.lower() not in ('http', 'https'):
return None
# ... rest of function
except Exception:
return None
```
Check `parsed.scheme.lower() in ('http', 'https')` and return None for
non-HTTP(S) URLs. This prevents `javascript:`, `data:`, `file:`, and other
dangerous protocols from being stored and potentially rendered.
### Acceptance Criteria
- [ ] Protocol validation added
- [ ] Only HTTP(S) URLs accepted
- [ ] Dangerous protocols rejected
- [ ] None returned for invalid URLs
- [ ] Integration test verifies rejection
- [ ] Unit test verifying the fix
### References
- Source reports: L1:1.2.2.md
- Related findings: FINDING-069
- ASVS sections: 1.2.2
### Priority
Medium
---
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]