GitHub user rajeshbulleddula-svg created a discussion: Security Concern: Vended Credentials as Bearer Tokens — Credential Delegation Violation & Workload Identity Binding
**Environment:** Polaris on AWS EKS | Compute Engine (Eg: Spark) on on-prem or AWS | Entra ID as IdP | S3 object storage --- ### The Core Security Violation: Vended Credential Delegation Our organization's security team has flagged a fundamental violation in the vended credentials model as it stands today — one that goes beyond the typical bearer token concern: The entity that requests the STS token (Polaris) is not the entity that uses it (Spark). Polaris requests vended credentials on its own identity, then hands them to another party, compute engine in this case This is a **credential delegation violation**. The standard security principle for STS tokens — and bearer tokens generally — is that the requesting entity is the consuming entity. That assumption is broken here by design The flow today is: ``` Spark authenticates via External IDP → calls Polaris REST API with JWT → Polaris validates JWT, calls AWS STS AssumeRole → Polaris receives STS token → Polaris hands STS token back to Spark → Spark uses vended credentials directly against S3 → S3 validates token is valid — no validation of who is holding it ``` **This is not just a theoretical risk. It means:** - The credential requestor (Polaris) and the credential consumer (Spark) are different entities - Any actor who obtains the token — compromised executor, malicious insider, accidental log exposure — can use it against S3 as if they were the original authorized requestor - AWS CloudTrail will show S3 access under Polaris's assumed role, not under Spark's or the end user's identity — creating an audit gap - S3 has no way to enforce that the token is only used by the intended consumer. Concrete threat vectors our security team has identified: - A compromised Spark executor node exfiltrates the token and uses it outside the cluster - A malicious insider on the compute cluster copies the token and accesses S3 independently - The token is accidentally logged and replayed within its TTL window - A man-in-the-middle intercepts the token in transit from Polaris to Spark This is a compliance and audit concern for us, particularly because our compute spans on-premises and multiple clouds, making network perimeter controls (VPC endpoints, IP allowlisting) operationally unscalable and insufficient. --- ### What We Are Looking For We want to understand: - Whether the credential delegation pattern is a known, accepted limitation of the current architecture - Whether there are community-endorsed patterns for hardening credential security in multi-cloud environments - Whether there is roadmap intent to address workload identity binding and credential delegation natively in Polaris Any input from the maintainers or organizations running Polaris in production at scale would be greatly appreciated. Happy to contribute to a design discussion if there is community interest in tackling this more formally. GitHub link: https://github.com/apache/polaris/discussions/3972 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
