[ 
https://issues.apache.org/jira/browse/HADOOP-19906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18085046#comment-18085046
 ] 

ASF GitHub Bot commented on HADOOP-19906:
-----------------------------------------

pan3793 opened a new pull request, #8522:
URL: https://github.com/apache/hadoop/pull/8522

   <!--
     Thanks for sending a pull request!
       1. If this is your first time, please read our contributor guidelines: 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
       2. Make sure your PR title starts with JIRA issue id, e.g., 
'HADOOP-17799. Your PR title ...'.
   -->
   
   ### Description of PR
   
   This is an alternative to HADOOP-19668 and HADOOP-19670 
(SubjectInheritingThread approach) to restore Subject propagation semantics on 
JDK 22+.
   
   Issues of HADOOP-19668 and HADOOP-19670 approach:
   
   - requires invasive modification on all downstream projects - downstream 
projects like Spark must replace all Thread with SubjectInheritingThread, this 
requires a lot of work and is mostly impossible, third-party libs may not allow 
setting a custom ThreadFactory ...
   
   - semantics are not fully aligned
   
     Subject should be captured at the Thread construction instead of calling 
start() time, which means a Thread construct in A, start in B must observe the 
same Subject with A (capture at construction). This breaks a typical thread 
usage pattern - thread pool.
   
     No cascading Subject propagation, it's an obvious conclusion because 
`SubjectInheritingThread` should be explicitly declared everywhere to achieve 
the Subject propagation semantics.
   
   This approach addressed the above issues but still has its own limitations - 
   
   - The approach should work on the platform thread, but 
InheritableThreadLocal has different behaviors on a virtual thread
   
   - This approach works for UGI.doAs, but does not apply to Subject.doAs, 
e.g., when users use JAAS-based Kerberos auth, they still need to maintain 
Subject propagation semantics themselves.
   
   ### How was this patch tested?
   
   UT added, also applied to our internal branch and integrated with Spark:
   
   - without HADOOP-19668 and HADOOP-19670
   
     Spark works on YARN correctly, because YARN will prepare the credentials 
before launching the container, all threads see null Subject, then fallback to 
the login user, and pick those credentials, but subsequent DT update is broken. 
On K8s, everything on the executor side is broken.
   
   - applies HADOOP-19668 and HADOOP-19670 on the Hadoop client, also changes 
Spark's thread to use SubjectInheritingThread
   
     A few threads see the same Subject, but most threads do not, due to 
SubjectInheritingThread not working on the thread pool case (explained in the 
above section). The situation does not change from the user's perspective.
   
   - Revert HADOOP-19668 and HADOOP-19670, applies this patch. Changes only 
apply to Hadoop Client, both Spark on YARN and K8s work as expected.
   
   ### For code changes:
   
   - [x] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   ### AI Tooling
   
   Contains content generated by Claude Opus 4.7.




> Alternative to SubjectInheritingThread to restore Subject propagation
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-19906
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19906
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 3.5.0, 3.4.3
>            Reporter: Cheng Pan
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to