mbutrovich commented on PR #4335:
URL: 
https://github.com/apache/datafusion-comet/pull/4335#issuecomment-4462040690

   Thanks for putting this together @karuppayya! We're definitely thinking 
about the same problem right now. :)
   
   One thing I'd like to push on before we go further. A fair amount of this PR 
is distribution and lifecycle wiring (the per-catalog broadcast cache, 
threading the broadcast through `CometExecRDD` and the operator, the 
`IcebergReflection.getCatalogName` lookup that keys the cache, and the 
`initialize(catalogProperties)` step that exists so the broadcast carries 
config). That feels like vendor territory to me rather than Comet territory. A 
vendor credential manager can already integrate with Spark's own credential 
distribution mechanisms, run its own broadcast, or stash state in a lazy 
singleton, none of which Comet has to know about. If the SPI is 
`ServiceLoader`-discovered, every executor finds the provider on its own 
classpath and the broadcast layer goes away.
   
   My concern is scope. Comet's value is accelerating compute-intensive 
operations in native code, and I'd rather we not take on responsibility for 
reimplementing Spark's credential plumbing inside Comet when vendors are 
already equipped to handle it. #4309 takes the narrower position of owning only 
the JNI shape and the per-call context, and leaves caching, refresh, and 
distribution to the vendor implementation.
   
   Is there a scenario you have in mind that #4309's shape cannot cover?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to