cbalci opened a new pull request, #10321:
URL: https://github.com/apache/pinot/pull/10321

   This is the first of a two PR changes which will add Spark3 support for 
'pinot-spark-connector'
   
   **Background**
   Apache Spark has 
[changed](https://blog.madhukaraphatak.com/spark-3-datasource-v2-part-3) the 
Datasource interface significantly between Spark2 and Spark3 so 
`pinot-spark-connector` doesn't work for Spark3. We can implement a new 
connector for Spark3 as a separate module, however about half of the logic/code 
under the existing connector is independent of the interface and can 
potentially be reused across Spark2 and Spark3 connectors. For this, I'm 
restructuring the packages similar to what was done for batch ingestion in 
https://github.com/apache/pinot/pull/8560.
   
   **Change**
   In this PR, I'm refactoring Spark Connector into two packages as:
   `pinot-spark-connector` --> ( `pinot-spark-common` +  
`pinot-spark-2-connector` )
   
   This is mostly a mechanical refactoring which moves packages around and 
renames fields/classes for clarity. Only addition is the backported (from 
Spark) `CaseInsensitiveStringMap` to make `PinotDataSourceReadOptions` reusable 
across implementations (see comment below).
   
   **Testing**
   Usage and functionality of the Spark2 connector should be completely 
unchanged except for the renaming of the maven module. All the unit tests are 
preseved to ensure previous assumptions. I also ran the integration tests under 
`ExampleSparkPinotConnectorTest` to verify expected behavior.
   
   
   To preview the full changes including the Spark3 Connector implementation 
you can check [this 
diff](https://github.com/apache/pinot/compare/master...cbalci:pinot:pinot-spark3-connector#diff-8733e0d7481c08afa1005dfcebcf93aae07cfdc996c82f809bef22fcb2061e0eR41).
   
   `refactor` `cleanup`
   `release-notes` ('Pinot Spark Connector' module is renamed to 'Pinot Spark 2 
Connector' for clarity)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to