cbalci opened a new pull request, #10321: URL: https://github.com/apache/pinot/pull/10321
This is the first of a two PR changes which will add Spark3 support for 'pinot-spark-connector' **Background** Apache Spark has [changed](https://blog.madhukaraphatak.com/spark-3-datasource-v2-part-3) the Datasource interface significantly between Spark2 and Spark3 so `pinot-spark-connector` doesn't work for Spark3. We can implement a new connector for Spark3 as a separate module, however about half of the logic/code under the existing connector is independent of the interface and can potentially be reused across Spark2 and Spark3 connectors. For this, I'm restructuring the packages similar to what was done for batch ingestion in https://github.com/apache/pinot/pull/8560. **Change** In this PR, I'm refactoring Spark Connector into two packages as: `pinot-spark-connector` --> ( `pinot-spark-common` + `pinot-spark-2-connector` ) This is mostly a mechanical refactoring which moves packages around and renames fields/classes for clarity. Only addition is the backported (from Spark) `CaseInsensitiveStringMap` to make `PinotDataSourceReadOptions` reusable across implementations (see comment below). **Testing** Usage and functionality of the Spark2 connector should be completely unchanged except for the renaming of the maven module. All the unit tests are preseved to ensure previous assumptions. I also ran the integration tests under `ExampleSparkPinotConnectorTest` to verify expected behavior. To preview the full changes including the Spark3 Connector implementation you can check [this diff](https://github.com/apache/pinot/compare/master...cbalci:pinot:pinot-spark3-connector#diff-8733e0d7481c08afa1005dfcebcf93aae07cfdc996c82f809bef22fcb2061e0eR41). `refactor` `cleanup` `release-notes` ('Pinot Spark Connector' module is renamed to 'Pinot Spark 2 Connector' for clarity) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org