chinnaraolalam opened a new issue, #9113:
URL: https://github.com/apache/iceberg/issues/9113

   ### Apache Iceberg version
   
   1.3.0
   
   ### Query engine
   
   Hive
   
   ### Please describe the bug 🐞
   
   
   Iceberg supports Hive 2 and Hive 3 versions, for the same it has below 
modules
   
   1. mr
   2. hive-metastore
   3. hive3
   4. hive3-orc
   5. hive-runtime
   
   “mr” module contains all common classes and test classes + hive 2 classes
   “hive3” module contains specific hive3 classes and test classes
   
   Hive generates one runtime jar; it is a common jar for hive version 2 and 
version 3.
   These modules are very tightly coupled, if user wants only Hive 3, currently 
there is no way to disable Hive 2. Hive 2 is dependent on Hadoop 2, as there is 
no option to disable Hive 2, to compile Hive 2 need to configure Hadoop 2. 
Configuring Hadoop version is in common file, due to this all-other modules 
also like Spark, flink and all using Hadoop 2 as well.
   
   **Here 2 Issues:**
   
       1. Currently all modules are using Hadoop 2 as it is mentioned on common 
file, Actual deployments will work on Hadoop 3.x versions.  So, compilation and 
testing are happening on Hadoop 2 and deployment is happening on Hadoop 3.x. 
Which is the major issue, need to fix.
          
       2. By default, Hive3 should be configured as this version widely used 
and stable version and should have provision to compile and build Hive 2 as 
well whenever it is needed. By doing this it won't include Hive 2.x files every 
time on hive-runtime jar and Hadoop 2 dependency can be avoided.
   
   **Proposed solution:**
   
   Spark have proper version management like spark 3.2 and spark 3.3 etc. Here 
every module will build spark_runtime.jar individually. Same as spark, hive 
should have version management, but hive do not have more versions and hive3 
was more stable version, instead of sophisticated version management, make 
hive3 as default and make hive2 as configurable (Provide an option to build 
when it is needed) as below.
   
   **Introduce new module Hive2:**
   
   Introduce new module as hive2, move all hive2 classes and test classes to 
new module.
   “mr” module should contain all common classes and test classes. (“mr” module 
can rename to hive-common)
   By default, configure Hive3 and as no other components have Hadoop 2 
dependency, make Hadoop 3 as default in common version file, which will be 
applicable to all other components.
   
   Hive modules will look like below.
   
   1. mr (rename to hive-common)
   2. hive-metastore
   3. hive2
   4. hive3
   5. hive3-orc
   6. hive-runtime 
   
   **Currently hive-runtime jar building as below**
   
   dependencies {
    implementation project(':iceberg-mr')
     if(jdkVersion =='8'&& hiveVersions.contains("3")) {
        implementation project(':iceberg-hive3')
   }
   
   **This can be changed as below**
   
   dependencies {
    implementation project(':iceberg-mr')
    if (hiveVersions.contains("2")) {
      implementation project(':iceberg-hive2')
    }
    if (jdkVersion == '8' && hiveVersions.contains("3")) {
      implementation project(':iceberg-hive3')
    }
   
   Make Hive3 and Hadoop 3 as default, as major components support Hadoop 3. 
With this new change have provision to build hive 2 also.
   
   After introducing hive2 module, if specify hive2 then only this module will 
build and it requires Hadoop 2, which need to specify.
   
   Now hive_runtime.jar will have versions specific classes  hive 2 or hive 3.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to