[DISCUSSION] Hive and Presto Write support + Performance improvement

Kunal Kapoor Wed, 08 Jan 2020 22:21:00 -0800

Hi All,
As you all know that carbon has been supporting reading carbontable from
presto and hive for a long time now and its high time that we start
supporting write from presto and hive in 2.0.0 version.


The development would be divided into 2 Phases.

*Phase1 (Hive):*
*1. Support a OutputFormat(MapredCarbonOutputFormat) that allows the user
to write data in carbondata format from hive.*
    - Tables would be created in spark, until a solution to create schema
file in hive is found.
    - Tables would support the same folder structure as a transactional
table.
    - Any carbon specific DDL/DML would not be supported.

*2. Read Performance should be better or equivalent to ORC.*

*Phase2 (Presto): To be done later*
The Tasks are same as Hive and any update to the task list would be updated
after analysis.

Any suggestions from the community is appreciated.

Thanks
Kunal Kapoor

[DISCUSSION] Hive and Presto Write support + Performance improvement

Reply via email to