asfimport opened a new issue, #279:
URL: https://github.com/apache/arrow-java/issues/279

   We added a new java interface to support parquet read and write from hdfs or 
local file.
   
   The purpose of this implementation is that when we loading and dumping 
parquet data in Java, we can only use rowBased put and get methods. Since arrow 
already has C++ implementation to load and dump parquet, so we wrapped those 
codes as Java APIs.
   
   After test, we noticed in our workload, performance improved more than 2x 
comparing with rowBased load and dump. So we want to contribute codes to arrow.
   
   since this is a total independent change, there is no codes change to 
current arrow codes. We added two folders as listed:  java/adapter/parquet and 
cpp/src/jni/parquet
   
   **Reporter**: [Chendi.Xue](https://issues.apache.org/jira/browse/ARROW-6720)
   #### Related issues:
   - [[Java][Dataset] Implement Datasets Java API 
](https://github.com/apache/arrow/issues/17055) (incorporates)
   - [[Java][Dataset] Support writing to files within dataset scanner via 
JNI](https://github.com/apache/arrow/issues/27628) (incorporates)
   #### PRs and other links:
   - [GitHub Pull Request 
apache/arrow#5522](https://github.com/apache/arrow/pull/5522)
   - [GitHub Pull Request 
apache/arrow#5717](https://github.com/apache/arrow/pull/5717)
   - [GitHub Pull Request 
apache/arrow#5719](https://github.com/apache/arrow/pull/5719)
   
   <sub>**Note**: *This issue was originally created as 
[ARROW-6720](https://issues.apache.org/jira/browse/ARROW-6720). Please see the 
[migration documentation](https://github.com/apache/arrow/issues/14542) for 
further details.*</sub>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to