wgtmac commented on code in PR #1993:
URL: https://github.com/apache/orc/pull/1993#discussion_r1713005130


##########
c++/include/orc/Common.hh:
##########
@@ -224,6 +224,26 @@ namespace orc {
      * Get the writer timezone.
      */
     virtual const std::string& getWriterTimezone() const = 0;
+    /**
+     *Get the list of Localkeys for all columns in the Stripe, with each 
encrypted column
+     *corresponding to a Localkey.
+     * @return
+     */
+    virtual std::shared_ptr<std::vector<std::vector<unsigned char>>>
+    getEncryptedLocalKeys() const = 0;
+    /**
+     *Get the Localkey for a specific column in this Stripe.
+     * @param col
+     * @return

Review Comment:
   ```suggestion
   ```
   
   If the param docstring does not provide additional information, please 
remove it. Same for other comments.



##########
c++/include/orc/Common.hh:
##########
@@ -224,6 +224,26 @@ namespace orc {
      * Get the writer timezone.
      */
     virtual const std::string& getWriterTimezone() const = 0;
+    /**
+     *Get the list of Localkeys for all columns in the Stripe, with each 
encrypted column
+     *corresponding to a Localkey.
+     * @return
+     */
+    virtual std::shared_ptr<std::vector<std::vector<unsigned char>>>
+    getEncryptedLocalKeys() const = 0;
+    /**
+     *Get the Localkey for a specific column in this Stripe.
+     * @param col
+     * @return
+     */
+    virtual std::vector<unsigned char>& getEncryptedLocalKeyByVariantId(int 
col) const = 0;

Review Comment:
   ```suggestion
       virtual std::vector<unsigned char>& 
getEncryptedLocalKeyByVariantId(int32_t col) const = 0;
   ```
   
   It would be good to explicitly use fixed width type instead of `int` or 
`long` for portability.



##########
c++/src/security/InMemoryKeystore.cc:
##########
@@ -0,0 +1,247 @@
+// Copyright 2010-present vivo, Inc. All rights reserved.

Review Comment:
   Please note that we have a license check which does not allow custom 
modification to the license header in this repo. I'm not sure if this violates 
any ASF rule to add something like this. @dongjoon-hyun Do you have any 
concern? 



##########
c++/src/CMakeLists.txt:
##########
@@ -211,6 +213,8 @@ target_link_libraries (orc
     $<BUILD_INTERFACE:orc::lz4>
     $<BUILD_INTERFACE:orc::zstd>
     $<BUILD_INTERFACE:${LIBHDFSPP_LIBRARIES}>
+    ${OPENSSL_CRYPTO_LIBRARY}
+    ${OPENSSL_SSL_LIBRARY}

Review Comment:
   To give more context, we support two modes for each 3rd-party library:
   - download & build static library (by default)
   - find locally installed library using FindXXX CMake config file
   
   For this PR, I think we need to at least support one of them and make sure 
that CIs on the decryption are running happily.



##########
c++/src/security/ReaderEncryption.hh:
##########
@@ -0,0 +1,215 @@
+// Copyright 2010-present vivo, Inc. All rights reserved.
+//
+//Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#ifndef ORC_READERENCRYPTION_H
+#define ORC_READERENCRYPTION_H
+
+#include <memory>
+#include "InMemoryKeystore.hh"
+#include "orc/Common.hh"
+#include "orc/Statistics.hh"
+#include "orc/Type.hh"
+#include "orc_proto.pb.h"
+namespace orc {
+  class ReaderEncryptionKey;
+  struct FileContents;
+  class EncryptionVariant {
+   public:
+    std::shared_ptr<ReaderEncryptionKey> getKeyDescription();
+
+    Type* getRoot();
+
+    int getVariantId();
+
+    std::shared_ptr<SecretKeySpec> getFileFooterKey();
+
+    std::shared_ptr<SecretKeySpec> getStripeKey(long stripe);
+  };
+
+  // 加密变体,每个列一个

Review Comment:
   Please use English for comment.



##########
c++/include/orc/Common.hh:
##########
@@ -224,6 +224,26 @@ namespace orc {
      * Get the writer timezone.
      */
     virtual const std::string& getWriterTimezone() const = 0;
+    /**
+     *Get the list of Localkeys for all columns in the Stripe, with each 
encrypted column
+     *corresponding to a Localkey.
+     * @return
+     */
+    virtual std::shared_ptr<std::vector<std::vector<unsigned char>>>
+    getEncryptedLocalKeys() const = 0;
+    /**
+     *Get the Localkey for a specific column in this Stripe.
+     * @param col
+     * @return
+     */
+    virtual std::vector<unsigned char>& getEncryptedLocalKeyByVariantId(int 
col) const = 0;
+    /**
+     * In general, only the first stripe in an ORC file will store the 
LocalKey.In this case, the
+     * stripeId and originalStripeId are equal. If an ORC file has multiple 
stripes storing the
+     * LocalKey, the values of stripeId and originalStripeId may not be equal.
+     * @return
+     */
+    virtual long getOriginalStripeId() const = 0;

Review Comment:
   ```suggestion
       virtual int64_t getOriginalStripeId() const = 0;
   ```
   Same as above.



##########
c++/src/CMakeLists.txt:
##########
@@ -211,6 +213,8 @@ target_link_libraries (orc
     $<BUILD_INTERFACE:orc::lz4>
     $<BUILD_INTERFACE:orc::zstd>
     $<BUILD_INTERFACE:${LIBHDFSPP_LIBRARIES}>
+    ${OPENSSL_CRYPTO_LIBRARY}
+    ${OPENSSL_SSL_LIBRARY}

Review Comment:
   In the meanwhile, it would be better if we can add a CMake option to 
enable/disable decryption. This is an example: 
https://github.com/apache/orc/blob/main/CMakeLists.txt#L76.
   
   This is important to keep backward compatibility for downstream projects 
which may not have depended on openssl. Otherwise these projects may be 
reluctant to upgrade due to a lot of work to adopt openssl.



##########
c++/include/orc/Reader.hh:
##########
@@ -39,7 +39,7 @@ namespace orc {
   // classes that hold data members so we can maintain binary compatibility
   struct ReaderOptionsPrivate;
   struct RowReaderOptionsPrivate;
-
+  class KeyProvider;

Review Comment:
   The `c++/include/orc` folder is public and files in it will be installed. If 
the `KeyProvider` is also public, please move its definition to the include 
folder as well. Otherwise, we should remove the related function from here.



##########
c++/src/TypeImpl.cc:
##########
@@ -837,5 +837,4 @@ namespace orc {
     }
     return nullptr;
   }
-

Review Comment:
   Please revert the unnecessary change.



##########
c++/src/CMakeLists.txt:
##########
@@ -211,6 +213,8 @@ target_link_libraries (orc
     $<BUILD_INTERFACE:orc::lz4>
     $<BUILD_INTERFACE:orc::zstd>
     $<BUILD_INTERFACE:${LIBHDFSPP_LIBRARIES}>
+    ${OPENSSL_CRYPTO_LIBRARY}
+    ${OPENSSL_SSL_LIBRARY}

Review Comment:
   Please follow https://github.com/apache/orc/tree/main/cmake_modules to add 
similar support for openssl.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to