wgtmac commented on code in PR #112:
URL: https://github.com/apache/iceberg-cpp/pull/112#discussion_r2177045928


##########
src/iceberg/table_scan.h:
##########
@@ -0,0 +1,210 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#pragma once
+
+#include <string>
+#include <vector>
+
+#include "iceberg/manifest_entry.h"
+#include "iceberg/type_fwd.h"
+
+namespace iceberg {
+
+/// \brief Represents a task to scan a table or a portion of it.

Review Comment:
   ```suggestion
   /// \brief An abstract scan task.
   ```



##########
src/iceberg/table_scan.h:
##########
@@ -0,0 +1,210 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#pragma once
+
+#include <string>
+#include <vector>
+
+#include "iceberg/manifest_entry.h"
+#include "iceberg/type_fwd.h"
+
+namespace iceberg {
+
+/// \brief Represents a task to scan a table or a portion of it.
+class ICEBERG_EXPORT ScanTask {
+ public:
+  virtual ~ScanTask() = default;
+
+  /// \brief The number of bytes that should be read by this scan task.
+  virtual int64_t size_bytes() const = 0;
+
+  /// \brief The number of files that should be read by this scan task.
+  virtual int32_t files_count() const = 0;
+
+  /// \brief The number of rows that should be read by this scan task.
+  virtual int64_t estimated_row_count() const = 0;
+};
+
+/// \brief Represents a task to scan a portion of a data file.
+class ICEBERG_EXPORT FileScanTask : public ScanTask {

Review Comment:
   Some thoughts about `FileScanTask`:
   
   1. Should we remove `ScanTask` abstraction above? If we remove the 
abstraction, we can directly use aggregate initialization to create a task. 
Otherwise we may need to expand the constructor every time a new parameter is 
required.
   2. If we do (1) above, is it possible also to make it a simple struct by 
removing all functions (as they are all trivial accessors).
   3. Should we add fields  (a.k.a. spec and partition_value) from Java 
`PartitionScanTask` to support partitioning? We can add them later but a TODO 
comment is desirable.
   4. Should we combine `start` and `length`, and wrap them by `std::optional`? 
I believe they are not required at all times.



##########
src/iceberg/table_scan.h:
##########
@@ -0,0 +1,210 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#pragma once
+
+#include <string>
+#include <vector>
+
+#include "iceberg/manifest_entry.h"
+#include "iceberg/type_fwd.h"
+
+namespace iceberg {
+
+/// \brief Represents a task to scan a table or a portion of it.
+class ICEBERG_EXPORT ScanTask {
+ public:
+  virtual ~ScanTask() = default;
+
+  /// \brief The number of bytes that should be read by this scan task.
+  virtual int64_t size_bytes() const = 0;
+
+  /// \brief The number of files that should be read by this scan task.
+  virtual int32_t files_count() const = 0;
+
+  /// \brief The number of rows that should be read by this scan task.
+  virtual int64_t estimated_row_count() const = 0;
+};
+
+/// \brief Represents a task to scan a portion of a data file.

Review Comment:
   ```suggestion
   /// \brief Task representing a data file and its corresponding delete files.
   ```
   
   Copied from `iceberg-python`



##########
src/iceberg/manifest_reader.h:
##########
@@ -43,10 +44,27 @@ class ICEBERG_EXPORT ManifestReader {
 /// \brief Read manifest files from a manifest list file.
 class ICEBERG_EXPORT ManifestListReader {
  public:
+  virtual ~ManifestListReader() = default;
   virtual Result<std::span<std::unique_ptr<ManifestFile>>> Files() const = 0;
 
  private:
   std::unique_ptr<StructLikeReader> reader_;
 };
 
+/// \brief Creates a reader for the manifest list.
+/// \param file_path Path to the manifest list file.
+/// \return A Result containing the reader or an error.
+Result<std::unique_ptr<ManifestListReader>> CreateManifestListReader(
+    const std::string& file_path) {

Review Comment:
   I don't know yet. It depends on how we will use them. BTW @dongxiao1198 will 
work on manifest reading.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to