wgtmac commented on code in PR #180:
URL: https://github.com/apache/iceberg-cpp/pull/180#discussion_r2280924769
##########
src/iceberg/type.h:
##########
@@ -24,7 +24,9 @@
/// iceberg/type_fwd.h for the enum defining the list of types.
#include <array>
+#include <cctype>
Review Comment:
Are these new includes really necessary? Or they were added by clangd by
accident?
##########
test/type_test.cc:
##########
@@ -21,11 +21,13 @@
#include <format>
#include <memory>
+#include <optional>
Review Comment:
```suggestion
```
##########
src/iceberg/type.cc:
##########
@@ -19,9 +19,17 @@
#include "iceberg/type.h"
+#include <algorithm>
+#include <cctype>
#include <format>
+#include <functional>
#include <iterator>
#include <memory>
+#include <optional>
+#include <ranges>
+#include <string_view>
+
+#include <iceberg/schema_field.h>
Review Comment:
```suggestion
```
This is redundant too so we can remove it. BTW, including header files from
this project usually use `#include "foo.h"` instead of `#include <foo.h>`.
##########
src/iceberg/type.h:
##########
@@ -115,12 +122,18 @@ class ICEBERG_EXPORT StructType : public NestedType {
int32_t index) const override;
std::optional<std::reference_wrapper<const SchemaField>> GetFieldByName(
std::string_view name) const override;
+ std::optional<std::reference_wrapper<const SchemaField>>
GetFieldByNameCaseInsensitive(
+ std::string_view name) const override;
+ void InitNameToIdMap() const;
+ void InitNameToIdMapCaseInsensitive() const;
protected:
bool Equals(const Type& other) const override;
std::vector<SchemaField> fields_;
std::unordered_map<int32_t, size_t> field_id_to_index_;
+ mutable std::unordered_map<std::string, size_t> field_name_to_index_;
+ mutable std::unordered_map<std::string, size_t>
caseinsensitive_field_name_to_index_;
Review Comment:
Should we also move these definitions to `NestedType` so List and Map types
can leverage this?
##########
test/type_test.cc:
##########
@@ -21,11 +21,13 @@
#include <format>
#include <memory>
+#include <optional>
#include <string>
#include <gmock/gmock.h>
#include <gtest/gtest.h>
+#include "gmock/gmock.h"
Review Comment:
```suggestion
```
##########
src/iceberg/type.h:
##########
@@ -81,17 +83,22 @@ class ICEBERG_EXPORT NestedType : public Type {
[[nodiscard]] virtual std::optional<std::reference_wrapper<const
SchemaField>>
GetFieldById(int32_t field_id) const = 0;
/// \brief Get a field by index.
+
///
/// \note This is O(1) complexity.
[[nodiscard]] virtual std::optional<std::reference_wrapper<const
SchemaField>>
GetFieldByIndex(int32_t index) const = 0;
/// \brief Get a field by name (case-sensitive). Behavior is undefined if
/// the field name is not unique; prefer GetFieldById or GetFieldByIndex
/// when possible.
+
///
- /// \note This is currently O(n) complexity.
+ /// \note This is currently O(1) complexity.
[[nodiscard]] virtual std::optional<std::reference_wrapper<const
SchemaField>>
GetFieldByName(std::string_view name) const = 0;
+
+ [[nodiscard]] virtual std::optional<std::reference_wrapper<const
SchemaField>>
+ GetFieldByNameCaseInsensitive(std::string_view name) const = 0;
Review Comment:
Is it better to directly extend the existing function like
`GetFieldByName(std::string_view name, bool case_sensitive = true)`?
##########
src/iceberg/type.h:
##########
@@ -115,12 +122,18 @@ class ICEBERG_EXPORT StructType : public NestedType {
int32_t index) const override;
std::optional<std::reference_wrapper<const SchemaField>> GetFieldByName(
std::string_view name) const override;
+ std::optional<std::reference_wrapper<const SchemaField>>
GetFieldByNameCaseInsensitive(
+ std::string_view name) const override;
+ void InitNameToIdMap() const;
+ void InitNameToIdMapCaseInsensitive() const;
protected:
bool Equals(const Type& other) const override;
std::vector<SchemaField> fields_;
std::unordered_map<int32_t, size_t> field_id_to_index_;
+ mutable std::unordered_map<std::string, size_t> field_name_to_index_;
+ mutable std::unordered_map<std::string, size_t>
caseinsensitive_field_name_to_index_;
Review Comment:
BTW, can we use `std::string_view` as the key for better performance?
##########
src/iceberg/type.cc:
##########
@@ -19,9 +19,17 @@
#include "iceberg/type.h"
+#include <algorithm>
Review Comment:
Please clean up these includes to leave those non-duplicated.
##########
src/iceberg/type.h:
##########
@@ -115,12 +122,18 @@ class ICEBERG_EXPORT StructType : public NestedType {
int32_t index) const override;
std::optional<std::reference_wrapper<const SchemaField>> GetFieldByName(
std::string_view name) const override;
+ std::optional<std::reference_wrapper<const SchemaField>>
GetFieldByNameCaseInsensitive(
+ std::string_view name) const override;
+ void InitNameToIdMap() const;
+ void InitNameToIdMapCaseInsensitive() const;
protected:
bool Equals(const Type& other) const override;
std::vector<SchemaField> fields_;
std::unordered_map<int32_t, size_t> field_id_to_index_;
+ mutable std::unordered_map<std::string, size_t> field_name_to_index_;
+ mutable std::unordered_map<std::string, size_t>
caseinsensitive_field_name_to_index_;
Review Comment:
```suggestion
mutable std::unordered_map<std::string, size_t> name_to_id_;
mutable std::unordered_map<std::string, size_t> lowercase_name_to_id_;
```
Let's make them concise.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]