Xuanwo commented on code in PR #475:
URL: https://github.com/apache/iceberg-rust/pull/475#discussion_r1690027985


##########
crates/catalog/inmemory/README.md:
##########
@@ -0,0 +1,27 @@
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+-->
+
+# Apache Iceberg In-Memory Catalog Official Native Rust Implementation
+
+[![crates.io](https://img.shields.io/crates/v/iceberg.svg)](https://crates.io/crates/iceberg-catalog-inmemory)
+[![docs.rs](https://img.shields.io/docsrs/iceberg.svg)](https://docs.rs/iceberg/latest/iceberg-catalog-inmemory/)

Review Comment:
   ```suggestion
   
[![docs.rs](https://img.shields.io/docsrs/iceberg-catalog-inmemory.svg)](https://docs.rs/iceberg/latest/iceberg-catalog-inmemory/)
   ```



##########
crates/catalog/inmemory/Cargo.toml:
##########
@@ -0,0 +1,41 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+[package]
+name = "iceberg-catalog-inmemory"

Review Comment:
   nit: is using `memory` a good idea? cc @Fokko @liurenjie1024 



##########
crates/catalog/inmemory/src/catalog.rs:
##########
@@ -0,0 +1,1511 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! This module contains in-memory catalog implementation.
+
+use async_lock::Mutex;
+use iceberg::io::{FileIO, FileIOBuilder};
+use iceberg::spec::{TableMetadata, TableMetadataBuilder};
+use itertools::Itertools;
+use std::collections::HashMap;
+use uuid::Uuid;
+
+use async_trait::async_trait;
+
+use iceberg::table::Table;
+use iceberg::Result;
+use iceberg::{
+    Catalog, Error, ErrorKind, Namespace, NamespaceIdent, TableCommit, 
TableCreation, TableIdent,
+};
+use tempfile::TempDir;
+
+use crate::namespace_state::NamespaceState;
+
+/// In-memory catalog implementation.
+#[derive(Debug)]
+pub struct InMemoryCatalog {
+    root_namespace_state: Mutex<NamespaceState>,
+    file_io: FileIO,
+}
+
+impl InMemoryCatalog {
+    /// Creates an in-memory catalog.
+    pub fn new() -> Result<Self> {

Review Comment:
   We can accept a `FileIO` as input for the storage they want to connect. 
Also, we can provide a `Default` implementation which uses in-memory file_io.



##########
crates/catalog/inmemory/src/catalog.rs:
##########
@@ -0,0 +1,1511 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! This module contains in-memory catalog implementation.
+
+use async_lock::Mutex;
+use iceberg::io::{FileIO, FileIOBuilder};
+use iceberg::spec::{TableMetadata, TableMetadataBuilder};
+use itertools::Itertools;
+use std::collections::HashMap;
+use uuid::Uuid;
+
+use async_trait::async_trait;
+
+use iceberg::table::Table;
+use iceberg::Result;
+use iceberg::{
+    Catalog, Error, ErrorKind, Namespace, NamespaceIdent, TableCommit, 
TableCreation, TableIdent,
+};
+use tempfile::TempDir;
+
+use crate::namespace_state::NamespaceState;
+
+/// In-memory catalog implementation.
+#[derive(Debug)]
+pub struct InMemoryCatalog {
+    root_namespace_state: Mutex<NamespaceState>,
+    file_io: FileIO,
+}
+
+impl InMemoryCatalog {
+    /// Creates an in-memory catalog.
+    pub fn new() -> Result<Self> {
+        let root_namespace_state = NamespaceState::new();
+        let file_io = FileIOBuilder::new_fs_io().build()?;

Review Comment:
   We can use an in-memory filesystem by default, eliminating the need to 
change the environment. 
   
   I will build one for file_io.



##########
crates/catalog/inmemory/src/catalog.rs:
##########
@@ -0,0 +1,1511 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! This module contains in-memory catalog implementation.
+
+use async_lock::Mutex;

Review Comment:
   Hi, is there a specific reason we're using `async_lock`? Could we switch to 
`futures::lock::Mutex` since we already depend on `futures`?



##########
crates/catalog/inmemory/src/lib.rs:
##########
@@ -0,0 +1,26 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! Iceberg in-memory Catalog API implementation.
+
+#![deny(missing_docs)]
+#![feature(map_try_insert)]

Review Comment:
   All crates built by iceberg should be available in stable rust. So it's 
better to not enable nightly features.



##########
crates/catalog/inmemory/src/catalog.rs:
##########
@@ -0,0 +1,1511 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! This module contains in-memory catalog implementation.
+
+use async_lock::Mutex;
+use iceberg::io::{FileIO, FileIOBuilder};
+use iceberg::spec::{TableMetadata, TableMetadataBuilder};
+use itertools::Itertools;
+use std::collections::HashMap;
+use uuid::Uuid;
+
+use async_trait::async_trait;
+
+use iceberg::table::Table;
+use iceberg::Result;
+use iceberg::{
+    Catalog, Error, ErrorKind, Namespace, NamespaceIdent, TableCommit, 
TableCreation, TableIdent,
+};
+use tempfile::TempDir;
+
+use crate::namespace_state::NamespaceState;
+
+/// In-memory catalog implementation.
+#[derive(Debug)]
+pub struct InMemoryCatalog {
+    root_namespace_state: Mutex<NamespaceState>,
+    file_io: FileIO,
+}
+
+impl InMemoryCatalog {
+    /// Creates an in-memory catalog.
+    pub fn new() -> Result<Self> {
+        let root_namespace_state = NamespaceState::new();
+        let file_io = FileIOBuilder::new_fs_io().build()?;
+        let inmemory_catalog = Self {
+            root_namespace_state: Mutex::new(root_namespace_state),
+            file_io,
+        };
+
+        Ok(inmemory_catalog)
+    }
+}
+
+/// Create metadata location from `location` and `version`
+fn create_metadata_location(location: impl AsRef<str>, version: i32) -> 
Result<String> {
+    if version < 0 {
+        Err(Error::new(
+            ErrorKind::DataInvalid,
+            format!(
+                "Table metadata version: '{}' must be a non-negative integer",
+                version
+            ),
+        ))
+    } else {
+        let version = format!("{:0>5}", version);
+        let id = Uuid::new_v4();
+        let metadata_location = format!(
+            "{}/metadata/{}-{}.metadata.json",
+            location.as_ref(),
+            version,
+            id
+        );
+
+        Ok(metadata_location)
+    }
+}
+
+#[async_trait]
+impl Catalog for InMemoryCatalog {
+    /// List namespaces inside the Catalog.
+    async fn list_namespaces(
+        &self,
+        maybe_parent: Option<&NamespaceIdent>,
+    ) -> Result<Vec<NamespaceIdent>> {
+        let root_namespace_state = self.root_namespace_state.lock().await;
+
+        match maybe_parent {
+            None => Ok(root_namespace_state
+                .list_top_level_namespaces()
+                .into_iter()
+                .map(|str| NamespaceIdent::new(str.to_string()))
+                .collect_vec()),
+            Some(parent_namespace_ident) => {
+                let namespaces = root_namespace_state
+                    .list_namespaces_under(parent_namespace_ident)?
+                    .into_iter()
+                    .map(|name| NamespaceIdent::new(name.to_string()))
+                    .collect_vec();
+
+                Ok(namespaces)
+            }
+        }
+    }
+
+    /// Create a new namespace inside the catalog.
+    async fn create_namespace(
+        &self,
+        namespace_ident: &NamespaceIdent,
+        properties: HashMap<String, String>,
+    ) -> Result<Namespace> {
+        let mut root_namespace_state = self.root_namespace_state.lock().await;
+
+        root_namespace_state.insert_new_namespace(namespace_ident, 
properties.clone())?;
+        let namespace = Namespace::with_properties(namespace_ident.clone(), 
properties);
+
+        Ok(namespace)
+    }
+
+    /// Get a namespace information from the catalog.
+    async fn get_namespace(&self, namespace_ident: &NamespaceIdent) -> 
Result<Namespace> {
+        let root_namespace_state = self.root_namespace_state.lock().await;
+
+        let namespace = Namespace::with_properties(
+            namespace_ident.clone(),
+            root_namespace_state
+                .get_properties(namespace_ident)?
+                .clone(),
+        );
+
+        Ok(namespace)
+    }
+
+    /// Check if namespace exists in catalog.
+    async fn namespace_exists(&self, namespace_ident: &NamespaceIdent) -> 
Result<bool> {
+        let guarded_namespaces = self.root_namespace_state.lock().await;
+
+        Ok(guarded_namespaces.namespace_exists(namespace_ident))
+    }
+
+    /// Update a namespace inside the catalog.
+    ///
+    /// # Behavior
+    ///
+    /// The properties must be the full set of namespace.
+    async fn update_namespace(
+        &self,
+        namespace_ident: &NamespaceIdent,
+        properties: HashMap<String, String>,
+    ) -> Result<()> {
+        let mut root_namespace_state = self.root_namespace_state.lock().await;
+
+        root_namespace_state.replace_properties(namespace_ident, properties)
+    }
+
+    /// Drop a namespace from the catalog.
+    async fn drop_namespace(&self, namespace_ident: &NamespaceIdent) -> 
Result<()> {
+        let mut root_namespace_state = self.root_namespace_state.lock().await;
+
+        root_namespace_state.remove_existing_namespace(namespace_ident)
+    }
+
+    /// List tables from namespace.
+    async fn list_tables(&self, namespace_ident: &NamespaceIdent) -> 
Result<Vec<TableIdent>> {
+        let root_namespace_state = self.root_namespace_state.lock().await;
+
+        let table_names = root_namespace_state.list_tables(namespace_ident)?;
+        let table_idents = table_names
+            .into_iter()
+            .map(|table_name| TableIdent::new(namespace_ident.clone(), 
table_name.clone()))
+            .collect_vec();
+
+        Ok(table_idents)
+    }
+
+    /// Create a new table inside the namespace.
+    async fn create_table(
+        &self,
+        namespace_ident: &NamespaceIdent,
+        table_creation: TableCreation,
+    ) -> Result<Table> {
+        let mut root_namespace_state = self.root_namespace_state.lock().await;
+
+        let table_name = table_creation.name.clone();
+        let table_ident = TableIdent::new(namespace_ident.clone(), table_name);
+        let table_exists = root_namespace_state.table_exists(&table_ident)?;
+
+        if table_exists {
+            Err(Error::new(
+                ErrorKind::Unexpected,
+                format!(
+                    "Cannot create table {:?}. Table already exists",
+                    &table_ident
+                ),
+            ))
+        } else {
+            let (table_creation, location) = match 
table_creation.location.clone() {
+                Some(location) => (table_creation, location),
+                None => {
+                    let tmp_dir = TempDir::new()?;

Review Comment:
   I personally believe we should not depend on `TempDir`. As mentioned in the 
previous comment, the `file_io` could be memory or s3. 
   
   If `location` is empty, we can generate one for user.



##########
crates/catalog/inmemory/README.md:
##########
@@ -0,0 +1,27 @@
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+-->
+
+# Apache Iceberg In-Memory Catalog Official Native Rust Implementation
+
+[![crates.io](https://img.shields.io/crates/v/iceberg.svg)](https://crates.io/crates/iceberg-catalog-inmemory)

Review Comment:
   ```suggestion
   
[![crates.io](https://img.shields.io/crates/v/iceberg-catalog-inmemory.svg)](https://crates.io/crates/iceberg-catalog-inmemory)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to