pengxiangyu commented on code in PR #9139:
URL: https://github.com/apache/incubator-doris/pull/9139#discussion_r857065072


##########
be/src/filesystem/local_file_system.cpp:
##########
@@ -0,0 +1,126 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "filesystem/local_file_system.h"
+
+#include <fmt/format.h>
+
+#include <filesystem>
+#include <system_error>
+
+#include "filesystem/read_stream.h"
+#include "filesystem/write_stream.h"
+
+namespace fs = std::filesystem;
+
+namespace doris {
+
+LocalFileSystem::LocalFileSystem(std::string root_path) : 
_root_path(std::move(root_path)) {}
+
+LocalFileSystem::~LocalFileSystem() = default;
+
+Status LocalFileSystem::exists(const std::string& path, bool* res) const {

Review Comment:
   This class is similar to PosixEnv, most functions of it is also defined in 
PosixEnv, is it necessary to create a new class.



##########
be/src/filesystem/s3_file_system.cpp:
##########
@@ -0,0 +1,110 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "filesystem/s3_file_system.h"
+
+#include <aws/s3/S3Client.h>
+#include <aws/s3/model/DeleteObjectRequest.h>
+#include <aws/s3/model/HeadObjectRequest.h>
+
+#include <filesystem>
+
+#include "util/s3_util.h"
+
+namespace fs = std::filesystem;
+
+namespace doris {
+

Review Comment:
   S3StorageBackend(extends StorageBackend) is used to manager remote file 
system.
   Local file system and remote file system has more different, Each remote 
file system has similiar operations( such as download,  upload,  local cache), 
Don't implement them for every remote file system.



##########
be/src/filesystem/file_system.h:
##########
@@ -0,0 +1,73 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "common/status.h"
+#include "filesystem/io_context.h"
+
+namespace doris {
+
+class ReadStream;
+class WriteStream;
+
+struct FileStat {
+    std::string name;
+    size_t size;
+};
+
+class FileSystem {
+public:
+    virtual ~FileSystem() = default;
+
+    // Check if the specified file exists.
+    virtual Status exists(const std::string& path, bool* res) const = 0;
+
+    // Check if the specified file exists and it's a regular file (not a 
directory or special file type).
+    virtual Status is_file(const std::string& path, bool* res) const = 0;
+
+    // Check if the specified file exists and it's a directory.
+    virtual Status is_directory(const std::string& path, bool* res) const = 0;
+
+    // Get all files under the `path` directory.
+    // If it's not a directory, return error
+    virtual Status list(const std::string& path, std::vector<FileStat>* files) 
= 0;
+
+    // Delete the directory recursively if it exists and not a regular file.
+    // If it's a file, return error
+    // If the directory doesn't exist, return ok
+    virtual Status delete_directory(const std::string& path) = 0;

Review Comment:
   it is different for path with children or not.  Sometimes, all the children 
need to be deleted, but other times, an error need to be returned.



##########
be/src/filesystem/s3_read_stream.cpp:
##########
@@ -0,0 +1,104 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "filesystem/s3_read_stream.h"
+
+#include <aws/s3/S3Client.h>
+#include <aws/s3/model/GetObjectRequest.h>
+
+#include "filesystem/s3_common.h"
+
+namespace doris {
+
+S3ReadStream::S3ReadStream(std::shared_ptr<Aws::S3::S3Client> client, 
std::string bucket,
+                           std::string key, size_t offset, size_t 
read_until_position)
+        : _client(std::move(client)),
+          _bucket(std::move(bucket)),
+          _key(std::move(key)),
+          _offset(offset),
+          _read_until_position(read_until_position) {}
+
+S3ReadStream::~S3ReadStream() {
+    close();
+}
+
+Status S3ReadStream::read_at(size_t position, char* to, size_t req_n, size_t* 
read_n) {
+    if (closed()) {
+        return Status::IOError("Operation on closed stream");
+    }
+    if (position > _read_until_position) {
+        return Status::IOError("Position exceeds range");
+    }
+    req_n = std::min(req_n, _read_until_position - position);
+    if (req_n == 0) {
+        *read_n = 0;
+        return Status::OK();
+    }
+    Aws::S3::Model::GetObjectRequest req;
+    req.SetBucket(_bucket);
+    req.SetKey(_key);
+    req.SetRange(fmt::format("bytes={}-{}", position, position + req_n - 1));
+    req.SetResponseStreamFactory(AwsWriteableStreamFactory(to, req_n));
+
+    auto outcome = _client->GetObject(req);

Review Comment:
   GetObject will be called for every select, a local cache file manager is 
needed, otherwise, Every select will have to call downloading operation which 
is too heavy and it will make many select operation slow sometimes, it is not 
good for user.
   Please notice remote_block_manager, it will manage local cache files.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to