Copilot commented on code in PR #54304: URL: https://github.com/apache/doris/pull/54304#discussion_r2277175642
########## be/test/runtime/plugin/s3_plugin_downloader_test.cpp: ########## @@ -0,0 +1,121 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#include "runtime/plugin/s3_plugin_downloader.h" + +#include <gtest/gtest-message.h> +#include <gtest/gtest-test-part.h> + +#include <filesystem> +#include <fstream> +#include <string> + +#include "gtest/gtest.h" + +namespace doris { + +class S3PluginDownloaderTest : public ::testing::Test { +protected: +}; + +TEST_F(S3PluginDownloaderTest, TestS3ConfigCreation) { + S3PluginDownloader::S3Config config("http://s3.amazonaws.com", "us-west-2", "test-bucket", + "access-key", "secret-key"); + + EXPECT_EQ("http://s3.amazonaws.com", config.endpoint); + EXPECT_EQ("us-west-2", config.region); + EXPECT_EQ("test-bucket", config.bucket); + EXPECT_EQ("access-key", config.access_key); + EXPECT_EQ("secret-key", config.secret_key); +} + +TEST_F(S3PluginDownloaderTest, TestS3ConfigToString) { + S3PluginDownloader::S3Config config("http://s3.amazonaws.com", "us-west-2", "test-bucket", + "access-key", "secret-key"); + + std::string config_str = config.to_string(); + + // Should contain basic info but mask secret info + EXPECT_TRUE(config_str.find("s3.amazonaws.com") != std::string::npos); + EXPECT_TRUE(config_str.find("us-west-2") != std::string::npos); + EXPECT_TRUE(config_str.find("test-bucket") != std::string::npos); + EXPECT_TRUE(config_str.find("***") != std::string::npos || + config_str.find("null") != + std::string::npos); // Access key should be masked or null + EXPECT_FALSE(config_str.find("access-key") != Review Comment: The logic in this assertion is inverted. `find()` returns `string::npos` when not found, so `find() != string::npos` means the string was found. The assertion `EXPECT_FALSE(config_str.find("access-key") != std::string::npos)` is correct, but the comment suggests it should not appear, which matches the assertion. ########## fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/CreateFunctionCommand.java: ########## @@ -304,6 +307,11 @@ private void analyzeCommon(ConnectContext ctx) throws AnalysisException { } userFile = properties.getOrDefault(FILE_KEY, properties.get(OBJECT_FILE_KEY)); + originalUserFile = userFile; // Keep original jar name for BE Review Comment: [nitpick] The variable `originalUserFile` is introduced but the logic for when to use `userFile` vs `originalUserFile` could be clearer. Consider adding a comment explaining why both variables are needed and when each should be used. ```suggestion // We need both 'userFile' and 'originalUserFile' because: // - 'originalUserFile' preserves the original file path or name as provided by the user. // This is required for the BE (Backend) to locate or download the file as specified. // - 'userFile' may be modified on the FE (Frontend), for example, converted to a real URL // for checksum calculation or other FE-side processing. Any changes to 'userFile' do not // affect 'originalUserFile', ensuring the BE always receives the unmodified path. userFile = properties.getOrDefault(FILE_KEY, properties.get(OBJECT_FILE_KEY)); originalUserFile = userFile; ``` ########## be/src/runtime/user_function_cache.cpp: ########## @@ -381,4 +385,43 @@ std::vector<std::string> UserFunctionCache::_split_string_by_checksum(const std: return result; } + +Status UserFunctionCache::_get_real_url(const std::string& url, std::string* result_url) { + if (url.find(":/") == std::string::npos) { + return _check_and_return_default_java_udf_url(url, result_url); + } + *result_url = url; + return Status::OK(); +} + +Status UserFunctionCache::_check_and_return_default_java_udf_url(const std::string& url, + std::string* result_url) { + const char* doris_home = std::getenv("DORIS_HOME"); + std::string default_url = std::string(doris_home) + "/plugins/java_udf"; + + std::filesystem::path file = default_url + "/" + url; + + // In cloud mode, always try cloud download first (prioritize cloud mode) + if (config::is_cloud_mode()) { + std::string target_path = default_url + "/" + url; + std::string downloaded_path; + Status status = CloudPluginDownloader::download_from_cloud( + CloudPluginDownloader::PluginType::JAVA_UDF, url, target_path, &downloaded_path); + if (status.ok() && !downloaded_path.empty()) { + *result_url = "file://" + downloaded_path; + return Status::OK(); + } else { + LOG(WARNING) << "Failed to download Java UDF from cloud: " << status.to_string(); Review Comment: [nitpick] Similar to the JDBC connector, this warning message should include the UDF URL for better debugging: `LOG(WARNING) << "Failed to download Java UDF '" << url << "' from cloud: " << status.to_string()` ```suggestion LOG(WARNING) << "Failed to download Java UDF '" << url << "' from cloud: " << status.to_string(); ``` ########## be/src/vec/exec/vjdbc_connector.cpp: ########## @@ -653,15 +657,33 @@ std::string JdbcConnector::_check_and_return_default_driver_url(const std::strin // Because in 2.1.8, we change the default value of `jdbc_drivers_dir` // from `DORIS_HOME/jdbc_drivers` to `DORIS_HOME/plugins/jdbc_drivers`, // so we need to check the old default dir for compatibility. - std::filesystem::path file = default_url + "/" + url; - if (std::filesystem::exists(file)) { - return "file://" + default_url + "/" + url; - } else { - return "file://" + default_old_url + "/" + url; + std::string target_path = default_url + "/" + url; + if (std::filesystem::exists(target_path)) { + // File exists in new default directory + *result_url = "file://" + target_path; + return Status::OK(); + } else if (config::is_cloud_mode()) { + // Cloud mode: try to download from cloud to new default directory + std::string downloaded_path; + Status status = CloudPluginDownloader::download_from_cloud( + CloudPluginDownloader::PluginType::JDBC_DRIVERS, url, target_path, + &downloaded_path); + if (status.ok() && !downloaded_path.empty()) { + *result_url = "file://" + downloaded_path; + return Status::OK(); + } + // Download failed, log warning but continue to fallback + LOG(WARNING) << "Failed to download JDBC driver from cloud: " << status.to_string() Review Comment: [nitpick] The warning message logs the status but doesn't include the original URL for context. Consider adding the plugin URL to the log message for better debugging: `LOG(WARNING) << "Failed to download JDBC driver '" << url << "' from cloud: " << status.to_string() << ", fallback to old directory"` ```suggestion LOG(WARNING) << "Failed to download JDBC driver '" << url << "' from cloud: " << status.to_string() ``` ########## be/src/runtime/plugin/s3_plugin_downloader.cpp: ########## @@ -0,0 +1,122 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#include "runtime/plugin/s3_plugin_downloader.h" + +#include <fmt/format.h> + +#include <filesystem> +#include <memory> +#include <string> +#include <thread> + +#include "common/logging.h" +#include "common/status.h" +#include "io/fs/local_file_system.h" +#include "io/fs/s3_file_system.h" +#include "util/s3_util.h" + +namespace doris { + +std::mutex S3PluginDownloader::_download_mutex; +std::string S3PluginDownloader::S3Config::to_string() const { Review Comment: [nitpick] The `to_string()` method implementation should be moved to the header file or the declaration should be moved to the cpp file to maintain consistency with the class definition location. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
