morningman commented on code in PR #17404: URL: https://github.com/apache/doris/pull/17404#discussion_r1125588284
########## fe/fe-core/src/main/java/org/apache/doris/tablefunction/LocalTableValuedFunction.java: ########## @@ -0,0 +1,119 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.tablefunction; + +import org.apache.doris.analysis.BrokerDesc; +import org.apache.doris.analysis.StorageBackend.StorageType; +import org.apache.doris.catalog.Env; +import org.apache.doris.common.AnalysisException; +import org.apache.doris.system.Backend; +import org.apache.doris.thrift.TBrokerFileStatus; +import org.apache.doris.thrift.TFileType; + +import com.google.common.collect.ImmutableSet; +import com.google.common.collect.Maps; +import org.apache.commons.collections.map.CaseInsensitiveMap; +import org.apache.logging.log4j.LogManager; +import org.apache.logging.log4j.Logger; + +import java.util.Map; + +/** + * The Implement of table valued function + * local("file" = "/path/to/file.txt", "backend_id" = "be_id"). + */ +public class LocalTableValuedFunction extends ExternalFileTableValuedFunction { + private static final Logger LOG = LogManager.getLogger(LocalTableValuedFunction.class); + + public static final String NAME = "local"; + public static final String FILE_PATH = "file_path"; + public static final String BACKEND_ID = "backend_id"; + + private static final ImmutableSet<String> LOCATION_PROPERTIES = new ImmutableSet.Builder<String>() + .add(FILE_PATH) + .add(BACKEND_ID) + .build(); + + private String filePath; + private long backendId; + + public LocalTableValuedFunction(Map<String, String> params) throws AnalysisException { + Map<String, String> fileFormatParams = new CaseInsensitiveMap(); + locationProperties = Maps.newHashMap(); + for (String key : params.keySet()) { + if (FILE_FORMAT_PROPERTIES.contains(key.toLowerCase())) { + fileFormatParams.put(key, params.get(key)); + } else if (LOCATION_PROPERTIES.contains(key.toLowerCase())) { + locationProperties.put(key.toLowerCase(), params.get(key)); + } else { + throw new AnalysisException(key + " is invalid property"); + } + } + + if (!locationProperties.containsKey(FILE_PATH)) { + throw new AnalysisException(String.format("Configuration '%s' is required.", FILE_PATH)); + } + if (!locationProperties.containsKey(BACKEND_ID)) { + throw new AnalysisException(String.format("Configuration '%s' is required.", BACKEND_ID)); + } + + filePath = locationProperties.get(FILE_PATH); + backendId = Long.parseLong(locationProperties.get(BACKEND_ID)); + + if (!filePath.startsWith("/")) { Review Comment: There is a security reason. If do so, user can visit any file on a Backend. So user should specify a relative path, and we can define a path prefix. and the full path should be `path prefix + relative path`, so that only files in `path prefix` can be read. And more, should check if user use `..` in relative path, it should be forbidden. you can first normalize the path, than check if the full path starts with `path prefix` ########## fe/fe-core/src/main/java/org/apache/doris/tablefunction/ExternalFileTableValuedFunction.java: ########## @@ -324,4 +323,13 @@ private PFetchTableSchemaRequest getFetchTableStructureRequest() throws Analysis return InternalService.PFetchTableSchemaRequest.newBuilder() .setFileScanRange(ByteString.copyFrom(new TSerializer().serialize(fileScanRange))).build(); } + + protected Backend getBackend() { Review Comment: It should return a `List<Backend>`, because for a tvf, it can be send to multi backends to execute. And `local()` tvf is a special case that the size of list is 1 ########## fe/fe-core/src/main/java/org/apache/doris/planner/external/ExternalFileScanNode.java: ########## @@ -202,7 +208,16 @@ public void init(Analyzer analyzer) throws UserException { throw new UserException("Unknown type: " + type); } - backendPolicy.init(); + if (isLocalTVF) { + // for local file, create a LocalTVFBackendPolicy which will always return the + // backend that user specify by "backend_id" = "xxx" + LocalTVFBackendPolicy tvfBackendPolicy = new LocalTVFBackendPolicy(); + tvfBackendPolicy.init(backendId); + backendPolicy = tvfBackendPolicy; + } else { + backendPolicy = new BackendPolicy(); + backendPolicy.init(); Review Comment: This is not a good design, but this is not your problem. we are working on refactoring the `split` interface here #17390, maybe we can hold for a while. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org