gortiz commented on code in PR #10192: URL: https://github.com/apache/pinot/pull/10192#discussion_r1136796477
########## pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/IndexService.java: ########## @@ -0,0 +1,83 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.pinot.segment.spi.index; + +import com.google.common.collect.Sets; +import java.util.HashSet; +import java.util.Optional; +import java.util.ServiceLoader; +import java.util.Set; + + +/** + * This is the entry point of the Index SPI. + * + * Ideally, if we used some kind of injection system, this class should be injected into a Pinot context all classes can + * receive when they are built. Given that Pinot doesn't have that, we have to relay on static fields. + * + * By default, this class will be initialized by reading all ServiceLoader SPI services that implement + * {@link IndexPlugin}, adding all the {@link IndexType} that can be found in that way. + * + * In case we want to change the instance to be used at runtime, the method {@link #setInstance(IndexService)} can be + * called. + */ +public class IndexService { + + private static volatile IndexService _instance = fromServiceLoader(); + + private final Set<IndexType<?, ?, ?>> _allIndexes; + + public IndexService(Set<IndexPlugin<?>> allPlugins) { + _allIndexes = Sets.newHashSetWithExpectedSize(allPlugins.size()); + + for (IndexPlugin<?> plugin : allPlugins) { + _allIndexes.add(plugin.getIndexType()); + } + } + + public static IndexService getInstance() { + return _instance; + } + + public static void setInstance(IndexService other) { + _instance = other; + } + + public static IndexService fromServiceLoader() { + Set<IndexPlugin<?>> pluginList = new HashSet<>(); + for (IndexPlugin indexPlugin : ServiceLoader.load(IndexPlugin.class)) { + pluginList.add(indexPlugin); + } + return new IndexService(pluginList); + } + + public Set<IndexType<?, ?, ?>> getAllIndexes() { + return _allIndexes; + } + + public Optional<IndexType<?, ?, ?>> getIndexTypeById(String indexId) { + return getAllIndexes().stream().filter(indexType -> indexType.getId().equalsIgnoreCase(indexId)).findAny(); + } + + public IndexType<?, ?, ?> getIndexTypeByIdOrThrow(String indexId) { + return getIndexTypeById(indexId) + .orElseThrow(() -> new IllegalArgumentException("Unknown index id: " + indexId)); + } Review Comment: First of all, these methods were renamed in `index-spi-all-types` branch. I think it was a mistake to do not change them here, so I'm going to apply the changes. The new names are `get(String indexId)` and `getOrThrow(String indexId)`. > What are the allowed values for the indexId string? Can you add javadocs to these methods? I thought it was clear that `indexId` is the result of calling `IndexType.getId()`, but I can add extra javadoc in order to make it clear. What I'm sure it may not be clear is where we need it. It is going to be easier to understand later when these methods are actually used. But I can also add some reasons in the javadoc. > Do we really need both methods - one that throws and the one which doesn't? First of all, I don't think it actually hurts to have both methods. The main reason to have these methods is to be able to deal with the case where caller ask for an index that doesn't exist without producing NPE. If we would only had a single method `get`, it should either: 1. Return a nullable `IndexType`. 2. Throw when there is no index with the given id. 3. Return `Optional<IndexType>`. Option 1 makes it very easy to the caller to forget that the index may not be loaded, which would produce a NPE the first time it tries to use it, but probably not in the same line where `get` is called, so it is more difficult to find where the problem is. Option 2 is very aggressive. Either we add another method to check whether the index is there or the caller would need to write something like the following in order to deal with the fact that an index may not be loaded ```java boolean indexIsLoaded; try { IndexService.getInstance().get() indexIsLoaded = true; } catch (Whatever e) { indexIsLoaded = false; } ``` Option 3 is safe at compile time and is the one I'm following here. There are two problems with this approach: 1. An Optional is created each time the method is called with an index that actually exists. This shouldn't be a performance issue, but given that index types do not actually change in production, we can catch the optionals in case we found that we are allocating too much here. 2. Most of the times we call this method, the index type is a precondition of the calling code, therefore the caller actually want to fail if the index type is not there. Therefore we needed to repeat the patter `IndexService.getInstance().get().orElseThrow()` several times. That is why I added the second method and if you see https://github.com/apache/pinot/pull/10184 you will see that most (if not all) calls are to `getOrThrow` (directly or indirectly by using `StandardIndexes`). What we can do, if you think that it is interesting, is to change the semantics of the methods. `get` could throw if the index type is not there and a new method `getOptional` or `getNullable` may return Optional or null in case the index is not present. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org