gortiz commented on code in PR #10192: URL: https://github.com/apache/pinot/pull/10192#discussion_r1136848647
########## pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/IndexService.java: ########## @@ -0,0 +1,83 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.pinot.segment.spi.index; + +import com.google.common.collect.Sets; +import java.util.HashSet; +import java.util.Optional; +import java.util.ServiceLoader; +import java.util.Set; + + +/** + * This is the entry point of the Index SPI. + * + * Ideally, if we used some kind of injection system, this class should be injected into a Pinot context all classes can + * receive when they are built. Given that Pinot doesn't have that, we have to relay on static fields. + * + * By default, this class will be initialized by reading all ServiceLoader SPI services that implement + * {@link IndexPlugin}, adding all the {@link IndexType} that can be found in that way. + * + * In case we want to change the instance to be used at runtime, the method {@link #setInstance(IndexService)} can be + * called. Review Comment: > In general, it is simpler to maintain immutable instances of objects Totally agree. I try to make objects immutable unless there is a strong reason to not to. In this case is even worse. We are not making the instance mutable. We are making the static state mutable, which usually is a red flag. I really tried to avoid this, but right now this is the only way to do this that I found. So there are two different questions here: 1. One is why would we need to have a IndexService that is not created from the classpath. 2. Assuming we need an IndexService that is not created from the classpath, how could we have that feature without modifying static state. I think this [other discussion](https://github.com/apache/pinot/pull/10192/files/42a5f0ed55636f5d9acfc9125b4596ff40ce670e#r1136583129) is also related to the first question, so I'm going to answer that there and I'm going to focus here on the second question. About the (ab)use of static state in Pinot: I think that is a flaw in the current Pinot architecture. Unless I'm wrong, we don't have other ways to have a shared state in different parts of the code unless we use static variables. TL;DR: I think we need to add some contextual object (and probably some dependency injection) into Pinot in order to be able to make it more complex without abusing of static state. Long story: IMHO Pinot abuse of static state (here, in metrics, now in PlanMaker, etc). A mentor I had told me once that one of the most important things he focus on when reviewing an architectural change in the code is whether several instance of the code can be running in the same process. He wasn't talking about Java instances but _service_ instances. In Pinot case, it would be whether two Pinot servers or two Pinot brokers can run in the same process. This idea looks quite artificial. What is the business advantage of running two servers in the same process? Probably none! You would probably would never want to do that in production! But this idea produces better architectures. For example, if you can run two servers in the same process, testing is quite easier. Also, suddenly you cannot use static variables to store information that will change from one server to the other. If you cannot use static variables to do that, you need to have some kind of Context class that you can inject into your instances in order to get the information related to each server that is running in the same process. And forcing you to use this Context object suddenly makes a lot of problems easier to solve: * How can each server in the same process view different indexes? It is simple, just store an `IndexService` instance on the Context instead of a static variable! * How can each server has their own metrics? It is simple, just store the MetricRegistry in the Context instead of using an static variable! And each meter should have a prefix that identifies the specific server. * How can I change the way my PlanMaker changes a leaf node depending on whether an index is or is not in the IndexService? Again, add that to the Context. This Context object doesn't exist in Pinot because it wasn't designed with this architectural idea in mind. And Pinot is a great product without it! But the lack of this concept makes more difficult to test Pinot or to create complex behaviors. For example, to test the behavior of your code using different index type sets, you need to change a static variable, which means that running tests in parallel is quite difficult. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org