gortiz commented on code in PR #10192:
URL: https://github.com/apache/pinot/pull/10192#discussion_r1136848647


##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/IndexService.java:
##########
@@ -0,0 +1,83 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.pinot.segment.spi.index;
+
+import com.google.common.collect.Sets;
+import java.util.HashSet;
+import java.util.Optional;
+import java.util.ServiceLoader;
+import java.util.Set;
+
+
+/**
+ * This is the entry point of the Index SPI.
+ *
+ * Ideally, if we used some kind of injection system, this class should be 
injected into a Pinot context all classes can
+ * receive when they are built. Given that Pinot doesn't have that, we have to 
relay on static fields.
+ *
+ * By default, this class will be initialized by reading all ServiceLoader SPI 
services that implement
+ * {@link IndexPlugin}, adding all the {@link IndexType} that can be found in 
that way.
+ *
+ * In case we want to change the instance to be used at runtime, the method 
{@link #setInstance(IndexService)} can be
+ * called.

Review Comment:
   > In general, it is simpler to maintain immutable instances of objects
   
   Totally agree. I try to make objects immutable unless there is a strong 
reason to not to. In this case is even worse. We are not making the instance 
mutable. We are making the static state mutable, which usually is a red flag. I 
really tried to avoid this, but right now this is the only way to do this that 
I found.
   
   So there are two different questions here: 
   1. One is why would we need to have a IndexService that is not created from 
the classpath.
   2. Assuming we need an IndexService that is not created from the classpath, 
how could we have that feature without modifying static state.
   
   I think this [other 
discussion](https://github.com/apache/pinot/pull/10192/files/42a5f0ed55636f5d9acfc9125b4596ff40ce670e#r1136583129)
 is also related to the first question, so I'm going to answer that there and 
I'm going to focus here on the second question.
   
   About the (ab)use of static state in Pinot: I think that is a flaw in the 
current Pinot architecture. Unless I'm wrong, we don't have other ways to have 
a shared state in different parts of the code unless we use static variables.
   
   TL;DR: I think we need to add some contextual object (and probably some 
dependency injection) into Pinot in order to be able to make it more complex 
without abusing of static state.
   
   Long story:
   
   IMHO Pinot abuse of static state (here, in metrics, now in PlanMaker, etc). 
A mentor I had told me once that one of the most important things he focus on 
when reviewing an architectural change in the code is whether several instance 
of the code can be running in the same process. He wasn't talking about Java 
instances but _service_ instances. In Pinot case, it would be whether two Pinot 
servers or two Pinot brokers can run in the same process. 
   
   This idea looks quite artificial. What is the business advantage of running 
two servers in the same process? Probably none! You would probably would never 
want to do that in production! But this idea produces better architectures. For 
example, if you can run two servers in the same process, testing is quite 
easier. Also, suddenly you cannot use static variables to store information 
that will change from one server to the other. If you cannot use static 
variables to do that, you need to have some kind of Context class that you can 
inject into your instances in order to get the information related to each 
server that is running in the same process. And forcing you to use this Context 
object suddenly makes a lot of problems easier to solve:
   * How can each server in the same process view different indexes? It is 
simple, just store an `IndexService` instance on the Context instead of a 
static variable!
   * How can each server has their own metrics? It is simple, just store the 
MetricRegistry in the Context instead of using an static variable! And each 
meter should have a prefix that identifies the specific server.
   * How can I change the way my PlanMaker changes a leaf node depending on 
whether an index is or is not in the IndexService? Again, add that to the 
Context.
   
   This Context object doesn't exist in Pinot because it wasn't designed with 
this architectural idea in mind. And Pinot is a great product without it! But 
the lack of this concept makes more difficult to test Pinot or to create 
complex behaviors. For example, to test the behavior of your code using 
different index type sets, you need to change a static variable, which means 
that running tests in parallel is quite difficult.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to