[GitHub] [lucene-solr] sigram commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface

GitBox Thu, 06 Aug 2020 23:58:28 -0700


sigram commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466853804




##########
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/CreateNewCollectionRequest.java
##########
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.util.Set;
+
+/**
+ * <p>Request for creating a new collection with a given set of shards and 
replication factor for various replica types.
+ * The expected {@link WorkOrder} corresponding to this {@link Request} is 
created using
+ * {@link WorkOrderFactory#createWorkOrderNewCollection}
+ *
+ * <p>Note there is no need at this stage to allow the plugin to know each 
shard hash range for example, this can be handled
+ * by the Solr side implementation of this interface without needing the 
plugin to worry about it (the implementation of this interface on
+ * the Solr side can maintain the ranges for each shard).
+ *
+ * <p>Same goes for the {@link org.apache.solr.core.ConfigSet} name or other 
collection parameters. They are needed for
+ * creating a Collection but likely do not have to be exposed to the plugin 
(this can easily be changed if needed by
+ * adding accessors here, the underlying Solr side implementation of this 
interface has the information).
+ */
+public interface CreateNewCollectionRequest extends Request {
+  /**
+   * <p>The name of the collection to be created and for which placement 
should be computed.
+   *
+   * <p>Compare this method with {@link AddReplicasRequest#getCollection()}, 
there the collection already exists so can be
+   * directly passed in the {@link Request}.
+   *
+   * <p>When processing this request, plugin code doesn't have to worry about 
existing {@link Replica}'s for the collection
+   * given that the collection is assumed not to exist.
+   */
+  String getCollectionName();
+
+  Set<String> getShardNames();
+
+  /**
+   * <p>Properties passed through the Collection API by the client creating 
the collection.
+   * See {@link SolrCollection#getCustomProperty(String)}.
+   *
+   * <p>Given this {@link Request} is for creating a new collection, it is not 
possible to pass the custom property values through
+   * the {@link SolrCollection} object. That instance does not exist yet, and 
is the reason {@link #getCollectionName()} exists
+   * rather than a method returning {@link SolrCollection}...
+   */
+  String getCustomProperty(String customPropertyName);

Review comment:
       Ok.

##########
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/AddReplicasRequest.java
##########
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.util.Set;
+
+/**
+ * <p>Request for creating one or more {@link Replica}'s for one or more 
{@link Shard}'s of an existing {@link SolrCollection}.
+ * The shard might or might not already exist, plugin code can easily find out 
by using {@link SolrCollection#getShards()}
+ * and verifying if the shard name(s) from {@link #getShardNames()} are there.
+ *
+ * <p>As opposed to {@link CreateNewCollectionRequest}, the set of {@link 
Node}s on which the replicas should be placed
+ * is specified (defaults to being equal to the set returned by {@link 
Cluster#getLiveNodes()}).
+ *
+ * <p>There is no extension between this interface and {@link 
CreateNewCollectionRequest} in either direction
+ * or from a common ancestor for readability. An ancestor could make sense and 
would be an "abstract interface" not intended
+ * to be implemented directly, but this does not exist in Java.
+ *
+ * <p>Plugin code would likely treat the two types of requests differently 
since here existing {@link Replica}'s must be taken
+ * into account for placement whereas in {@link CreateNewCollectionRequest} no 
{@link Replica}'s are assumed to exist.
+ */
+public interface AddReplicasRequest extends Request {
+  /**
+   * The {@link SolrCollection} to add {@link Replica}(s) to. The replicas are 
to be added to a shard that might or might
+   * not yet exist when the plugin's {@link PlacementPlugin#computePlacement} 
is called.
+   */
+  SolrCollection getCollection();
+
+  /**
+   * <p>Shard name(s) for which new replicas placement should be computed. The 
shard(s) might exist or not (that's why this
+   * method returns a {@link Set} of {@link String}'s and not directly a set 
of {@link Shard} instances).
+   *
+   * <p>Note the Collection API allows specifying the shard name or a {@code 
_route_} parameter. The Solr implementation will
+   * convert either specification into the relevant shard name so the plugin 
code doesn't have to worry about this.
+   */
+  Set<String> getShardNames();
+
+  /** Replicas should only be placed on nodes from the set returned by this 
method. */
+  Set<Node> getTargetNodes();

Review comment:
       Ok, but if we specify the semantics of this method more precisely then 
it will be a better guide to the implementors. As it is now it's not clear what 
to return when the target set is the whole cluster.

##########
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/ReplicaPlacement.java
##########
@@ -0,0 +1,29 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+/**
+ * <p>Placement decision for a single {@link Replica}. Note this placement 
decision is used as part of a {@link WorkOrder},
+ * it does not directly lead to the plugin code getting a corresponding {@link 
Replica} instance, nor does it require the
+ * plugin to provide a {@link Shard} instance (the plugin code gets such 
instances for existing replicas and shards in the
+ * cluster but does not create them directly for adding new replicas for new 
or existing shards).
+ *
+ * <p>Captures the {@link Shard} (via the shard name), {@link Node} and {@link 
Replica.ReplicaType} of a Replica to be created.
+ */
+public interface ReplicaPlacement {

Review comment:
       It's a common problem today (or at least in 8x) to actually understand 
why the policy made the placement decisions it did. Imagine having to 
investigate it without having a handy reference to the request that caused 
these decisions.
   
   So the purpose of this reference is not for the plugin, it's for the user to 
be able to track why a particular placement decision was made. We could leave 
this to the plugins to implement it but I think it's an important aspect of 
auditing and debugging and we should enforce it on plugins using the interfaces.
   
   Even if a plugin just randomly assigns placements there is always a cause 
for the assignments to be generated. Whether we include the whole Request or 
just an id, I think we should explcitly require this information to be tracked.

##########
File path: solr/core/src/java/org/apache/solr/cluster/placement/Cluster.java
##########
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.io.IOException;
+import java.util.Optional;
+import java.util.Set;
+
+/**
+ * <p>A representation of the (initial) cluster state, providing information 
on which nodes are part of the cluster and a way
+ * to get to more detailed info.
+ *
+ * <p>This instance can also be used as a {@link PropertyValueSource} if 
{@link PropertyKey}'s need to be specified with
+ * a global cluster target.
+ */
+public interface Cluster extends PropertyValueSource {
+  /**
+   * @return current set of live nodes. Never <code>null</code>, never empty 
(Solr wouldn't call the plugin if empty
+   * since no useful could then be done).
+   */
+  Set<Node> getLiveNodes();
+
+  /**
+   * <p>Returns info about the given collection if one exists. Because it is 
not expected for plugins to request info about
+   * a large number of collections, requests can only be made one by one.
+   *
+   * <p>This is also the reason we do not return a {@link java.util.Map} or 
{@link Set} of {@link SolrCollection}'s here: it would be
+   * wasteful to fetch all data and fill such a map when plugin code likely 
needs info about at most one or two collections.
+   */
+  Optional<SolrCollection> getCollection(String collectionName) throws 
IOException;

Review comment:
       Right, I thought of `withCollection`. But there are other cases where we 
may want to avoid placing replicas on nodes with replicas from other 
collections. Eg. at Lucidworks we heavily use several predefined system 
collections for logging and analytics jobs, and we want to avoid putting search 
collections on those nodes if possible. This is currently achieved by totally 
separating the nodes using exclusive policy properties, but a different softer 
mechanism could be better on resource-constrained clusters - but then you need 
to know the placements of other collections...

##########
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/PropertyKeyFactory.java
##########
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+/**
+ * Factory used by the plugin to create property keys to request property 
values from Solr.<p>
+ *
+ * Building of a {@link PropertyKey} requires specifying the target (context) 
from which the value of that key should be
+ * obtained. This is done by specifying the appropriate {@link 
PropertyValueSource}.<br>
+ * For clarity, when only a single type of target is acceptable, the 
corresponding subtype of {@link PropertyValueSource} is used instead
+ * (for example {@link Node}).
+ */
+public interface PropertyKeyFactory {
+  /**
+   * Returns a property key to request the number of cores on a {@link Node}.
+   */
+  PropertyKey createCoreCountKey(Node node);
+
+  /**
+   * Returns a property key to request disk related info on a {@link Node}.
+   */
+  PropertyKey createDiskInfoKey(Node node);
+
+  /**
+   * Returns a property key to request the value of a system property on a 
{@link Node}.
+   * @param systemPropertyName the name of the system property to retrieve.
+   */
+  PropertyKey createSystemPropertyKey(Node node, String systemPropertyName);
+
+  /**
+   * Returns a property key to request the value of a metric.<p>
+   *
+   * Not all metrics make sense everywhere, but metrics can be applied to 
different objects. For example
+   * <code>SEARCHER.searcher.indexCommitSize</code> would make sense for a 
given replica of a given shard of a given collection,
+   * and possibly in other contexts.<p>
+   *
+   * @param metricSource The registry of the metric. For example a specific 
{@link Replica}.
+   * @param metricName for example 
<code>SEARCHER.searcher.indexCommitSize</code>.
+   */
+  PropertyKey createMetricKey(PropertyValueSource metricSource, String 
metricName);

Review comment:
       .. and yes, the metrics in `solr.jvm` are a totally different set of 
metrics from those in `solr.node`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] sigram commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface

Reply via email to