sigram commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466853804
########## File path: solr/core/src/java/org/apache/solr/cluster/placement/CreateNewCollectionRequest.java ########## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import java.util.Set; + +/** + * <p>Request for creating a new collection with a given set of shards and replication factor for various replica types. + * The expected {@link WorkOrder} corresponding to this {@link Request} is created using + * {@link WorkOrderFactory#createWorkOrderNewCollection} + * + * <p>Note there is no need at this stage to allow the plugin to know each shard hash range for example, this can be handled + * by the Solr side implementation of this interface without needing the plugin to worry about it (the implementation of this interface on + * the Solr side can maintain the ranges for each shard). + * + * <p>Same goes for the {@link org.apache.solr.core.ConfigSet} name or other collection parameters. They are needed for + * creating a Collection but likely do not have to be exposed to the plugin (this can easily be changed if needed by + * adding accessors here, the underlying Solr side implementation of this interface has the information). + */ +public interface CreateNewCollectionRequest extends Request { + /** + * <p>The name of the collection to be created and for which placement should be computed. + * + * <p>Compare this method with {@link AddReplicasRequest#getCollection()}, there the collection already exists so can be + * directly passed in the {@link Request}. + * + * <p>When processing this request, plugin code doesn't have to worry about existing {@link Replica}'s for the collection + * given that the collection is assumed not to exist. + */ + String getCollectionName(); + + Set<String> getShardNames(); + + /** + * <p>Properties passed through the Collection API by the client creating the collection. + * See {@link SolrCollection#getCustomProperty(String)}. + * + * <p>Given this {@link Request} is for creating a new collection, it is not possible to pass the custom property values through + * the {@link SolrCollection} object. That instance does not exist yet, and is the reason {@link #getCollectionName()} exists + * rather than a method returning {@link SolrCollection}... + */ + String getCustomProperty(String customPropertyName); Review comment: Ok. ########## File path: solr/core/src/java/org/apache/solr/cluster/placement/AddReplicasRequest.java ########## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import java.util.Set; + +/** + * <p>Request for creating one or more {@link Replica}'s for one or more {@link Shard}'s of an existing {@link SolrCollection}. + * The shard might or might not already exist, plugin code can easily find out by using {@link SolrCollection#getShards()} + * and verifying if the shard name(s) from {@link #getShardNames()} are there. + * + * <p>As opposed to {@link CreateNewCollectionRequest}, the set of {@link Node}s on which the replicas should be placed + * is specified (defaults to being equal to the set returned by {@link Cluster#getLiveNodes()}). + * + * <p>There is no extension between this interface and {@link CreateNewCollectionRequest} in either direction + * or from a common ancestor for readability. An ancestor could make sense and would be an "abstract interface" not intended + * to be implemented directly, but this does not exist in Java. + * + * <p>Plugin code would likely treat the two types of requests differently since here existing {@link Replica}'s must be taken + * into account for placement whereas in {@link CreateNewCollectionRequest} no {@link Replica}'s are assumed to exist. + */ +public interface AddReplicasRequest extends Request { + /** + * The {@link SolrCollection} to add {@link Replica}(s) to. The replicas are to be added to a shard that might or might + * not yet exist when the plugin's {@link PlacementPlugin#computePlacement} is called. + */ + SolrCollection getCollection(); + + /** + * <p>Shard name(s) for which new replicas placement should be computed. The shard(s) might exist or not (that's why this + * method returns a {@link Set} of {@link String}'s and not directly a set of {@link Shard} instances). + * + * <p>Note the Collection API allows specifying the shard name or a {@code _route_} parameter. The Solr implementation will + * convert either specification into the relevant shard name so the plugin code doesn't have to worry about this. + */ + Set<String> getShardNames(); + + /** Replicas should only be placed on nodes from the set returned by this method. */ + Set<Node> getTargetNodes(); Review comment: Ok, but if we specify the semantics of this method more precisely then it will be a better guide to the implementors. As it is now it's not clear what to return when the target set is the whole cluster. ########## File path: solr/core/src/java/org/apache/solr/cluster/placement/ReplicaPlacement.java ########## @@ -0,0 +1,29 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +/** + * <p>Placement decision for a single {@link Replica}. Note this placement decision is used as part of a {@link WorkOrder}, + * it does not directly lead to the plugin code getting a corresponding {@link Replica} instance, nor does it require the + * plugin to provide a {@link Shard} instance (the plugin code gets such instances for existing replicas and shards in the + * cluster but does not create them directly for adding new replicas for new or existing shards). + * + * <p>Captures the {@link Shard} (via the shard name), {@link Node} and {@link Replica.ReplicaType} of a Replica to be created. + */ +public interface ReplicaPlacement { Review comment: It's a common problem today (or at least in 8x) to actually understand why the policy made the placement decisions it did. Imagine having to investigate it without having a handy reference to the request that caused these decisions. So the purpose of this reference is not for the plugin, it's for the user to be able to track why a particular placement decision was made. We could leave this to the plugins to implement it but I think it's an important aspect of auditing and debugging and we should enforce it on plugins using the interfaces. Even if a plugin just randomly assigns placements there is always a cause for the assignments to be generated. Whether we include the whole Request or just an id, I think we should explcitly require this information to be tracked. ########## File path: solr/core/src/java/org/apache/solr/cluster/placement/Cluster.java ########## @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import java.io.IOException; +import java.util.Optional; +import java.util.Set; + +/** + * <p>A representation of the (initial) cluster state, providing information on which nodes are part of the cluster and a way + * to get to more detailed info. + * + * <p>This instance can also be used as a {@link PropertyValueSource} if {@link PropertyKey}'s need to be specified with + * a global cluster target. + */ +public interface Cluster extends PropertyValueSource { + /** + * @return current set of live nodes. Never <code>null</code>, never empty (Solr wouldn't call the plugin if empty + * since no useful could then be done). + */ + Set<Node> getLiveNodes(); + + /** + * <p>Returns info about the given collection if one exists. Because it is not expected for plugins to request info about + * a large number of collections, requests can only be made one by one. + * + * <p>This is also the reason we do not return a {@link java.util.Map} or {@link Set} of {@link SolrCollection}'s here: it would be + * wasteful to fetch all data and fill such a map when plugin code likely needs info about at most one or two collections. + */ + Optional<SolrCollection> getCollection(String collectionName) throws IOException; Review comment: Right, I thought of `withCollection`. But there are other cases where we may want to avoid placing replicas on nodes with replicas from other collections. Eg. at Lucidworks we heavily use several predefined system collections for logging and analytics jobs, and we want to avoid putting search collections on those nodes if possible. This is currently achieved by totally separating the nodes using exclusive policy properties, but a different softer mechanism could be better on resource-constrained clusters - but then you need to know the placements of other collections... ########## File path: solr/core/src/java/org/apache/solr/cluster/placement/PropertyKeyFactory.java ########## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +/** + * Factory used by the plugin to create property keys to request property values from Solr.<p> + * + * Building of a {@link PropertyKey} requires specifying the target (context) from which the value of that key should be + * obtained. This is done by specifying the appropriate {@link PropertyValueSource}.<br> + * For clarity, when only a single type of target is acceptable, the corresponding subtype of {@link PropertyValueSource} is used instead + * (for example {@link Node}). + */ +public interface PropertyKeyFactory { + /** + * Returns a property key to request the number of cores on a {@link Node}. + */ + PropertyKey createCoreCountKey(Node node); + + /** + * Returns a property key to request disk related info on a {@link Node}. + */ + PropertyKey createDiskInfoKey(Node node); + + /** + * Returns a property key to request the value of a system property on a {@link Node}. + * @param systemPropertyName the name of the system property to retrieve. + */ + PropertyKey createSystemPropertyKey(Node node, String systemPropertyName); + + /** + * Returns a property key to request the value of a metric.<p> + * + * Not all metrics make sense everywhere, but metrics can be applied to different objects. For example + * <code>SEARCHER.searcher.indexCommitSize</code> would make sense for a given replica of a given shard of a given collection, + * and possibly in other contexts.<p> + * + * @param metricSource The registry of the metric. For example a specific {@link Replica}. + * @param metricName for example <code>SEARCHER.searcher.indexCommitSize</code>. + */ + PropertyKey createMetricKey(PropertyValueSource metricSource, String metricName); Review comment: .. and yes, the metrics in `solr.jvm` are a totally different set of metrics from those in `solr.node`. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org