[GitHub] [lucene-solr] sigram commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface
sigram commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467387315 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/PlacementException.java ## @@ -0,0 +1,48 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +/** + * Exception thrown by a {@link PlacementPlugin} when it is unable to compute placement for whatever reason (except an Review comment: Some kind of policy violation comes to mind, but you're right, it's too vague. OTOH with the Policy engine these kind of exceptions where invaluable for tracking the reasons for failures (policy rule violation vs. other types of low-level errors), and in that case we could usually get detailed and well-structured (JSON maps) errors. Something to keep in mind. ## File path: solr/core/src/java/org/apache/solr/cluster/placement/PropertyKeyFactory.java ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +/** + * Factory used by the plugin to create property keys to request property values from Solr. + * + * Building of a {@link PropertyKey} requires specifying the target (context) from which the value of that key should be + * obtained. This is done by specifying the appropriate {@link PropertyValueSource}. + * For clarity, when only a single type of target is acceptable, the corresponding subtype of {@link PropertyValueSource} is used instead + * (for example {@link Node}). + */ +public interface PropertyKeyFactory { + /** + * Returns a property key to request the number of cores on a {@link Node}. + */ + PropertyKey createCoreCountKey(Node node); Review comment: Ok. ## File path: solr/core/src/java/org/apache/solr/cluster/placement/AddReplicasRequest.java ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import java.util.Set; + +/** + * Request for creating one or more {@link Replica}'s for one or more {@link Shard}'s of an existing {@link SolrCollection}. + * The shard might or might not already exist, plugin code can easily find out by using {@link SolrCollection#getShards()} + * and verifying if the shard name(s) from {@link #getShardNames()} are there. + * + * As opposed to {@link CreateNewCollectionRequest}, the set of {@link Node}s on which the replicas should be placed + * is specified (defaults to being equal to the set returned by {@link Cluster#getLiveNodes()}). + * + * There is no extension between this interface and {@link CreateNewCollectionReq
[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface
murblanc commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467453245 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/ReplicaPlacement.java ## @@ -0,0 +1,29 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +/** + * Placement decision for a single {@link Replica}. Note this placement decision is used as part of a {@link WorkOrder}, + * it does not directly lead to the plugin code getting a corresponding {@link Replica} instance, nor does it require the + * plugin to provide a {@link Shard} instance (the plugin code gets such instances for existing replicas and shards in the + * cluster but does not create them directly for adding new replicas for new or existing shards). + * + * Captures the {@link Shard} (via the shard name), {@link Node} and {@link Replica.ReplicaType} of a Replica to be created. + */ +public interface ReplicaPlacement { Review comment: Ok, added. Note that the interfaces in `org.apache.solr.cluster.placement` only condition what the plugin code is seeing, not what's present in the objects. The `Request` is present in the `WorkOrder` instances and I added it to the interface, but even if it's not in the interface, Solr side code would have it. My thinking (by not putting it in the interface) is that plugin code is creating these instances so it should know which instance is which and doesn't need a method on that instance to find out. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface
murblanc commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467453739 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/PlacementPlugin.java ## @@ -0,0 +1,41 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +/** + * Implemented by external plugins to control replica placement and movement on the search cluster (as well as other things + * such as cluster elasticity?) when cluster changes are required (initiated elsewhere, most likely following a Collection + * API call). + */ +public interface PlacementPlugin { Review comment: I believe Solr side would keep the configuration (map) somewhere and just pass a reference to it on every call to the plugin. So cost would be minimal. If the plugin code needs to retrieve a config value, it will basically do a `get` on that map, very low cost here as well. I don't see how a plugin can keep state between invocations with a semantic that's different from a static field, unless we create a notion of higher level context shared between invocations (and different such higher level contexts will not be shared). I suggest we skip that aspect for now and reconsider later when we implement real plugins. For now I believe the model in which the plugin is instantiated (no arg constructor, or constructor getting the config map or something simple like that) before every call to the computePlacement method would be sufficient. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface
murblanc commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467454703 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/AddReplicasRequest.java ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import java.util.Set; + +/** + * Request for creating one or more {@link Replica}'s for one or more {@link Shard}'s of an existing {@link SolrCollection}. + * The shard might or might not already exist, plugin code can easily find out by using {@link SolrCollection#getShards()} + * and verifying if the shard name(s) from {@link #getShardNames()} are there. + * + * As opposed to {@link CreateNewCollectionRequest}, the set of {@link Node}s on which the replicas should be placed + * is specified (defaults to being equal to the set returned by {@link Cluster#getLiveNodes()}). + * + * There is no extension between this interface and {@link CreateNewCollectionRequest} in either direction + * or from a common ancestor for readability. An ancestor could make sense and would be an "abstract interface" not intended + * to be implemented directly, but this does not exist in Java. + * + * Plugin code would likely treat the two types of requests differently since here existing {@link Replica}'s must be taken + * into account for placement whereas in {@link CreateNewCollectionRequest} no {@link Replica}'s are assumed to exist. + */ +public interface AddReplicasRequest extends Request { + /** + * The {@link SolrCollection} to add {@link Replica}(s) to. The replicas are to be added to a shard that might or might + * not yet exist when the plugin's {@link PlacementPlugin#computePlacement} is called. + */ + SolrCollection getCollection(); + + /** + * Shard name(s) for which new replicas placement should be computed. The shard(s) might exist or not (that's why this + * method returns a {@link Set} of {@link String}'s and not directly a set of {@link Shard} instances). + * + * Note the Collection API allows specifying the shard name or a {@code _route_} parameter. The Solr implementation will + * convert either specification into the relevant shard name so the plugin code doesn't have to worry about this. + */ + Set getShardNames(); + + /** Replicas should only be placed on nodes from the set returned by this method. */ + Set getTargetNodes(); Review comment: Used work order -> placement plan This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface
murblanc commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467455202 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/AddReplicasRequest.java ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import java.util.Set; + +/** + * Request for creating one or more {@link Replica}'s for one or more {@link Shard}'s of an existing {@link SolrCollection}. + * The shard might or might not already exist, plugin code can easily find out by using {@link SolrCollection#getShards()} + * and verifying if the shard name(s) from {@link #getShardNames()} are there. + * + * As opposed to {@link CreateNewCollectionRequest}, the set of {@link Node}s on which the replicas should be placed + * is specified (defaults to being equal to the set returned by {@link Cluster#getLiveNodes()}). + * + * There is no extension between this interface and {@link CreateNewCollectionRequest} in either direction + * or from a common ancestor for readability. An ancestor could make sense and would be an "abstract interface" not intended + * to be implemented directly, but this does not exist in Java. + * + * Plugin code would likely treat the two types of requests differently since here existing {@link Replica}'s must be taken + * into account for placement whereas in {@link CreateNewCollectionRequest} no {@link Replica}'s are assumed to exist. + */ +public interface AddReplicasRequest extends Request { + /** + * The {@link SolrCollection} to add {@link Replica}(s) to. The replicas are to be added to a shard that might or might + * not yet exist when the plugin's {@link PlacementPlugin#computePlacement} is called. + */ + SolrCollection getCollection(); + + /** + * Shard name(s) for which new replicas placement should be computed. The shard(s) might exist or not (that's why this + * method returns a {@link Set} of {@link String}'s and not directly a set of {@link Shard} instances). + * + * Note the Collection API allows specifying the shard name or a {@code _route_} parameter. The Solr implementation will + * convert either specification into the relevant shard name so the plugin code doesn't have to worry about this. + */ + Set getShardNames(); + + /** Replicas should only be placed on nodes from the set returned by this method. */ + Set getTargetNodes(); Review comment: And also `Request` -> `PlacementRequest` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13933) Cluster mode Stress test suite
[ https://issues.apache.org/jira/browse/SOLR-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173637#comment-17173637 ] Erick Erickson commented on SOLR-13933: --- BTW, this is excellent, we've needed this for a lng time. > Cluster mode Stress test suite > --- > > Key: SOLR-13933 > URL: https://issues.apache.org/jira/browse/SOLR-13933 > Project: Solr > Issue Type: Bug >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > > We need a stress test harness based on 10s or 100s of nodes, 1000s of > collection API operations, overseer operations etc. This suite should run > nightly, publish results publicly, so as to help with: > # Uncover stability problems > # Benchmarking (timings, resource metrics etc.) on collection operations > # Indexing/querying performance > # Validate the accuracy of potential improvements > References: > SOLR-10317 > https://github.com/lucidworks/solr-scale-tk > https://github.com/shalinmangar/solr-perf-tools > Lucene benchmarks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ErickErickson commented on a change in pull request #1728: SOLR-14596: equals/hashCode for common SolrRequest classes
ErickErickson commented on a change in pull request #1728: URL: https://github.com/apache/lucene-solr/pull/1728#discussion_r467457703 ## File path: solr/solrj/src/java/org/apache/solr/client/solrj/ResponseParser.java ## @@ -49,4 +52,31 @@ public String getVersion() { return "2.2"; } + + @Override + public int hashCode() { +return new HashCodeBuilder() +.append(getWriterType()) +.append(getContentType()) +.append(getVersion()) +.toHashCode(); + } + + @Override + public boolean equals(Object rhs) { +if (rhs == null || getClass() != rhs.getClass()) { + return false; +} else if (this == rhs) { + return true; +} else if (hashCode() != rhs.hashCode()) { + return false; +} + +final ResponseParser rhsCast = (ResponseParser) rhs; Review comment: I'm curious about the comparison between Objects.hash(bunch of params) and the commons builder. I'm not advocating you change to Objects.hash, rather wondering whether there's a particular advantage to using one over the other. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14724) "multiple values encountered for non multiValued" even after adding a multiValued field
Ishan Chattopadhyaya created SOLR-14724: --- Summary: "multiple values encountered for non multiValued" even after adding a multiValued field Key: SOLR-14724 URL: https://issues.apache.org/jira/browse/SOLR-14724 Project: Solr Issue Type: Task Security Level: Public (Default Security Level. Issues are Public) Reporter: Ishan Chattopadhyaya Attachments: faulty.jsonl Steps to reproduce: # Create a new collection # Remove the dynamic field pattern for *_i # Add a multivalued field "body.algolia.children.children.created_at_i" # Index the attached document that contains multiple values for that field. # Boom! {{"ERROR: [doc=1017960] multiple values encountered for non multiValued field body.algolia.children.children.created_at_i: [1262006144, 1262024153, 1261970915, 1261970358, 1262014651, 1261971697, 1261978873]"}} {code} [ishan@pseries ~] $ curl "http://localhost:18983/solr/admin/collections?action=CREATE&name=hn2&numShards=1"; { "responseHeader":{ "status":0, "QTime":981}, "success":{ "172.17.0.2:8983_solr":{ "responseHeader":{ "status":0, "QTime":351}, "core":"hn2_shard1_replica_n1"}}, "warning":"Using _default configset. Data driven schema functionality is enabled by default, which is NOT RECOMMENDED for production use. To turn it off: curl http://{host:port}/solr/hn2/config -d '{\"set-user-property\": {\"update.autoCreateFields\":\"false\"}}'"} [ishan@pseries ~] $ curl -X POST -H 'Content-type:application/json' --data-binary '{ "delete-dynamic-field":{ "name":"*_s" } }' http://localhost:18983/solr/hn2/schema { "responseHeader":{ "status":0, "QTime":389}} [ishan@pseries ~] $ (reverse-i-search)`add-fi': curl -X POST -H 'Content-type:application/json' --data-binary '{ "add-field":{ "name":"body.algolia.children.created_at_i", "type":"pint", "multiValued":true, "stored":true } }' http://localhost:18983/solr/hn2/schema { "responseHeader":{ "status":0, "QTime":360}} [ishan@pseries ~] $ curl -X POST -d @faulty.jsonl -H "Content-Type: application/json" "http://localhost:18983/solr/hn2/update/json/docs?commit=true"; { "responseHeader":{ "rf":1, "status":400, "QTime":1045}, "error":{ "metadata":[ "error-class","org.apache.solr.common.SolrException", "root-error-class","org.apache.solr.common.SolrException"], "msg":"ERROR: [doc=1017960] multiple values encountered for non multiValued field body.algolia.children.children.created_at_i: [1262006144, 1262024153, 1261970915, 1261970358, 1262014651, 1261971697, 1261978873]", "code":400}} {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13933) Cluster mode Stress test suite
[ https://issues.apache.org/jira/browse/SOLR-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173556#comment-17173556 ] Ishan Chattopadhyaya edited comment on SOLR-13933 at 8/8/20, 11:54 AM: --- I am actively working on this here: https://github.com/TheSearchStack/solr-bench/tree/stress-harness Here is a sample suite: https://github.com/TheSearchStack/solr-bench/blob/stress-harness/rolling.json It starts 3 Solr nodes, 1GB ram each. It has two tasks: # task1 that creates a bunch of collections and indexes some data in them. # task2 that waits for task1 to end, and after that it does a rolling restart of the cluster (i.e. restarts a node and waits until all replicas in the restarted node is active, and proceeds to the next node until all nodes are restarted). While doing this, it measures the timings on all of them. Here is another sample suite: https://github.com/TheSearchStack/solr-bench/blob/stress-harness/workflow.json Does various tasks in task1 through task5, including indexing, creating many collections, shard splitting, restarting a node, validating number of documents in a collection (this one is WIP). As defined, some of these tasks are done in parallel with each other, some are blocking tasks. It is my intention to quickly wrap up this quite and start running automated tests for medium scale Solr cluster (say, 50-100 nodes, ~2GB RAM each) on each of our branches and publishing results periodically. This will be specially useful to benchmark stability of the cluster, with SOLR-13951 and SOLR-14636. was (Author: ichattopadhyaya): I am actively working on this here: https://github.com/TheSearchStack/solr-bench/tree/stress-harness Here is a sample suite: https://github.com/TheSearchStack/solr-bench/blob/stress-harness/rolling.json It starts 3 Solr nodes, 1GB ram each. It has two tasks: # task1 that creates a bunch of collections and indexes some data in them. # task2 that waits for task2 to end, and after that it does a rolling restart of the cluster (i.e. restarts a node and waits until all replicas in the restarted node is active, and proceeds to the next node until all nodes are restarted). While doing this, it measures the timings on all of them. Here is another sample suite: https://github.com/TheSearchStack/solr-bench/blob/stress-harness/workflow.json Does various tasks in task1 through task5, including indexing, creating many collections, shard splitting, restarting a node, validating number of documents in a collection (this one is WIP). As defined, some of these tasks are done in parallel with each other, some are blocking tasks. It is my intention to quickly wrap up this quite and start running automated tests for medium scale Solr cluster (say, 50-100 nodes, ~2GB RAM each) on each of our branches and publishing results periodically. This will be specially useful to benchmark stability of the cluster, with SOLR-13951 and SOLR-14636. > Cluster mode Stress test suite > --- > > Key: SOLR-13933 > URL: https://issues.apache.org/jira/browse/SOLR-13933 > Project: Solr > Issue Type: Bug >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > > We need a stress test harness based on 10s or 100s of nodes, 1000s of > collection API operations, overseer operations etc. This suite should run > nightly, publish results publicly, so as to help with: > # Uncover stability problems > # Benchmarking (timings, resource metrics etc.) on collection operations > # Indexing/querying performance > # Validate the accuracy of potential improvements > References: > SOLR-10317 > https://github.com/lucidworks/solr-scale-tk > https://github.com/shalinmangar/solr-perf-tools > Lucene benchmarks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13933) Cluster mode Stress test suite
[ https://issues.apache.org/jira/browse/SOLR-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173556#comment-17173556 ] Ishan Chattopadhyaya edited comment on SOLR-13933 at 8/8/20, 11:56 AM: --- I am actively working on this here: https://github.com/TheSearchStack/solr-bench/tree/stress-harness Here is a sample suite: https://github.com/TheSearchStack/solr-bench/blob/stress-harness/rolling.json It starts 3 Solr nodes, 1GB ram each. It has two tasks: # task1 that creates a bunch of collections and indexes some data in them. # task2 that waits for task1 to end, and after that it does a rolling restart of the cluster (i.e. restarts a node and waits until all replicas in the restarted node is active, and proceeds to the next node until all nodes are restarted). While doing this, it measures the timings on all of them. Here is another sample suite: https://github.com/TheSearchStack/solr-bench/blob/stress-harness/workflow.json Does various tasks in task1 through task5, including indexing, creating many collections, shard splitting, restarting a node, validating number of documents in a collection (this one is WIP). As defined, some of these tasks are done in parallel with each other, some are blocking tasks. It is my intention to quickly wrap up this quickly and start running automated tests for medium scale Solr cluster (say, 50-100 nodes, ~2GB RAM each) on each of our branches and publishing results periodically. This will be specially useful to benchmark stability of the cluster, with SOLR-13951 and SOLR-14636. was (Author: ichattopadhyaya): I am actively working on this here: https://github.com/TheSearchStack/solr-bench/tree/stress-harness Here is a sample suite: https://github.com/TheSearchStack/solr-bench/blob/stress-harness/rolling.json It starts 3 Solr nodes, 1GB ram each. It has two tasks: # task1 that creates a bunch of collections and indexes some data in them. # task2 that waits for task1 to end, and after that it does a rolling restart of the cluster (i.e. restarts a node and waits until all replicas in the restarted node is active, and proceeds to the next node until all nodes are restarted). While doing this, it measures the timings on all of them. Here is another sample suite: https://github.com/TheSearchStack/solr-bench/blob/stress-harness/workflow.json Does various tasks in task1 through task5, including indexing, creating many collections, shard splitting, restarting a node, validating number of documents in a collection (this one is WIP). As defined, some of these tasks are done in parallel with each other, some are blocking tasks. It is my intention to quickly wrap up this quite and start running automated tests for medium scale Solr cluster (say, 50-100 nodes, ~2GB RAM each) on each of our branches and publishing results periodically. This will be specially useful to benchmark stability of the cluster, with SOLR-13951 and SOLR-14636. > Cluster mode Stress test suite > --- > > Key: SOLR-13933 > URL: https://issues.apache.org/jira/browse/SOLR-13933 > Project: Solr > Issue Type: Bug >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > > We need a stress test harness based on 10s or 100s of nodes, 1000s of > collection API operations, overseer operations etc. This suite should run > nightly, publish results publicly, so as to help with: > # Uncover stability problems > # Benchmarking (timings, resource metrics etc.) on collection operations > # Indexing/querying performance > # Validate the accuracy of potential improvements > References: > SOLR-10317 > https://github.com/lucidworks/solr-scale-tk > https://github.com/shalinmangar/solr-perf-tools > Lucene benchmarks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14724) "multiple values encountered for non multiValued" even after adding a multiValued field
[ https://issues.apache.org/jira/browse/SOLR-14724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chattopadhyaya resolved SOLR-14724. - Resolution: Invalid Damn, I was removing the dynamic field for "*_s" instead of "*_i". I'm closing this issue, but there's definitely something wrong with Solr. I added a field explicitly, but still dynamic field type takes preference. > "multiple values encountered for non multiValued" even after adding a > multiValued field > --- > > Key: SOLR-14724 > URL: https://issues.apache.org/jira/browse/SOLR-14724 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > Attachments: faulty.jsonl > > > Steps to reproduce: > # Create a new collection > # Remove the dynamic field pattern for *_i > # Add a multivalued field "body.algolia.children.children.created_at_i" > # Index the attached document that contains multiple values for that field. > # Boom! {{"ERROR: [doc=1017960] multiple values encountered for non > multiValued field body.algolia.children.children.created_at_i: [1262006144, > 1262024153, 1261970915, 1261970358, 1262014651, 1261971697, 1261978873]"}} > {code} > [ishan@pseries ~] $ curl > "http://localhost:18983/solr/admin/collections?action=CREATE&name=hn2&numShards=1"; > { > "responseHeader":{ > "status":0, > "QTime":981}, > "success":{ > "172.17.0.2:8983_solr":{ > "responseHeader":{ > "status":0, > "QTime":351}, > "core":"hn2_shard1_replica_n1"}}, > "warning":"Using _default configset. Data driven schema functionality is > enabled by default, which is NOT RECOMMENDED for production use. To turn it > off: curl http://{host:port}/solr/hn2/config -d '{\"set-user-property\": > {\"update.autoCreateFields\":\"false\"}}'"} > [ishan@pseries ~] $ curl -X POST -H 'Content-type:application/json' > --data-binary '{ > "delete-dynamic-field":{ "name":"*_s" } > }' http://localhost:18983/solr/hn2/schema > { > "responseHeader":{ > "status":0, > "QTime":389}} > [ishan@pseries ~] $ (reverse-i-search)`add-fi': curl -X POST -H > 'Content-type:application/json' --data-binary '{ > "add-field":{ > "name":"body.algolia.children.created_at_i", > "type":"pint", > "multiValued":true, > "stored":true } > }' http://localhost:18983/solr/hn2/schema > { > "responseHeader":{ > "status":0, > "QTime":360}} > [ishan@pseries ~] $ curl -X POST -d @faulty.jsonl -H "Content-Type: > application/json" > "http://localhost:18983/solr/hn2/update/json/docs?commit=true"; > { > "responseHeader":{ > "rf":1, > "status":400, > "QTime":1045}, > "error":{ > "metadata":[ > "error-class","org.apache.solr.common.SolrException", > "root-error-class","org.apache.solr.common.SolrException"], > "msg":"ERROR: [doc=1017960] multiple values encountered for non > multiValued field body.algolia.children.children.created_at_i: [1262006144, > 1262024153, 1261970915, 1261970358, 1262014651, 1261971697, 1261978873]", > "code":400}} > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13438) DELETE collection should remove AUTOCREATED configsets
[ https://issues.apache.org/jira/browse/SOLR-13438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173643#comment-17173643 ] Ishan Chattopadhyaya commented on SOLR-13438: - Can't believe this is still not fixed. It needs to be! I was bitten by this as I was experimenting with a new dataset today: https://twitter.com/ichattopadhyaya/status/1292018719548305409. Marking this as a beginner issue. > DELETE collection should remove AUTOCREATED configsets > -- > > Key: SOLR-13438 > URL: https://issues.apache.org/jira/browse/SOLR-13438 > Project: Solr > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Priority: Major > > Current user experience: > # User creates a collection (without specifying configset), and makes some > schema/config changes. > # He's not happy with how the changes turned out, so he deletes and > re-creates the collection. > # He observes that the previously made settings changes persist. If he is > only aware of Schema and Config APIs and not explicitly aware of the concept > of configsets, this will be un-intuitive for him. > Proposed: > DELETE collection could have a flag that can be enabled to delete the > configset if it has the prefix ".AUTOCREATED". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13438) DELETE collection should remove AUTOCREATED configsets
[ https://issues.apache.org/jira/browse/SOLR-13438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chattopadhyaya updated SOLR-13438: Labels: newdev (was: ) > DELETE collection should remove AUTOCREATED configsets > -- > > Key: SOLR-13438 > URL: https://issues.apache.org/jira/browse/SOLR-13438 > Project: Solr > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > > Current user experience: > # User creates a collection (without specifying configset), and makes some > schema/config changes. > # He's not happy with how the changes turned out, so he deletes and > re-creates the collection. > # He observes that the previously made settings changes persist. If he is > only aware of Schema and Config APIs and not explicitly aware of the concept > of configsets, this will be un-intuitive for him. > Proposed: > DELETE collection could have a flag that can be enabled to delete the > configset if it has the prefix ".AUTOCREATED". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13438) DELETE collection should remove AUTOCREATED configsets
[ https://issues.apache.org/jira/browse/SOLR-13438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chattopadhyaya updated SOLR-13438: Description: Current user experience: # User creates a collection (without specifying configset), and makes some schema/config changes. # He's/She's not happy with how the changes turned out, so he/she deletes and re-creates the collection. # He/she observes that the previously made settings changes persist. If he/she is only aware of Schema and Config APIs and not explicitly aware of the concept of configsets, this will be un-intuitive for him/her. Proposed: DELETE collection should delete the configset if it has the prefix ".AUTOCREATED" and that configset isn't being shared by any other collection. was: Current user experience: # User creates a collection (without specifying configset), and makes some schema/config changes. # He's not happy with how the changes turned out, so he deletes and re-creates the collection. # He observes that the previously made settings changes persist. If he is only aware of Schema and Config APIs and not explicitly aware of the concept of configsets, this will be un-intuitive for him. Proposed: DELETE collection could have a flag that can be enabled to delete the configset if it has the prefix ".AUTOCREATED". > DELETE collection should remove AUTOCREATED configsets > -- > > Key: SOLR-13438 > URL: https://issues.apache.org/jira/browse/SOLR-13438 > Project: Solr > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > > Current user experience: > # User creates a collection (without specifying configset), and makes some > schema/config changes. > # He's/She's not happy with how the changes turned out, so he/she deletes and > re-creates the collection. > # He/she observes that the previously made settings changes persist. If > he/she is only aware of Schema and Config APIs and not explicitly aware of > the concept of configsets, this will be un-intuitive for him/her. > Proposed: > DELETE collection should delete the configset if it has the prefix > ".AUTOCREATED" and that configset isn't being shared by any other collection. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13933) Cluster mode Stress test suite
[ https://issues.apache.org/jira/browse/SOLR-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173645#comment-17173645 ] Ishan Chattopadhyaya commented on SOLR-13933: - bq. BTW, this is excellent, we've needed this for a lng time. Thanks Erick! It is about time that we bring in tight policing to defend against attacks like SOLR-14665. > Cluster mode Stress test suite > --- > > Key: SOLR-13933 > URL: https://issues.apache.org/jira/browse/SOLR-13933 > Project: Solr > Issue Type: Bug >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > > We need a stress test harness based on 10s or 100s of nodes, 1000s of > collection API operations, overseer operations etc. This suite should run > nightly, publish results publicly, so as to help with: > # Uncover stability problems > # Benchmarking (timings, resource metrics etc.) on collection operations > # Indexing/querying performance > # Validate the accuracy of potential improvements > References: > SOLR-10317 > https://github.com/lucidworks/solr-scale-tk > https://github.com/shalinmangar/solr-perf-tools > Lucene benchmarks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13438) DELETE collection should remove AUTOCREATED configsets
[ https://issues.apache.org/jira/browse/SOLR-13438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173668#comment-17173668 ] Marcus Eagan commented on SOLR-13438: - [~ichattopadhyaya] It's a constant issue for many users. > DELETE collection should remove AUTOCREATED configsets > -- > > Key: SOLR-13438 > URL: https://issues.apache.org/jira/browse/SOLR-13438 > Project: Solr > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > > Current user experience: > # User creates a collection (without specifying configset), and makes some > schema/config changes. > # He's/She's not happy with how the changes turned out, so he/she deletes and > re-creates the collection. > # He/she observes that the previously made settings changes persist. If > he/she is only aware of Schema and Config APIs and not explicitly aware of > the concept of configsets, this will be un-intuitive for him/her. > Proposed: > DELETE collection should delete the configset if it has the prefix > ".AUTOCREATED" and that configset isn't being shared by any other collection. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org