date:20200808

[GitHub] [lucene-solr] sigram commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

2020-08-08 Thread GitBox



sigram commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467387315



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/PlacementException.java
##
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+/**
+ * Exception thrown by a {@link PlacementPlugin} when it is unable to compute 
placement for whatever reason (except an

Review comment:
   Some kind of policy violation comes to mind, but you're right, it's too 
vague. OTOH with the Policy engine these kind of exceptions where invaluable 
for tracking the reasons for failures (policy rule violation vs. other types of 
low-level errors), and in that case we could usually get detailed and 
well-structured (JSON maps) errors. Something to keep in mind.

##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/PropertyKeyFactory.java
##
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+/**
+ * Factory used by the plugin to create property keys to request property 
values from Solr.
+ *
+ * Building of a {@link PropertyKey} requires specifying the target (context) 
from which the value of that key should be
+ * obtained. This is done by specifying the appropriate {@link 
PropertyValueSource}.
+ * For clarity, when only a single type of target is acceptable, the 
corresponding subtype of {@link PropertyValueSource} is used instead
+ * (for example {@link Node}).
+ */
+public interface PropertyKeyFactory {
+  /**
+   * Returns a property key to request the number of cores on a {@link Node}.
+   */
+  PropertyKey createCoreCountKey(Node node);

Review comment:
   Ok.

##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/AddReplicasRequest.java
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.util.Set;
+
+/**
+ * Request for creating one or more {@link Replica}'s for one or more 
{@link Shard}'s of an existing {@link SolrCollection}.
+ * The shard might or might not already exist, plugin code can easily find out 
by using {@link SolrCollection#getShards()}
+ * and verifying if the shard name(s) from {@link #getShardNames()} are there.
+ *
+ * As opposed to {@link CreateNewCollectionRequest}, the set of {@link 
Node}s on which the replicas should be placed
+ * is specified (defaults to being equal to the set returned by {@link 
Cluster#getLiveNodes()}).
+ *
+ * There is no extension between this interface and {@link 
CreateNewCollectionReq

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

2020-08-08 Thread GitBox



murblanc commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467453245



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/ReplicaPlacement.java
##
@@ -0,0 +1,29 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+/**
+ * Placement decision for a single {@link Replica}. Note this placement 
decision is used as part of a {@link WorkOrder},
+ * it does not directly lead to the plugin code getting a corresponding {@link 
Replica} instance, nor does it require the
+ * plugin to provide a {@link Shard} instance (the plugin code gets such 
instances for existing replicas and shards in the
+ * cluster but does not create them directly for adding new replicas for new 
or existing shards).
+ *
+ * Captures the {@link Shard} (via the shard name), {@link Node} and {@link 
Replica.ReplicaType} of a Replica to be created.
+ */
+public interface ReplicaPlacement {

Review comment:
   Ok, added. Note that the interfaces in 
`org.apache.solr.cluster.placement` only condition what the plugin code is 
seeing, not what's present in the objects. The `Request` is present in the 
`WorkOrder` instances and I added it to the interface, but even if it's not in 
the interface, Solr side code would have it.
   My thinking (by not putting it in the interface) is that plugin code is 
creating these instances so it should know which instance is which and doesn't 
need a method on that instance to find out.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

2020-08-08 Thread GitBox



murblanc commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467453739



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/PlacementPlugin.java
##
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+/**
+ * Implemented by external plugins to control replica placement and movement 
on the search cluster (as well as other things
+ * such as cluster elasticity?) when cluster changes are required (initiated 
elsewhere, most likely following a Collection
+ * API call).
+ */
+public interface PlacementPlugin {

Review comment:
   I believe Solr side would keep the configuration (map) somewhere and 
just pass a reference to it on every call to the plugin. So cost would be 
minimal. If the plugin code needs to retrieve a config value, it will basically 
do a `get` on that map, very low cost here as well.
   I don't see how a plugin can keep state between invocations with a semantic 
that's different from a static field, unless we create a notion of higher level 
context shared between invocations (and different such higher level contexts 
will not be shared). I suggest we skip that aspect for now and reconsider later 
when we implement real plugins. For now I believe the model in which the plugin 
is instantiated (no arg constructor, or constructor getting the config map or 
something simple like that) before every call to the computePlacement method 
would be sufficient. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

2020-08-08 Thread GitBox



murblanc commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467454703



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/AddReplicasRequest.java
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.util.Set;
+
+/**
+ * Request for creating one or more {@link Replica}'s for one or more 
{@link Shard}'s of an existing {@link SolrCollection}.
+ * The shard might or might not already exist, plugin code can easily find out 
by using {@link SolrCollection#getShards()}
+ * and verifying if the shard name(s) from {@link #getShardNames()} are there.
+ *
+ * As opposed to {@link CreateNewCollectionRequest}, the set of {@link 
Node}s on which the replicas should be placed
+ * is specified (defaults to being equal to the set returned by {@link 
Cluster#getLiveNodes()}).
+ *
+ * There is no extension between this interface and {@link 
CreateNewCollectionRequest} in either direction
+ * or from a common ancestor for readability. An ancestor could make sense and 
would be an "abstract interface" not intended
+ * to be implemented directly, but this does not exist in Java.
+ *
+ * Plugin code would likely treat the two types of requests differently 
since here existing {@link Replica}'s must be taken
+ * into account for placement whereas in {@link CreateNewCollectionRequest} no 
{@link Replica}'s are assumed to exist.
+ */
+public interface AddReplicasRequest extends Request {
+  /**
+   * The {@link SolrCollection} to add {@link Replica}(s) to. The replicas are 
to be added to a shard that might or might
+   * not yet exist when the plugin's {@link PlacementPlugin#computePlacement} 
is called.
+   */
+  SolrCollection getCollection();
+
+  /**
+   * Shard name(s) for which new replicas placement should be computed. The 
shard(s) might exist or not (that's why this
+   * method returns a {@link Set} of {@link String}'s and not directly a set 
of {@link Shard} instances).
+   *
+   * Note the Collection API allows specifying the shard name or a {@code 
_route_} parameter. The Solr implementation will
+   * convert either specification into the relevant shard name so the plugin 
code doesn't have to worry about this.
+   */
+  Set getShardNames();
+
+  /** Replicas should only be placed on nodes from the set returned by this 
method. */
+  Set getTargetNodes();

Review comment:
   Used work order -> placement plan





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

2020-08-08 Thread GitBox



murblanc commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467455202



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/AddReplicasRequest.java
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.util.Set;
+
+/**
+ * Request for creating one or more {@link Replica}'s for one or more 
{@link Shard}'s of an existing {@link SolrCollection}.
+ * The shard might or might not already exist, plugin code can easily find out 
by using {@link SolrCollection#getShards()}
+ * and verifying if the shard name(s) from {@link #getShardNames()} are there.
+ *
+ * As opposed to {@link CreateNewCollectionRequest}, the set of {@link 
Node}s on which the replicas should be placed
+ * is specified (defaults to being equal to the set returned by {@link 
Cluster#getLiveNodes()}).
+ *
+ * There is no extension between this interface and {@link 
CreateNewCollectionRequest} in either direction
+ * or from a common ancestor for readability. An ancestor could make sense and 
would be an "abstract interface" not intended
+ * to be implemented directly, but this does not exist in Java.
+ *
+ * Plugin code would likely treat the two types of requests differently 
since here existing {@link Replica}'s must be taken
+ * into account for placement whereas in {@link CreateNewCollectionRequest} no 
{@link Replica}'s are assumed to exist.
+ */
+public interface AddReplicasRequest extends Request {
+  /**
+   * The {@link SolrCollection} to add {@link Replica}(s) to. The replicas are 
to be added to a shard that might or might
+   * not yet exist when the plugin's {@link PlacementPlugin#computePlacement} 
is called.
+   */
+  SolrCollection getCollection();
+
+  /**
+   * Shard name(s) for which new replicas placement should be computed. The 
shard(s) might exist or not (that's why this
+   * method returns a {@link Set} of {@link String}'s and not directly a set 
of {@link Shard} instances).
+   *
+   * Note the Collection API allows specifying the shard name or a {@code 
_route_} parameter. The Solr implementation will
+   * convert either specification into the relevant shard name so the plugin 
code doesn't have to worry about this.
+   */
+  Set getShardNames();
+
+  /** Replicas should only be placed on nodes from the set returned by this 
method. */
+  Set getTargetNodes();

Review comment:
   And also `Request` -> `PlacementRequest`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13933) Cluster mode Stress test suite

2020-08-08 Thread Erick Erickson (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173637#comment-17173637
 ] 

Erick Erickson commented on SOLR-13933:
---

BTW, this is excellent, we've needed this for a lng time.

> Cluster mode Stress test suite 
> ---
>
> Key: SOLR-13933
> URL: https://issues.apache.org/jira/browse/SOLR-13933
> Project: Solr
>  Issue Type: Bug
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
>
> We need a stress test harness based on 10s or 100s of nodes, 1000s of 
> collection API operations, overseer operations etc. This suite should run 
> nightly, publish results publicly, so as to help with:
> # Uncover stability problems
> # Benchmarking (timings, resource metrics etc.) on collection operations
> # Indexing/querying performance
> # Validate the accuracy of potential improvements
> References:
> SOLR-10317
> https://github.com/lucidworks/solr-scale-tk
> https://github.com/shalinmangar/solr-perf-tools
> Lucene benchmarks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] ErickErickson commented on a change in pull request #1728: SOLR-14596: equals/hashCode for common SolrRequest classes

2020-08-08 Thread GitBox



ErickErickson commented on a change in pull request #1728:
URL: https://github.com/apache/lucene-solr/pull/1728#discussion_r467457703



##
File path: solr/solrj/src/java/org/apache/solr/client/solrj/ResponseParser.java
##
@@ -49,4 +52,31 @@ public String getVersion()
   {
 return "2.2";
   }
+
+  @Override
+  public int hashCode() {
+return new HashCodeBuilder()
+.append(getWriterType())
+.append(getContentType())
+.append(getVersion())
+.toHashCode();
+  }
+
+  @Override
+  public boolean equals(Object rhs) {
+if (rhs == null || getClass() != rhs.getClass()) {
+  return false;
+} else if (this == rhs) {
+  return true;
+} else if (hashCode() != rhs.hashCode()) {
+  return false;
+}
+
+final ResponseParser rhsCast = (ResponseParser) rhs;

Review comment:
   I'm curious about the comparison between Objects.hash(bunch of params) 
and the commons builder. I'm not advocating you change to Objects.hash, rather 
wondering whether there's a particular advantage to using one over the other.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14724) "multiple values encountered for non multiValued" even after adding a multiValued field

2020-08-08 Thread Ishan Chattopadhyaya (Jira)

Ishan Chattopadhyaya created SOLR-14724:
---

 Summary: "multiple values encountered for non multiValued" even 
after adding a multiValued field
 Key: SOLR-14724
 URL: https://issues.apache.org/jira/browse/SOLR-14724
 Project: Solr
  Issue Type: Task
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Ishan Chattopadhyaya
 Attachments: faulty.jsonl

Steps to reproduce:
# Create a new collection
# Remove the dynamic field pattern for *_i
# Add a multivalued field "body.algolia.children.children.created_at_i"
# Index the attached document that contains multiple values for that field.
# Boom! {{"ERROR: [doc=1017960] multiple values encountered for non multiValued 
field body.algolia.children.children.created_at_i: [1262006144, 1262024153, 
1261970915, 1261970358, 1262014651, 1261971697, 1261978873]"}}

{code}
[ishan@pseries ~] $ curl 
"http://localhost:18983/solr/admin/collections?action=CREATE&name=hn2&numShards=1";
{
  "responseHeader":{
"status":0,
"QTime":981},
  "success":{
"172.17.0.2:8983_solr":{
  "responseHeader":{
"status":0,
"QTime":351},
  "core":"hn2_shard1_replica_n1"}},
  "warning":"Using _default configset. Data driven schema functionality is 
enabled by default, which is NOT RECOMMENDED for production use. To turn it 
off: curl http://{host:port}/solr/hn2/config -d '{\"set-user-property\": 
{\"update.autoCreateFields\":\"false\"}}'"}

[ishan@pseries ~] $ curl -X POST -H 'Content-type:application/json' 
--data-binary '{
  "delete-dynamic-field":{ "name":"*_s" }
}' http://localhost:18983/solr/hn2/schema
{
  "responseHeader":{
"status":0,
"QTime":389}}

[ishan@pseries ~] $ (reverse-i-search)`add-fi': curl -X POST -H 
'Content-type:application/json' --data-binary '{
  "add-field":{
 "name":"body.algolia.children.created_at_i",
 "type":"pint",
 "multiValued":true,
 "stored":true }
}' http://localhost:18983/solr/hn2/schema
{
  "responseHeader":{
"status":0,
"QTime":360}}

[ishan@pseries ~] $ curl -X POST -d @faulty.jsonl -H "Content-Type: 
application/json" "http://localhost:18983/solr/hn2/update/json/docs?commit=true";
{
  "responseHeader":{
"rf":1,
"status":400,
"QTime":1045},
  "error":{
"metadata":[
  "error-class","org.apache.solr.common.SolrException",
  "root-error-class","org.apache.solr.common.SolrException"],
"msg":"ERROR: [doc=1017960] multiple values encountered for non multiValued 
field body.algolia.children.children.created_at_i: [1262006144, 1262024153, 
1261970915, 1261970358, 1262014651, 1261971697, 1261978873]",
"code":400}}
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-13933) Cluster mode Stress test suite

2020-08-08 Thread Ishan Chattopadhyaya (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173556#comment-17173556
 ] 

Ishan Chattopadhyaya edited comment on SOLR-13933 at 8/8/20, 11:54 AM:
---

I am actively working on this here: 
https://github.com/TheSearchStack/solr-bench/tree/stress-harness

Here is a sample suite: 
https://github.com/TheSearchStack/solr-bench/blob/stress-harness/rolling.json
It starts 3 Solr nodes, 1GB ram each.
It has two tasks: 
# task1 that creates a bunch of collections and indexes some data in them.
# task2 that waits for task1 to end, and after that it does a rolling restart 
of the cluster (i.e. restarts a node and waits until all replicas in the 
restarted node is active, and proceeds to the next node until all nodes are 
restarted).

While doing this, it measures the timings on all of them.

Here is another sample suite: 
https://github.com/TheSearchStack/solr-bench/blob/stress-harness/workflow.json

Does various tasks in task1 through task5, including indexing, creating many 
collections, shard splitting, restarting a node, validating number of documents 
in a collection (this one is WIP). As defined, some of these tasks are done in 
parallel with each other, some are blocking tasks.


It is my intention to quickly wrap up this quite and start running automated 
tests for medium scale Solr cluster (say, 50-100 nodes, ~2GB RAM each) on each 
of our branches and publishing results periodically. This will be specially 
useful to benchmark stability of the cluster, with SOLR-13951 and SOLR-14636.


was (Author: ichattopadhyaya):
I am actively working on this here: 
https://github.com/TheSearchStack/solr-bench/tree/stress-harness

Here is a sample suite: 
https://github.com/TheSearchStack/solr-bench/blob/stress-harness/rolling.json
It starts 3 Solr nodes, 1GB ram each.
It has two tasks: 
# task1 that creates a bunch of collections and indexes some data in them.
# task2 that waits for task2 to end, and after that it does a rolling restart 
of the cluster (i.e. restarts a node and waits until all replicas in the 
restarted node is active, and proceeds to the next node until all nodes are 
restarted).

While doing this, it measures the timings on all of them.

Here is another sample suite: 
https://github.com/TheSearchStack/solr-bench/blob/stress-harness/workflow.json

Does various tasks in task1 through task5, including indexing, creating many 
collections, shard splitting, restarting a node, validating number of documents 
in a collection (this one is WIP). As defined, some of these tasks are done in 
parallel with each other, some are blocking tasks.


It is my intention to quickly wrap up this quite and start running automated 
tests for medium scale Solr cluster (say, 50-100 nodes, ~2GB RAM each) on each 
of our branches and publishing results periodically. This will be specially 
useful to benchmark stability of the cluster, with SOLR-13951 and SOLR-14636.

> Cluster mode Stress test suite 
> ---
>
> Key: SOLR-13933
> URL: https://issues.apache.org/jira/browse/SOLR-13933
> Project: Solr
>  Issue Type: Bug
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
>
> We need a stress test harness based on 10s or 100s of nodes, 1000s of 
> collection API operations, overseer operations etc. This suite should run 
> nightly, publish results publicly, so as to help with:
> # Uncover stability problems
> # Benchmarking (timings, resource metrics etc.) on collection operations
> # Indexing/querying performance
> # Validate the accuracy of potential improvements
> References:
> SOLR-10317
> https://github.com/lucidworks/solr-scale-tk
> https://github.com/shalinmangar/solr-perf-tools
> Lucene benchmarks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-13933) Cluster mode Stress test suite

2020-08-08 Thread Ishan Chattopadhyaya (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173556#comment-17173556
 ] 

Ishan Chattopadhyaya edited comment on SOLR-13933 at 8/8/20, 11:56 AM:
---

I am actively working on this here: 
https://github.com/TheSearchStack/solr-bench/tree/stress-harness

Here is a sample suite: 
https://github.com/TheSearchStack/solr-bench/blob/stress-harness/rolling.json
It starts 3 Solr nodes, 1GB ram each.
It has two tasks: 
# task1 that creates a bunch of collections and indexes some data in them.
# task2 that waits for task1 to end, and after that it does a rolling restart 
of the cluster (i.e. restarts a node and waits until all replicas in the 
restarted node is active, and proceeds to the next node until all nodes are 
restarted).

While doing this, it measures the timings on all of them.

Here is another sample suite: 
https://github.com/TheSearchStack/solr-bench/blob/stress-harness/workflow.json

Does various tasks in task1 through task5, including indexing, creating many 
collections, shard splitting, restarting a node, validating number of documents 
in a collection (this one is WIP). As defined, some of these tasks are done in 
parallel with each other, some are blocking tasks.


It is my intention to quickly wrap up this quickly and start running automated 
tests for medium scale Solr cluster (say, 50-100 nodes, ~2GB RAM each) on each 
of our branches and publishing results periodically. This will be specially 
useful to benchmark stability of the cluster, with SOLR-13951 and SOLR-14636.


was (Author: ichattopadhyaya):
I am actively working on this here: 
https://github.com/TheSearchStack/solr-bench/tree/stress-harness

Here is a sample suite: 
https://github.com/TheSearchStack/solr-bench/blob/stress-harness/rolling.json
It starts 3 Solr nodes, 1GB ram each.
It has two tasks: 
# task1 that creates a bunch of collections and indexes some data in them.
# task2 that waits for task1 to end, and after that it does a rolling restart 
of the cluster (i.e. restarts a node and waits until all replicas in the 
restarted node is active, and proceeds to the next node until all nodes are 
restarted).

While doing this, it measures the timings on all of them.

Here is another sample suite: 
https://github.com/TheSearchStack/solr-bench/blob/stress-harness/workflow.json

Does various tasks in task1 through task5, including indexing, creating many 
collections, shard splitting, restarting a node, validating number of documents 
in a collection (this one is WIP). As defined, some of these tasks are done in 
parallel with each other, some are blocking tasks.


It is my intention to quickly wrap up this quite and start running automated 
tests for medium scale Solr cluster (say, 50-100 nodes, ~2GB RAM each) on each 
of our branches and publishing results periodically. This will be specially 
useful to benchmark stability of the cluster, with SOLR-13951 and SOLR-14636.

> Cluster mode Stress test suite 
> ---
>
> Key: SOLR-13933
> URL: https://issues.apache.org/jira/browse/SOLR-13933
> Project: Solr
>  Issue Type: Bug
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
>
> We need a stress test harness based on 10s or 100s of nodes, 1000s of 
> collection API operations, overseer operations etc. This suite should run 
> nightly, publish results publicly, so as to help with:
> # Uncover stability problems
> # Benchmarking (timings, resource metrics etc.) on collection operations
> # Indexing/querying performance
> # Validate the accuracy of potential improvements
> References:
> SOLR-10317
> https://github.com/lucidworks/solr-scale-tk
> https://github.com/shalinmangar/solr-perf-tools
> Lucene benchmarks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (SOLR-14724) "multiple values encountered for non multiValued" even after adding a multiValued field

2020-08-08 Thread Ishan Chattopadhyaya (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya resolved SOLR-14724.
-
Resolution: Invalid

Damn, I was removing the dynamic field for "*_s" instead of "*_i".

I'm closing this issue, but there's definitely something wrong with Solr. I 
added a field explicitly, but still dynamic field type takes preference.

> "multiple values encountered for non multiValued" even after adding a 
> multiValued field
> ---
>
> Key: SOLR-14724
> URL: https://issues.apache.org/jira/browse/SOLR-14724
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
> Attachments: faulty.jsonl
>
>
> Steps to reproduce:
> # Create a new collection
> # Remove the dynamic field pattern for *_i
> # Add a multivalued field "body.algolia.children.children.created_at_i"
> # Index the attached document that contains multiple values for that field.
> # Boom! {{"ERROR: [doc=1017960] multiple values encountered for non 
> multiValued field body.algolia.children.children.created_at_i: [1262006144, 
> 1262024153, 1261970915, 1261970358, 1262014651, 1261971697, 1261978873]"}}
> {code}
> [ishan@pseries ~] $ curl 
> "http://localhost:18983/solr/admin/collections?action=CREATE&name=hn2&numShards=1";
> {
>   "responseHeader":{
> "status":0,
> "QTime":981},
>   "success":{
> "172.17.0.2:8983_solr":{
>   "responseHeader":{
> "status":0,
> "QTime":351},
>   "core":"hn2_shard1_replica_n1"}},
>   "warning":"Using _default configset. Data driven schema functionality is 
> enabled by default, which is NOT RECOMMENDED for production use. To turn it 
> off: curl http://{host:port}/solr/hn2/config -d '{\"set-user-property\": 
> {\"update.autoCreateFields\":\"false\"}}'"}
> [ishan@pseries ~] $ curl -X POST -H 'Content-type:application/json' 
> --data-binary '{
>   "delete-dynamic-field":{ "name":"*_s" }
> }' http://localhost:18983/solr/hn2/schema
> {
>   "responseHeader":{
> "status":0,
> "QTime":389}}
> [ishan@pseries ~] $ (reverse-i-search)`add-fi': curl -X POST -H 
> 'Content-type:application/json' --data-binary '{
>   "add-field":{
>  "name":"body.algolia.children.created_at_i",
>  "type":"pint",
>  "multiValued":true,
>  "stored":true }
> }' http://localhost:18983/solr/hn2/schema
> {
>   "responseHeader":{
> "status":0,
> "QTime":360}}
> [ishan@pseries ~] $ curl -X POST -d @faulty.jsonl -H "Content-Type: 
> application/json" 
> "http://localhost:18983/solr/hn2/update/json/docs?commit=true";
> {
>   "responseHeader":{
> "rf":1,
> "status":400,
> "QTime":1045},
>   "error":{
> "metadata":[
>   "error-class","org.apache.solr.common.SolrException",
>   "root-error-class","org.apache.solr.common.SolrException"],
> "msg":"ERROR: [doc=1017960] multiple values encountered for non 
> multiValued field body.algolia.children.children.created_at_i: [1262006144, 
> 1262024153, 1261970915, 1261970358, 1262014651, 1261971697, 1261978873]",
> "code":400}}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13438) DELETE collection should remove AUTOCREATED configsets

2020-08-08 Thread Ishan Chattopadhyaya (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173643#comment-17173643
 ] 

Ishan Chattopadhyaya commented on SOLR-13438:
-

Can't believe this is still not fixed. It needs to be! I was bitten by this as 
I was experimenting with a new dataset today: 
https://twitter.com/ichattopadhyaya/status/1292018719548305409. Marking this as 
a beginner issue.

> DELETE collection should remove AUTOCREATED configsets
> --
>
> Key: SOLR-13438
> URL: https://issues.apache.org/jira/browse/SOLR-13438
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>
> Current user experience:
> # User creates a collection (without specifying configset), and makes some 
> schema/config changes.
> # He's not happy with how the changes turned out, so he deletes and 
> re-creates the collection.
> # He observes that the previously made settings changes persist. If he is 
> only aware of Schema and Config APIs and not explicitly aware of the concept 
> of configsets, this will be un-intuitive for him.
> Proposed:
> DELETE collection could have a flag that can be enabled to delete the 
> configset if it has the prefix ".AUTOCREATED".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13438) DELETE collection should remove AUTOCREATED configsets

2020-08-08 Thread Ishan Chattopadhyaya (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-13438:

Labels: newdev  (was: )

> DELETE collection should remove AUTOCREATED configsets
> --
>
> Key: SOLR-13438
> URL: https://issues.apache.org/jira/browse/SOLR-13438
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
>
> Current user experience:
> # User creates a collection (without specifying configset), and makes some 
> schema/config changes.
> # He's not happy with how the changes turned out, so he deletes and 
> re-creates the collection.
> # He observes that the previously made settings changes persist. If he is 
> only aware of Schema and Config APIs and not explicitly aware of the concept 
> of configsets, this will be un-intuitive for him.
> Proposed:
> DELETE collection could have a flag that can be enabled to delete the 
> configset if it has the prefix ".AUTOCREATED".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13438) DELETE collection should remove AUTOCREATED configsets

2020-08-08 Thread Ishan Chattopadhyaya (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-13438:

Description: 
Current user experience:
# User creates a collection (without specifying configset), and makes some 
schema/config changes.
# He's/She's not happy with how the changes turned out, so he/she deletes and 
re-creates the collection.
# He/she observes that the previously made settings changes persist. If he/she 
is only aware of Schema and Config APIs and not explicitly aware of the concept 
of configsets, this will be un-intuitive for him/her.

Proposed:
DELETE collection should delete the configset if it has the prefix 
".AUTOCREATED" and that configset isn't being shared by any other collection.

  was:
Current user experience:
# User creates a collection (without specifying configset), and makes some 
schema/config changes.
# He's not happy with how the changes turned out, so he deletes and re-creates 
the collection.
# He observes that the previously made settings changes persist. If he is only 
aware of Schema and Config APIs and not explicitly aware of the concept of 
configsets, this will be un-intuitive for him.

Proposed:
DELETE collection could have a flag that can be enabled to delete the configset 
if it has the prefix ".AUTOCREATED".


> DELETE collection should remove AUTOCREATED configsets
> --
>
> Key: SOLR-13438
> URL: https://issues.apache.org/jira/browse/SOLR-13438
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
>
> Current user experience:
> # User creates a collection (without specifying configset), and makes some 
> schema/config changes.
> # He's/She's not happy with how the changes turned out, so he/she deletes and 
> re-creates the collection.
> # He/she observes that the previously made settings changes persist. If 
> he/she is only aware of Schema and Config APIs and not explicitly aware of 
> the concept of configsets, this will be un-intuitive for him/her.
> Proposed:
> DELETE collection should delete the configset if it has the prefix 
> ".AUTOCREATED" and that configset isn't being shared by any other collection.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13933) Cluster mode Stress test suite

2020-08-08 Thread Ishan Chattopadhyaya (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173645#comment-17173645
 ] 

Ishan Chattopadhyaya commented on SOLR-13933:
-

bq. BTW, this is excellent, we've needed this for a lng time.
Thanks Erick! It is about time that we bring in tight policing to defend 
against attacks like SOLR-14665.

> Cluster mode Stress test suite 
> ---
>
> Key: SOLR-13933
> URL: https://issues.apache.org/jira/browse/SOLR-13933
> Project: Solr
>  Issue Type: Bug
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
>
> We need a stress test harness based on 10s or 100s of nodes, 1000s of 
> collection API operations, overseer operations etc. This suite should run 
> nightly, publish results publicly, so as to help with:
> # Uncover stability problems
> # Benchmarking (timings, resource metrics etc.) on collection operations
> # Indexing/querying performance
> # Validate the accuracy of potential improvements
> References:
> SOLR-10317
> https://github.com/lucidworks/solr-scale-tk
> https://github.com/shalinmangar/solr-perf-tools
> Lucene benchmarks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13438) DELETE collection should remove AUTOCREATED configsets

2020-08-08 Thread Marcus Eagan (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173668#comment-17173668
 ] 

Marcus Eagan commented on SOLR-13438:
-

[~ichattopadhyaya] It's a constant issue for many users. 

> DELETE collection should remove AUTOCREATED configsets
> --
>
> Key: SOLR-13438
> URL: https://issues.apache.org/jira/browse/SOLR-13438
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
>
> Current user experience:
> # User creates a collection (without specifying configset), and makes some 
> schema/config changes.
> # He's/She's not happy with how the changes turned out, so he/she deletes and 
> re-creates the collection.
> # He/she observes that the previously made settings changes persist. If 
> he/she is only aware of Schema and Config APIs and not explicitly aware of 
> the concept of configsets, this will be un-intuitive for him/her.
> Proposed:
> DELETE collection should delete the configset if it has the prefix 
> ".AUTOCREATED" and that configset isn't being shared by any other collection.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] sigram commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

[jira] [Commented] (SOLR-13933) Cluster mode Stress test suite

[GitHub] [lucene-solr] ErickErickson commented on a change in pull request #1728: SOLR-14596: equals/hashCode for common SolrRequest classes

[jira] [Created] (SOLR-14724) "multiple values encountered for non multiValued" even after adding a multiValued field

[jira] [Comment Edited] (SOLR-13933) Cluster mode Stress test suite

[jira] [Comment Edited] (SOLR-13933) Cluster mode Stress test suite

[jira] [Resolved] (SOLR-14724) "multiple values encountered for non multiValued" even after adding a multiValued field

[jira] [Commented] (SOLR-13438) DELETE collection should remove AUTOCREATED configsets

[jira] [Updated] (SOLR-13438) DELETE collection should remove AUTOCREATED configsets

[jira] [Updated] (SOLR-13438) DELETE collection should remove AUTOCREATED configsets

[jira] [Commented] (SOLR-13933) Cluster mode Stress test suite

[jira] [Commented] (SOLR-13438) DELETE collection should remove AUTOCREATED configsets

16 matches

Site Navigation

Mail list logo

Footer information