[jira] [Commented] (LUCENE-9439) Matches API should enumerate hit fields that have no positions (no iterator)

2020-07-25 Thread Alan Woodward (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164855#comment-17164855
 ] 

Alan Woodward commented on LUCENE-9439:
---

If the issue is just for term queries against fields without stored positions, 
maybe the simplest thing to do is to allow MatchesIterator to return -1 for 
positions?  I tried removing the `hasPositions` checks in TermWeight, 
SynonymWeight and MultiTermQueryConstantScoreWrapper, and then altering the 
check in TestMatchesIterator#checkNoPositionsMatches to assert that we get an 
iterator, but that it's position is -1, and things seem to work.

> Matches API should enumerate hit fields that have no positions (no iterator)
> 
>
> Key: LUCENE-9439
> URL: https://issues.apache.org/jira/browse/LUCENE-9439
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Minor
> Attachments: LUCENE-9439.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I have been fiddling with Matches API and it's great. There is one corner 
> case that doesn't work for me though -- queries that affect fields without 
> positions return {{MatchesUtil.MATCH_WITH_NO_TERMS}} but this constant is 
> problematic as it doesn't carry the field name that caused it (returns null).
> The associated fromSubMatches combines all these constants into one (or 
> swallows them) which is another problem.
> I think it would be more consistent if MATCH_WITH_NO_TERMS was replaced with 
> a true match (carrying field name) returning an empty iterator (or a constant 
> "empty" iterator NO_TERMS).
> I have a very compelling use case: I wrote an "auto-highlighter" that runs on 
> top of Matches API and automatically picks up query-relevant fields and 
> snippets. Everything works beautifully except for cases where fields are 
> searchable but don't have any positions (token-like fields).
> I can work on a patch but wanted to reach out first - [~romseygeek]?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9439) Matches API should enumerate hit fields that have no positions (no iterator)

2020-07-25 Thread Alan Woodward (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-9439:
--
Attachment: LUCENE-9439.patch

> Matches API should enumerate hit fields that have no positions (no iterator)
> 
>
> Key: LUCENE-9439
> URL: https://issues.apache.org/jira/browse/LUCENE-9439
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Minor
> Attachments: LUCENE-9439.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I have been fiddling with Matches API and it's great. There is one corner 
> case that doesn't work for me though -- queries that affect fields without 
> positions return {{MatchesUtil.MATCH_WITH_NO_TERMS}} but this constant is 
> problematic as it doesn't carry the field name that caused it (returns null).
> The associated fromSubMatches combines all these constants into one (or 
> swallows them) which is another problem.
> I think it would be more consistent if MATCH_WITH_NO_TERMS was replaced with 
> a true match (carrying field name) returning an empty iterator (or a constant 
> "empty" iterator NO_TERMS).
> I have a very compelling use case: I wrote an "auto-highlighter" that runs on 
> top of Matches API and automatically picks up query-relevant fields and 
> snippets. Everything works beautifully except for cases where fields are 
> searchable but don't have any positions (token-like fields).
> I can work on a patch but wanted to reach out first - [~romseygeek]?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9439) Matches API should enumerate hit fields that have no positions (no iterator)

2020-07-25 Thread Alan Woodward (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164855#comment-17164855
 ] 

Alan Woodward edited comment on LUCENE-9439 at 7/25/20, 10:03 AM:
--

If the issue is just for term queries against fields without stored positions, 
maybe the simplest thing to do is to allow MatchesIterator to return -1 for 
positions?  I tried removing the `hasPositions` checks in TermWeight, 
SynonymWeight and MultiTermQueryConstantScoreWrapper, and then altering the 
check in TestMatchesIterator#checkNoPositionsMatches to assert that we get an 
iterator, but that its position is -1, and things seem to work.

edit: uploaded a patch to illustrate


was (Author: romseygeek):
If the issue is just for term queries against fields without stored positions, 
maybe the simplest thing to do is to allow MatchesIterator to return -1 for 
positions?  I tried removing the `hasPositions` checks in TermWeight, 
SynonymWeight and MultiTermQueryConstantScoreWrapper, and then altering the 
check in TestMatchesIterator#checkNoPositionsMatches to assert that we get an 
iterator, but that it's position is -1, and things seem to work.

> Matches API should enumerate hit fields that have no positions (no iterator)
> 
>
> Key: LUCENE-9439
> URL: https://issues.apache.org/jira/browse/LUCENE-9439
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Minor
> Attachments: LUCENE-9439.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I have been fiddling with Matches API and it's great. There is one corner 
> case that doesn't work for me though -- queries that affect fields without 
> positions return {{MatchesUtil.MATCH_WITH_NO_TERMS}} but this constant is 
> problematic as it doesn't carry the field name that caused it (returns null).
> The associated fromSubMatches combines all these constants into one (or 
> swallows them) which is another problem.
> I think it would be more consistent if MATCH_WITH_NO_TERMS was replaced with 
> a true match (carrying field name) returning an empty iterator (or a constant 
> "empty" iterator NO_TERMS).
> I have a very compelling use case: I wrote an "auto-highlighter" that runs on 
> top of Matches API and automatically picks up query-relevant fields and 
> snippets. Everything works beautifully except for cases where fields are 
> searchable but don't have any positions (token-like fields).
> I can work on a patch but wanted to reach out first - [~romseygeek]?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9439) Matches API should enumerate hit fields that have no positions (no iterator)

2020-07-25 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164957#comment-17164957
 ] 

Dawid Weiss commented on LUCENE-9439:
-

Hmmm. I don't think it is a good idea. If a given query matched but has no 
positions then it seems to me an empty iterator seems like a better choice than 
one returning useless data? The key is also to return fields on which the match 
occurred - these are currently not set correctly. 

> Matches API should enumerate hit fields that have no positions (no iterator)
> 
>
> Key: LUCENE-9439
> URL: https://issues.apache.org/jira/browse/LUCENE-9439
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Minor
> Attachments: LUCENE-9439.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I have been fiddling with Matches API and it's great. There is one corner 
> case that doesn't work for me though -- queries that affect fields without 
> positions return {{MatchesUtil.MATCH_WITH_NO_TERMS}} but this constant is 
> problematic as it doesn't carry the field name that caused it (returns null).
> The associated fromSubMatches combines all these constants into one (or 
> swallows them) which is another problem.
> I think it would be more consistent if MATCH_WITH_NO_TERMS was replaced with 
> a true match (carrying field name) returning an empty iterator (or a constant 
> "empty" iterator NO_TERMS).
> I have a very compelling use case: I wrote an "auto-highlighter" that runs on 
> top of Matches API and automatically picks up query-relevant fields and 
> snippets. Everything works beautifully except for cases where fields are 
> searchable but don't have any positions (token-like fields).
> I can work on a patch but wanted to reach out first - [~romseygeek]?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] limingnihao closed pull request #1690: SOLR-14607: LTR Query, timeAllowed parameter causes a timeout exception with no result

2020-07-25 Thread GitBox


limingnihao closed pull request #1690:
URL: https://github.com/apache/lucene-solr/pull/1690


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] limingnihao opened a new pull request #1693: SOLR-14607: LTR Query, timeAllowed parameter causes a timeout exception with no result

2020-07-25 Thread GitBox


limingnihao opened a new pull request #1693:
URL: https://github.com/apache/lucene-solr/pull/1693


   …on with no result
   
   
   
   
   # Description
   When TimeAllowed is set, the SolrQueryTimeoutImpl function will be started 
to detect whether it has timed out when the term is loaded. When overtime, an 
ExitingReaderException is thrown.
   In the process of scoreFeatures of LTRQuery, ExitingReaderException will 
occur in two stages.
1.scorer. Occurs when a term needs to be loaded to the 
LeafReaderContext when creating Weight.
2.score. The term needs to be loaded 
when some functions call getValue. For example, FloatPayloadValueSource.
   So it can be compatible with this ExitingReaderException, and partly return. 
More empty results are better.
   
   # Solution
   1. In the scorer process, catch ExitingReaderException and return the 
currently loaded document.
   2. In the score process, catch ExitingReaderException and return the 
currently calculated document.
   
   # Tests
   Simulation throws ExitingReaderException in the scorer process, and 
partially returns the loaded document.
Simulation throws ExitingReaderException 
in the score process of Feature, partially returning the calculated document.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `master` branch.
   - [x] I have run `ant precommit` and the appropriate test suite.
   - [x] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9439) Matches API should enumerate hit fields that have no positions (no iterator)

2020-07-25 Thread Alan Woodward (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17165049#comment-17165049
 ] 

Alan Woodward commented on LUCENE-9439:
---

I wouldn't call it useless data, it works in the same way that PostingsEnum 
does.  And an empty iterator won't let you call getQuery() on it, as you can 
never position it.

> Matches API should enumerate hit fields that have no positions (no iterator)
> 
>
> Key: LUCENE-9439
> URL: https://issues.apache.org/jira/browse/LUCENE-9439
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Minor
> Attachments: LUCENE-9439.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I have been fiddling with Matches API and it's great. There is one corner 
> case that doesn't work for me though -- queries that affect fields without 
> positions return {{MatchesUtil.MATCH_WITH_NO_TERMS}} but this constant is 
> problematic as it doesn't carry the field name that caused it (returns null).
> The associated fromSubMatches combines all these constants into one (or 
> swallows them) which is another problem.
> I think it would be more consistent if MATCH_WITH_NO_TERMS was replaced with 
> a true match (carrying field name) returning an empty iterator (or a constant 
> "empty" iterator NO_TERMS).
> I have a very compelling use case: I wrote an "auto-highlighter" that runs on 
> top of Matches API and automatically picks up query-relevant fields and 
> snippets. Everything works beautifully except for cases where fields are 
> searchable but don't have any positions (token-like fields).
> I can work on a patch but wanted to reach out first - [~romseygeek]?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9439) Matches API should enumerate hit fields that have no positions (no iterator)

2020-07-25 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17165112#comment-17165112
 ] 

Dawid Weiss commented on LUCENE-9439:
-

Let me return to it on Monday, Alan. I am away for the weekend. I still think 
conceptually match position iterator returning -1 doesn't make much sense, at 
least from the use case of highlighting. This is different from offsets which 
may not be available (but you still have positions then).

> Matches API should enumerate hit fields that have no positions (no iterator)
> 
>
> Key: LUCENE-9439
> URL: https://issues.apache.org/jira/browse/LUCENE-9439
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Minor
> Attachments: LUCENE-9439.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I have been fiddling with Matches API and it's great. There is one corner 
> case that doesn't work for me though -- queries that affect fields without 
> positions return {{MatchesUtil.MATCH_WITH_NO_TERMS}} but this constant is 
> problematic as it doesn't carry the field name that caused it (returns null).
> The associated fromSubMatches combines all these constants into one (or 
> swallows them) which is another problem.
> I think it would be more consistent if MATCH_WITH_NO_TERMS was replaced with 
> a true match (carrying field name) returning an empty iterator (or a constant 
> "empty" iterator NO_TERMS).
> I have a very compelling use case: I wrote an "auto-highlighter" that runs on 
> top of Matches API and automatically picks up query-relevant fields and 
> snippets. Everything works beautifully except for cases where fields are 
> searchable but don't have any positions (token-like fields).
> I can work on a patch but wanted to reach out first - [~romseygeek]?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface

2020-07-25 Thread GitBox


noblepaul commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r460459656



##
File path: solr/core/src/java/org/apache/solr/cloud/gumi/Shard.java
##
@@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cloud.gumi;
+
+import java.util.Set;
+
+/**
+ * Shard in a {@link SolrCollection}, i.e. a subset of the data indexed in 
that collection.
+ */
+public interface Shard {
+  /**
+   * 0 numbered index of the {@link Shard} in the {@link SolrCollection}.
+   */
+  int getShardIndex();

Review comment:
   shards do not have index. This must not exist





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14680) Provide simple interfaces to our concrete SolrCloud classes

2020-07-25 Thread Noble Paul (Jira)
Noble Paul created SOLR-14680:
-

 Summary: Provide simple interfaces to our concrete SolrCloud 
classes
 Key: SOLR-14680
 URL: https://issues.apache.org/jira/browse/SOLR-14680
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Noble Paul
Assignee: Noble Paul


All our current implementations of SolrCLoud such as 
# ClusterState
# DocCollection
# Slice
# Replica
etc are concrete classes. Providing alternate implementations or wrappers is 
extremely difficult. 

SOLR-14613 is attempting to create  such interfaces to make their sdk simpler



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14680) Provide simple interfaces to our concrete SolrCloud classes

2020-07-25 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14680:
--
Description: 
All our current implementations of SolrCloud such as 
# ClusterState
# DocCollection
# Slice
# Replica
etc are concrete classes. Providing alternate implementations or wrappers is 
extremely difficult. 

SOLR-14613 is attempting to create  such interfaces to make their sdk simpler

The objective is not to have a comprehensive set of methods in these 
interfaces. We will start out with a subset of required interfaces. We 
guarantee is that these interfaces will not delete/change signatures of methods 
in these interfaces. But we may add more methods as and when it suits us

  was:
All our current implementations of SolrCLoud such as 
# ClusterState
# DocCollection
# Slice
# Replica
etc are concrete classes. Providing alternate implementations or wrappers is 
extremely difficult. 

SOLR-14613 is attempting to create  such interfaces to make their sdk simpler

The objective is not to have a comprehensive set of methods in these 
interfaces. We will start out with a subset of required interfaces. We 
guarantee is that these interfaces will not delete/change signatures of methods 
in these interfaces. But we may add more methods as and when it suits us


> Provide simple interfaces to our concrete SolrCloud classes
> ---
>
> Key: SOLR-14680
> URL: https://issues.apache.org/jira/browse/SOLR-14680
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Minor
>
> All our current implementations of SolrCloud such as 
> # ClusterState
> # DocCollection
> # Slice
> # Replica
> etc are concrete classes. Providing alternate implementations or wrappers is 
> extremely difficult. 
> SOLR-14613 is attempting to create  such interfaces to make their sdk simpler
> The objective is not to have a comprehensive set of methods in these 
> interfaces. We will start out with a subset of required interfaces. We 
> guarantee is that these interfaces will not delete/change signatures of 
> methods in these interfaces. But we may add more methods as and when it suits 
> us



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14680) Provide simple interfaces to our concrete SolrCloud classes

2020-07-25 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14680:
--
Description: 
All our current implementations of SolrCLoud such as 
# ClusterState
# DocCollection
# Slice
# Replica
etc are concrete classes. Providing alternate implementations or wrappers is 
extremely difficult. 

SOLR-14613 is attempting to create  such interfaces to make their sdk simpler

The objective is not to have a comprehensive set of methods in these 
interfaces. We will start out with a subset of required interfaces. We 
guarantee is that these interfaces will not delete/change signatures of methods 
in these interfaces. But we may add more methods as and when it suits us

  was:
All our current implementations of SolrCLoud such as 
# ClusterState
# DocCollection
# Slice
# Replica
etc are concrete classes. Providing alternate implementations or wrappers is 
extremely difficult. 

SOLR-14613 is attempting to create  such interfaces to make their sdk simpler


> Provide simple interfaces to our concrete SolrCloud classes
> ---
>
> Key: SOLR-14680
> URL: https://issues.apache.org/jira/browse/SOLR-14680
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Minor
>
> All our current implementations of SolrCLoud such as 
> # ClusterState
> # DocCollection
> # Slice
> # Replica
> etc are concrete classes. Providing alternate implementations or wrappers is 
> extremely difficult. 
> SOLR-14613 is attempting to create  such interfaces to make their sdk simpler
> The objective is not to have a comprehensive set of methods in these 
> interfaces. We will start out with a subset of required interfaces. We 
> guarantee is that these interfaces will not delete/change signatures of 
> methods in these interfaces. But we may add more methods as and when it suits 
> us



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14680) Provide simple interfaces to our concrete SolrCloud classes

2020-07-25 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14680:
--
Description: 
All our current implementations of SolrCloud such as 
# ClusterState
# DocCollection
# Slice
# Replica
etc are concrete classes. Providing alternate implementations or wrappers is 
extremely difficult. 

SOLR-14613 is attempting to create  such interfaces to make their sdk simpler

The objective is not to have a comprehensive set of methods in these 
interfaces. We will start out with a subset of required interfaces. We 
guarantee is that signatures of methods in these interfaces will not be 
deleted/changed . But we may add more methods as and when it suits us

  was:
All our current implementations of SolrCloud such as 
# ClusterState
# DocCollection
# Slice
# Replica
etc are concrete classes. Providing alternate implementations or wrappers is 
extremely difficult. 

SOLR-14613 is attempting to create  such interfaces to make their sdk simpler

The objective is not to have a comprehensive set of methods in these 
interfaces. We will start out with a subset of required interfaces. We 
guarantee is that these interfaces will not delete/change signatures of methods 
in these interfaces. But we may add more methods as and when it suits us


> Provide simple interfaces to our concrete SolrCloud classes
> ---
>
> Key: SOLR-14680
> URL: https://issues.apache.org/jira/browse/SOLR-14680
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Minor
>
> All our current implementations of SolrCloud such as 
> # ClusterState
> # DocCollection
> # Slice
> # Replica
> etc are concrete classes. Providing alternate implementations or wrappers is 
> extremely difficult. 
> SOLR-14613 is attempting to create  such interfaces to make their sdk simpler
> The objective is not to have a comprehensive set of methods in these 
> interfaces. We will start out with a subset of required interfaces. We 
> guarantee is that signatures of methods in these interfaces will not be 
> deleted/changed . But we may add more methods as and when it suits us



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul opened a new pull request #1694: SOLR-14680: Provide simple interfaces to our concrete SolrCloud classes

2020-07-25 Thread GitBox


noblepaul opened a new pull request #1694:
URL: https://github.com/apache/lucene-solr/pull/1694


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul commented on pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface

2020-07-25 Thread GitBox


noblepaul commented on pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#issuecomment-663920053


   Pleas refer to #1694 as a separate effort to make this simpler
   
   I would prefer not to give names like `gumi` anywhere. Let's use simple 
english terms like `Assign` , `AssignStrategy` and the package name can be 
`assign`



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-11868) Deprecate CloudSolrClient.setIdField, use information from Zookeeper

2020-07-25 Thread Erick Erickson (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-11868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-11868:
--
Summary: Deprecate CloudSolrClient.setIdField, use information from 
Zookeeper  (was: CloudSolrClient.setIdField is confusing, it's really the 
routing field. Should be deprecated.)

> Deprecate CloudSolrClient.setIdField, use information from Zookeeper
> 
>
> Key: SOLR-11868
> URL: https://issues.apache.org/jira/browse/SOLR-11868
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 7.2
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> IIUC idField has nothing to do with the  field. It's really
> the field used to route documents. Agreed, this is often the "id"
> field, but still
> In fact, over in UpdateReqeust.getRoutes(), it's passed as the "id"
> field to router.getTargetSlice() and just works, even though
> getTargetSlice is clearly designed to route on a field other than the
>  if we didn't just pass null as the "route" param.
> The confusing bit is that if I have a route field defined for my
> collection and want to use CloudSolrClient I have to figure out that I
> need to use the setIdField method to use that field for routing.
>  
> We should deprecate setIdField and refactor how this is used (i.e. 
> getRoutes). Need to beef up tests too I suspect.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9424) Have a warning comment for AttributeSource.captureState

2020-07-25 Thread Michael McCandless (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17165140#comment-17165140
 ] 

Michael McCandless commented on LUCENE-9424:


Thanks [~zhai7631], the patch looks good.  I'll push soon!

> Have a warning comment for AttributeSource.captureState
> ---
>
> Key: LUCENE-9424
> URL: https://issues.apache.org/jira/browse/LUCENE-9424
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/javadocs
>Reporter: Haoyu Zhai
>Priority: Trivial
> Attachments: LUCENE-9424.patch
>
>
> {{AttributeSource.captureState}} is a powerful method that can be used to 
> store and (later on) restore the current state, but it comes with a cost of 
> copying all attributes in this source and sometimes can be a big cost if 
> called multiple times.
> We could probably add a warning to indicate this cost, as this method is 
> encapsulated quite well and sometimes people who use it won't be aware of the 
> cost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-12845) Add a default cluster policy

2020-07-25 Thread Varun Thacker (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-12845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17165144#comment-17165144
 ] 

Varun Thacker commented on SOLR-12845:
--

Hi Houston,

 

Can you post the results of the test you ran with Ishan's script ? Like AB 
mentioned on Slack, why was master faster below 80 collections . Wondering if 
you also got similar results.

> Have we agreed to revert this commit on 8_6 and 8x? 

AB also mentioned that we should remove these rules.  I agree with that given 
the fact that the autoscaling framework is going away  ( SOLR-14656 )

 

> Add a default cluster policy
> 
>
> Key: SOLR-12845
> URL: https://issues.apache.org/jira/browse/SOLR-12845
> Project: Solr
>  Issue Type: Improvement
>  Components: AutoScaling
>Reporter: Shalin Shekhar Mangar
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: 8.6
>
> Attachments: SOLR-12845.patch, SOLR-12845.patch, Screenshot from 
> 2020-07-18 21-07-34.png
>
>
> [~varunthacker] commented on SOLR-12739:
> bq. We should also ship with some default policies - "Don't allow more than 
> one replica of a shard on the same JVM" , "Distribute cores across the 
> cluster evenly" , "Distribute replicas per collection across the nodes"
> This issue is about adding these defaults. I propose the following as default 
> cluster policy:
> {code}
> # Each shard cannot have more than one replica on the same node if possible
> {"replica": "<2", "shard": "#EACH", "node": "#ANY", "strict":false}
> # Each collections replicas should be equally distributed amongst nodes
> {"replica": "#EQUAL", "node": "#ANY", "strict":false} 
> # All cores should be equally distributed amongst nodes
> {"cores": "#EQUAL", "node": "#ANY", "strict":false}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-12845) Add a default cluster policy

2020-07-25 Thread Varun Thacker (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-12845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17165144#comment-17165144
 ] 

Varun Thacker edited comment on SOLR-12845 at 7/26/20, 1:27 AM:


Hi Houston,

Can you post the results of the test you ran with Ishan's script ? Like AB 
mentioned on Slack, why was master faster below 80 collections . Wondering if 
you also got similar results.

> Have we agreed to revert this commit on 8_6 and 8x? 

AB also mentioned that we should remove these rules.  I agree with that given 
the fact that the autoscaling framework is going away  ( SOLR-14656 )

 


was (Author: varunthacker):
Hi Houston,

 

Can you post the results of the test you ran with Ishan's script ? Like AB 
mentioned on Slack, why was master faster below 80 collections . Wondering if 
you also got similar results.

> Have we agreed to revert this commit on 8_6 and 8x? 

AB also mentioned that we should remove these rules.  I agree with that given 
the fact that the autoscaling framework is going away  ( SOLR-14656 )

 

> Add a default cluster policy
> 
>
> Key: SOLR-12845
> URL: https://issues.apache.org/jira/browse/SOLR-12845
> Project: Solr
>  Issue Type: Improvement
>  Components: AutoScaling
>Reporter: Shalin Shekhar Mangar
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: 8.6
>
> Attachments: SOLR-12845.patch, SOLR-12845.patch, Screenshot from 
> 2020-07-18 21-07-34.png
>
>
> [~varunthacker] commented on SOLR-12739:
> bq. We should also ship with some default policies - "Don't allow more than 
> one replica of a shard on the same JVM" , "Distribute cores across the 
> cluster evenly" , "Distribute replicas per collection across the nodes"
> This issue is about adding these defaults. I propose the following as default 
> cluster policy:
> {code}
> # Each shard cannot have more than one replica on the same node if possible
> {"replica": "<2", "shard": "#EACH", "node": "#ANY", "strict":false}
> # Each collections replicas should be equally distributed amongst nodes
> {"replica": "#EQUAL", "node": "#ANY", "strict":false} 
> # All cores should be equally distributed amongst nodes
> {"cores": "#EQUAL", "node": "#ANY", "strict":false}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14636) Provide a reference implementation for SolrCloud that is stable and fast.

2020-07-25 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17165151#comment-17165151
 ] 

Mark Robert Miller commented on SOLR-14636:
---

I am narrowing in on the non Nightly test run I'd like to see. Here is the 
latest from the 1 CPU VM on the Mac, from a Solr 'gradlew clean' state:
{noformat}
The slowest tests (exceeding 500 ms) during this run:
   4.29s TestStressLucene.testStressLuceneNRT (:solr:core)
   3.51s VersionInfoTest.testMaxDocValuesVersionFromIndex (:solr:core)
   3.12s TestSolrJErrorHandling.testHttpURLConnection (:solr:solrj)
   2.63s TestTimeSource.testEpochTime (:solr:solrj)
   2.62s TestDeleteCollectionOnDownNodes.deleteCollectionWithDownNodes 
(:solr:core)
   2.45s TestSolrJErrorHandling.testWithXml (:solr:solrj)
   2.37s TestRealTimeGet.testStressGetRealtime (:solr:core)
   2.33s TestRandomDVFaceting.testRandomFaceting (:solr:core)
   2.30s 
TestSimpleTrackingShardHandler.testSolrXmlOverrideAndCorrectShardHandler 
(:solr:core)
   2.10s TestReqParamsAPI.test (:solr:core)
BUILD SUCCESSFUL in 8m 51s {noformat}

> Provide a reference implementation for SolrCloud that is stable and fast.
> -
>
> Key: SOLR-14636
> URL: https://issues.apache.org/jira/browse/SOLR-14636
> Project: Solr
>  Issue Type: Task
>Reporter: Mark Robert Miller
>Assignee: Mark Robert Miller
>Priority: Major
> Attachments: IMG_5575 (1).jpg
>
>
> SolrCloud powers critical infrastructure and needs the ability to run quickly 
> with stability. This reference implementation will allow for this.
> *location*: [https://github.com/apache/lucene-solr/tree/reference_impl]
> *status*: alpha
> *speed*: ludicrous
> {color:#de350b}NOTE: Just entered a period of likely instability as I clear 
> out the new room of zombies.{color}
> *tests***:
>  * *core*: {color:#00875a}*solid*{color} with *{color:#de350b}ignores{color}*
>  * *solrj*: {color:#00875a}*solid*{color} with {color:#de350b}*ignores*{color}
>  * *test-framework*: {color:#00875a}*solid*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/analysis-extras*: {color:#00875a}*solid*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/analytics*: {color:#00875a}*solid*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/clustering*: {color:#00875a}*solid*{color} with 
> *{color:#de350b}ignores{color}*
>  * *contrib/dataimporthandler*: {color:#00875a}*solid*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/dataimporthandler-extras*: {color:#00875a}*solid*{color} with 
> *{color:#de350b}ignores{color}*
>  * *contrib/extraction*: {color:#00875a}*solid*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/jaegertracer-configurator*: {color:#00875a}*solid*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/langid*: {color:#00875a}*solid*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/prometheus-exporter*: {color:#00875a}*solid*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/velocity*: {color:#00875a}*solid*{color} with 
> {color:#de350b}*ignores*{color}
> _* Running tests quickly and efficiently with strict policing will more 
> frequently find bugs and requires a period of hardening._
>  _** Non Nightly currently, Nightly comes last._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley commented on pull request #1672: SOLR-14651: Metrics History could disable better

2020-07-25 Thread GitBox


dsmiley commented on pull request #1672:
URL: https://github.com/apache/lucene-solr/pull/1672#issuecomment-663935133


   I see @markrmiller  did something similar at about the same time: 
https://github.com/apache/lucene-solr/commit/a7b9847a7e98c27d74b13ac0a46d3e36945fbba3
 but instead more completely disabled it, not just the RRD aspect.  And it's 
purely for tests, as opposed to the PR here that has a production aspect if you 
choose to configure to disable the RRD.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org